Last Updated : 07 Aug, 2024
Comments
Improve
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages that makes importing and analyzing data much easier.here we are learning how to Extract rows using Pandas .iloc[] in Python.
Pandas .iloc[] Syntax
Syntax: pandas.DataFrame.iloc[]
Parameters: Index position of rows in integer or list of integer.
Return type: Data frame or Series depending on parameters
What is Pandas .iloc[] in Python?
In the Python Pandas library, .iloc[]
is an indexer used for integer-location-based indexing of data in a DataFrame. It allows users to select specific rows and columns by providing integer indices, making it a valuable tool for data manipulation and extraction based on numerical positions within the DataFrame. This indexer is particularly useful when you want to access or manipulate data using integer-based positional indexing rather than labels.
Dataset Used: To download the CSV used in the code, click here.
Extracting Rows using Pandas .iloc[] in Python
The Pandas library provides a unique method to retrieve rows from a DataFrame. Dataframe.iloc[] method is used when the index label of a data frame is something other than numeric series of 0, 1, 2, 3….n or in case the user doesn’t know the index label. Rows can be extracted using an imaginary index position that isn’t visible in the Dataframe.
There are various method to Extracting rows using Pandas .iloc[] in Python here we are using some generally used methods which are following:
- Selecting rows using Pandas .iloc and loc
- Selecting Multiple Rows using Pandas .iloc[] in Python
- Select Rows by Name or Index usingPandas .iloc[] in Python
Selecting rows using Pandas .iloc and loc
In this example, the same index number row is extracted by both .iloc[] and.loc[] methods and compared. Since the index column by default is numeric, hence the index label will also be integers.
# importing pandas packageimport pandas as pd# making data frame from csv filedata = pd.read_csv('nba.csv')# retrieving rows by loc methodrow1 = data.loc[3]# retrieving rows by iloc methodrow2 = data.iloc[3]# checking if values are equalrow1 == row2
Output:
Name True
Team True
Number True
Position True
Age True
Height True
Weight True
College True
Salary True
Name: 3, dtype: bool
As shown in the output image, the results returned by both methods are the same.
Selecting Multiple Rows using Pandas .iloc[] in Python
In this example, multiple rows are extracted, first by passing a list and then by passing integers to extract rows between that range. After that, both values are compared.
# importing pandas packageimport pandas as pd# making data frame from csv filedata = pd.read_csv('nba.csv')# retrieving rows by loc methodrow1 = data.iloc[[4, 5, 6, 7]]# retrieving rows by loc methodrow2 = data.iloc[4:8]# comparing valuesrow1 == row2
Output:
Name Team Number Position Age Height Weight College Salary
4 True True True True True True True False True
5 True True True True True True True False True
6 True True True True True True True True True
7 True True True True True True True True True
As shown in the output image, the results returned by both methods are the same. All values are True except values in the college column since those were NaN values.
Select Rows by Name or Index usingPandas .iloc[] in Python
This code uses Pandas to create a DataFrame with information about individuals (Geek1 to Geek5) regarding their age and salary. It sets the ‘Name’ column as the index for clarity. The original DataFrame is displayed, and then it demonstrates the extraction of a single row (Geek1) and multiple rows (Geek2 to Geek3) using Pandas .iloc[] for integer-location based indexing. The extracted rows are printed for verification.
import pandas as pd# Creating a sample DataFramedata = pd.DataFrame({ 'Name': ['Geek1', 'Geek2', 'Geek3', 'Geek4', 'Geek5'], 'Age': [25, 30, 22, 35, 28], 'Salary': [50000, 60000, 45000, 70000, 55000]})# Setting 'Name' column as the index for claritydata.set_index('Name', inplace=True)# Displaying the original DataFrameprint("Original DataFrame:")print(data)# Extracting a single row by indexrow_alice = data.iloc[0, :]print("\nExtracted Row (Geek1):")print(row_alice)# Extracting multiple rows using a slicerows_geek2_to_geek3 = data.iloc[1:3, :]print("\nExtracted Rows (Geek2 to Geek3):")print(rows_geek2_to_geek3)
Output :
Original DataFrame:
Age Salary
Name
Geek1 25 50000
Geek2 30 60000
Geek3 22 45000
Geek4 35 70000
Geek5 28 55000
Extracted Row (Geek1):
Age 25
Salary 50000
Name: Geek1, dtype: int64
Extracted Rows (Geek2 to Geek3):
Age Salary
Name
Geek2 30 60000
Geek3 22 45000
Conclusion
In Conclusion, Pandas .iloc[]
in Python is a powerful tool for extracting rows based on integer-location indexing. Its value shines in datasets where numerical positions matter more than labels. This feature allows selective retrieval of individual rows or slices, making it essential for efficient data manipulation and analysis. The versatility of .iloc[]
enhances flexibility in data extraction, enabling seamless access to specific portions of datasets. As a fundamental component of Pandas, .iloc[]
significantly contributes to the efficiency and clarity of data-related tasks for developers and data scientists.
Extracting rows using Pandas .iloc[] in Python – FAQs
How to Drop Rows Using iloc in Pandas?
While
iloc
is generally used for indexing and selecting data in Pandas, it is not directly used to drop rows. Instead, to drop rows using index positions, you can use a combination ofiloc
anddrop
methods or use slicing to create a new DataFrame that excludes the rows you want to drop.Example using slicing to exclude rows:
import pandas as pd# Create a sample DataFrame
df = pd.DataFrame({
'A': range(10),
'B': range(10, 20)
})# Drop the first 5 rows
df = df.iloc[5:] # Keeps rows from index 5 onwards
print(df)
What is the Difference Between iloc and [] in Pandas?
iloc
: This is an integer-location based indexing method used to access data in specific positions in the DataFrame. It is strictly integer-based, from 0 to the length-1 of the axis. It is used to retrieve rows and columns by integer positions.df.iloc[0] # Retrieves the first row of the DataFrame
[]
: This indexing operator is more versatile and can be used to select columns by column names or rows based on boolean arrays.df['A'] # Retrieves the column named 'A'
df[df['A'] > 5] # Retrieves rows where the value in column 'A' is greater than 5
What Does iloc[:0] Do?
The expression
iloc[:0]
in Pandas is used to select rows up to but not including index 0, effectively returning an empty DataFrame with the same column headers.Example:
df.iloc[:0]
This will return an empty DataFrame with the same structure (columns and types) as
df
but no rows.
How to Drop the First 5 Rows in Pandas?
To drop the first 5 rows in a Pandas DataFrame, you can use the
drop
method with the row indices you want to remove, or you can simply slice the DataFrame to skip the first 5 rows.Example using slicing:
df = df.iloc[5:] # Keeps rows starting from index 5, dropping the first 5 rows
What is the Difference Between [] and {} in Python?
my_list = [1, 'apple', 3.14]
[]
are used to define lists in Python. Lists are ordered, mutable collections of items that can be of mixed types.{}
are used to define dictionaries or sets in Python.
- As a dictionary, it contains key-value pairs where each key is unique.
- When used with distinct elements without key-value pairs, it defines a set, which is an unordered collection of unique elements.
my_dict = {'name': 'Alice', 'age': 25}
my_set = {1, 2, 3}
Both are fundamental data structures in Python, used extensively in various types of applications.
Elevate your coding journey with a Premium subscription. Benefit from ad-free learning, unlimited article summaries, an AI bot, access to 35+ courses, and more-available only with GeeksforGeeks Premium! Explore now!
Kartikaybhutani
Improve
Previous Article
Python | Pandas Extracting rows using .loc[]
Next Article
Indexing and Selecting Data with Pandas