Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Pandas Articles
Page 27 of 42
How to select subset of data with Index Labels in Python Pandas?
Pandas provides powerful selection capabilities to extract subsets of data using either index positions or index labels. This article demonstrates how to select data using index labels with the .loc accessor. The .loc attribute works similar to Python dictionaries, selecting data by index labels rather than positions. This is different from .iloc which selects by integer position like Python lists. Setting Up the Dataset Let's start by importing a movies dataset with the title as the index ? import pandas as pd movies = pd.read_csv("https://raw.githubusercontent.com/sasankac/TestDataSet/master/movies_data.csv", ...
Read MoreBoolean Indexing in Pandas
Boolean indexing in Pandas allows you to select data from DataFrames using boolean vectors. This powerful feature lets you filter rows based on True/False values, either through boolean indices or by passing boolean arrays directly to the DataFrame. Creating a DataFrame with Boolean Index First, let's create a DataFrame with a boolean index vector − import pandas as pd # data data = { 'Name': ['Hafeez', 'Srikanth', 'Rakesh'], 'Age': [19, 20, 19] } # creating a DataFrame with boolean index vector data_frame = pd.DataFrame(data, index=[True, False, ...
Read MoreAdd new column in Pandas Data Frame Using a Dictionary
A Pandas DataFrame is a two-dimensional tabular data structure with rows and columns. You can add a new column by mapping values from a Python dictionary to an existing column using the map() function. Creating a DataFrame First, create a DataFrame from a Pandas Series ? import pandas as pd s = pd.Series([6, 8, 3, 1, 12]) df = pd.DataFrame(s, columns=['Month_No']) print(df) Month_No 0 6 1 8 2 ...
Read MoreAccessing elements of a Pandas Series
A Pandas Series is a one-dimensional labeled array that can hold any data type. Elements can be accessed using integer position, custom index labels, or slicing. Creating a Series import pandas as pd s = pd.Series([11, 8, 6, 14, 25], index=['a', 'b', 'c', 'd', 'e']) print(s) a 11 b 8 c 6 d 14 e 25 dtype: int64 Accessing a Single Element Use integer position or custom label to access individual elements ? ...
Read MoreHow to iterate over rows in a DataFrame in Pandas?
To iterate rows in a DataFrame in Pandas, we can use the iterrows() method, which will iterate over DataFrame rows as (index, Series) pairs.StepsCreate a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.Iterate df using df.iterrows() method.Print each row with index.Exampleimport pandas as pd df = pd.DataFrame( { "x": [5, 2, 1, 9], "y": [4, 1, 5, 10], "z": [4, 1, 5, 0] } ) print "Given DataFrame:", df for index, row in df.iterrows(): print "Row ", index, "contains: " print row["x"], row["y"], row["z"]OutputGiven DataFrame: x y z 0 5 4 4 1 2 1 1 2 1 5 5 3 9 10 0 Row 0 contains: 5 4 4 Row 1 contains: 2 1 1 Row 2 contains: 1 5 5 Row 3 contains: 9 10 0
Read MoreSelect rows from a Pandas DataFrame based on column values
To select rows from a DataFrame based on column values, we can take the following Steps −Create a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.Print the input DataFrame.Use df.loc[df["x"]==2] to print the DataFrame when x==2.Similarly, print the DataFrame when (x >= 2) and (x < 2).Exampleimport pandas as pd df = pd.DataFrame( { "x": [5, 2, 1, 9], "y": [4, 1, 5, 10], "z": [4, 1, 5, 0] } ) print "Given DataFrame is:", df print "When column x value == 2:", df.loc[df["x"] == 2] ...
Read MoreHow to rename column names in a Pandas DataFrame?
To rename columns in a Pandas DataFrame, we can override df.columns with the new column names.StepsCreate a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.Print the input DataFrame.Override the columns with new list of column names.Print the DataFrame again with the renamed column names.Exampleimport pandas as pd df = pd.DataFrame( { "x": [5, 2, 1, 9], "y": [4, 1, 5, 10], "z": [4, 1, 5, 0] } ) print("Input DataFrame is:", df) df.columns = ["a", "b", "c"] print("After renaming, DataFrame is:", df)OutputInput DataFrame is: x y z 0 5 4 4 1 2 1 1 2 1 5 5 3 9 10 0 After renaming, DataFrame is: a b c 0 5 4 4 1 2 1 1 2 1 5 5 3 9 10 0
Read MoreSelect multiple columns in a Pandas DataFrame
To select multiple columns in a Pandas DataFrame, we can create new a DataFrame from the existing DataFrameStepsCreate a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.Print the input DataFrame.Create a new DataFrame, df1, with selection of multiple columns.Print the new DataFrame with multiple selected columns.Exampleimport pandas as pd df = pd.DataFrame( { "x": [5, 2, 1, 9], "y": [4, 1, 5, 10], "z": [4, 1, 5, 0] } ) print "Input DataFrame is:", df df1 = df[['x', 'y']] print "After selecting multiple columns:", df1OutputInput DataFrame is: x y z 0 5 4 4 1 2 1 1 2 1 5 5 3 9 10 0 After selecting multiple columns: x y 0 5 4 1 2 1 2 1 5 3 9 10
Read MoreHow to get the row count of a Pandas DataFrame?
To get the row count of a Pandas DataFrame, we can use the length of DataFrame index.StepsCreate a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.Print the input DataFrame.Print the length of the DataFrame index list, len(df.index).Exampleimport pandas as pd df = pd.DataFrame( { "x": [5, 2, 1, 9], "y": [4, 1, 5, 10], "z": [4, 1, 5, 0] } ) print "Input DataFrame is:", df print "Row count of DataFrame is: ", len(df.index)OutputInput DataFrame is: x y z 0 5 4 4 1 2 1 1 2 1 5 5 3 9 10 0 Row count of DataFrame is: 4
Read MoreHow to get the list of column headers from a Pandas DataFrame?
To get a list of Pandas DataFrame column headers, we can use df.columns.values.StepsCreate a two-dimensional, size-mutable, potentially heterogeneous tabular data, df.Print the input DataFrame.Print the list of df.columns.values output.Exampleimport pandas as pd df = pd.DataFrame( { "x": [5, 2, 1, 9], "y": [4, 1, 5, 10], "z": [4, 1, 5, 0] } ) print "Input DataFrame is:", df print "List of headers are: ", list(df.columns.values)OutputInput DataFrame is: x y z 0 5 4 4 1 2 1 1 2 1 5 5 3 9 10 0 List of headers are: ['x', 'y', 'z']
Read More