Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Articles by Vani Nalliappan
Page 4 of 13
Write a Python code to calculate percentage change between Id and Age columns of the top 2 and bottom 2 values
Sometimes we need to calculate percentage changes between consecutive rows in specific columns of a DataFrame. The pct_change() method calculates the percentage change from the previous row, which is useful for analyzing trends in data. Understanding Percentage Change The pct_change() method computes the percentage change between the current and previous element. The formula is: (current - previous) / previous. Example Dataset Let's start by creating a sample DataFrame with Id and Age columns ? import pandas as pd df = pd.DataFrame({ "Id": [1, 2, 3, None, 5], ...
Read MoreWrite a Python program to perform table-wise pipe function in a dataframe
The pipe() function in Pandas allows you to apply a custom function to an entire DataFrame. This is useful for performing table-wise operations where you want to transform the entire dataset using a user-defined function. Understanding DataFrame pipe() Function The pipe() method passes the DataFrame as the first argument to a function, along with any additional arguments you specify. This enables method chaining and cleaner code organization. Syntax DataFrame.pipe(func, *args, **kwargs) Example: Table-wise Operation Let's create a DataFrame and apply a custom function using pipe() ? import pandas as pd ...
Read MoreWrite a Python program to trim the minimum and maximum threshold value in a dataframe
Sometimes you need to limit values in a DataFrame to fall within specific minimum and maximum thresholds. Pandas provides the clip() method to trim values that exceed these boundaries. Understanding DataFrame Clipping The clip() method constrains values between a lower and upper limit: lower parameter sets the minimum threshold upper parameter sets the maximum threshold Values below the lower limit are replaced with the lower limit Values above the upper limit are replaced with the upper limit Syntax DataFrame.clip(lower=None, upper=None, axis=None) Creating Sample Data Let's create a DataFrame with ...
Read MoreWrite a Python program to quantify the shape of a distribution in a dataframe
Distribution shape analysis is crucial in data science for understanding data characteristics. Python's Pandas provides built-in methods to calculate kurtosis (measures peakedness) and skewness (measures asymmetry) to quantify distribution shapes. What is Kurtosis and Skewness? Kurtosis measures how peaked or flat a distribution is compared to a normal distribution. Values above 0 indicate a more peaked distribution, while negative values indicate a flatter distribution. Skewness measures the asymmetry of a distribution. Positive skewness indicates a tail extending toward higher values, while negative skewness indicates a tail extending toward lower values. Creating a Sample DataFrame Let's ...
Read MoreWrite a Python program to find the mean absolute deviation of rows and columns in a dataframe
Mean Absolute Deviation (MAD) measures the average distance between each data point and the mean of the dataset. In pandas, you can calculate MAD for both rows and columns of a DataFrame using the mad() method. What is Mean Absolute Deviation? MAD is calculated as the mean of absolute deviations from the arithmetic mean: MAD = mean(|x - mean(x)|) Creating a Sample DataFrame Let's start by creating a DataFrame with sample data ? import pandas as pd data = {"Column1": [6, 5.3, 5.9, 7.8, 7.6, 7.45, 7.75], ...
Read MoreWrite a Python program to find the average of first row in a Panel
A Panel was a 3-dimensional data structure in older versions of pandas (deprecated since v0.25). To find the average of the first row, we use the major_xs() method to select a specific row and then calculate its mean. Creating a Panel First, let's create a Panel with sample data ? import pandas as pd import numpy as np # Create data dictionary with DataFrame data = {'Column1': pd.DataFrame(np.random.randn(5, 3))} # Create Panel from data p = pd.Panel(data) print("Panel values:") print(p['Column1']) Panel values: ...
Read MoreWrite a program in Python to find the minimum rank of a particular column in a dataframe
Finding the minimum rank of values in a DataFrame column is useful for data analysis and ranking operations. Pandas provides the rank() method with different ranking strategies including the minimum rank approach. Understanding Minimum Rank Minimum rank assigns the lowest possible rank to tied values. For example, if two values tie for 1st place, both get rank 1, and the next rank becomes 3 (skipping rank 2). Creating the DataFrame Let's start by creating a sample DataFrame with age data ? import pandas as pd data = {'Id': [1, 2, 3, 4, 5], ...
Read MoreWrite a program in Python to create a panel from a dictionary of dataframe and print the maximum value of the first column
Pandas Panel was a 3-dimensional data structure in older versions of pandas. Though deprecated, understanding how to work with multi-dimensional data and extract maximum values from specific axes remains valuable for data analysis. Problem Statement We need to create a panel from a dictionary of DataFrames and find the maximum value in the first column across all items in the panel. Solution Approach To solve this problem, we will follow these steps ? Create a dictionary with DataFrame containing random data Convert the dictionary to a Panel structure Use minor_xs() to select the first ...
Read MoreWrite a program in Python to shift a dataframe index by two periods in positive and negative direction
In pandas, you can shift DataFrame values by a specified number of periods using the shift() method. This is useful for time series analysis, creating lag variables, or comparing data across different time periods. Understanding DataFrame Shifting The shift() method moves data along the specified axis: Positive values shift data down (forward in time) Negative values shift data up (backward in time) Shifted positions are filled with NaN values Syntax DataFrame.shift(periods=1, freq=None, axis=0, fill_value=None) Creating the DataFrame Let's create a time-indexed DataFrame to demonstrate shifting ? import ...
Read MoreWrite a program in Python to remove first duplicate rows in a given dataframe
Duplicate rows in a DataFrame can clutter your data analysis. In pandas, you can remove duplicate rows using the drop_duplicates() method. When you set keep='last', it removes the first occurrence of duplicates and keeps the last one. Understanding the Problem Let's start by creating a DataFrame with duplicate rows to see how duplicate removal works ? import pandas as pd df = pd.DataFrame({ 'Id': [1, 2, 3, 4, 5, 6, 2, 7, 3, 9, 10], 'Age': [12, 13, 14, 13, 14, 12, 13, 16, 14, 15, 14] ...
Read More