Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Articles by Vani Nalliappan
Page 5 of 13
Write a program in Python to compute grouped data covariance and calculate grouped data covariance between two columns in a given dataframe
Covariance measures how much two variables change together. In pandas, you can compute grouped data covariance using groupby() with cov() to analyze relationships within different groups of your data. Understanding Grouped Covariance When you have categorical data, computing covariance within each group helps identify patterns specific to each category. The cov() function returns a covariance matrix showing relationships between all numeric columns. Creating Sample Data Let's start with a DataFrame containing student marks grouped by subjects ? import pandas as pd df = pd.DataFrame({ 'subjects': ['maths', 'maths', 'maths', 'science', ...
Read MoreWrite a program to truncate a dataframe time series data based on index value
When working with time series data in pandas, you often need to extract a specific date range from your DataFrame. The truncate() method allows you to filter data based on index values, making it useful for time-based filtering. Understanding DataFrame Truncation The truncate() method filters DataFrame rows based on index values using before and after parameters. For time series data, this is particularly useful when your DataFrame has a datetime index. Creating Sample Time Series Data Let's start by creating a DataFrame with time series data ? import pandas as pd # Create ...
Read MoreWrite a program in Python to compute autocorrelation between series and number of lags
Autocorrelation measures the correlation between a time series and its lagged version. In pandas, the autocorr() method computes the Pearson correlation coefficient between a series and its lagged values. Understanding Autocorrelation Autocorrelation helps identify patterns and dependencies in time series data. A lag of 1 compares each value with the previous value, lag of 2 compares with the value two positions back, and so on. Creating a Series Let's create a pandas Series with some sample data including a NaN value ? import pandas as pd import numpy as np series = pd.Series([2, ...
Read MoreWrite a program in Python to export a given dataframe into Pickle file format and read the content from the Pickle file
Pickle is a Python serialization format that preserves the exact data types and structure of pandas DataFrames. This tutorial shows how to export a DataFrame to a pickle file and read it back. What is Pickle Format? Pickle is Python's native binary serialization format that maintains data types, index information, and DataFrame structure perfectly. Unlike CSV, pickle preserves datetime objects, categorical data, and multi-level indexes. Creating and Exporting DataFrame to Pickle Let's create a sample DataFrame and export it to pickle format ? import pandas as pd # Create a sample DataFrame df ...
Read MoreWrite a program in Python to resample a given time series data and find the maximum month-end frequency
Time series resampling in Pandas allows you to change the frequency of your data and apply aggregation functions. In this tutorial, we'll learn how to resample time series data to find the maximum month-end frequency using the resample() method. Understanding the Problem We have a time series dataset with weekly data points and want to group them by month, then find the maximum values for each month. The result will show month-end dates with the corresponding maximum values. Step-by-Step Solution Step 1: Create the DataFrame First, we'll create a DataFrame with ID numbers and generate ...
Read MoreWrite a Python program to read an Excel data from file and read all rows of first and last columns
Reading specific columns from an Excel file is a common data analysis task. Python's pandas library provides the iloc method to select rows and columns by their position, making it easy to extract the first and last columns from any dataset. Understanding Column Selection with iloc The iloc method uses integer-based indexing where: df.iloc[:, 0] selects all rows of the first column (index 0) df.iloc[:, -1] selects all rows of the last column (index -1) The colon (:) means "all rows" Creating Sample Data First, let's create a sample Excel file to demonstrate ...
Read MoreWrite a program in Python to read CSV data from a file and print the total sum of last two rows
When working with CSV files in Python, you often need to perform calculations on specific rows. This tutorial shows three different methods to read CSV data and calculate the sum of the last two rows using pandas. Sample CSV Data First, let's create a sample CSV file named pandas.csv: Id, Data 1, 11 2, 22 3, 33 4, 44 5, 55 6, 66 7, 77 8, 88 9, 99 10, 100 Using tail() Method The tail() method returns the last n rows of a DataFrame. Combined with sum(), it provides a clean solution ...
Read MoreWrite a Python program to export dataframe into an Excel file with multiple sheets
Exporting a Pandas DataFrame to an Excel file with multiple sheets is a common requirement for data analysis and reporting. Python provides several approaches to accomplish this using libraries like xlsxwriter, openpyxl, or Pandas' built-in Excel writer. Using xlsxwriter Engine The xlsxwriter engine provides excellent formatting options and performance for creating Excel files ? import pandas as pd import xlsxwriter # Create sample DataFrame df = pd.DataFrame({ 'Fruits': ["Apple", "Orange", "Mango", "Kiwi"], 'City': ["Shimla", "Sydney", "Lucknow", "Wellington"] }) print("Original DataFrame:") print(df) # Create Excel writer ...
Read MoreWrite a Python program to separate a series of alphabets and digits and convert them to a dataframe
When working with mixed alphanumeric data in Pandas, you often need to separate alphabetic and numeric parts into different columns. This is commonly done using the str.extract() method with regular expressions. Problem Statement Given a Pandas Series containing strings with both letters and digits, we need to separate them into two columns in a DataFrame ? Original Series: 0 abx123 1 bcd25 2 cxy30 dtype: object Expected DataFrame: 0 1 0 abx 123 1 bcd ...
Read MoreWrite a program in Python to filter armstrong numbers in a given series
An Armstrong number (also called a narcissistic number) is a number that equals the sum of its digits raised to the power of the number of digits. For 3-digit numbers, each digit is cubed and summed. In this tutorial, we'll filter Armstrong numbers from a Pandas Series. What is an Armstrong Number? For a 3-digit number, if the sum of cubes of its digits equals the original number, it's an Armstrong number: 153: 1³ + 5³ + 3³ = 1 + 125 + 27 = 153 ✓ 371: 3³ + 7³ + 1³ = 27 + ...
Read More