Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Articles by Vani Nalliappan
Page 5 of 13
Write a program in Python to print numeric index array with sorted distinct values in a given series
When working with pandas Series, you often need to convert categorical data into numeric indices. The pd.factorize() function creates numeric indices for distinct values, with an option to sort the unique values alphabetically. Understanding pd.factorize() The pd.factorize() function returns two arrays: codes − numeric indices for each element uniques − array of distinct values Without Sorting By default, pd.factorize() assigns indices based on the order of first appearance ? import pandas as pd fruits = ['mango', 'orange', 'apple', 'orange', 'mango', 'kiwi', 'pomegranate'] index, unique_values = pd.factorize(fruits) print("Without sorting of ...
Read MoreWrite a program in Python to perform average of rolling window size 3 calculation in a given dataframe
A rolling window calculation computes statistics over a sliding window of fixed size. In pandas, you can calculate the average of a rolling window using the rolling() method with mean(). What is Rolling Window? A rolling window of size 3 means we calculate the average of the current row and the previous 2 rows. For the first few rows where we don't have enough previous data, the result will be NaN. Creating Sample DataFrame Let's create a sample DataFrame to demonstrate rolling window calculations ? import pandas as pd df = pd.DataFrame({ ...
Read MoreWrite a program in Python to slice substrings from each element in a given series
In Pandas, you can slice substrings from each element in a Series using string methods. This is useful for extracting specific characters or patterns from text data. Creating a Sample Series Let's start by creating a Series with fruit names ? import pandas as pd data = pd.Series(['Apple', 'Orange', 'Mango', 'Kiwis']) print("Original Series:") print(data) Original Series: 0 Apple 1 Orange 2 Mango 3 Kiwis dtype: object Method 1: Using str.slice() The str.slice() method allows ...
Read MoreWrite a Python function to split the string based on delimiter and convert to series
When working with strings in Python, you often need to split them based on a delimiter and convert the result into a Pandas Series for further data analysis. This is commonly done when processing CSV-like data or text files. Understanding the Problem Let's say we have a tab-separated string like 'apple\torange\tmango\tkiwi' and want to split it into individual elements, then convert to a Pandas Series ? 0 apple 1 orange 2 mango 3 kiwi dtype: object Method 1: Using a Function ...
Read MoreWrite a program in Python to print the first and last three days from a given time series data
When working with time series data in Pandas, you often need to extract specific time periods. The first() and last() methods allow you to retrieve data from the beginning and end of a time series based on a time offset. Creating Time Series Data First, let's create a time series with city names indexed by dates ? import pandas as pd # Create a series with city names data = pd.Series(['Chennai', 'Delhi', 'Mumbai', 'Pune', 'Kolkata']) # Create a date range with 2-day frequency time_series = pd.date_range('2020-01-01', periods=5, freq='2D') # Set the date range ...
Read MoreWrite a program in Python to generate a random array of 30 elements from 1 to 100 and calculate maximum by minimum of each row in a dataframe
In this tutorial, we'll learn how to generate a random array of 30 elements from 1 to 100, reshape it into a DataFrame, and calculate the ratio of maximum to minimum values for each row. Understanding the Problem We need to create a 6×5 DataFrame with random integers and calculate max/min ratio for each row using pandas operations. Solution Approach Follow these steps to solve the problem − Generate 30 random integers from 1 to 100 using np.random.randint() Reshape the array to (6, 5) to create a 2D structure Convert to DataFrame and apply ...
Read MoreWrite a program in Python to find which column has the minimum number of missing values in a given dataframe
When working with data analysis, it's common to encounter missing values in DataFrames. Sometimes you need to identify which column has the minimum number of missing values to help guide your data cleaning strategy. Problem Statement Given a DataFrame with missing values, we need to find which column has the fewest NaN values. This is useful for determining which columns are most complete in your dataset. Sample DataFrame Let's start by creating a sample DataFrame with missing values to demonstrate the solution ? import pandas as pd import numpy as np df = ...
Read MoreWrite a Python function to calculate the total number of business days from a range of start and end date
Business days are weekdays (Monday through Friday), excluding weekends and holidays. Python's Pandas library provides several methods to calculate business days between two dates. Understanding Business Days First, let's see what business days look like in a date range ? import pandas as pd dates = pd.bdate_range('2020-01-01', '2020-01-31') print("Business days in January 2020:") print(dates) print(f"Total business days: {len(dates)}") Business days in January 2020: DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-06', '2020-01-07', '2020-01-08', '2020-01-09', '2020-01-10', ...
Read MoreWrite a program in Python to perform flatten the records in a given dataframe by C and F order
When working with DataFrames, you may need to flatten the data into a one-dimensional array. Python Pandas provides the ravel() function which can flatten data in different orders: C order (row-major) and F order (column-major). Understanding C and F Order The order parameter determines how multi-dimensional data is flattened ? C order (row-major): Flattens row by row, reading elements from left to right F order (column-major): Flattens column by column, reading elements from top to bottom Creating the DataFrame Let's start by creating a sample DataFrame with ID and Age columns ? ...
Read MoreWrite a program in Python to print dataframe rows as orderDict with a list of tuple values
In Pandas, you can convert DataFrame rows to OrderedDict objects with list of tuple values. This is useful when you need to maintain the order of columns and access row data in a structured dictionary format. Understanding the Problem When working with DataFrames, sometimes you need each row as an OrderedDict where each column-value pair is represented as a tuple. The expected output format is ? OrderedDict([('Index', 0), ('Name', 'Raj'), ('Age', 13), ('City', 'Chennai'), ('Mark', 80)]) OrderedDict([('Index', 1), ('Name', 'Ravi'), ('Age', 12), ('City', 'Delhi'), ('Mark', 90)]) OrderedDict([('Index', 2), ('Name', 'Ram'), ('Age', 13), ('City', 'Chennai'), ('Mark', 95)]) ...
Read More