Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Articles by Vani Nalliappan
Page 3 of 13
Write a program in Python to generate a random array of 30 elements from 1 to 100 and calculate maximum by minimum of each row in a dataframe
In this tutorial, we'll learn how to generate a random array of 30 elements from 1 to 100, reshape it into a DataFrame, and calculate the ratio of maximum to minimum values for each row. Understanding the Problem We need to create a 6×5 DataFrame with random integers and calculate max/min ratio for each row using pandas operations. Solution Approach Follow these steps to solve the problem − Generate 30 random integers from 1 to 100 using np.random.randint() Reshape the array to (6, 5) to create a 2D structure Convert to DataFrame and apply ...
Read MoreWrite a program in Python to find which column has the minimum number of missing values in a given dataframe
When working with data analysis, it's common to encounter missing values in DataFrames. Sometimes you need to identify which column has the minimum number of missing values to help guide your data cleaning strategy. Problem Statement Given a DataFrame with missing values, we need to find which column has the fewest NaN values. This is useful for determining which columns are most complete in your dataset. Sample DataFrame Let's start by creating a sample DataFrame with missing values to demonstrate the solution ? import pandas as pd import numpy as np df = ...
Read MoreWrite a Python function to calculate the total number of business days from a range of start and end date
Business days are weekdays (Monday through Friday), excluding weekends and holidays. Python's Pandas library provides several methods to calculate business days between two dates. Understanding Business Days First, let's see what business days look like in a date range ? import pandas as pd dates = pd.bdate_range('2020-01-01', '2020-01-31') print("Business days in January 2020:") print(dates) print(f"Total business days: {len(dates)}") Business days in January 2020: DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-06', '2020-01-07', '2020-01-08', '2020-01-09', '2020-01-10', ...
Read MoreWrite a program in Python to perform flatten the records in a given dataframe by C and F order
When working with DataFrames, you may need to flatten the data into a one-dimensional array. Python Pandas provides the ravel() function which can flatten data in different orders: C order (row-major) and F order (column-major). Understanding C and F Order The order parameter determines how multi-dimensional data is flattened ? C order (row-major): Flattens row by row, reading elements from left to right F order (column-major): Flattens column by column, reading elements from top to bottom Creating the DataFrame Let's start by creating a sample DataFrame with ID and Age columns ? ...
Read MoreWrite a program in Python to print dataframe rows as orderDict with a list of tuple values
In Pandas, you can convert DataFrame rows to OrderedDict objects with list of tuple values. This is useful when you need to maintain the order of columns and access row data in a structured dictionary format. Understanding the Problem When working with DataFrames, sometimes you need each row as an OrderedDict where each column-value pair is represented as a tuple. The expected output format is ? OrderedDict([('Index', 0), ('Name', 'Raj'), ('Age', 13), ('City', 'Chennai'), ('Mark', 80)]) OrderedDict([('Index', 1), ('Name', 'Ravi'), ('Age', 12), ('City', 'Delhi'), ('Mark', 90)]) OrderedDict([('Index', 2), ('Name', 'Ram'), ('Age', 13), ('City', 'Chennai'), ('Mark', 95)]) ...
Read MoreWrite a program in Python to caluculate the adjusted and non-adjusted EWM in a given dataframe
The Exponentially Weighted Moving Average (EWM) is a statistical technique that gives more weight to recent observations. Pandas provides two modes: adjusted (default) and non-adjusted, which handle the calculation differently during the initial periods. Understanding EWM Parameters The key difference between adjusted and non-adjusted EWM lies in how they handle the bias correction ? Adjusted EWM (default): Applies bias correction to account for the initialization period Non-adjusted EWM: Uses raw exponential weighting without bias correction com parameter: Center of mass, controls the decay rate (higher values = slower decay) Creating Sample Data First, ...
Read MoreWrite a Python code to fill all the missing values in a given dataframe
When working with datasets, missing values (NaN) are common. Pandas provides the interpolate() method to fill missing values using various interpolation techniques like linear, polynomial, or time-based methods. Syntax df.interpolate(method='linear', limit_direction='forward', limit=None) Parameters method − Interpolation technique ('linear', 'polynomial', 'spline', etc.) limit_direction − Direction to fill ('forward', 'backward', 'both') limit − Maximum number of consecutive NaNs to fill Example Let's create a DataFrame with missing values and apply linear interpolation ? import pandas as pd df = pd.DataFrame({"Id": [1, 2, 3, None, 5], ...
Read MoreWrite a Python code to rename the given axis in a dataframe
In Pandas, you can rename the axis (row index or column names) of a DataFrame using the rename_axis() method. This is useful when you want to give a meaningful name to your DataFrame's index or columns axis. Syntax DataFrame.rename_axis(mapper, axis=None, copy=None, inplace=False) Parameters mapper − The new name for the axis axis − 0 or 'index' for row axis, 1 or 'columns' for column axis inplace − If True, modify the DataFrame in place Renaming the Column Axis Let's create a DataFrame and rename its column axis ? ...
Read MoreWrite a Python code to find a cross tabulation of two dataframes
Cross-tabulation (crosstab) creates a frequency table showing relationships between categorical variables from different DataFrames. Pandas provides the pd.crosstab() function to compute cross-tabulations between two or more factors. Creating Sample DataFrames Let's start by creating two DataFrames with related data ? import pandas as pd # First DataFrame with Id and Age df = pd.DataFrame({'Id': [1, 2, 3, 4, 5], 'Age': [12, 13, 12, 13, 14]}) print("DataFrame 1:") print(df) # Second DataFrame with Mark df1 = pd.DataFrame({'Mark': [80, 90, 80, 90, 85]}) print("DataFrame 2:") print(df1) DataFrame 1: Id ...
Read MoreWrite a program in Python to print the length of elements in all column in a dataframe using applymap
The applymap() function in Pandas allows you to apply a function element-wise to every cell in a DataFrame. This is useful when you want to calculate the length of string elements across all columns. Understanding applymap() The applymap() method applies a function to each element of the DataFrame. Unlike apply(), which works on rows or columns, applymap() works on individual elements. Syntax DataFrame.applymap(func) Where func is the function to apply to each element. Example Let's create a DataFrame and calculate the length of elements in all columns ? import ...
Read More