Analysing Mobile Data Speeds from TRAI with Pandas in Python

The Telecom Regulatory Authority of India (TRAI) publishes data regarding mobile internet speeds for various telecom operators across India. This data is useful for users and telecom companies to evaluate and compare the performance of different service providers.

In this article, we are going to use pandas in Python to analyze the TRAI mobile data speed reports. Let's assume we have a CSV file named demo.csv with the following structure to use in the examples.

Sample Data Structure

date operator circle download_speed upload_speed
2025-06-01 Airtel Hyderabad 23.4 8.5
2025-06-01 Idea Vijayawada 22.5 8.0
2025-06-01 Jio Mumbai 10.6 6.0
2025-06-01 BSNL Kerala 7.4 3.0

Loading the Dataset

You can load a dataset using the read_csv() function. It reads data from CSV (Comma-Separated Values) files and converts them into a DataFrame −

import pandas as pd

# Create sample data for demonstration
data = {
    'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
    'operator': ['Airtel', 'Idea', 'Jio', 'BSNL'],
    'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala'],
    'download_speed': [23.4, 22.5, 10.6, 7.4],
    'upload_speed': [8.5, 8.0, 6.0, 3.0]
}

df = pd.DataFrame(data)
print(df.head())
        date operator     circle  download_speed  upload_speed
0 2025-06-01   Airtel  Hyderabad            23.4           8.5
1 2025-06-01     Idea Vijayawada            22.5           8.0
2 2025-06-01      Jio     Mumbai            10.6           6.0
3 2025-06-01     BSNL     Kerala             7.4           3.0

Using Boolean Indexing

Boolean indexing is a technique used in Python, particularly within libraries like NumPy and Pandas, for filtering and selecting data based on specific conditions. It is also known as Boolean masking.

Filtering Data by Circle

Consider the following example, where we filter data for a specific circle −

import pandas as pd

# Create sample data
data = {
    'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
    'operator': ['Airtel', 'Idea', 'Jio', 'BSNL'],
    'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala'],
    'download_speed': [23.4, 22.5, 10.6, 7.4],
    'upload_speed': [8.5, 8.0, 6.0, 3.0]
}

df = pd.DataFrame(data)

# Filter data for Hyderabad circle
hyderabad_data = df[df['circle'] == 'Hyderabad']
print(hyderabad_data)
        date operator     circle  download_speed  upload_speed
0 2025-06-01   Airtel  Hyderabad            23.4           8.5

Filtering by Download Speed

You can also filter operators with download speeds above a certain threshold −

import pandas as pd

# Create sample data
data = {
    'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
    'operator': ['Airtel', 'Idea', 'Jio', 'BSNL'],
    'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala'],
    'download_speed': [23.4, 22.5, 10.6, 7.4],
    'upload_speed': [8.5, 8.0, 6.0, 3.0]
}

df = pd.DataFrame(data)

# Filter operators with download speed greater than 15 Mbps
fast_operators = df[df['download_speed'] > 15]
print(fast_operators)
        date operator     circle  download_speed  upload_speed
0 2025-06-01   Airtel  Hyderabad            23.4           8.5
1 2025-06-01     Idea Vijayawada            22.5           8.0

Using groupby() Method

The Pandas groupby() method is used to split a DataFrame into groups based on one or more columns, allowing for efficient data analysis. It follows the "split-apply-combine" strategy.

Average Speed by Circle

In the following example, we find the top 3 circles with the highest average download speed −

import pandas as pd

# Create expanded sample data
data = {
    'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
    'operator': ['Airtel', 'Idea', 'Jio', 'BSNL', 'Airtel', 'Jio'],
    'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala', 'Mumbai', 'Kerala'],
    'download_speed': [23.4, 22.5, 10.6, 7.4, 20.1, 8.2],
    'upload_speed': [8.5, 8.0, 6.0, 3.0, 7.8, 3.5]
}

df = pd.DataFrame(data)

# Group by circle and calculate average download speed
avg_speed_by_circle = df.groupby('circle')['download_speed'].mean().sort_values(ascending=False).head(3)
print(avg_speed_by_circle)
circle
Vijayawada    22.50
Hyderabad     23.40
Mumbai        15.35
Name: download_speed, dtype: float64

Performance by Operator

You can also analyze performance statistics by telecom operator −

import pandas as pd

# Create sample data
data = {
    'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
    'operator': ['Airtel', 'Idea', 'Jio', 'BSNL', 'Airtel', 'Jio'],
    'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala', 'Mumbai', 'Kerala'],
    'download_speed': [23.4, 22.5, 10.6, 7.4, 20.1, 8.2],
    'upload_speed': [8.5, 8.0, 6.0, 3.0, 7.8, 3.5]
}

df = pd.DataFrame(data)

# Group by operator and calculate statistics
operator_stats = df.groupby('operator')[['download_speed', 'upload_speed']].agg(['mean', 'max', 'min'])
print(operator_stats.round(2))
        download_speed              upload_speed            
                  mean   max   min         mean  max  min
operator                                                  
Airtel           21.75  23.4  20.1         8.15  8.5  7.8
BSNL              7.40   7.4   7.4         3.00  3.0  3.0
Idea             22.50  22.5  22.5         8.00  8.0  8.0
Jio               9.40  10.6   8.2         4.75  6.0  3.5

Conclusion

With the help of pandas, analyzing mobile data speeds from TRAI becomes straightforward. From filtering operator-specific data to ranking performance by region, we can get clear insights using just a few lines of code.

Updated on: 2026-03-25T06:44:01+05:30

273 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements