Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Analysing Mobile Data Speeds from TRAI with Pandas in Python
The Telecom Regulatory Authority of India (TRAI) publishes data regarding mobile internet speeds for various telecom operators across India. This data is useful for users and telecom companies to evaluate and compare the performance of different service providers.
In this article, we are going to use pandas in Python to analyze the TRAI mobile data speed reports. Let's assume we have a CSV file named demo.csv with the following structure to use in the examples.
Sample Data Structure
| date | operator | circle | download_speed | upload_speed |
|---|---|---|---|---|
| 2025-06-01 | Airtel | Hyderabad | 23.4 | 8.5 |
| 2025-06-01 | Idea | Vijayawada | 22.5 | 8.0 |
| 2025-06-01 | Jio | Mumbai | 10.6 | 6.0 |
| 2025-06-01 | BSNL | Kerala | 7.4 | 3.0 |
Loading the Dataset
You can load a dataset using the read_csv() function. It reads data from CSV (Comma-Separated Values) files and converts them into a DataFrame −
import pandas as pd
# Create sample data for demonstration
data = {
'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
'operator': ['Airtel', 'Idea', 'Jio', 'BSNL'],
'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala'],
'download_speed': [23.4, 22.5, 10.6, 7.4],
'upload_speed': [8.5, 8.0, 6.0, 3.0]
}
df = pd.DataFrame(data)
print(df.head())
date operator circle download_speed upload_speed
0 2025-06-01 Airtel Hyderabad 23.4 8.5
1 2025-06-01 Idea Vijayawada 22.5 8.0
2 2025-06-01 Jio Mumbai 10.6 6.0
3 2025-06-01 BSNL Kerala 7.4 3.0
Using Boolean Indexing
Boolean indexing is a technique used in Python, particularly within libraries like NumPy and Pandas, for filtering and selecting data based on specific conditions. It is also known as Boolean masking.
Filtering Data by Circle
Consider the following example, where we filter data for a specific circle −
import pandas as pd
# Create sample data
data = {
'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
'operator': ['Airtel', 'Idea', 'Jio', 'BSNL'],
'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala'],
'download_speed': [23.4, 22.5, 10.6, 7.4],
'upload_speed': [8.5, 8.0, 6.0, 3.0]
}
df = pd.DataFrame(data)
# Filter data for Hyderabad circle
hyderabad_data = df[df['circle'] == 'Hyderabad']
print(hyderabad_data)
date operator circle download_speed upload_speed
0 2025-06-01 Airtel Hyderabad 23.4 8.5
Filtering by Download Speed
You can also filter operators with download speeds above a certain threshold −
import pandas as pd
# Create sample data
data = {
'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
'operator': ['Airtel', 'Idea', 'Jio', 'BSNL'],
'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala'],
'download_speed': [23.4, 22.5, 10.6, 7.4],
'upload_speed': [8.5, 8.0, 6.0, 3.0]
}
df = pd.DataFrame(data)
# Filter operators with download speed greater than 15 Mbps
fast_operators = df[df['download_speed'] > 15]
print(fast_operators)
date operator circle download_speed upload_speed
0 2025-06-01 Airtel Hyderabad 23.4 8.5
1 2025-06-01 Idea Vijayawada 22.5 8.0
Using groupby() Method
The Pandas groupby() method is used to split a DataFrame into groups based on one or more columns, allowing for efficient data analysis. It follows the "split-apply-combine" strategy.
Average Speed by Circle
In the following example, we find the top 3 circles with the highest average download speed −
import pandas as pd
# Create expanded sample data
data = {
'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
'operator': ['Airtel', 'Idea', 'Jio', 'BSNL', 'Airtel', 'Jio'],
'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala', 'Mumbai', 'Kerala'],
'download_speed': [23.4, 22.5, 10.6, 7.4, 20.1, 8.2],
'upload_speed': [8.5, 8.0, 6.0, 3.0, 7.8, 3.5]
}
df = pd.DataFrame(data)
# Group by circle and calculate average download speed
avg_speed_by_circle = df.groupby('circle')['download_speed'].mean().sort_values(ascending=False).head(3)
print(avg_speed_by_circle)
circle Vijayawada 22.50 Hyderabad 23.40 Mumbai 15.35 Name: download_speed, dtype: float64
Performance by Operator
You can also analyze performance statistics by telecom operator −
import pandas as pd
# Create sample data
data = {
'date': ['2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01', '2025-06-01'],
'operator': ['Airtel', 'Idea', 'Jio', 'BSNL', 'Airtel', 'Jio'],
'circle': ['Hyderabad', 'Vijayawada', 'Mumbai', 'Kerala', 'Mumbai', 'Kerala'],
'download_speed': [23.4, 22.5, 10.6, 7.4, 20.1, 8.2],
'upload_speed': [8.5, 8.0, 6.0, 3.0, 7.8, 3.5]
}
df = pd.DataFrame(data)
# Group by operator and calculate statistics
operator_stats = df.groupby('operator')[['download_speed', 'upload_speed']].agg(['mean', 'max', 'min'])
print(operator_stats.round(2))
download_speed upload_speed
mean max min mean max min
operator
Airtel 21.75 23.4 20.1 8.15 8.5 7.8
BSNL 7.40 7.4 7.4 3.00 3.0 3.0
Idea 22.50 22.5 22.5 8.00 8.0 8.0
Jio 9.40 10.6 8.2 4.75 6.0 3.5
Conclusion
With the help of pandas, analyzing mobile data speeds from TRAI becomes straightforward. From filtering operator-specific data to ranking performance by region, we can get clear insights using just a few lines of code.
