How to Count the Number of Lines in a CSV File in Python?

Counting lines in a CSV file is a common task in data analysis. Python provides several approaches using Pandas, from simple DataFrame methods to file-level operations.

Prerequisites

First, ensure you have Pandas installed ?

pip install pandas

Sample CSV File

Let's create a sample CSV file to work with ?

import pandas as pd

# Create sample data
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'Age': [25, 30, 35, 28, 32],
    'City': ['New York', 'London', 'Tokyo', 'Paris', 'Sydney']
}

df = pd.DataFrame(data)
df.to_csv('sample.csv', index=False)
print("Sample CSV created:")
print(df)
Sample CSV created:
      Name  Age      City
0    Alice   25  New York
1      Bob   30    London
2  Charlie   35     Tokyo
3    Diana   28     Paris
4      Eve   32    Sydney

Using DataFrame Shape Attribute

The shape attribute returns a tuple of (rows, columns). The first element gives us the line count ?

import pandas as pd

# Read the CSV file
df = pd.read_csv('sample.csv')

# Get number of lines using shape attribute
num_lines = df.shape[0]
total_columns = df.shape[1]

print(f"Number of lines: {num_lines}")
print(f"Number of columns: {total_columns}")
Number of lines: 5
Number of columns: 3

Using the len() Function

The len() function directly returns the number of rows in the DataFrame ?

import pandas as pd

# Read the CSV file
df = pd.read_csv('sample.csv')

# Count lines using len() function
num_lines = len(df)

print(f"Number of lines: {num_lines}")
Number of lines: 5

Counting Lines Without Loading Entire File

For very large CSV files, you can count lines without loading all data into memory ?

import csv

def count_csv_lines(filename):
    with open(filename, 'r') as file:
        csv_reader = csv.reader(file)
        line_count = sum(1 for row in csv_reader)
    return line_count

# Count lines in our sample file
line_count = count_csv_lines('sample.csv')
print(f"Total lines (including header): {line_count}")
print(f"Data lines (excluding header): {line_count - 1}")
Total lines (including header): 6
Data lines (excluding header): 5

Comparison of Methods

Method Memory Usage Speed Best For
df.shape[0] High Fast When you need the data anyway
len(df) High Fast Simple syntax, data analysis
csv.reader Low Slow Very large files, memory constraints

Including Header Count

Remember that CSV files typically have headers. Here's how to handle both scenarios ?

import pandas as pd

# Read CSV file
df = pd.read_csv('sample.csv')

# Data rows only (excluding header)
data_lines = len(df)

# Total lines including header
total_lines = data_lines + 1

print(f"Data lines: {data_lines}")
print(f"Total lines (with header): {total_lines}")
Data lines: 5
Total lines (with header): 6

Conclusion

Use len(df) or df.shape[0] for most cases when working with DataFrames. For very large files where memory is a concern, use the csv.reader approach to count lines without loading all data.

Updated on: 2026-03-27T09:38:53+05:30

13K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements