Python Data Persistence - CSV Module



CSV stands for comma separated values. This file format is a commonly used data format while exporting/importing data to/from spreadsheets and data tables in databases. The csv module was incorporated in Python’s standard library as a result of PEP 305. It presents classes and methods to perform read/write operations on CSV file as per recommendations of PEP 305.

CSV is a preferred export data format by Microsoft’s Excel spreadsheet software. However, csv module can handle data represented by other dialects also.

The CSV API interface consists of following writer and reader classes −

writer()

This function in csv module returns a writer object that converts data into a delimited string and stores in a file object. The function needs a file object with write permission as a parameter. Every row written in the file issues a newline character. To prevent additional space between lines, newline parameter is set to ''.

The writer class has following methods −

writerow()

This method writes items in an iterable (list, tuple or string), separating them by comma character.

writerows()

This method takes a list of iterables, as parameter and writes each item as a comma separated line of items in the file.

Example

Following example shows use of writer() function. First a file is opened in ‘w’ mode. This file is used to obtain writer object. Each tuple in list of tuples is then written to file using writerow() method.

import csv
   persons=[('Lata',22,45),('Anil',21,56),('John',20,60)]
   csvfile=open('persons.csv','w', newline='')
   obj=csv.writer(csvfile)
   for person in persons:
      obj.writerow(person)
csvfile.close()

Output

This will create ‘persons.csv’ file in current directory. It will show following data.

Lata,22,45
Anil,21,56
John,20,60

Instead of iterating over the list to write each row individually, we can use writerows() method.

csvfile=open('persons.csv','w', newline='')
persons=[('Lata',22,45),('Anil',21,56),('John',20,60)]
   obj=csv.writer(csvfile)
   obj.writerows(persons)
   obj.close()

reader()

This function returns a reader object which returns an iterator of lines in the csv file. Using the regular for loop, all lines in the file are displayed in following example −

Example

csvfile=open('persons.csv','r', newline='')
   obj=csv.reader(csvfile)
   for row in obj:
      print (row)

Output

['Lata', '22', '45']
['Anil', '21', '56']
['John', '20', '60']

The reader object is an iterator. Hence, it supports next() function which can also be used to display all lines in csv file instead of a for loop.

csvfile=open('persons.csv','r', newline='')
   obj=csv.reader(csvfile)
   while True:
   try:
      row=next(obj)
      print (row)
   except StopIteration:
      break

As mentioned earlier, csv module uses Excel as its default dialect. The csv module also defines a dialect class. Dialect is set of standards used to implement CSV protocol. The list of dialects available can be obtained by list_dialects() function.

>>> csv.list_dialects()
['excel', 'excel-tab', 'unix']

In addition to iterables, csv module can export a dictionary object to CSV file and read it to populate Python dictionary object. For this purpose, this module defines following classes −

DictWriter()

This function returns a DictWriter object. It is similar to writer object, but the rows are mapped to dictionary object. The function needs a file object with write permission and a list of keys used in dictionary as fieldnames parameter. This is used to write first line in the file as header.

writeheader()

This method writes list of keys in dictionary as a comma separated line as first line in the file.

In following example, a list of dictionary items is defined. Each item in the list is a dictionary. Using writrows() method, they are written to file in comma separated manner.

persons=[
   {'name':'Lata', 'age':22, 'marks':45}, 
   {'name':'Anil', 'age':21, 'marks':56}, 
   {'name':'John', 'age':20, 'marks':60}
]
csvfile=open('persons.csv','w', newline='')
fields=list(persons[0].keys())
obj=csv.DictWriter(csvfile, fieldnames=fields)
obj.writeheader()
obj.writerows(persons)
csvfile.close()

The persons.csv file shows following contents −

name,age,marks
Lata,22,45
Anil,21,56
John,20,60

DictReader()

This function returns a DictReader object from the underlying CSV file. As, in case of, reader object, this one is also an iterator, using which contents of the file are retrieved.

csvfile=open('persons.csv','r', newline='')
obj=csv.DictReader(csvfile)

The class provides fieldnames attribute, returning the dictionary keys used as header of file.

print (obj.fieldnames)
['name', 'age', 'marks']

Use loop over the DictReader object to fetch individual dictionary objects.

for row in obj:
   print (row)

This results in following output −

OrderedDict([('name', 'Lata'), ('age', '22'), ('marks', '45')])
OrderedDict([('name', 'Anil'), ('age', '21'), ('marks', '56')])
OrderedDict([('name', 'John'), ('age', '20'), ('marks', '60')])

To convert OrderedDict object to normal dictionary, we have to first import OrderedDict from collections module.

from collections import OrderedDict
   r=OrderedDict([('name', 'Lata'), ('age', '22'), ('marks', '45')])
   dict(r)
{'name': 'Lata', 'age': '22', 'marks': '45'}
Advertisements