Skip to content

Python CSV module

The Python CSV module is a built-in library used to handle Comma-Separated Values (CSV) files. While it is possible to parse CSV files manually using primitive string operations, using the csv module is recommended because manually writing data structures like dictionaries line-by-line is inefficient and CPU-intensive when dealing with large datasets.^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md]

Reading CSV Files

To read a CSV file, the file must first be opened using the open() function.^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md] It is best practice to use the with open(...) statement to handle the file resource automatically^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md].

The module provides csv.DictReader, which maps the information in each row to a dictionary whose keys are given by the optional header row^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md]. This allows accessing field values by column name (e.g., row['customerID']) rather than index^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md].

import csv

with open('customers.log', newline='') as customerFile:
    reader = csv.DictReader(customerFile)
    for row in reader:
        print("customer id:" + row['customerID'] + " fullName : " + row['firstName'] + " " + row['lastName'])

Writing CSV Files

To write data to a CSV file, you use csv.writer^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md]. A common pattern involves first defining a list of field names (headers) and writing them to the file using writer.writerow() before iterating over your data to write the records^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md].

fields = ['customerID', 'firstName', 'lastName']
with open('customers.log', 'w', newline='') as customerFile:
    writer = csv.writer(customerFile)
    writer.writerow(fields)

    customers = getCustomers()
    for customerID in customers:
        customer = customers[customerID]
        writer.writerow([customer.customerID, customer.firstName, customer.lastName])

Data Transformation

The CSV module facilitates easy transformation between file data and Python in-memory data structures^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md]. For example, data read via DictReader can be converted into a list and then projected into a dictionary for fast lookups^[400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md].

import os.path

def getCustomers():
    if os.path.isfile("customers.log"):
        with open('customers.log', newline='') as customerFile:
            reader = csv.DictReader(customerFile)
            l = list(reader)
            customers = {c["customerID"]: c for c in l}
            return customers
    else: 
        return {}
  • [[Python file handling]]
  • [[Python dictionaries]]

Sources

  • 400-devops__09-Scripting-Language__python__introduction__part-2.files__README.md