Have you ever needed to exchange information between the programs except using the variables? Have you ever thought or felt the need to have a text file to exchange data between the programs? Do you know about the idea of storing data/information into an external file and later extracting it when needed? Yes, we can store and transfer data between the programs using the file format known as CSV.
What is a CSV file?
A CSV file is a type of file that is used to store data in a structured tabular (Row/Column) form. It is a plain text file and as its name indicates it stores values separated by a comma.
In this post, we will have a detailed discussion on reading, writing, and parsing a CSV file in Python.
Origin of CSV file
The concept of having a CSV file came from the need of exporting large amounts of data from one place to another(programs). For example, importing large spreadsheet data and exporting it to some database. Similarly, we can export large amounts of data to the programs.
Different languages use different formats to store data so when the programmers need to export data from one program to another they felt the need to have a kind of universal file type using which we can transfer large amounts of data; A file type which any program can read and parse into its own format.
Understand the Structure of a CSV file
The structure of the CSV file will look something like this:
Column 1, Column 2, Column 3 Value 1, Value 2, Value 3 ..., ..., ...
Just like we have columns and rows in the database:
Or in tabular spreadsheet data:
Similarly, a CSV file is a simple text file type in which the data is stored in the form of pieces separated by a comma:
Each column is separated by a comma, and each row is on a new line.
Alright, after understanding the core concept, origin, and structure of the CSV file, let’s learn to read, write, and parse CSV in Python.
Reading a CSV file in Python
For parsing CSV files, luckily, we have a built-in CSV library provided by Python. The CSV library is really easy to use and can be used for both reading and writing to a CSV file. Let’s start with reading a CSV file.
For reading a CSV file, the reader object will be used. Let’s start writing the code for reading the CSV file and understand it in a step-by-step procedure:
Reading a CSV file with the Default (Comma) Delimiter
First of all, we have to import the CSV module:
import csv
After importing the CSV module, we are now able to use it in our python program.
Next, we have to open the CSV file using open() function in reading mode:
with open('students.csv', 'r') as csvfile
After reading the CSV file, create a CSV reader object:
csvreader = csv.reader(csvfile)
Since comma is the default delimiter so we do not have to specifically mention it. Otherwise, we have to provide the delimiter used by the CSV file.
Finally for extracting each row, use the for loop to iterate over the csvreader object and print them out:
for student in csvreader: print(student)
All in all, the final code will look like this:
1 2 3 4 5 | import csv with open('students.csv', 'r') as csvfile csvreader = csv.reader(csvfile) for student in csvreader: print(student) |
Once all the code is written, execute the code and you will have the whole CSV file parsed into the array of python:
Reading a CSV file with a Custom Delimiter
To read a CSV file with a custom delimiter, we just have to mention the delimiter while reading and creating a CSV reader object.
For example, if we put a semicolon(;) at the place of a comma in the students.csv:
And want to read CSV file based on the semicolon then the delimiter would be mentioned like:
csvreader = csv.reader(csvfile, delimiter = ';')
The final code with the specific delimiter will look like this:
1 2 3 4 5 | import csv with open('students.csv', 'r') as csvfile csvreader = csv.reader(csvfile, delimiter = ';') for student in csvreader: print(student) |
The output will be the same as we have in the previous example:
This is how we can provide a custom delimiter and read the CSV file in Python. Now, let’s learn to write a CSV file in python.
Writing a CSV file in Python
For writing a CSV file, the writer object will be used. There can be two ways to write a CSV file:
- Write lines one by one using writerow() function
- Write multiple lines using writerows() function
Let’s start writing the code for writing the CSV file and understand both ways better:
How to Write a CSV file using writerow() function in Python
By using this writerow()
function, we can only write one line at a time in a CSV file.
For example, to write three rows into a new employees.csv file, the python will go like this:
1 2 3 4 5 6 | import csv with open('employees.csv', 'w') as csvfile: writer = csv.writer(csvfile) writer.writerow(["ID", "Name", "Age"]) writer.writerow([1, "John", 35]) writer.writerow([2, "Harry", 25]) |
The above code will create a file with the name of employees.csv and add three rows to that employees.csv file.
However, this method gets inefficient, when we have to add hundreds of rows. In such cases, we have to add each row one by one, which becomes a hectic job. For resolving this problem, python also provides a writerows()
function to write numerous rows into a CSV file simultaneously.
Write Multiple Rows using writerows() function
By using the writerows() function, we can simply provide an array of arrays(rows) to the writerows() function and it will add all the rows into the CSV file.
For example, to write three rows into an employees.csv file, we first create an array with the name of “employees.csv” which contains rows in the form of array elements, and then provide the “employees.csv” array to the writerows() function.
All in all, the python will go like this:
1 2 3 4 5 | import csv employees_csv=[["ID", "Name", "Age"], [1, "John", 35], [2, "Harry", 25]] with open('employees.csv', 'w') as csvfile: writer = csv.writer(csvfile) writer.writerow(employees.csv) |
The above code will create a file with the name of employees.csv and add three rows to that employees.csv file, the same as it did for the earlier procedure.
By using this method, you can add thousands of rows to the CSV file within no time by providing the data in the form of arrays.
Conclusion
CSV is a data format that is used to store data in a tabular format and transfer it between different applications. Python has a built-in module that allows the code to read, write and parse CSV data into Python code.
In this post, we learned to read and write data in the form of a CSV file using Python.