Learn to work with CSV files in Python. CSV (Comma Separated Values) format is a very popular import and export format used in spreadsheets and databases. Python language contains the csv
module which has classes to read and write data in the CSV format.
Table of Contents Reading CSV file with csv.reader() Reading CSV file with csv.DictReader Writing CSV file using csv.writer() Quoting CSV Dialects Custom CSV Dialects
Reading CSV file with csv.reader()
The csv.reader()
method returns a reader object which will iterate over lines in the given CSV file.
Let’s suppose that we have the following numbers.csv
file containing numbers:
6,5,3,9,8,6,7
The following python script reads data from this CSV file.
#!/usr/bin/python3 import csv f = open( 'numbers.csv' , 'r' ) with f: reader = csv.reader(f) for row in reader: print(row) |
In above code example, we opened the numbers.csv
for reading and loaded the data with the csv.reader()
method.
Now imagine that the CSV file would use a different delimiter. (Strictly speaking, it would not be a CSV file, but this practice is common.) For instance, we have the following items.csv
file where the elements are separated by the pipe character (|):
pen|table|keyboard
The following script reads the data from the items.csv
file.
#!/usr/bin/python3 import csv f = open( 'items.csv' , 'r' ) with f: reader = csv.reader(f, delimiter= "|" ) for row in reader: for e in row: print(e) |
We specify the new separating character with the delimiter
parameter in the csv.reader()
method.
Reading CSV file with csv.DictReader
The csv.DictReader
class operates like a regular reader but maps the information read into a dictionary.
The keys for the dictionary can be passed in with the fieldnames
parameter or inferred from the first row of the CSV file.
We have the following values.csv
file:
min, avg, max 1, 5.5, 10
The first row represents the keys to the dictionary and second row represents the values.
#!/usr/bin/python3 import csv f = open( 'values.csv' , 'r' ) with f: reader = csv.DictReader(f) for row in reader: print(row) |
Above python script reads the values from the values.csv
file using the csv.DictReader
.
This is the output of the example.
$ ./read_csv3.py {' max': ' 10', 'min': '1', ' avg': ' 5.5'}
Writing CSV file using csv.writer()
The csv.writer()
method returns a writer object responsible for converting the user’s data into delimited strings on the given file-like object.
#!/usr/bin/python3 import csv nms = [[ 1 , 2 , 3 , 4 , 5 , 6 ], [ 7 , 8 , 9 , 10 , 11 , 12 ]] f = open( 'numbers2.csv' , 'w' ) with f: writer = csv.writer(f) for row in nms: writer.writerow(row) |
The script writes numbers into the numbers2.csv
file. The writerow()
method writes a row of data into the specified file.
The script produces the following file (numbers2.csv
):
1,2,3,4,5,6 7,8,9,10,11,12
It is possible to write all data in one shot. The writerows()
method writes all given rows to the CSV file.
The next code example writes a Python list to the numbers3.csv
file. The script writes three rows of numbers into the file.
#!/usr/bin/python3 import csv nms = [[ 1 , 2 , 3 ], [ 7 , 8 , 9 ], [ 10 , 11 , 12 ]] f = open( 'numbers3.csv' , 'w' ) with f: writer = csv.writer(f) writer.writerows(nms) |
When running the above program, below output is written in numbers3.csv
file:
1,2,3 7,8,9 10,11,12
Quoting
It is possible to quote words in CSV files. There are four different quoting modes in the Python CSV module:
- QUOTE_ALL — quotes all fields
- QUOTE_MINIMAL — quotes only those fields which contain special characters
- QUOTE_NONNUMERIC — quotes all non-numeric fields
- QUOTE_NONE — does not quote fields
In the next example, we write three rows to the items2.csv
file. All non-numeric fields are quoted.
#!/usr/bin/python3 import csv f = open( 'items2.csv' , 'w' ) with f: writer = csv.writer(f, quoting=csv.QUOTE_NONNUMERIC) writer.writerows(([ "coins" , 3 ], [ "pens" , 2 ], [ "bottles" , 7 ])) |
The program creates the following items2.csv
file. The item names are quoted, the quantities expressed in numbers are not quoted.
"coins",3 "pens",2 "bottles",7
CSV Dialects
Despite CSV format being a very simple format, there can be many differecies, such as different delimiters, new lines, or quoting characters. Therefore, there are different CSV dialects available.
The next code example prints the available dialects and their characteristics.
#!/usr/bin/python3 import csv names = csv.list_dialects() for name in names: print(name) dialect = csv.get_dialect(name) print(repr(dialect.delimiter), end= " " ) print(dialect.doublequote, end= " " ) print(dialect.escapechar, end= " " ) print(repr(dialect.lineterminator), end= " " ) print(dialect.quotechar, end= " " ) print(dialect.quoting, end= " " ) print(dialect.skipinitialspace, end= " " ) print(dialect.strict) |
The csv.list_dialects()
returns the list of dialect names and the csv.get_dialect()
method returns the dialect associated with the dialect name.
$ ./dialects.py excel ',' 1 None '\r\n' " 0 0 0 excel-tab '\t' 1 None '\r\n' " 0 0 0 unix ',' 1 None '\n' " 1 0 0
The program prints this output. There are three built-in dialects: excel
, excel-tab
and unix
.
Custom CSV Dialect
In the last example of this tutorial, we will create a custom dialect. A custom dialect is created with the csv.register_dialect()
method.
#!/usr/bin/python3 import csv csv.register_dialect( "hashes" , delimiter= "#" ) f = open( 'items3.csv' , 'w' ) with f: writer = csv.writer(f, dialect= "hashes" ) writer.writerow(( "pencils" , 2 )) writer.writerow(( "plates" , 1 )) writer.writerow(( "books" , 4 )) |
The program uses a (#) character as a delimiter. The dialect is specified with the dialect
option in the csv.writer()
method.
The program produces the following file (items3.csv
):
pencils#2 plates#1 books#4
In this tutorial, we have explored the Python csv
module and gone through some examples of reading and writing CSV files in python.
No comments:
Post a Comment
Note: only a member of this blog may post a comment.