我有一个大约有2000条记录的CSV文件。
每条记录都有一个字符串和一个类别:
This is the first line,Line1
This is the second line,Line2
This is the third line,Line3
我需要把这个文件读入一个列表,看起来像这样:
data = [('This is the first line', 'Line1'),
('This is the second line', 'Line2'),
('This is the third line', 'Line3')]
如何使用Python将此CSV导入到我需要的列表?
扩展一下您的需求,假设您不关心行顺序,并希望将它们分组到类别下,下面的解决方案可能适合您:
>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
... for line in f:
... text, cat = line.rstrip("\n").split(",", 1)
... dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})
通过这种方式,您可以在类别的键下获得字典中所有可用的相关行。
扩展一下您的需求,假设您不关心行顺序,并希望将它们分组到类别下,下面的解决方案可能适合您:
>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
... for line in f:
... text, cat = line.rstrip("\n").split(",", 1)
... dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})
通过这种方式,您可以在类别的键下获得字典中所有可用的相关行。
Python3的更新:
import csv
from pprint import pprint
with open('text.csv', newline='') as file:
reader = csv.reader(file)
res = list(map(tuple, reader))
pprint(res)
输出:
[('This is the first line', ' Line1'),
('This is the second line', ' Line2'),
('This is the third line', ' Line3')]
如果csvfile是一个文件对象,它应该用newline= "打开。
csv模块
使用csv模块:
import csv
with open('file.csv', newline='') as f:
reader = csv.reader(f)
data = list(reader)
print(data)
输出:
[['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]
如果你需要元组:
import csv
with open('file.csv', newline='') as f:
reader = csv.reader(f)
data = [tuple(row) for row in reader]
print(data)
输出:
[('This is the first line', 'Line1'), ('This is the second line', 'Line2'), ('This is the third line', 'Line3')]
旧的Python 2答案,同样使用csv模块:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
your_list = list(reader)
print your_list
# [['This is the first line', 'Line1'],
# ['This is the second line', 'Line2'],
# ['This is the third line', 'Line3']]