我有一个大约有2000条记录的CSV文件。
每条记录都有一个字符串和一个类别:
This is the first line,Line1
This is the second line,Line2
This is the third line,Line3
我需要把这个文件读入一个列表,看起来像这样:
data = [('This is the first line', 'Line1'),
('This is the second line', 'Line2'),
('This is the third line', 'Line3')]
如何使用Python将此CSV导入到我需要的列表?
扩展一下您的需求,假设您不关心行顺序,并希望将它们分组到类别下,下面的解决方案可能适合您:
>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
... for line in f:
... text, cat = line.rstrip("\n").split(",", 1)
... dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})
通过这种方式,您可以在类别的键下获得字典中所有可用的相关行。
扩展一下您的需求,假设您不关心行顺序,并希望将它们分组到类别下,下面的解决方案可能适合您:
>>> fname = "lines.txt"
>>> from collections import defaultdict
>>> dct = defaultdict(list)
>>> with open(fname) as f:
... for line in f:
... text, cat = line.rstrip("\n").split(",", 1)
... dct[cat].append(text)
...
>>> dct
defaultdict(<type 'list'>, {' CatA': ['This is the first line', 'This is the another line'], ' CatC': ['This is the third line'], ' CatB': ['This is the second line', 'This is the last line']})
通过这种方式,您可以在类别的键下获得字典中所有可用的相关行。
正如在评论中已经说过的,你可以在python中使用csv库。CSV意味着用逗号分隔的值,这似乎正是您的情况:一个标签和一个用逗号分隔的值。
作为一个类别和值类型,我宁愿使用字典类型而不是元组列表。
无论如何,在下面的代码中我展示了两种方式:d是字典,l是元组列表。
import csv
file_name = "test.txt"
try:
csvfile = open(file_name, 'rt')
except:
print("File not found")
csvReader = csv.reader(csvfile, delimiter=",")
d = dict()
l = list()
for row in csvReader:
d[row[1]] = row[0]
l.append((row[0], row[1]))
print(d)
print(l)