我有一个JSON文件,我想转换为CSV文件。我如何用Python做到这一点?
我试着:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
csv_file.writerow(item)
f.close()
然而,这并没有起作用。我正在使用Django和我收到的错误是:
`file' object has no attribute 'writerow'`
然后我尝试了以下方法:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
f.writerow(item) # ← changed
f.close()
然后得到错误:
`sequence expected`
样本json文件:
[{
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_logentry",
"name": "Can add log entry",
"content_type": 8
}
}, {
"pk": 23,
"model": "auth.permission",
"fields": {
"codename": "change_logentry",
"name": "Can change log entry",
"content_type": 8
}
}, {
"pk": 24,
"model": "auth.permission",
"fields": {
"codename": "delete_logentry",
"name": "Can delete log entry",
"content_type": 8
}
}, {
"pk": 4,
"model": "auth.permission",
"fields": {
"codename": "add_group",
"name": "Can add group",
"content_type": 2
}
}, {
"pk": 10,
"model": "auth.permission",
"fields": {
"codename": "add_message",
"name": "Can add message",
"content_type": 4
}
}
]
我假设您的JSON文件将解码为字典列表。首先,我们需要一个将JSON对象扁平化的函数:
def flattenjson(b, delim):
val = {}
for i in b.keys():
if isinstance(b[i], dict):
get = flattenjson(b[i], delim)
for j in get.keys():
val[i + delim + j] = get[j]
else:
val[i] = b[i]
return val
在JSON对象上运行这段代码的结果:
flattenjson({
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_message",
"name": "Can add message",
"content_type": 8
}
}, "__")
is
{
"pk": 22,
"model": "auth.permission",
"fields__codename": "add_message",
"fields__name": "Can add message",
"fields__content_type": 8
}
对JSON对象输入数组中的每个dict应用此函数后:
input = map(lambda x: flattenjson( x, "__" ), input)
并查找相关的列名:
columns = [x for row in input for x in row.keys()]
columns = list(set(columns))
在CSV模块中运行这个并不难:
with open(fname, 'wb') as out_file:
csv_w = csv.writer(out_file)
csv_w.writerow(columns)
for i_r in input:
csv_w.writerow(map(lambda x: i_r.get(x, ""), columns))
正如在前面的回答中提到的,将json转换为csv的困难在于json文件可以包含嵌套字典,因此是多维数据结构,而csv是2D数据结构。但是,将多维结构转换为csv的一个好方法是使用多个主键连接在一起的csv。
在你的例子中,第一个csv输出的列是“pk”,“model”,“fields”。“pk”和“model”的值很容易获得,但因为“fields”列包含一个字典,它应该是它自己的csv,因为“codename”似乎是主键,你可以使用作为“fields”的输入来完成第一个csv。第二个csv包含来自“fields”列的字典,以codename作为主键,可用于将两个csv绑定在一起。
这是一个解决方案,为您的json文件转换嵌套字典2 csv。
import csv
import json
def readAndWrite(inputFileName, primaryKey=""):
input = open(inputFileName+".json")
data = json.load(input)
input.close()
header = set()
if primaryKey != "":
outputFileName = inputFileName+"-"+primaryKey
if inputFileName == "data":
for i in data:
for j in i["fields"].keys():
if j not in header:
header.add(j)
else:
outputFileName = inputFileName
for i in data:
for j in i.keys():
if j not in header:
header.add(j)
with open(outputFileName+".csv", 'wb') as output_file:
fieldnames = list(header)
writer = csv.DictWriter(output_file, fieldnames, delimiter=',', quotechar='"')
writer.writeheader()
for x in data:
row_value = {}
if primaryKey == "":
for y in x.keys():
yValue = x.get(y)
if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
row_value[y] = str(yValue).encode('utf8')
elif type(yValue) != dict:
row_value[y] = yValue.encode('utf8')
else:
if inputFileName == "data":
row_value[y] = yValue["codename"].encode('utf8')
readAndWrite(inputFileName, primaryKey="codename")
writer.writerow(row_value)
elif primaryKey == "codename":
for y in x["fields"].keys():
yValue = x["fields"].get(y)
if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
row_value[y] = str(yValue).encode('utf8')
elif type(yValue) != dict:
row_value[y] = yValue.encode('utf8')
writer.writerow(row_value)
readAndWrite("data")