我有一个JSON文件,我想转换为CSV文件。我如何用Python做到这一点?

我试着:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    csv_file.writerow(item)

f.close()

然而,这并没有起作用。我正在使用Django和我收到的错误是:

`file' object has no attribute 'writerow'`

然后我尝试了以下方法:

import json
import csv

f = open('data.json')
data = json.load(f)
f.close()

f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
    f.writerow(item)  # ← changed

f.close()

然后得到错误:

`sequence expected`

样本json文件:

[{
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    }, {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    }, {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }, {
        "pk": 4,
        "model": "auth.permission",
        "fields": {
            "codename": "add_group",
            "name": "Can add group",
            "content_type": 2
        }
    }, {
        "pk": 10,
        "model": "auth.permission",
        "fields": {
            "codename": "add_message",
            "name": "Can add message",
            "content_type": 4
        }
    }
]

当前回答

这不是一个很聪明的方法,但我也遇到过同样的问题,这对我来说很有效:

import csv

f = open('data.json')
data = json.load(f)
f.close()

new_data = []

for i in data:
   flat = {}
   names = i.keys()
   for n in names:
      try:
         if len(i[n].keys()) > 0:
            for ii in i[n].keys():
               flat[n+"_"+ii] = i[n][ii]
      except:
         flat[n] = i[n]
   new_data.append(flat)  

f = open(filename, "r")
writer = csv.DictWriter(f, new_data[0].keys())
writer.writeheader()
for row in new_data:
   writer.writerow(row)
f.close()

其他回答

令人惊讶的是,我发现到目前为止贴在这里的答案都没有正确处理所有可能的场景(例如,嵌套字典,嵌套列表,无值等)。

这个解决方案应该适用于所有场景:

def flatten_json(json):
    def process_value(keys, value, flattened):
        if isinstance(value, dict):
            for key in value.keys():
                process_value(keys + [key], value[key], flattened)
        elif isinstance(value, list):
            for idx, v in enumerate(value):
                process_value(keys + [str(idx)], v, flattened)
        else:
            flattened['__'.join(keys)] = value

    flattened = {}
    for key in json.keys():
        process_value([key], json[key], flattened)
    return flattened

这不是一个很聪明的方法,但我也遇到过同样的问题,这对我来说很有效:

import csv

f = open('data.json')
data = json.load(f)
f.close()

new_data = []

for i in data:
   flat = {}
   names = i.keys()
   for n in names:
      try:
         if len(i[n].keys()) > 0:
            for ii in i[n].keys():
               flat[n+"_"+ii] = i[n][ii]
      except:
         flat[n] = i[n]
   new_data.append(flat)  

f = open(filename, "r")
writer = csv.DictWriter(f, new_data[0].keys())
writer.writeheader()
for row in new_data:
   writer.writerow(row)
f.close()

解决这个问题的简单方法是:

创建一个新的Python文件,如:json_to_csv.py

添加以下代码:

import csv, json, sys
#if you are not using utf-8 files, remove the next line
sys.setdefaultencoding("UTF-8")
#check if you pass the input file and output file
if sys.argv[1] is not None and sys.argv[2] is not None:

    fileInput = sys.argv[1]
    fileOutput = sys.argv[2]

    inputFile = open(fileInput)
    outputFile = open(fileOutput, 'w')
    data = json.load(inputFile)
    inputFile.close()

    output = csv.writer(outputFile)

    output.writerow(data[0].keys())  # header row

    for row in data:
        output.writerow(row.values())

添加代码后,保存文件并在终端上运行:

Python json_to_csv.py input.txt output.csv

我希望这对你有帮助。

韩国歌手组合!

Alec的回答很好,但在存在多层嵌套的情况下行不通。下面是一个支持多层嵌套的修改版本。如果嵌套对象已经指定了自己的键(例如Firebase Analytics / BigTable / BigQuery数据),它也会使头名称更好一些:

"""Converts JSON with nested fields into a flattened CSV file.
"""

import sys
import json
import csv
import os

import jsonlines

from orderedset import OrderedSet

# from https://stackoverflow.com/a/28246154/473201
def flattenjson( b, prefix='', delim='/', val=None ):
  if val is None:
    val = {}

  if isinstance( b, dict ):
    for j in b.keys():
      flattenjson(b[j], prefix + delim + j, delim, val)
  elif isinstance( b, list ):
    get = b
    for j in range(len(get)):
      key = str(j)

      # If the nested data contains its own key, use that as the header instead.
      if isinstance( get[j], dict ):
        if 'key' in get[j]:
          key = get[j]['key']

      flattenjson(get[j], prefix + delim + key, delim, val)
  else:
    val[prefix] = b

  return val

def main(argv):
  if len(argv) < 2:
    raise Error('Please specify a JSON file to parse')

  print "Loading and Flattening..."
  filename = argv[1]
  allRows = []
  fieldnames = OrderedSet()
  with jsonlines.open(filename) as reader:
    for obj in reader:
      # print 'orig:\n'
      # print obj
      flattened = flattenjson(obj)
      #print 'keys: %s' % flattened.keys()
      # print 'flattened:\n'
      # print flattened
      fieldnames.update(flattened.keys())
      allRows.append(flattened)

  print "Exporting to CSV..."
  outfilename = filename + '.csv'
  count = 0
  with open(outfilename, 'w') as file:
    csvwriter = csv.DictWriter(file, fieldnames=fieldnames)
    csvwriter.writeheader()
    for obj in allRows:
      # print 'allRows:\n'
      # print obj
      csvwriter.writerow(obj)
      count += 1

  print "Wrote %d rows" % count



if __name__ == '__main__':
  main(sys.argv)

JSON可以表示各种各样的数据结构——JS的“对象”大致类似于Python的dict(带有字符串键),JS的“数组”大致类似于Python列表,只要最后的“叶子”元素是数字或字符串,你就可以嵌套它们。

CSV本质上只能表示一个2-D表——可选的第一行是“标题”,即“列名”,这可以使表可解释为字典列表,而不是正常的解释,一个列表的列表(同样,“叶子”元素可以是数字或字符串)。

So, in the general case, you can't translate an arbitrary JSON structure to a CSV. In a few special cases you can (array of arrays with no further nesting; arrays of objects which all have exactly the same keys). Which special case, if any, applies to your problem? The details of the solution depend on which special case you do have. Given the astonishing fact that you don't even mention which one applies, I suspect you may not have considered the constraint, neither usable case in fact applies, and your problem is impossible to solve. But please do clarify!