import csv

with open('thefile.csv', 'rb') as f:
  data = list(csv.reader(f))
  import collections
  counter = collections.defaultdict(int)

  for row in data:
        counter[row[10]] += 1


with open('/pythonwork/thefile_subset11.csv', 'w') as outfile:
    writer = csv.writer(outfile)
    for row in data:
        if counter[row[10]] >= 504:
           writer.writerow(row)

这段代码读取file.csv,进行修改,并将结果写入到file_subset1。

然而,当我在Microsoft Excel中打开结果csv时,每条记录后都有一个额外的空行!

有没有办法让它不放额外的空行?


当前回答

在使用Python 3时,可以通过使用codecs模块来避免空行。正如文档中所述,文件是以二进制模式打开的,因此不需要更改换行符kwarg。我最近遇到了同样的问题,这对我来说很有效:

with codecs.open( csv_file,  mode='w', encoding='utf-8') as out_csv:
     csv_out_file = csv.DictWriter(out_csv)

其他回答

with open(destPath+'\\'+csvXML, 'a+') as csvFile:
    writer = csv.writer(csvFile, delimiter=';', lineterminator='\r')
    writer.writerows(xmlList)

lineterminator='\r'"允许传递到下一行,两行之间没有空行。

借用这个答案,似乎最干净的解决方案是使用io.TextIOWrapper。我为自己解决了这个问题:

from io import TextIOWrapper

...

with open(filename, 'wb') as csvfile, TextIOWrapper(csvfile, encoding='utf-8', newline='') as wrapper:
    csvwriter = csv.writer(wrapper)
    for data_row in data:
        csvwriter.writerow(data_row)

上面的答案与Python 2不兼容。为了具有兼容性,我认为只需要将所有的写入逻辑包装在if块中:

if sys.version_info < (3,):
    # Python 2 way of handling CSVs
else:
    # The above logic

从最初的问题开始的十年里,许多其他的答案都已经过时了。对于Python3,答案在文档中是正确的:

如果csvfile是一个文件对象,它应该用newline= "

脚注更详细地解释了:

如果没有指定newline= ",则内嵌在带引号字段中的换行符将不能被正确解释,并且在write时使用\r\n linend的平台上将添加一个额外的\r。指定newline= "应该总是安全的,因为csv模块有自己的(通用的)换行处理。

在使用Python 3时,可以通过使用codecs模块来避免空行。正如文档中所述,文件是以二进制模式打开的,因此不需要更改换行符kwarg。我最近遇到了同样的问题,这对我来说很有效:

with codecs.open( csv_file,  mode='w', encoding='utf-8') as out_csv:
     csv_out_file = csv.DictWriter(out_csv)

注意:这似乎不是首选的解决方案,因为在Windows系统上是如何添加额外的行。如python文档中所述:

如果csvfile是一个文件对象,在不同的平台上,它必须用' b '标志打开。

Windows is one such platform where that makes a difference. While changing the line terminator as I described below may have fixed the problem, the problem could be avoided altogether by opening the file in binary mode. One might say this solution is more "elegent". "Fiddling" with the line terminator would have likely resulted in unportable code between systems in this case, where opening a file in binary mode on a unix system results in no effect. ie. it results in cross system compatible code.

来自Python文档:

On Windows, 'b' appended to the mode opens the file in binary mode, so there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This behind-the-scenes modification to file data is fine for ASCII text files, but it’ll corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files. On Unix, it doesn’t hurt to append a 'b' to the mode, so you can use it platform-independently for all binary files.

原:

作为csv可选参数的一部分。如果你得到额外的空行,你可能不得不改变行结束符(信息这里)。下面的例子改编自python页面csv docs。把“\n”改成它应该是什么样子。因为这只是在黑暗中对问题的一种尝试,这可能行得通,也可能行不通,但这是我最好的猜测。

>>> import csv
>>> spamWriter = csv.writer(open('eggs.csv', 'w'), lineterminator='\n')
>>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans'])
>>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])