I'm developing a part of an application that's responsible for exporting some data into CSV files. The application always uses UTF-8 because of its multilingual nature at all levels. But opening such CSV files (containing e.g. diacritics, cyrillic letters, Greek letters) in Excel does not achieve the expected results showing something like Г„/Г¤, Г–/Г¶. And I don't know how to force Excel understand that the open CSV file is encoded in UTF-8. I also tried specifying UTF-8 BOM EF BB BF, but Excel ignores that.

有什么解决办法吗?

附注:哪些工具可能像Excel一样?


更新

I have to say that I've confused the community with the formulation of the question. When I was asking this question, I asked for a way of opening a UTF-8 CSV file in Excel without any problems for a user, in a fluent and transparent way. However, I used a wrong formulation asking for doing it automatically. That is very confusing and it clashes with VBA macro automation. There are two answers for this questions that I appreciate the most: the very first answer by Alex https://stackoverflow.com/a/6002338/166589, and I've accepted this answer; and the second one by Mark https://stackoverflow.com/a/6488070/166589 that have appeared a little later. From the usability point of view, Excel seemed to have lack of a good user-friendly UTF-8 CSV support, so I consider both answers are correct, and I have accepted Alex's answer first because it really stated that Excel was not able to do that transparently. That is what I confused with automatically here. Mark's answer promotes a more complicated way for more advanced users to achieve the expected result. Both answers are great, but Alex's one fits my not clearly specified question a little better.


更新2

在最后一次编辑5个月后,我注意到Alex的答案不知为何消失了。我真的希望这不是一个技术问题,我希望现在不再有关于哪个答案更好的讨论。所以我认为马克的答案是最好的。


当前回答

如果你想让它完全自动化,点击一下,或者从一个网页自动加载到Excel中,但不能生成适当的Excel文件,那么我建议考虑SYLK格式作为替代方案。好吧,它不像CSV那么简单,但它是基于文本的,非常容易实现,它支持UTF-8没有问题。

我写了一个PHP类,接收数据并输出一个SYLK文件,该文件将通过单击文件直接在Excel中打开(或者如果您将文件写入具有正确mime类型的web页面,将自动启动Excel)。你甚至可以添加格式(如粗体,以特定的方式格式化数字等),改变列的大小,或自动调整列的文本,所有的代码可能不超过100行。

通过创建一个简单的电子表格并保存为SYLK,然后用文本编辑器读取它,就可以非常容易地对SYLK进行逆向工程。第一个块是您可以识别的标头和标准数字格式(您只需在创建的每个文件中反刍它们),然后数据只是一个X/Y坐标和一个值。

其他回答

这是一个老问题,但我刚刚遇到过类似的问题,解决方案可能会帮助其他人:

同样的问题是,将CSV文本数据写入文件,然后在Excel中打开生成的. CSV,将所有文本转移到单个列中。在阅读了上面的答案后,我尝试了下面的答案,这似乎可以解决问题。

在创建StreamWriter时应用UTF-8编码。就是这样。

例子:

using (StreamWriter output = new StreamWriter(outputFileName, false, Encoding.UTF8, 2 << 22)) {
   /* ... do stuff .... */
   output.Close();
}

php生成的CSV文件也有同样的问题。 当分隔符在内容开头通过“sep=,\n”定义时(当然是在BOM之后),Excel会忽略BOM。

因此,在内容的开头添加一个BOM ("\xEF\xBB\xBF"),并通过fputcsv($fh, $data_array, ";")设置分号作为分隔符;很管用。

Alex是正确的,但是由于你必须导出到csv,你可以在打开csv文件时给用户这样的建议:

另存为csv格式 打开Excel 使用“data”导入数据——>导入外部数据——>导入数据 选择文件类型“csv”并浏览到您的文件 在导入向导中将File_Origin更改为“65001 UTF”(或选择正确的语言字符标识符) 将分隔符更改为逗号 选择要导入的位置并完成

这样特殊字符才能正确显示。

我过去也遇到过同样的问题(如何生成Excel可以读取的文件,以及其他工具也可以读取的文件)。我使用的是TSV而不是CSV,但同样的编码问题出现了。

我没能找到任何方法让Excel自动识别UTF-8,我也不愿意/不能给文件的使用者复杂的如何打开它们的指令。所以我将它们编码为UTF-16le(带有BOM)而不是UTF-8。大小是原来的两倍,但Excel可以识别编码。而且它们的压缩性很好,所以尺寸很少(但遗憾的是并非永远)重要。

这是我的工作解决方案:

vbFILEOPEN = "your_utf8_file.csv"
Workbooks.OpenText Filename:=vbFILEOPEN, DataType:=xlDelimited, Semicolon:=True, Local:=True, Origin:=65001

密钥是Origin:=65001