是否可以强制Excel自动识别UTF-8 CSV文件?

I'm developing a part of an application that's responsible for exporting some data into CSV files. The application always uses UTF-8 because of its multilingual nature at all levels. But opening such CSV files (containing e.g. diacritics, cyrillic letters, Greek letters) in Excel does not achieve the expected results showing something like Г„/Г¤, Г–/Г¶. And I don't know how to force Excel understand that the open CSV file is encoded in UTF-8. I also tried specifying UTF-8 BOM EF BB BF, but Excel ignores that.

有什么解决办法吗?

附注:哪些工具可能像Excel一样?

更新

I have to say that I've confused the community with the formulation of the question. When I was asking this question, I asked for a way of opening a UTF-8 CSV file in Excel without any problems for a user, in a fluent and transparent way. However, I used a wrong formulation asking for doing it automatically. That is very confusing and it clashes with VBA macro automation. There are two answers for this questions that I appreciate the most: the very first answer by Alex https://stackoverflow.com/a/6002338/166589, and I've accepted this answer; and the second one by Mark https://stackoverflow.com/a/6488070/166589 that have appeared a little later. From the usability point of view, Excel seemed to have lack of a good user-friendly UTF-8 CSV support, so I consider both answers are correct, and I have accepted Alex's answer first because it really stated that Excel was not able to do that transparently. That is what I confused with automatically here. Mark's answer promotes a more complicated way for more advanced users to achieve the expected result. Both answers are great, but Alex's one fits my not clearly specified question a little better.

更新2

在最后一次编辑5个月后，我注意到Alex的答案不知为何消失了。我真的希望这不是一个技术问题，我希望现在不再有关于哪个答案更好的讨论。所以我认为马克的答案是最好的。

当前回答

是的，这是可能的。正如之前多个用户所指出的，当文件以UTF-8编码时，excel读取正确的字节顺序标记似乎存在问题。对于UTF-16，它似乎没有问题，所以它是UTF-8特有的。我为此使用的解决方案是添加BOM，两次。为此，我执行了两次下面的sed命令:

sed -I '1s/^/\xef\xbb\xbf/' *.csv

，其中通配符可以替换为任何文件名。然而，这会导致.csv文件开头的sep=发生突变。然后，.csv文件将在excel中正常打开，但在第一个单元格中有一个带有“sep=”的额外行。 "sep="也可以在源文件的.csv中删除，但是当用VBA打开文件时，应该指定分隔符:

Workbooks.Open(name, Format:=6, Delimiter:=";", Local:=True)

格式6是.csv格式。将Local设置为true，以防文件中有日期。如果Local未设置为true，日期将被美国化，这在某些情况下会破坏.csv格式。

2016-12-01 13:03:53

其他回答

简单的vba宏用于打开utf-8文本和csv文件

Sub OpenTextFile()

   filetoopen = Application.GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
   If filetoopen = Null Or filetoopen = Empty Then Exit Sub

   Workbooks.OpenText Filename:=filetoopen, _
   Origin:=65001, DataType:=xlDelimited, Comma:=True

End Sub

原点:=65001为UTF-8。逗号:对于按列分布的.csv文件为True

保存在个人。XLSB使它始终可用。个性化excel工具栏添加一个宏调用按钮，并从那里打开文件。您可以添加更多的格式到宏，如列自动拟合，对齐等。

2012-03-19 15:02:40

UTF-8字节顺序标记将提示Excel 2007+您正在使用UTF-8。(请看这篇SO帖子)。

以防有人遇到和我一样的问题，. net的UTF8编码类不会在GetBytes()调用中输出字节顺序标记。您需要使用流(或使用一种变通方法)来获取要输出的BOM。

2012-07-09 16:40:03

这是我的工作解决方案:

vbFILEOPEN = "your_utf8_file.csv"
Workbooks.OpenText Filename:=vbFILEOPEN, DataType:=xlDelimited, Semicolon:=True, Local:=True, Origin:=65001

密钥是Origin:=65001

2013-10-18 09:49:47

office 365的工作解决方案

保存在UTF-16(无LE, BE) 使用分离器\t

PHP代码

$header = ['číslo', 'vytvořeno', 'ěščřžýáíé'];
$fileName = 'excel365.csv';
$fp = fopen($fileName, 'w');
fputcsv($fp, $header, "\t");
fclose($fp);

$handle = fopen($fileName, "r");
$contents = fread($handle, filesize($fileName));
$contents = iconv('UTF-8', 'UTF-16', $contents);
fclose($handle);

$handle = fopen($fileName, "w");
fwrite($handle, $contents);
fclose($handle);

2020-04-23 06:06:38

我尝试了我能在这个帖子上找到的一切，类似的，没有什么是完全有效的。然而，导入到谷歌表和简单地下载为csv工作就像一个魅力。如果你到了我的挫败点，可以试试。

2018-12-20 20:06:50

是否可以强制Excel自动识别UTF-8 CSV文件?

推荐文章

最新文章

标签