I'm developing a part of an application that's responsible for exporting some data into CSV files. The application always uses UTF-8 because of its multilingual nature at all levels. But opening such CSV files (containing e.g. diacritics, cyrillic letters, Greek letters) in Excel does not achieve the expected results showing something like Г„/Г¤, Г–/Г¶. And I don't know how to force Excel understand that the open CSV file is encoded in UTF-8. I also tried specifying UTF-8 BOM EF BB BF, but Excel ignores that.

有什么解决办法吗?

附注:哪些工具可能像Excel一样?


更新

I have to say that I've confused the community with the formulation of the question. When I was asking this question, I asked for a way of opening a UTF-8 CSV file in Excel without any problems for a user, in a fluent and transparent way. However, I used a wrong formulation asking for doing it automatically. That is very confusing and it clashes with VBA macro automation. There are two answers for this questions that I appreciate the most: the very first answer by Alex https://stackoverflow.com/a/6002338/166589, and I've accepted this answer; and the second one by Mark https://stackoverflow.com/a/6488070/166589 that have appeared a little later. From the usability point of view, Excel seemed to have lack of a good user-friendly UTF-8 CSV support, so I consider both answers are correct, and I have accepted Alex's answer first because it really stated that Excel was not able to do that transparently. That is what I confused with automatically here. Mark's answer promotes a more complicated way for more advanced users to achieve the expected result. Both answers are great, but Alex's one fits my not clearly specified question a little better.


更新2

在最后一次编辑5个月后,我注意到Alex的答案不知为何消失了。我真的希望这不是一个技术问题,我希望现在不再有关于哪个答案更好的讨论。所以我认为马克的答案是最好的。


当前回答

Alex是正确的,但是由于你必须导出到csv,你可以在打开csv文件时给用户这样的建议:

另存为csv格式 打开Excel 使用“data”导入数据——>导入外部数据——>导入数据 选择文件类型“csv”并浏览到您的文件 在导入向导中将File_Origin更改为“65001 UTF”(或选择正确的语言字符标识符) 将分隔符更改为逗号 选择要导入的位置并完成

这样特殊字符才能正确显示。

其他回答

php生成的CSV文件也有同样的问题。 当分隔符在内容开头通过“sep=,\n”定义时(当然是在BOM之后),Excel会忽略BOM。

因此,在内容的开头添加一个BOM ("\xEF\xBB\xBF"),并通过fputcsv($fh, $data_array, ";")设置分号作为分隔符;很管用。

简单的vba宏用于打开utf-8文本和csv文件

Sub OpenTextFile()

   filetoopen = Application.GetOpenFilename("Text Files (*.txt;*.csv), *.txt;*.csv")
   If filetoopen = Null Or filetoopen = Empty Then Exit Sub

   Workbooks.OpenText Filename:=filetoopen, _
   Origin:=65001, DataType:=xlDelimited, Comma:=True

End Sub

原点:=65001为UTF-8。 逗号:对于按列分布的.csv文件为True

保存在个人。XLSB使它始终可用。 个性化excel工具栏添加一个宏调用按钮,并从那里打开文件。 您可以添加更多的格式到宏,如列自动拟合,对齐等。

嗨,我正在使用ruby on rails生成CSV。在我们的应用程序中,我们计划使用多语言(I18n),但在windows excel的CSV文件中查看I18n内容时遇到了一个问题。

Linux (Ubuntu)和mac都没问题。

我们发现windows excel需要重新导入数据才能查看实际数据。在导入时,我们将获得更多选择字符集的选项。

但这不能教育每一个用户,所以我们寻找的解决方案是只需双击打开。

然后利用aghuddleston gist确定了在windows excel中以open模式显示数据和bom格式显示数据的方法。在引用时添加。

示例I18n内容

在Mac和Linux中

瑞典语:Förnamn 中文:名字

在Windows中

瑞典语:Förnamn 中文:名字

def user_information_report(report_file_path, user_id)
    user = User.find(user_id)
    I18n.locale = user.current_lang
    open_mode = "w+:UTF-16LE:UTF-8"
    bom = "\xEF\xBB\xBF"
    body user, open_mode, bom
  end

def headers
    headers = [
        "ID", "SDN ID",
        I18n.t('sys_first_name'), I18n.t('sys_last_name'), I18n.t('sys_dob'),
        I18n.t('sys_gender'), I18n.t('sys_email'), I18n.t('sys_address'),
        I18n.t('sys_city'), I18n.t('sys_state'), I18n.t('sys_zip'),
        I18n.t('sys_phone_number')
    ]
  end

def body tenant, open_mode, bom
    File.open(report_file_path, open_mode) do |f|
      csv_file = CSV.generate(col_sep: "\t") do |csv|
        csv << headers
        tenant.patients.find_each(batch_size: 10) do |patient|
          csv <<  [
              patient.id, patient.patientid,
              patient.first_name, patient.last_name, "#{patient.dob}",
              "#{translate_gender(patient.gender)}", patient.email, "#{patient.address_1.to_s} #{patient.address_2.to_s}",
              "#{patient.city}", "#{patient.state}",  "#{patient.zip}",
              "#{patient.phone_number}"
          ]
        end
      end
      f.write bom
      f.write(csv_file)
    end
  end

这里需要注意的重要事项是open mode和bom

open_mode = "w+:UTF-16LE:UTF-8"

好= "\xEF\xBB\xBF"

在写入CSV之前插入BOM

f.write好

f.write (csv_file)

Windows和Mac

双击即可直接打开文件。

Linux (ubuntu)

当打开一个文件时,询问分隔符选项->选择“TAB”

正如我在http://thinkinginsoftware.blogspot.com/2017/12/correctly-generate-csv-that-excel-can.html:上发表的

告诉负责生成CSV的软件开发人员纠正它。作为一个快速的解决方法,你可以使用gsed在字符串的开头插入UTF-8 BOM:

gsed -i '1s/^\(\xef\xbb\xbf\)\?/\xef\xbb\xbf/' file.csv

如果UTF-4 BOM不存在,该命令将插入。因此这是一个幂等命令。现在您应该能够双击该文件并在Excel中打开它。

我们使用了以下方法:

转换CSV到UTF-16 LE 在文件开头插入BOM 使用制表符作为字段分隔符