I'm developing a part of an application that's responsible for exporting some data into CSV files. The application always uses UTF-8 because of its multilingual nature at all levels. But opening such CSV files (containing e.g. diacritics, cyrillic letters, Greek letters) in Excel does not achieve the expected results showing something like Г„/Г¤, Г–/Г¶. And I don't know how to force Excel understand that the open CSV file is encoded in UTF-8. I also tried specifying UTF-8 BOM EF BB BF, but Excel ignores that.

有什么解决办法吗?

附注:哪些工具可能像Excel一样?


更新

I have to say that I've confused the community with the formulation of the question. When I was asking this question, I asked for a way of opening a UTF-8 CSV file in Excel without any problems for a user, in a fluent and transparent way. However, I used a wrong formulation asking for doing it automatically. That is very confusing and it clashes with VBA macro automation. There are two answers for this questions that I appreciate the most: the very first answer by Alex https://stackoverflow.com/a/6002338/166589, and I've accepted this answer; and the second one by Mark https://stackoverflow.com/a/6488070/166589 that have appeared a little later. From the usability point of view, Excel seemed to have lack of a good user-friendly UTF-8 CSV support, so I consider both answers are correct, and I have accepted Alex's answer first because it really stated that Excel was not able to do that transparently. That is what I confused with automatically here. Mark's answer promotes a more complicated way for more advanced users to achieve the expected result. Both answers are great, but Alex's one fits my not clearly specified question a little better.


更新2

在最后一次编辑5个月后,我注意到Alex的答案不知为何消失了。我真的希望这不是一个技术问题,我希望现在不再有关于哪个答案更好的讨论。所以我认为马克的答案是最好的。


当前回答

嗨,我正在使用ruby on rails生成CSV。在我们的应用程序中,我们计划使用多语言(I18n),但在windows excel的CSV文件中查看I18n内容时遇到了一个问题。

Linux (Ubuntu)和mac都没问题。

我们发现windows excel需要重新导入数据才能查看实际数据。在导入时,我们将获得更多选择字符集的选项。

但这不能教育每一个用户,所以我们寻找的解决方案是只需双击打开。

然后利用aghuddleston gist确定了在windows excel中以open模式显示数据和bom格式显示数据的方法。在引用时添加。

示例I18n内容

在Mac和Linux中

瑞典语:Förnamn 中文:名字

在Windows中

瑞典语:Förnamn 中文:名字

def user_information_report(report_file_path, user_id)
    user = User.find(user_id)
    I18n.locale = user.current_lang
    open_mode = "w+:UTF-16LE:UTF-8"
    bom = "\xEF\xBB\xBF"
    body user, open_mode, bom
  end

def headers
    headers = [
        "ID", "SDN ID",
        I18n.t('sys_first_name'), I18n.t('sys_last_name'), I18n.t('sys_dob'),
        I18n.t('sys_gender'), I18n.t('sys_email'), I18n.t('sys_address'),
        I18n.t('sys_city'), I18n.t('sys_state'), I18n.t('sys_zip'),
        I18n.t('sys_phone_number')
    ]
  end

def body tenant, open_mode, bom
    File.open(report_file_path, open_mode) do |f|
      csv_file = CSV.generate(col_sep: "\t") do |csv|
        csv << headers
        tenant.patients.find_each(batch_size: 10) do |patient|
          csv <<  [
              patient.id, patient.patientid,
              patient.first_name, patient.last_name, "#{patient.dob}",
              "#{translate_gender(patient.gender)}", patient.email, "#{patient.address_1.to_s} #{patient.address_2.to_s}",
              "#{patient.city}", "#{patient.state}",  "#{patient.zip}",
              "#{patient.phone_number}"
          ]
        end
      end
      f.write bom
      f.write(csv_file)
    end
  end

这里需要注意的重要事项是open mode和bom

open_mode = "w+:UTF-16LE:UTF-8"

好= "\xEF\xBB\xBF"

在写入CSV之前插入BOM

f.write好

f.write (csv_file)

Windows和Mac

双击即可直接打开文件。

Linux (ubuntu)

当打开一个文件时,询问分隔符选项->选择“TAB”

其他回答

只是为了帮助有兴趣在Excel上打开文件实现这个线程的用户。

我使用了下面的向导,它对我来说工作得很好,导入了一个UTF-8文件。 不是透明的,但如果您已经有了该文件,则非常有用。

Open Microsoft Excel 2007. Click on the Data menu bar option. Click on the From Text icon. Navigate to the location of the file that you want to import. Click on the filename and then click on the Import button. The Text Import Wizard - Step 1 or 3 window will now appear on the screen. Choose the file type that best describes your data - Delimited or Fixed Width. Choose 65001: Unicode (UTF-8) from the drop-down list that appears next to File origin. Click on the Next button to display the Text Import Wizard - Step 2 or 3 window. Place a checkmark next to the delimiter that was used in the file you wish to import into Microsoft Excel 2007. The Data preview window will show you how your data will appear based on the delimiter that you chose. Click on the Next button to display the Text Import Wizard - Step 3 of 3. Choose the appropriate data format for each column of data that you want to import. You also have the option to not import one or more columns of data if you want. Click on the Finish button to finish importing your data into Microsoft Excel 2007.

来源:https://www.itg.ias.edu/content/how-import-csv-file-uses-utf-8-character-encoding-0

UTF-8字节顺序标记将提示Excel 2007+您正在使用UTF-8。(请看这篇SO帖子)。

以防有人遇到和我一样的问题,. net的UTF8编码类不会在GetBytes()调用中输出字节顺序标记。您需要使用流(或使用一种变通方法)来获取要输出的BOM。

首先将Excel电子表格保存为Unicode文本。使用ie浏览器打开TXT文件,点击“另存为”TXT编码-选择合适的编码,例如Win Cyrillic 1251

这并不是准确地解决问题,但由于我偶然发现了这一点,上面的解决方案不适合我或有要求,我不能满足,这里是另一种方式添加BOM时,你可以访问vim:

vim -e -s +"set bomb|set encoding=utf-8|wq" filename.csv

一个真正令人惊叹的答案列表,但由于还缺少一个非常好的答案,我在这里提到它:打开谷歌表的csv文件,并将其保存到本地计算机作为excel文件。

与微软相比,谷歌已经成功支持UTF-8 csv文件,所以它只是在那里打开文件。导出到excel格式也可以。因此,尽管这可能不是所有人的首选解决方案,但它是非常安全的,点击次数也不像听起来那么多,特别是当您已经登录到谷歌时。