Excel到CSV的UTF8编码

我有一个Excel文件，其中有一些西班牙字符(波浪号等)，我需要将其转换为CSV文件作为导入文件使用。然而，当我将另存为CSV时，它会破坏不是ASCII字符的“特殊”西班牙字符。它似乎也这样做的左右引号和长破折号，似乎是来自最初的用户在Mac中创建Excel文件。

由于CSV只是一个文本文件，我确信它可以处理UTF8编码，所以我猜这是Excel的限制，但我正在寻找一种方法，从Excel到CSV，并保持非ascii字符完整。

当前回答

做到这一点的唯一“简单方法”如下。首先，要意识到Excel .csv文件中显示的内容和隐藏的内容之间是有区别的。

Open an Excel file where you have the info (.xls, .xlsx) In Excel, choose "CSV (Comma Delimited) (*.csv) as the file type and save as that type. In NOTEPAD (found under "Programs" and then Accessories in Start menu), open the saved .csv file in Notepad Then choose -> Save As... and at the bottom of the "save as" box, there is a select box labelled as "Encoding". Select UTF-8 (do NOT use ANSI or you lose all accents etc). After selecting UTF-8, then save the file to a slightly different file name from the original.

该文件采用UTF-8格式，保留所有字符和重音，可以导入，例如，MySQL和其他数据库程序。

这个答案来自这个论坛。

2015-01-27 21:05:03

其他回答

“nevets1219”的第二个选项是在notepad++中打开CSV文件并将其转换为ANSI。

在顶部菜单中选择: Encoding ->转换为Ansi

2011-02-16 18:57:40

我也遇到了同样的问题，于是谷歌了这篇文章。以上这些方法对我都没用。最后，我将我的Unicode .xls转换为.xml(选择另存为…XML电子表格2003)，它产生了正确的字符。然后我编写代码来解析xml并提取内容供我使用。

2015-09-01 15:57:16

另一个我觉得有用的例子是: “数字”允许在保存为CSV时进行编码设置。

2011-04-04 08:30:15

将Excel表格保存为“Unicode Text (.txt)”。好消息是所有的国际字符都是UTF16(注意，不是UTF8)。但是，新的“*.txt”文件是TAB分隔符，而不是逗号分隔符，因此不是真正的CSV。 (可选)除非您可以使用制表符分隔的文件进行导入，否则请使用您最喜欢的文本编辑器并将制表符替换为逗号“，”。在目标应用程序中导入*.txt文件。确保它可以接受UTF16格式。

如果UTF-16已经正确实现，并且支持非bmp代码点，那么您就可以将UTF-16文件转换为UTF-8而不会丢失信息。我把它留给你去寻找你最喜欢的方法。

我使用这个过程从Excel导入数据到Moodle。

2013-03-19 12:51:59

I needed to automate this process on my Mac. I originally tried using catdoc/xls2csv as suggested by mpowered, but xls2csv had trouble detecting the original encoding of the document and not all documents were the same. What I ended up doing was setting the default webpage output encoding to be UTF-8 and then providing the files to Apple's Automator, applying the Convert Format of Excel Files action to convert to Web Page (HTML). Then using PHP, DOMDocument and XPath, I queried the documents and formatted them to CSV.

这是PHP脚本(process.php):

<?php
$pi = pathinfo($argv[1]);
$file = $pi['dirname'] . '/' . $pi['filename'] . '.csv';
$fp = fopen($file,'w+');
$doc = new DOMDocument;
$doc->loadHTMLFile($argv[1]);
$xpath = new DOMXPath($doc);
$table = [];
foreach($xpath->query('//tr') as $row){
    $_r = [];
    foreach($xpath->query('td',$row) as $col){
        $_r[] = trim($col->textContent);
    }
    fputcsv($fp,$_r);
}
fclose($fp);
?>

这是我用来将HTML文档转换为csv的shell命令:

find . -name '*.htm' | xargs -I{} php ./process.php {}

这是一种非常非常迂回的方法，但这是我发现的最可靠的方法。

2016-06-07 18:46:01

Excel到CSV的UTF8编码

推荐文章

最新文章

标签