什么是ANSI编码格式?它是系统默认格式吗? 它和ASCII有什么不同?


当前回答

ANSI encoding is a slightly generic term used to refer to the standard code page on a system, usually Windows. It is more properly referred to as Windows-1252 on Western/U.S. systems. (It can represent certain other Windows code pages on other systems.) This is essentially an extension of the ASCII character set in that it includes all the ASCII characters with an additional 128 character codes. This difference is due to the fact that "ANSI" encoding is 8-bit rather than 7-bit as ASCII is (ASCII is almost always encoded nowadays as 8-bit bytes with the MSB set to 0). See the article for an explanation of why this encoding is usually referred to as ANSI.

“ANSI”这个名字是不恰当的,因为它不对应任何实际的ANSI标准,但这个名字一直存在。ANSI与UTF-8不同。

其他回答

基本上“ANSI”指的是Windows上的遗留代码页。请参阅Raymond Chen关于此主题的文章:

这是因为Windows代码页1252最初是基于ANSI草案,后来成为ISO标准8859-1。

在大多数代码页中,前127个字符与ASCII相同,但上面的字符有所不同。

然而,ANSI并不自动表示CP1252或拉丁1。

尽管有很多困惑,但您现在应该简单地避免这些问题,并使用Unicode。

ANSI encoding is a slightly generic term used to refer to the standard code page on a system, usually Windows. It is more properly referred to as Windows-1252 on Western/U.S. systems. (It can represent certain other Windows code pages on other systems.) This is essentially an extension of the ASCII character set in that it includes all the ASCII characters with an additional 128 character codes. This difference is due to the fact that "ANSI" encoding is 8-bit rather than 7-bit as ASCII is (ASCII is almost always encoded nowadays as 8-bit bytes with the MSB set to 0). See the article for an explanation of why this encoding is usually referred to as ANSI.

“ANSI”这个名字是不恰当的,因为它不对应任何实际的ANSI标准,但这个名字一直存在。ANSI与UTF-8不同。

如果你的电脑不是“西方”电脑,你不知道使用哪个代码页,你可以看看这个页面:国家语言支持(NLS) API参考

[微软删除了此引用,将其从web存档的国家语言支持(NLS) API引用中取出

或者您可以查询您的注册表:

C:\>reg query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage /f ACP

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage
    ACP    REG_SZ    1252

End of search: 1 match(es) found.

C:\>

Once upon a time Microsoft, like everyone else, used 7-bit character sets, and they invented their own when it suited them, though they kept ASCII as a core subset. Then they realised the world had moved on to 8-bit encodings and that there were international standards around, such as the ISO-8859 family. In those days, if you wanted to get hold of an international standard and you lived in the US, you bought it from the American National Standards Institute, ANSI, who republished international standards with their own branding and numbers (that's because the US government wants conformance to American standards, not international standards). So Microsoft's copy of ISO-8859 said "ANSI" on the cover. And because Microsoft weren't very used to standards in those days, they didn't realise that ANSI published lots of other standards as well. So they referred to the standards in the ISO-8859 family (and the variants that they invented, because they didn't really understand standards in those days) by the name on the cover, "ANSI", and it found its way into Microsoft user documentation and hence into the user community. That was about 30 years ago, but you still sometimes hear the name today.

当使用单字节字符时,ASCII格式定义了前127个字符。128-255的扩展字符由各种ANSI代码页定义,以允许对其他语言的有限支持。为了理解ANSI编码的字符串,您需要知道它使用哪个代码页。