什么是ANSI编码格式?它是系统默认格式吗? 它和ASCII有什么不同?


当前回答

我记得当“ANSI”文本引用伪VT-100转义码通过ANSI在DOS中可用。SYS驱动程序来改变流文本....可能不是你指的,但如果是,请参阅http://en.wikipedia.org/wiki/ANSI_escape_code

其他回答

如果你的电脑不是“西方”电脑,你不知道使用哪个代码页,你可以看看这个页面:国家语言支持(NLS) API参考

[微软删除了此引用,将其从web存档的国家语言支持(NLS) API引用中取出

或者您可以查询您的注册表:

C:\>reg query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage /f ACP

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage
    ACP    REG_SZ    1252

End of search: 1 match(es) found.

C:\>

Once upon a time Microsoft, like everyone else, used 7-bit character sets, and they invented their own when it suited them, though they kept ASCII as a core subset. Then they realised the world had moved on to 8-bit encodings and that there were international standards around, such as the ISO-8859 family. In those days, if you wanted to get hold of an international standard and you lived in the US, you bought it from the American National Standards Institute, ANSI, who republished international standards with their own branding and numbers (that's because the US government wants conformance to American standards, not international standards). So Microsoft's copy of ISO-8859 said "ANSI" on the cover. And because Microsoft weren't very used to standards in those days, they didn't realise that ANSI published lots of other standards as well. So they referred to the standards in the ISO-8859 family (and the variants that they invented, because they didn't really understand standards in those days) by the name on the cover, "ANSI", and it found its way into Microsoft user documentation and hence into the user community. That was about 30 years ago, but you still sometimes hear the name today.

ASCII只是定义了一个有128个符号的7位代码页。ANSI将其扩展到8位,并且对于符号128到255有几个不同的代码页。

命名ANSI是不正确的,因为它实际上是定义此代码页的ISO/IEC 8859规范。参考ISO/ iec8859。从ISO/IEC 8859-1到ISO/IEC 8859-16共有16个代码页。

Windows-1252也是基于ISO/IEC 8859-1进行了一些修改,主要是在C1控件的范围内设置为128到159。维基百科指出,Windows-1252也被称为ISO-8859-1,在ISO和8859之间有第二个连字符。(难以置信!谁会做这种事?!?)

当使用单字节字符时,ASCII格式定义了前127个字符。128-255的扩展字符由各种ANSI代码页定义,以允许对其他语言的有限支持。为了理解ANSI编码的字符串,您需要知道它使用哪个代码页。

从技术上讲,ANSI应该与US-ASCII相同。它指的是ANSI X3.4标准,这只是ANSI组织批准的ASCII版本。顶部位集字符的使用在ASCII/ANSI中没有定义,因为它是一个7位字符集。

然而,多年来DOS和随后的Windows社区对该术语的误用,已经使其实际含义成为“正在使用的任何机器的系统代码页”。系统码页有时也被称为“mbcs”,因为在东亚系统中,它可以是每个字符多字节编码。一些代码页甚至可以使用顶部清除字节作为多字节序列中的尾随字节,因此它甚至不严格兼容纯ASCII…但即使这样,它仍然被称为“ANSI”。

在美国和西欧的默认设置中,“ANSI”映射到Windows代码页1252。这与ISO-8859-1不同(尽管它们非常相似)。在其他机器上,它可能是其他任何东西。这使得“ANSI”作为外部编码标识符完全无用。