考虑:
unicode=utf16是真的吗?
很多人说Unicode是一种标准,而不是一种编码,但实际上大多数编辑器都支持另存为Unicode编码。
考虑:
unicode=utf16是真的吗?
很多人说Unicode是一种标准,而不是一种编码,但实际上大多数编辑器都支持另存为Unicode编码。
当前回答
让我们首先记住,数据是以字节的形式存储的;Unicode是一个字符集,其中字符映射到代码点(唯一整数),我们需要一些东西来将这些代码点数据转换为字节。这就是UTF-8编码的由来——简单!
其他回答
Unicode的开发是有目的的 致力于创建映射的新标准 在绝大多数的字符 今天使用的语言, 和其他角色一起 不是那么重要,但可能是 创建文本所必需的。utf - 8 只是你众多方式中的一种 可以编码的文件,因为有 编码的方法有很多 文件中的字符转换为Unicode。
来源:
http://www.differencebetween.net/technology/difference-between-unicode-and-utf-8/
让我们首先记住,数据是以字节的形式存储的;Unicode是一个字符集,其中字符映射到代码点(唯一整数),我们需要一些东西来将这些代码点数据转换为字节。这就是UTF-8编码的由来——简单!
UTF-16和UTF-8都是Unicode的编码。它们都是Unicode;一个并不比另一个更符合统一码。
不要被微软的一个不幸的历史文物所迷惑。
除了Trufa的注释之外,Unicode还明确不是UTF-16。当他们第一次研究Unicode时,人们推测16位整数可能足以存储任何代码,但实际上并非如此。然而,UTF-16是Unicode的另一种有效编码(除了8位和32位变体),我相信这是微软在nt派生操作系统运行时在内存中使用的编码。
正如Rasmus在他的文章《UTF-8和Unicode的区别?》:
If asked the question, "What is the difference between UTF-8 and Unicode?", would you confidently reply with a short and precise answer? In these days of internationalization all developers should be able to do that. I suspect many of us do not understand these concepts as well as we should. If you feel you belong to this group, you should read this ultra short introduction to character sets and encodings. Actually, comparing UTF-8 and Unicode is like comparing apples and oranges: UTF-8 is an encoding - Unicode is a character set A character set is a list of characters with unique numbers (these numbers are sometimes referred to as "code points"). For example, in the Unicode character set, the number for A is 41. An encoding on the other hand, is an algorithm that translates a list of numbers to binary so it can be stored on disk. For example UTF-8 would translate the number sequence 1, 2, 3, 4 like this: 00000001 00000010 00000011 00000100 Our data is now translated into binary and can now be saved to disk. All together now Say an application reads the following from the disk: 1101000 1100101 1101100 1101100 1101111 The app knows this data represent a Unicode string encoded with UTF-8 and must show this as text to the user. First step, is to convert the binary data to numbers. The app uses the UTF-8 algorithm to decode the data. In this case, the decoder returns this: 104 101 108 108 111 Since the app knows this is a Unicode string, it can assume each number represents a character. We use the Unicode character set to translate each number to a corresponding character. The resulting string is "hello". Conclusion So when somebody asks you "What is the difference between UTF-8 and Unicode?", you can now confidently answer short and precise: UTF-8 (Unicode Transformation Format) and Unicode cannot be compared. UTF-8 is an encoding used to translate numbers into binary data. Unicode is a character set used to translate characters into numbers.