






来自Mark Byers的例子




72 101 108 108 111 10 119 111 114 108 100 33


83 71 86 115 98 71 56 115 67 110 100 118 99 109 120 107 73 61 61




Base64 is one of the binary-to-text encoding scheme having 75% efficiency. It is used so that typical binary data (such as images) may be safely sent over legacy "not 8-bit clean" channels. In earlier email networks (till early 1990s), most email messages were plain text in the 7-bit US-ASCII character set. So many early comm protocol standards were designed to work over "7-bit" comm links "not 8-bit clean". Scheme efficiency is the ratio between number of bits in the input and the number of bits in the encoded output. Hexadecimal (Base16) is also one of the binary-to-text encoding scheme with 50% efficiency.


Binary data is arranged in continuous chunks of 24 bits (3 bytes) each. Each 24 bits chunk is grouped in to four parts of 6 bit each. Each 6 bit group is converted into their corresponding Base64 character values, i.e. Base64 encoding converts three octets into four encoded characters. The ratio of output bytes to input bytes is 4:3 (33% overhead). Interestingly, the same characters will be encoded differently depending on their position within the three-octet group which is encoded to produce the four characters. The receiver will have to reverse this process to recover the original message.






Base64 is one of the binary-to-text encoding scheme having 75% efficiency. It is used so that typical binary data (such as images) may be safely sent over legacy "not 8-bit clean" channels. In earlier email networks (till early 1990s), most email messages were plain text in the 7-bit US-ASCII character set. So many early comm protocol standards were designed to work over "7-bit" comm links "not 8-bit clean". Scheme efficiency is the ratio between number of bits in the input and the number of bits in the encoded output. Hexadecimal (Base16) is also one of the binary-to-text encoding scheme with 50% efficiency.


Binary data is arranged in continuous chunks of 24 bits (3 bytes) each. Each 24 bits chunk is grouped in to four parts of 6 bit each. Each 6 bit group is converted into their corresponding Base64 character values, i.e. Base64 encoding converts three octets into four encoded characters. The ratio of output bytes to input bytes is 4:3 (33% overhead). Interestingly, the same characters will be encoded differently depending on their position within the three-octet group which is encoded to produce the four characters. The receiver will have to reverse this process to recover the original message.


Base encoding of data is used in many situations to store or transfer data in environments that, perhaps for legacy reasons, are restricted to US-ASCII [1] data.Base encoding can also be used in new applications that do not have legacy restrictions, simply because it makes it possible to manipulate objects with text editors. In the past, different applications have had different requirements and thus sometimes implemented base encodings in slightly different ways. Today, protocol specifications sometimes use base encodings in general, and "base64" in particular, without a precise description or reference. Multipurpose Internet Mail Extensions (MIME) [4] is often used as a reference for base64 without considering the consequences for line-wrapping or non-alphabet characters. The purpose of this specification is to establish common alphabet and encoding considerations. This will hopefully reduce ambiguity in other documents, leading to better interoperability.


“媒体是”是什么意思 设计用于处理文本数据”?


他们可以处理二进制=>他们可以 处理任何事情。


字节0x0A和0x0D,用于行结束,因平台而异。 其他控制字符,如0x00 (NULL = C字符串结束符),0x03(文本结束),0x04(传输结束),或0x1A (DOS文件结束),可能过早地表示数据结束。 大于0x7F的字节(如果协议是为ASCII设计的)。 无效的UTF-8字节序列。


但有一个问题。这是怎么回事 系统仍然没有达成共识 像这样的编码技术很常见 utf - 8 ?


在西方的问题是,有很多旧的软件,1字节= 1个字符,不能与UTF-8工作。


事实上,微软似乎仍然没有从选择错误的UTF编码中恢复过来。如果您想使用Windows API或Microsoft C运行时库,则只能使用UTF-16或区域设置的“ANSI”编码。这使得使用UTF-8非常痛苦,因为您必须一直进行转换。