是否有一种普遍接受的技术可以有效地将JavaScript字符串转换为arraybuffer,反之亦然?具体来说,我希望能够将ArrayBuffer的内容写入localStorage,然后再将其读回来。


当前回答

我发现这种方法有问题,主要是因为我试图将输出写入一个文件,而它没有正确编码。由于JS似乎使用UCS-2编码(源,源),我们需要进一步扩展这个解决方案,这是我的增强解决方案,对我来说是有效的。

我对一般文本没有任何困难,但当它变成阿拉伯语或韩语时,输出文件没有所有字符,而是显示错误字符

文件输出: ”、“单位”:“10 K”:“O©iuY喜爱”、“遵循% % {screen_name} {screen_name}”:“U”“O©iu“推特:“¤问题”、“推%{标签}”:“%{标签}’一个¤uEY喜爱”,“推特%{名称}”:“%{名称}U”xA¤uEY喜爱”},柯:{“% {followers_count}的追随者”:“% {followers_count}…X \”,“100 K +”:“100我助教”,“10 K单位”:“我e”,遵循:“\°”,“跟着% {screen_name}”:“% {screen_name}Ø\°X0”,凯西:“œ”,男:“我”,推特:“¸”,“推特%{标签}”:“%{标签}

original: ", " 10 k unit ": "万",follow: "关注"," follow百分之百分之;screen _ name} ": " {screen _ name}先生圆场,tweet: "推特"," tweet百分之百分之{hashtag} ": " {hashtag},推特的"," tweet to百分之百分之{name} ": " {name}先生推到百分之":{},ko " {followers _ count}百分之followers ": " {followers _ count}명의팔로워100 k + ": " 100 ", "만이상"," 10 k unit ": "만단위",follow: "팔로우"," follow百分之百分之{screen _ name} ": " {screen _ name}님팔로우하기",k: "천",米:"백만",tweet: "트윗"," tweet百分之百分之{hashtag} ": " {hashtag}

我从dennis的解决方案和我发现的这个帖子中获取了信息。

这是我的代码:

function encode_utf8(s) {
  return unescape(encodeURIComponent(s));
}

function decode_utf8(s) {
  return decodeURIComponent(escape(s));
}

 function ab2str(buf) {
   var s = String.fromCharCode.apply(null, new Uint8Array(buf));
   return decode_utf8(decode_utf8(s))
 }

function str2ab(str) {
   var s = encode_utf8(str)
   var buf = new ArrayBuffer(s.length); 
   var bufView = new Uint8Array(buf);
   for (var i=0, strLen=s.length; i<strLen; i++) {
     bufView[i] = s.charCodeAt(i);
   }
   return bufView;
 }

这允许我将内容保存到一个文件,而没有编码问题。

How it works: It basically takes the single 8-byte chunks composing a UTF-8 character and saves them as single characters (therefore an UTF-8 character built in this way, could be composed by 1-4 of these characters). UTF-8 encodes characters in a format that variates from 1 to 4 bytes in length. What we do here is encoding the sting in an URI component and then take this component and translate it in the corresponding 8 byte character. In this way we don't lose the information given by UTF8 characters that are more than 1 byte long.

其他回答

与这里的解决方案不同,我需要从UTF-8数据转换到UTF-8数据。为此,我使用(un)escape/(en)decodeURIComponent技巧编写了以下两个函数。它们非常浪费内存,分配的长度是编码后utf8-string的9倍,尽管这些应该由gc恢复。只是不要在100mb的文本中使用它们。

function utf8AbFromStr(str) {
    var strUtf8 = unescape(encodeURIComponent(str));
    var ab = new Uint8Array(strUtf8.length);
    for (var i = 0; i < strUtf8.length; i++) {
        ab[i] = strUtf8.charCodeAt(i);
    }
    return ab;
}

function strFromUtf8Ab(ab) {
    return decodeURIComponent(escape(String.fromCharCode.apply(null, ab)));
}

检查它是否工作:

strFromUtf8Ab(utf8AbFromStr('latinкирилицаαβγδεζηあいうえお'))
-> "latinкирилицаαβγδεζηあいうえお"

假设你有一个arrayBuffer binaryStr:

let text = String.fromCharCode.apply(null, new Uint8Array(binaryStr));

然后你把文本赋值给状态。

好吧,这里有一种有点复杂的方式来做同样的事情:

var string = "Blah blah blah", output;
var bb = new (window.BlobBuilder||window.WebKitBlobBuilder||window.MozBlobBuilder)();
bb.append(string);
var f = new FileReader();
f.onload = function(e) {
  // do whatever
  output = e.target.result;
}
f.readAsArrayBuffer(bb.getBlob());

编辑:BlobBuilder早已被弃用,取而代之的是Blob构造函数,在我第一次写这篇文章时,它还不存在。这是一个更新版本。(是的,这一直是一个非常愚蠢的转换方式,但它只是为了好玩!)

var string = "Blah blah blah", output;
var f = new FileReader();
f.onload = function(e) {
  // do whatever
  output = e.target.result;
};
f.readAsArrayBuffer(new Blob([string]));

Yes:

const encstr = (`TextEncoder` in window) ? new TextEncoder().encode(str) : Uint8Array.from(str, c => c.codePointAt(0));

Blob比String.fromCharCode(null,array)慢得多;

但如果数组缓冲区太大,就会失败。我发现的最佳解决方案是使用String.fromCharCode(null,数组);并将其拆分为不会破坏堆栈的操作,但每次比单个char更快。

大数组缓冲区的最佳解决方案是:

function arrayBufferToString(buffer){

    var bufView = new Uint16Array(buffer);
    var length = bufView.length;
    var result = '';
    var addition = Math.pow(2,16)-1;

    for(var i = 0;i<length;i+=addition){

        if(i + addition > length){
            addition = length - i;
        }
        result += String.fromCharCode.apply(null, bufView.subarray(i,i+addition));
    }

    return result;

}

我发现这比使用blob快20倍。它也适用于超过100mb的大字符串。