如何修复:"UnicodeDecodeError: 'ascii'编解码器不能解码字节"

as3:~/ngokevin-site# nano content/blog/20140114_test-chinese.mkd
as3:~/ngokevin-site# wok
Traceback (most recent call last):
  File "/usr/local/bin/wok", line 4, in
    Engine()
  File "/usr/local/lib/python2.7/site-packages/wok/engine.py", line 104, in init
    self.load_pages()
  File "/usr/local/lib/python2.7/site-packages/wok/engine.py", line 238, in load_pages
    p = Page.from_file(os.path.join(root, f), self.options, self, renderer)
  File "/usr/local/lib/python2.7/site-packages/wok/page.py", line 111, in from_file
    page.meta['content'] = page.renderer.render(page.original)
  File "/usr/local/lib/python2.7/site-packages/wok/renderers.py", line 46, in render
    return markdown(plain, Markdown.plugins)
  File "/usr/local/lib/python2.7/site-packages/markdown/init.py", line 419, in markdown
    return md.convert(text)
  File "/usr/local/lib/python2.7/site-packages/markdown/init.py", line 281, in convert
    source = unicode(source)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 1: ordinal not in range(128). -- Note: Markdown only accepts unicode input!

如何解决?

在其他一些基于python的静态博客应用中，中文帖子可以成功发布。比如这个应用:http://github.com/vrypan/bucket3。在我的网站http://bc3.brite.biz/，中文帖子可以成功发布。

当前回答

"UnicodeDecodeError: 'ascii' codec can't decode byte"

错误原因:input_string必须是unicode，但给出了str

"TypeError: Decoding Unicode is not supported"

此错误的原因:试图将unicode input_string转换为unicode

因此，首先检查你的input_string是否为str，并在必要时转换为unicode:

if isinstance(input_string, str):
   input_string = unicode(input_string, 'utf-8')

其次，上面只是改变了类型，但没有删除非ascii字符。如果你想删除非ascii字符:

if isinstance(input_string, str):
   input_string = input_string.decode('ascii', 'ignore').encode('ascii') #note: this removes the character and encodes back to string.

elif isinstance(input_string, unicode):
   input_string = input_string.encode('ascii', 'ignore')

2017-08-16 21:07:46

其他回答

这是我的解决方案，只需添加编码。用open(file, encoding='utf8')作为f

因为读取glove文件需要很长时间，所以我建议将glove文件转换为numpy文件。当你读取嵌入权重时，它将节省你的时间。

import numpy as np
from tqdm import tqdm


def load_glove(file):
    """Loads GloVe vectors in numpy array.
    Args:
        file (str): a path to a glove file.
    Return:
        dict: a dict of numpy arrays.
    """
    embeddings_index = {}
    with open(file, encoding='utf8') as f:
        for i, line in tqdm(enumerate(f)):
            values = line.split()
            word = ''.join(values[:-300])
            coefs = np.asarray(values[-300:], dtype='float32')
            embeddings_index[word] = coefs

    return embeddings_index

# EMBEDDING_PATH = '../embedding_weights/glove.840B.300d.txt'
EMBEDDING_PATH = 'glove.840B.300d.txt'
embeddings = load_glove(EMBEDDING_PATH)

np.save('glove_embeddings.npy', embeddings)

Gist链接:https://gist.github.com/BrambleXu/634a844cdd3cd04bb2e3ba3c83aef227

2018-09-11 06:06:40

我遇到了同样的问题，但它不适用于Python 3。我遵循了这个方法，解决了我的问题:

enc = sys.getdefaultencoding()
file = open(menu, "r", encoding = enc)

在读取/写入文件时，必须设置编码。

2017-08-16 20:12:12

我在Python2.7中遇到了这个错误。我在尝试运行许多python程序时遇到了这种情况，但我设法用这个简单的脚本重现了它:

#!/usr/bin/env python

import subprocess
import sys

result = subprocess.Popen([u'svn', u'info'])
if not callable(getattr(result, "__enter__", None)) and not callable(getattr(result, "__exit__", None)):
    print("foo")
print("bar")

在成功的情况下，它应该打印出'foo'和'bar'，如果你不在svn文件夹中，可能会有一个错误消息。

在失败时，它应该打印'UnicodeDecodeError: 'ascii' codec不能解码字节0xc4在位置39:序号不在范围(128)'。

在尝试重新生成区域设置和这个问题中发布的许多其他解决方案后，我了解到发生了错误，因为我的PATH环境变量中编码了一个特殊字符(ĺ)。在` ~/中固定PATH后。Bashrc '，然后退出我的会话并再次进入，(显然是在查找'~/。Bashrc’没有起作用)，问题就消失了。

2021-01-25 14:23:05

我正在搜索解决以下错误信息:

Unicodedecodeerror: 'ascii'编解码器无法解码位置5454中的字节0xe2:序号不在范围(128)

我最终通过指定'encoding'来修复它:

f = open('../glove/glove.6B.100d.txt', encoding="utf-8")

希望它也能帮助到你。

2018-03-06 12:57:11

这招对我很管用:

    file = open('docs/my_messy_doc.pdf', 'rb')

2019-06-14 08:55:07

如何修复:"UnicodeDecodeError: 'ascii'编解码器不能解码字节"

推荐文章

最新文章

标签