我正在练习“使用Python进行网络抓取”的代码,我一直有这个证书问题:

from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import re

pages = set()
def getLinks(pageUrl):
    global pages
    html = urlopen("http://en.wikipedia.org"+pageUrl)
    bsObj = BeautifulSoup(html)
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages:
                #We have encountered a new page
                newPage = link.attrs['href'] 
                print(newPage) 
                pages.add(newPage) 
                getLinks(newPage)
getLinks("")

错误是:

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1049)>

顺便说一句,我也在练习scrapy,但一直得到的问题:命令找不到:scrapy(我尝试了各种在线解决方案,但没有一个工作…真的令人沮丧)


当前回答

有两步对我很有效: 导入Macintosh HD > Applications > Python3.7文件夹 -点击“Install Certificates.command”

其他回答

open /Applications/Python\ 3.7/Install\ Certificates.command

在终端试试这个命令

使用请求库。 试试这个解决方案,或者只是在URL前添加https://:

import requests
from bs4 import BeautifulSoup
import re

pages = set()
def getLinks(pageUrl):
    global pages
    html = requests.get("http://en.wikipedia.org"+pageUrl, verify=False).text
    bsObj = BeautifulSoup(html)
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages:
                #We have encountered a new page
                newPage = link.attrs['href']
                print(newPage)
                pages.add(newPage)
                getLinks(newPage)
getLinks("")

检查一下这对你是否有效

对于正在使用anaconda的任何人,您将安装certifi包,查看更多信息:

https://anaconda.org/anaconda/certifi

要安装,请在终端中键入这一行:

conda install -c anaconda certifi

我可以找到这个解决方案,工作得很好:

cd /Applications/Python\ 3.7/
./Install\ Certificates.command

确保你的websockets是>=10.0

附加: 安装Certificates.command Update Shell Profile.command

Pip3安装websockets==10.0