我试图获得应用商店>业务的内容:

import requests
from lxml import html

page = requests.get("https://itunes.apple.com/in/genre/ios-business/id6000?mt=8")
tree = html.fromstring(page.text)

flist = []
plist = []
for i in range(0, 100):
    app = tree.xpath("//div[@class='column first']/ul/li/a/@href")
    ap = app[0]
    page1 = requests.get(ap)

当我尝试(0,2)的范围,它工作,但当我把范围在100,它显示这个错误:

Traceback (most recent call last):
  File "/home/preetham/Desktop/eg.py", line 17, in <module>
    page1 = requests.get(ap)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 55, in get
    return request('get', url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 383, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 486, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='itunes.apple.com', port=443): Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8 (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)

当前回答

PIP install pyopenssl似乎为我解决了这个问题。

https://github.com/requests/requests/issues/4246

其他回答

即使在安装pyopenssl和尝试各种python版本后,我也无法在Windows上工作(而它在mac上工作得很好),所以我切换到urllib,它可以在python 3.6(从python .org)和3.7 (anaconda)上工作

import urllib 
from urllib.request import urlopen
html = urlopen("http://pythonscraping.com/pages/page1.html")
contents = html.read()
print(contents)

检查网络连接。我有这个,虚拟机没有一个正确的网络连接。

当您向https://itunes.apple.com的公共IP地址发送过多请求时,就会发生这种情况。正如你所看到的,这是由于某些原因导致的,不允许/阻止访问与https://itunes.apple.com的公共IP地址映射。一个更好的解决方案是下面的python脚本,它计算任何域的公共IP地址,并创建到/etc/hosts文件的映射。

import re
import socket
import subprocess
from typing import Tuple

ENDPOINT = 'https://anydomainname.example.com/'
ENDPOINT = 'https://itunes.apple.com/'

def get_public_ip() -> Tuple[str, str, str]:
    """
    Command to get public_ip address of host machine and endpoint domain
    Returns
    -------
    my_public_ip : str
        Ip address string of host machine.
    end_point_ip_address : str
        Ip address of endpoint domain host.
    end_point_domain : str
        domain name of endpoint.

    """
    # bash_command = """host myip.opendns.com resolver1.opendns.com | \
    #     grep "myip.opendns.com has" | awk '{print $4}'"""
    # bash_command = """curl ifconfig.co"""
    # bash_command = """curl ifconfig.me"""
    bash_command = """ curl icanhazip.com"""
    my_public_ip = subprocess.getoutput(bash_command)
    my_public_ip = re.compile("[0-9.]{4,}").findall(my_public_ip)[0]
    end_point_domain = (
        ENDPOINT.replace("https://", "")
        .replace("http://", "")
        .replace("/", "")
    )
    end_point_ip_address = socket.gethostbyname(end_point_domain)
    return my_public_ip, end_point_ip_address, end_point_domain


def set_etc_host(ip_address: str, domain: str) -> str:
    """
    A function to write mapping of ip_address and domain name in /etc/hosts.
    Ref: https://stackoverflow.com/questions/38302867/how-to-update-etc-hosts-file-in-docker-image-during-docker-build

    Parameters
    ----------
    ip_address : str
        IP address of the domain.
    domain : str
        domain name of endpoint.

    Returns
    -------
    str
        Message to identify success or failure of the operation.

    """
    bash_command = """echo "{}    {}" >> /etc/hosts""".format(ip_address, domain)
    output = subprocess.getoutput(bash_command)
    return output


if __name__ == "__main__":
    my_public_ip, end_point_ip_address, end_point_domain = get_public_ip()
    output = set_etc_host(ip_address=end_point_ip_address, domain=end_point_domain)
    print("My public IP address:", my_public_ip)
    print("ENDPOINT public IP address:", end_point_ip_address)
    print("ENDPOINT Domain Name:", end_point_domain )
    print("Command output:", output)

你可以在运行你想要的函数之前调用上面的脚本:)

在我的例子中,我在python脚本中部署了一些docker容器,然后调用其中一个部署的服务。当我在调用服务之前添加一些延迟时,错误被修复。我认为它需要时间来准备接受连接。

from time import sleep
#deploy containers
#get URL of the container
sleep(5)
response = requests.get(url,verify=False)
print(response.json())

我有类似的问题,但下面的代码为我工作。

url = <some REST url>    
page = requests.get(url, verify=False)

verify=False禁用SSL验证。Try和catch可以像往常一样添加。