我尝试了python请求库文档中提供的示例。

使用async.map(rs),我获得了响应代码,但我想获得所请求的每个页面的内容。例如,这是行不通的:

out = async.map(rs)
print out[0].content

当前回答

Async现在是一个独立的模块:grequests。

请看这里:https://github.com/kennethreitz/grequests

还有:通过Python发送多个HTTP请求的理想方法?

安装:

$ pip install grequests

用法:

建立一个堆栈:

import grequests

urls = [
    'http://www.heroku.com',
    'http://tablib.org',
    'http://httpbin.org',
    'http://python-requests.org',
    'http://kennethreitz.com'
]

rs = (grequests.get(u) for u in urls)

发送堆栈

grequests.map(rs)

结果如下所示

[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]

grequest似乎没有设置并发请求的限制,即当多个请求被发送到同一个服务器时。

其他回答

也许请求-期货是另一种选择。

from requests_futures.sessions import FuturesSession

session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)

办公文档中也有建议。如果你不想卷入gevent,这是一个不错的选择。

我测试了两个请求——未来请求和请求请求。Grequests速度更快,但会带来猴子补丁和依赖关系的其他问题。请求-期货比请求慢几倍。我决定编写自己的请求,并简单地将请求包装到ThreadPoolExecutor中,它几乎和grequest一样快,但没有外部依赖。

import requests
import concurrent.futures

def get_urls():
    return ["url1","url2"]

def load_url(url, timeout):
    return requests.get(url, timeout = timeout)

with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:

    future_to_url = {executor.submit(load_url, url, 10): url for url in     get_urls()}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            resp_err = resp_err + 1
        else:
            resp_ok = resp_ok + 1

上面的答案都没有帮助我,因为他们假设你有一个预定义的请求列表,而在我的情况下,我需要能够侦听请求和异步响应(类似于它在nodejs中的工作方式)。

def handle_finished_request(r, **kwargs):
    print(r)


# while True:
def main():
    while True:
        address = listen_to_new_msg()  # based on your server

        # schedule async requests and run 'handle_finished_request' on response
        req = grequests.get(address, timeout=1, hooks=dict(response=handle_finished_request))
        job = grequests.send(req)  # does not block! for more info see https://stackoverflow.com/a/16016635/10577976


main()

handle_finished_request回调函数将在收到响应时被调用。注意:由于某些原因,超时(或无响应)在这里不会触发错误

这个简单的循环可以触发异步请求,类似于它在nodejs服务器中的工作方式

我知道这已经关闭了一段时间,但我认为推广另一种基于请求库的异步解决方案可能是有用的。

list_of_requests = ['http://moop.com', 'http://doop.com', ...]

from simple_requests import Requests
for response in Requests().swarm(list_of_requests):
    print response.content

文档在这里:http://pythonhosted.org/simple-requests/

我也尝试过使用python中的异步方法做一些事情,然而我使用twisted进行异步编程的运气要好得多。它的问题较少,并且有良好的文档记录。这里有一个类似于你在twisted中尝试的东西的链接。

http://pythonquirks.blogspot.com/2011/04/twisted-asynchronous-http-request.html