我尝试了python请求库文档中提供的示例。
使用async.map(rs),我获得了响应代码,但我想获得所请求的每个页面的内容。例如,这是行不通的:
out = async.map(rs)
print out[0].content
我尝试了python请求库文档中提供的示例。
使用async.map(rs),我获得了响应代码,但我想获得所请求的每个页面的内容。例如,这是行不通的:
out = async.map(rs)
print out[0].content
当前回答
Async现在是一个独立的模块:grequests。
请看这里:https://github.com/kennethreitz/grequests
还有:通过Python发送多个HTTP请求的理想方法?
安装:
$ pip install grequests
用法:
建立一个堆栈:
import grequests
urls = [
'http://www.heroku.com',
'http://tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://kennethreitz.com'
]
rs = (grequests.get(u) for u in urls)
发送堆栈
grequests.map(rs)
结果如下所示
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
grequest似乎没有设置并发请求的限制,即当多个请求被发送到同一个服务器时。
其他回答
你可以使用httpx。
import httpx
async def get_async(url):
async with httpx.AsyncClient() as client:
return await client.get(url)
urls = ["http://google.com", "http://wikipedia.org"]
# Note that you need an async context to use `await`.
await asyncio.gather(*map(get_async, urls))
如果你想要一个函数式语法,gamla库将其包装到get_async中。
然后你就可以
await gamla.map(gamla.get_async(10))(["http://google.com", "http://wikipedia.org"])
10是超时时间,单位是秒。
(声明:我是作者)
from threading import Thread
threads=list()
for requestURI in requests:
t = Thread(target=self.openURL, args=(requestURI,))
t.start()
threads.append(t)
for thread in threads:
thread.join()
...
def openURL(self, requestURI):
o = urllib2.urlopen(requestURI, timeout = 600)
o...
也许请求-期货是另一种选择。
from requests_futures.sessions import FuturesSession
session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)
办公文档中也有建议。如果你不想卷入gevent,这是一个不错的选择。
上面的答案都没有帮助我,因为他们假设你有一个预定义的请求列表,而在我的情况下,我需要能够侦听请求和异步响应(类似于它在nodejs中的工作方式)。
def handle_finished_request(r, **kwargs):
print(r)
# while True:
def main():
while True:
address = listen_to_new_msg() # based on your server
# schedule async requests and run 'handle_finished_request' on response
req = grequests.get(address, timeout=1, hooks=dict(response=handle_finished_request))
job = grequests.send(req) # does not block! for more info see https://stackoverflow.com/a/16016635/10577976
main()
handle_finished_request回调函数将在收到响应时被调用。注意:由于某些原因,超时(或无响应)在这里不会触发错误
这个简单的循环可以触发异步请求,类似于它在nodejs服务器中的工作方式
Async现在是一个独立的模块:grequests。
请看这里:https://github.com/kennethreitz/grequests
还有:通过Python发送多个HTTP请求的理想方法?
安装:
$ pip install grequests
用法:
建立一个堆栈:
import grequests
urls = [
'http://www.heroku.com',
'http://tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://kennethreitz.com'
]
rs = (grequests.get(u) for u in urls)
发送堆栈
grequests.map(rs)
结果如下所示
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
grequest似乎没有设置并发请求的限制,即当多个请求被发送到同一个服务器时。