使用Python请求的异步请求

我尝试了python请求库文档中提供的示例。

使用async.map(rs)，我获得了响应代码，但我想获得所请求的每个页面的内容。例如，这是行不通的:

out = async.map(rs)
print out[0].content

当前回答

我对发布的大多数答案都有很多问题——他们要么使用了已弃用的库，这些库已经移植了有限的功能，要么提供了一个在执行请求时具有太多魔力的解决方案，使得错误处理变得困难。如果它们不属于上述类别之一，则它们是第三方库或已弃用。

有些解决方案完全适用于http请求，但解决方案不适用于任何其他类型的请求，这是可笑的。这里不需要高度定制的解决方案。

简单地使用python内置库asyncio就足以执行任何类型的异步请求，并为复杂的和特定于用例的错误处理提供足够的流动性。

import asyncio

loop = asyncio.get_event_loop()

def do_thing(params):
    async def get_rpc_info_and_do_chores(id):
        # do things
        response = perform_grpc_call(id)
        do_chores(response)

    async def get_httpapi_info_and_do_chores(id):
        # do things
        response = requests.get(URL)
        do_chores(response)

    async_tasks = []
    for element in list(params.list_of_things):
       async_tasks.append(loop.create_task(get_chan_info_and_do_chores(id)))
       async_tasks.append(loop.create_task(get_httpapi_info_and_do_chores(ch_id)))

    loop.run_until_complete(asyncio.gather(*async_tasks))

它的工作原理很简单。您正在创建一系列希望异步发生的任务，然后请求一个循环执行这些任务并在完成时退出。不需要维护额外的库，也不缺少所需的功能。

2019-12-08 21:41:27

其他回答

我也尝试过使用python中的异步方法做一些事情，然而我使用twisted进行异步编程的运气要好得多。它的问题较少，并且有良好的文档记录。这里有一个类似于你在twisted中尝试的东西的链接。

http://pythonquirks.blogspot.com/2011/04/twisted-asynchronous-http-request.html

2012-02-02 17:06:14

from threading import Thread

threads=list()

for requestURI in requests:
    t = Thread(target=self.openURL, args=(requestURI,))
    t.start()
    threads.append(t)

for thread in threads:
    thread.join()

...

def openURL(self, requestURI):
    o = urllib2.urlopen(requestURI, timeout = 600)
    o...

2013-01-16 23:32:54

声明:下面的代码为每个函数创建了不同的线程。

这对于某些情况可能是有用的，因为它使用起来更简单。但要知道，它不是异步的，但使用多线程会给人一种异步的错觉，尽管decorator建议这样做。

可以使用以下装饰器在函数执行完成后给出回调，回调必须处理函数返回的数据。

请注意，在函数被修饰后，它将返回一个Future对象。

import asyncio

## Decorator implementation of async runner !!
def run_async(callback, loop=None):
    if loop is None:
        loop = asyncio.get_event_loop()

    def inner(func):
        def wrapper(*args, **kwargs):
            def __exec():
                out = func(*args, **kwargs)
                callback(out)
                return out

            return loop.run_in_executor(None, __exec)

        return wrapper

    return inner

实现示例:

urls = ["https://google.com", "https://facebook.com", "https://apple.com", "https://netflix.com"]
loaded_urls = []  # OPTIONAL, used for showing realtime, which urls are loaded !!


def _callback(resp):
    print(resp.url)
    print(resp)
    loaded_urls.append((resp.url, resp))  # OPTIONAL, used for showing realtime, which urls are loaded !!


# Must provide a callback function, callback func will be executed after the func completes execution
# Callback function will accept the value returned by the function.
@run_async(_callback)
def get(url):
    return requests.get(url)


for url in urls:
    get(url)

如果你想看到实时加载的url，你可以在最后添加以下代码:

while True:
    print(loaded_urls)
    if len(loaded_urls) == len(urls):
        break

2020-12-30 15:29:59

你可以使用httpx。

import httpx

async def get_async(url):
    async with httpx.AsyncClient() as client:
        return await client.get(url)

urls = ["http://google.com", "http://wikipedia.org"]

# Note that you need an async context to use `await`.
await asyncio.gather(*map(get_async, urls))

如果你想要一个函数式语法，gamla库将其包装到get_async中。

然后你就可以


await gamla.map(gamla.get_async(10))(["http://google.com", "http://wikipedia.org"])

10是超时时间，单位是秒。

(声明:我是作者)

2020-06-26 22:59:10

不幸的是，据我所知，请求库不具备执行异步请求的能力。您可以在请求周围包装async/await语法，但这将使底层请求的同步性不会降低。如果您想要真正的异步请求，则必须使用其他提供异步请求的工具。其中一个解决方案是aiohttp (Python 3.5.3+)。根据我在Python 3.7 async/await语法中使用它的经验，它工作得很好。下面我写了执行n个web请求的三个实现

使用Python请求库的纯同步请求(sync_requests_get_all) 同步请求(async_requests_get_all)，使用Python 3.7中包装的Python请求库async/await语法和asyncio 一个真正的异步实现(async_aiohttp_get_all)， Python aiohttp库封装在Python 3.7 async/await语法和asyncio中

"""
Tested in Python 3.5.10
"""

import time
import asyncio
import requests
import aiohttp

from asgiref import sync

def timed(func):
    """
    records approximate durations of function calls
    """
    def wrapper(*args, **kwargs):
        start = time.time()
        print('{name:<30} started'.format(name=func.__name__))
        result = func(*args, **kwargs)
        duration = "{name:<30} finished in {elapsed:.2f} seconds".format(
            name=func.__name__, elapsed=time.time() - start
        )
        print(duration)
        timed.durations.append(duration)
        return result
    return wrapper

timed.durations = []


@timed
def sync_requests_get_all(urls):
    """
    performs synchronous get requests
    """
    # use session to reduce network overhead
    session = requests.Session()
    return [session.get(url).json() for url in urls]


@timed
def async_requests_get_all(urls):
    """
    asynchronous wrapper around synchronous requests
    """
    session = requests.Session()
    # wrap requests.get into an async function
    def get(url):
        return session.get(url).json()
    async_get = sync.sync_to_async(get)

    async def get_all(urls):
        return await asyncio.gather(*[
            async_get(url) for url in urls
        ])
    # call get_all as a sync function to be used in a sync context
    return sync.async_to_sync(get_all)(urls)

@timed
def async_aiohttp_get_all(urls):
    """
    performs asynchronous get requests
    """
    async def get_all(urls):
        async with aiohttp.ClientSession() as session:
            async def fetch(url):
                async with session.get(url) as response:
                    return await response.json()
            return await asyncio.gather(*[
                fetch(url) for url in urls
            ])
    # call get_all as a sync function to be used in a sync context
    return sync.async_to_sync(get_all)(urls)


if __name__ == '__main__':
    # this endpoint takes ~3 seconds to respond,
    # so a purely synchronous implementation should take
    # little more than 30 seconds and a purely asynchronous
    # implementation should take little more than 3 seconds.
    urls = ['https://postman-echo.com/delay/3']*10

    async_aiohttp_get_all(urls)
    async_requests_get_all(urls)
    sync_requests_get_all(urls)
    print('----------------------')
    [print(duration) for duration in timed.durations]

在我的机器上，这是输出:

async_aiohttp_get_all          started
async_aiohttp_get_all          finished in 3.20 seconds
async_requests_get_all         started
async_requests_get_all         finished in 30.61 seconds
sync_requests_get_all          started
sync_requests_get_all          finished in 30.59 seconds
----------------------
async_aiohttp_get_all          finished in 3.20 seconds
async_requests_get_all         finished in 30.61 seconds
sync_requests_get_all          finished in 30.59 seconds

2020-07-30 18:48:25

使用Python请求的异步请求

推荐文章

最新文章

标签