try:
    r = requests.get(url, params={'s': thing})
except requests.ConnectionError, e:
    print(e)

这对吗?有更好的方式来组织吗?这将涵盖我所有的基础吗?


看一下Requests异常文档。简而言之:

在发生网络问题时(例如DNS故障,拒绝连接等),请求将引发ConnectionError异常。 在罕见的无效HTTP响应事件中,Requests将引发HTTPError异常。 如果请求超时,则会引发Timeout异常。 如果请求超过配置的最大重定向数量,则引发TooManyRedirects异常。 Requests显式引发的所有异常都继承自Requests .exceptions. requestexception。

回答你的问题,你所展示的并不能涵盖你所有的基础。您将只捕获与连接相关的错误,而不是超时的错误。

捕获异常时要做什么实际上取决于脚本/程序的设计。退出是否可以接受?你能继续再试一次吗?如果错误是灾难性的,并且你不能继续,那么是的,你可以通过引发SystemExit来中止你的程序(这是一种打印错误并调用sys.exit的好方法)。

你可以捕获基类异常,它将处理所有情况:

try:
    r = requests.get(url, params={'s': thing})
except requests.exceptions.RequestException as e:  # This is the correct syntax
    raise SystemExit(e)

或者你可以分别抓住它们,做不同的事情。

try:
    r = requests.get(url, params={'s': thing})
except requests.exceptions.Timeout:
    # Maybe set up for a retry, or continue in a retry loop
except requests.exceptions.TooManyRedirects:
    # Tell the user their URL was bad and try a different one
except requests.exceptions.RequestException as e:
    # catastrophic error. bail.
    raise SystemExit(e)

正如克里斯蒂安所指出的:

如果你想要http错误(例如401 Unauthorized)引发异常,你可以调用Response.raise_for_status。如果响应是http错误,将引发HTTPError。

一个例子:

try:
    r = requests.get('http://www.google.com/nothere')
    r.raise_for_status()
except requests.exceptions.HTTPError as err:
    raise SystemExit(err)

将打印:

404 Client Error: Not Found for url: http://www.google.com/nothere

还有一个建议要明确。最好的方法似乎是从特定错误到一般错误,从错误堆栈向下查找所需的错误,这样特定错误就不会被一般错误所掩盖。

url='http://www.google.com/blahblah'

try:
    r = requests.get(url,timeout=3)
    r.raise_for_status()
except requests.exceptions.HTTPError as errh:
    print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
    print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
    print ("Timeout Error:",errt)
except requests.exceptions.RequestException as err:
    print ("OOps: Something Else",err)

Http Error: 404 Client Error: Not Found for url: http://www.google.com/blahblah

vs

url='http://www.google.com/blahblah'

try:
    r = requests.get(url,timeout=3)
    r.raise_for_status()
except requests.exceptions.RequestException as err:
    print ("OOps: Something Else",err)
except requests.exceptions.HTTPError as errh:
    print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
    print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
    print ("Timeout Error:",errt)     

OOps: Something Else 404 Client Error: Not Found for url: http://www.google.com/blahblah

异常对象还包含原始响应e.response,如果需要查看来自服务器的响应中的错误体,这可能是有用的。例如:

try:
    r = requests.post('somerestapi.com/post-here', data={'birthday': '9/9/3999'})
    r.raise_for_status()
except requests.exceptions.HTTPError as e:
    print (e.response.text)

这里有一个通用的方法来做事情,这至少意味着你不需要围绕每个请求调用try…除了:

基本版

# see the docs: if you set no timeout the call never times out! A tuple means "max 
# connect time" and "max read time"
DEFAULT_REQUESTS_TIMEOUT = (5, 15) # for example

def log_exception(e, verb, url, kwargs):
    # the reason for making this a separate function will become apparent
    raw_tb = traceback.extract_stack()
    if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
        kwargs['data'] = f'{kwargs["data"][:500]}...'  
    msg = f'BaseException raised: {e.__class__.__module__}.{e.__class__.__qualname__}: {e}\n' \
        + f'verb {verb}, url {url}, kwargs {kwargs}\n\n' \
        + 'Stack trace:\n' + ''.join(traceback.format_list(raw_tb[:-2]))
    logger.error(msg) 

def requests_call(verb, url, **kwargs):
    response = None
    exception = None
    try:
        if 'timeout' not in kwargs:
            kwargs['timeout'] = DEFAULT_REQUESTS_TIMEOUT
        response = requests.request(verb, url, **kwargs)
    except BaseException as e:
        log_exception(e, verb, url, kwargs)
        exception = e
    return (response, exception)

NB

Be aware of ConnectionError which is a builtin, nothing to do with the class requests.ConnectionError*. I assume the latter is more common in this context but have no real idea... When examining a non-None returned exception, requests.RequestException, the superclass of all the requests exceptions (including requests.ConnectionError), is not "requests.exceptions.RequestException" according to the docs. Maybe it has changed since the accepted answer.** Obviously this assumes a logger has been configured. Calling logger.exception in the except block might seem a good idea but that would only give the stack within this method! Instead, get the trace leading up to the call to this method. Then log (with details of the exception, and of the call which caused the problem)

*我看了源代码:请求。ConnectionError将单个类请求子类化。RequestException,它是单个类IOError的子类(内置)

**然而,在本页的底部,你会发现“requests.exceptions”。在撰写本文时(2022-02)…但它链接到上面的页面:令人困惑。


用法非常简单:

search_response, exception = utilities.requests_call('get',
    f'http://localhost:9200/my_index/_search?q={search_string}')

首先检查响应:如果它是None,那么就发生了一些有趣的事情,您将有一个异常,必须根据上下文(以及异常)以某种方式进行操作。在Gui应用程序(PyQt5)中,我通常实现一个“可视化日志”来向用户提供一些输出(同时也记录到日志文件中),但添加的消息应该是非技术性的。所以通常会出现这样的情况:

if search_response == None:
    # you might check here for (e.g.) a requests.Timeout, tailoring the message
    # accordingly, as the kind of error anyone might be expected to understand
    msg = f'No response searching on |{search_string}|. See log'
    MainWindow.the().visual_log(msg, log_level=logging.ERROR)
    return
response_json = search_response.json()
if search_response.status_code != 200: # NB 201 ("created") may be acceptable sometimes... 
    msg = f'Bad response searching on |{search_string}|. See log'
    MainWindow.the().visual_log(msg, log_level=logging.ERROR)
    # usually response_json will give full details about the problem
    log_msg = f'search on |{search_string}| bad response\n{json.dumps(response_json, indent=4)}'
    logger.error(log_msg)
    return

# now examine the keys and values in response_json: these may of course 
# indicate an error of some kind even though the response returned OK (status 200)... 

鉴于堆栈跟踪是自动记录的,您通常不需要更多…

返回json对象时的高级版本

(…可能会节省大量的样板文件!)

为了跨越t,当期望返回json对象时:

如果,如上所述,一个异常给非技术用户一个消息“无响应”,以及一个非200状态“坏响应”,我建议这样做

在响应的JSON结构中缺少一个预期的键应该会产生一个消息“异常响应”。 消息“意外响应”的超出范围或奇怪值 以及存在一个键,如"error"或"errors",值为True或其他,到消息"error response"

这些可能会也可能不会阻止代码继续运行。


... 事实上,在我看来,让这个过程变得更通用是值得的。对于我来说,下面这些函数通常使用上述requests_call将20行代码减少到大约3行,并使大多数处理和日志消息标准化。在你的项目中调用超过几个请求,代码会变得更好,不那么臃肿:

def log_response_error(response_type, call_name, deliverable, verb, url, **kwargs):
    # NB this function can also be used independently
    if response_type == 'No': # exception was raised (and logged)
        if isinstance(deliverable, requests.Timeout):
            MainWindow.the().visual_log(f'Time out of {call_name} before response received!', logging.ERROR)
            return    
    else:
        if isinstance(deliverable, BaseException):
            # NB if response.json() raises an exception we end up here
            log_exception(deliverable, verb, url, kwargs)
        else:
            # if we get here no exception has been raised, so no stack trace has yet been logged.  
            # a response has been returned, but is either "Bad" or "Anomalous"
            response_json = deliverable.json()

            raw_tb = traceback.extract_stack()
            if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
                kwargs['data'] = f'{kwargs["data"][:500]}...'
            added_message = ''     
            if hasattr(deliverable, 'added_message'):
                added_message = deliverable.added_message + '\n'
                del deliverable.added_message
            call_and_response_details = f'{response_type} response\n{added_message}' \
                + f'verb {verb}, url {url}, kwargs {kwargs}\nresponse:\n{json.dumps(response_json, indent=4)}'
            logger.error(f'{call_and_response_details}\nStack trace: {"".join(traceback.format_list(raw_tb[:-1]))}')
    MainWindow.the().visual_log(f'{response_type} response {call_name}. See log.', logging.ERROR)
    
def check_keys(req_dict_structure, response_dict_structure, response):
    # so this function is about checking the keys in the returned json object...
    # NB both structures MUST be dicts
    if not isinstance(req_dict_structure, dict):
        response.added_message = f'req_dict_structure not dict: {type(req_dict_structure)}\n'
        return False
    if not isinstance(response_dict_structure, dict):
        response.added_message = f'response_dict_structure not dict: {type(response_dict_structure)}\n'
        return False
    for dict_key in req_dict_structure.keys():
        if dict_key not in response_dict_structure:
            response.added_message = f'key |{dict_key}| missing\n'
            return False
        req_value = req_dict_structure[dict_key]
        response_value = response_dict_structure[dict_key]
        if isinstance(req_value, dict):
            # if the response at this point is a list apply the req_value dict to each element:
            # failure in just one such element leads to "Anomalous response"... 
            if isinstance(response_value, list):
                for resp_list_element in response_value:
                    if not check_keys(req_value, resp_list_element, response):
                        return False
            elif not check_keys(req_value, response_value, response): # any other response value must be a dict (tested in next level of recursion)
                return False
        elif isinstance(req_value, list):
            if not isinstance(response_value, list): # if the req_value is a list the reponse must be one
                response.added_message = f'key |{dict_key}| not list: {type(response_value)}\n'
                return False
            # it is OK for the value to be a list, but these must be strings (keys) or dicts
            for req_list_element, resp_list_element in zip(req_value, response_value):
                if isinstance(req_list_element, dict):
                    if not check_keys(req_list_element, resp_list_element, response):
                        return False
                if not isinstance(req_list_element, str):
                    response.added_message = f'req_list_element not string: {type(req_list_element)}\n'
                    return False
                if req_list_element not in response_value:
                    response.added_message = f'key |{req_list_element}| missing from response list\n'
                    return False
        # put None as a dummy value (otherwise something like {'my_key'} will be seen as a set, not a dict 
        elif req_value != None: 
            response.added_message = f'required value of key |{dict_key}| must be None (dummy), dict or list: {type(req_value)}\n'
            return False
    return True

def process_json_requests_call(verb, url, **kwargs):
    # "call_name" is a mandatory kwarg
    if 'call_name' not in kwargs:
        raise Exception('kwarg "call_name" not supplied!')
    call_name = kwargs['call_name']
    del kwargs['call_name']

    required_keys = {}    
    if 'required_keys' in kwargs:
        required_keys = kwargs['required_keys']
        del kwargs['required_keys']

    acceptable_statuses = [200]
    if 'acceptable_statuses' in kwargs:
        acceptable_statuses = kwargs['acceptable_statuses']
        del kwargs['acceptable_statuses']

    exception_handler = log_response_error
    if 'exception_handler' in kwargs:
        exception_handler = kwargs['exception_handler']
        del kwargs['exception_handler']
        
    response, exception = requests_call(verb, url, **kwargs)

    if response == None:
        exception_handler('No', call_name, exception, verb, url, **kwargs)
        return (False, exception)
    try:
        response_json = response.json()
    except BaseException as e:
        logger.error(f'response.status_code {response.status_code} but calling json() raised exception')
        # an exception raised at this point can't truthfully lead to a "No response" message... so say "bad"
        exception_handler('Bad', call_name, e, verb, url, **kwargs)
        return (False, response)
        
    status_ok = response.status_code in acceptable_statuses
    if not status_ok:
        response.added_message = f'status code was {response.status_code}'
        log_response_error('Bad', call_name, response, verb, url, **kwargs)
        return (False, response)
    check_result = check_keys(required_keys, response_json, response)
    if not check_result:
        log_response_error('Anomalous', call_name, response, verb, url, **kwargs)
    return (check_result, response)      

示例调用(注意,在这个版本中,“deliverable”要么是一个异常,要么是一个传递json结构的响应):

success, deliverable = utilities.process_json_requests_call('get', 
    f'{ES_URL}{INDEX_NAME}/_doc/1', 
    call_name=f'checking index {INDEX_NAME}',
    required_keys={'_source':{'status_text': None}})
if not success: return False
# here, we know the deliverable is a response, not an exception
# we also don't need to check for the keys being present: 
# the generic code has checked that all expected keys are present
index_status = deliverable.json()['_source']['status_text']
if index_status != 'successfully completed':
    # ... i.e. an example of a 200 response, but an error nonetheless
    msg = f'Error response: ES index {INDEX_NAME} does not seem to have been built OK: cannot search'
    MainWindow.the().visual_log(msg)
    logger.error(f'index |{INDEX_NAME}|: deliverable.json() {json.dumps(deliverable.json(), indent=4)}')
    return False

例如,在缺少键“status_text”的情况下,用户看到的“可视日志”消息将是“异常响应检查索引XYZ。看到日志。”(日志会给出更详细的自动构造的技术消息,包括堆栈跟踪,还包括问题中丢失密钥的详细信息)。

NB

mandatory kwarg: call_name; optional kwargs: required_keys, acceptable_statuses, exception_handler. the required_keys dict can be nested to any depth finer-grained exception-handling can be accomplished by including a function exception_handler in kwargs (though don't forget that requests_call will have logged the call details, the exception type and __str__, and the stack trace). in the above I also implement a check on key "data" in any kwargs which may be logged. This is because a bulk operation (e.g. to populate an index in the case of Elasticsearch) can consist of enormous strings. So curtail to the first 500 characters, for example.

是的,我确实知道elasticsearch Python模块(一个围绕请求的“瘦包装器”)。以上所有内容都是为了说明目的。