如何检查字符串是否代表int，而不使用try/except?

有没有办法告诉一个字符串是否代表一个整数(例如，'3'，'-17'但不是'3.14'或'asfasfas')而不使用try/except机制?

is_int('3.14') == False
is_int('-7')   == True

当前回答

正确的RegEx解决方案应该结合Greg Hewgill和Nowell的思想，但不使用全局变量。可以通过将属性附加到方法来实现这一点。另外，我知道在方法中导入是不受欢迎的，但我想要的是像http://peak.telecommunity.com/DevCenter/Importing#lazy-imports这样的“惰性模块”效果

edit:到目前为止，我最喜欢的技术是使用String对象的独占方法。

#!/usr/bin/env python

# Uses exclusively methods of the String object
def isInteger(i):
    i = str(i)
    return i=='0' or (i if i.find('..') > -1 else i.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

# Uses re module for regex
def isIntegre(i):
    import re
    if not hasattr(isIntegre, '_re'):
        print("I compile only once. Remove this line when you are confident in that.")
        isIntegre._re = re.compile(r"[-+]?\d+(\.0*)?$")
    return isIntegre._re.match(str(i)) is not None

# When executed directly run Unit Tests
if __name__ == '__main__':
    for obj in [
                # integers
                0, 1, -1, 1.0, -1.0,
                '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0',
                # non-integers
                1.1, -1.1, '1.1', '-1.1', '+1.1',
                '1.1.1', '1.1.0', '1.0.1', '1.0.0',
                '1.0.', '1..0', '1..',
                '0.0.', '0..0', '0..',
                'one', object(), (1,2,3), [1,2,3], {'one':'two'}
            ]:
        # Notice the integre uses 're' (intended to be humorous)
        integer = ('an integer' if isInteger(obj) else 'NOT an integer')
        integre = ('an integre' if isIntegre(obj) else 'NOT an integre')
        # Make strings look like strings in the output
        if isinstance(obj, str):
            obj = ("'%s'" % (obj,))
        print("%30s is %14s is %14s" % (obj, integer, integre))

对于那些不太喜欢冒险的同学，输出如下:

I compile only once. Remove this line when you are confident in that.
                             0 is     an integer is     an integre
                             1 is     an integer is     an integre
                            -1 is     an integer is     an integre
                           1.0 is     an integer is     an integre
                          -1.0 is     an integer is     an integre
                           '0' is     an integer is     an integre
                          '0.' is     an integer is     an integre
                         '0.0' is     an integer is     an integre
                           '1' is     an integer is     an integre
                          '-1' is     an integer is     an integre
                          '+1' is     an integer is     an integre
                         '1.0' is     an integer is     an integre
                        '-1.0' is     an integer is     an integre
                        '+1.0' is     an integer is     an integre
                           1.1 is NOT an integer is NOT an integre
                          -1.1 is NOT an integer is NOT an integre
                         '1.1' is NOT an integer is NOT an integre
                        '-1.1' is NOT an integer is NOT an integre
                        '+1.1' is NOT an integer is NOT an integre
                       '1.1.1' is NOT an integer is NOT an integre
                       '1.1.0' is NOT an integer is NOT an integre
                       '1.0.1' is NOT an integer is NOT an integre
                       '1.0.0' is NOT an integer is NOT an integre
                        '1.0.' is NOT an integer is NOT an integre
                        '1..0' is NOT an integer is NOT an integre
                         '1..' is NOT an integer is NOT an integre
                        '0.0.' is NOT an integer is NOT an integre
                        '0..0' is NOT an integer is NOT an integre
                         '0..' is NOT an integer is NOT an integre
                         'one' is NOT an integer is NOT an integre
<object object at 0x103b7d0a0> is NOT an integer is NOT an integre
                     (1, 2, 3) is NOT an integer is NOT an integre
                     [1, 2, 3] is NOT an integer is NOT an integre
                {'one': 'two'} is NOT an integer is NOT an integre

2011-02-04 03:07:14

其他回答

edit:到目前为止，我最喜欢的技术是使用String对象的独占方法。

#!/usr/bin/env python

# Uses exclusively methods of the String object
def isInteger(i):
    i = str(i)
    return i=='0' or (i if i.find('..') > -1 else i.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

# Uses re module for regex
def isIntegre(i):
    import re
    if not hasattr(isIntegre, '_re'):
        print("I compile only once. Remove this line when you are confident in that.")
        isIntegre._re = re.compile(r"[-+]?\d+(\.0*)?$")
    return isIntegre._re.match(str(i)) is not None

# When executed directly run Unit Tests
if __name__ == '__main__':
    for obj in [
                # integers
                0, 1, -1, 1.0, -1.0,
                '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0',
                # non-integers
                1.1, -1.1, '1.1', '-1.1', '+1.1',
                '1.1.1', '1.1.0', '1.0.1', '1.0.0',
                '1.0.', '1..0', '1..',
                '0.0.', '0..0', '0..',
                'one', object(), (1,2,3), [1,2,3], {'one':'two'}
            ]:
        # Notice the integre uses 're' (intended to be humorous)
        integer = ('an integer' if isInteger(obj) else 'NOT an integer')
        integre = ('an integre' if isIntegre(obj) else 'NOT an integre')
        # Make strings look like strings in the output
        if isinstance(obj, str):
            obj = ("'%s'" % (obj,))
        print("%30s is %14s is %14s" % (obj, integer, integre))

对于那些不太喜欢冒险的同学，输出如下:

I compile only once. Remove this line when you are confident in that.
                             0 is     an integer is     an integre
                             1 is     an integer is     an integre
                            -1 is     an integer is     an integre
                           1.0 is     an integer is     an integre
                          -1.0 is     an integer is     an integre
                           '0' is     an integer is     an integre
                          '0.' is     an integer is     an integre
                         '0.0' is     an integer is     an integre
                           '1' is     an integer is     an integre
                          '-1' is     an integer is     an integre
                          '+1' is     an integer is     an integre
                         '1.0' is     an integer is     an integre
                        '-1.0' is     an integer is     an integre
                        '+1.0' is     an integer is     an integre
                           1.1 is NOT an integer is NOT an integre
                          -1.1 is NOT an integer is NOT an integre
                         '1.1' is NOT an integer is NOT an integre
                        '-1.1' is NOT an integer is NOT an integre
                        '+1.1' is NOT an integer is NOT an integre
                       '1.1.1' is NOT an integer is NOT an integre
                       '1.1.0' is NOT an integer is NOT an integre
                       '1.0.1' is NOT an integer is NOT an integre
                       '1.0.0' is NOT an integer is NOT an integre
                        '1.0.' is NOT an integer is NOT an integre
                        '1..0' is NOT an integer is NOT an integre
                         '1..' is NOT an integer is NOT an integre
                        '0.0.' is NOT an integer is NOT an integre
                        '0..0' is NOT an integer is NOT an integre
                         '0..' is NOT an integer is NOT an integre
                         'one' is NOT an integer is NOT an integre
<object object at 0x103b7d0a0> is NOT an integer is NOT an integre
                     (1, 2, 3) is NOT an integer is NOT an integre
                     [1, 2, 3] is NOT an integer is NOT an integre
                {'one': 'two'} is NOT an integer is NOT an integre

2011-02-04 03:07:14

You know, I've found (and I've tested this over and over) that try/except does not perform all that well, for whatever reason. I frequently try several ways of doing things, and I don't think I've ever found a method that uses try/except to perform the best of those tested, in fact it seems to me those methods have usually come out close to the worst, if not the worst. Not in every case, but in many cases. I know a lot of people say it's the "Pythonic" way, but that's one area where I part ways with them. To me, it's neither very performant nor very elegant, so, I tend to only use it for error trapping and reporting.

我本来想抱怨PHP, perl, ruby, C，甚至是该死的shell都有简单的函数来测试字符串是否为整数，但在验证这些假设时，我被绊倒了!显然，这种缺乏是一种常见的疾病。

以下是布鲁诺的帖子的快速编辑:

import sys, time, re

g_intRegex = re.compile(r"^([+-]?[1-9]\d*|0)$")

testvals = [
    # integers
    0, 1, -1, 1.0, -1.0,
    '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0', '06',
    # non-integers
    'abc 123',
    1.1, -1.1, '1.1', '-1.1', '+1.1',
    '1.1.1', '1.1.0', '1.0.1', '1.0.0',
    '1.0.', '1..0', '1..',
    '0.0.', '0..0', '0..',
    'one', object(), (1,2,3), [1,2,3], {'one':'two'},
    # with spaces
    ' 0 ', ' 0.', ' .0','.01 '
]

def isInt_try(v):
    try:     i = int(v)
    except:  return False
    return True

def isInt_str(v):
    v = str(v).strip()
    return v=='0' or (v if v.find('..') > -1 else v.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

def isInt_re(v):
    import re
    if not hasattr(isInt_re, 'intRegex'):
        isInt_re.intRegex = re.compile(r"^([+-]?[1-9]\d*|0)$")
    return isInt_re.intRegex.match(str(v).strip()) is not None

def isInt_re2(v):
    return g_intRegex.match(str(v).strip()) is not None

def check_int(s):
    s = str(s)
    if s[0] in ('-', '+'):
        return s[1:].isdigit()
    return s.isdigit()    


def timeFunc(func, times):
    t1 = time.time()
    for n in range(times):
        for v in testvals: 
            r = func(v)
    t2 = time.time()
    return t2 - t1

def testFuncs(funcs):
    for func in funcs:
        sys.stdout.write( "\t%s\t|" % func.__name__)
    print()
    for v in testvals:
        if type(v) == type(''):
            sys.stdout.write("'%s'" % v)
        else:
            sys.stdout.write("%s" % str(v))
        for func in funcs:
            sys.stdout.write( "\t\t%s\t|" % func(v))
        sys.stdout.write("\r\n") 

if __name__ == '__main__':
    print()
    print("tests..")
    testFuncs((isInt_try, isInt_str, isInt_re, isInt_re2, check_int))
    print()

    print("timings..")
    print("isInt_try:   %6.4f" % timeFunc(isInt_try, 10000))
    print("isInt_str:   %6.4f" % timeFunc(isInt_str, 10000)) 
    print("isInt_re:    %6.4f" % timeFunc(isInt_re, 10000))
    print("isInt_re2:   %6.4f" % timeFunc(isInt_re2, 10000))
    print("check_int:   %6.4f" % timeFunc(check_int, 10000))

以下是性能比较结果:

timings..
isInt_try:   0.6426
isInt_str:   0.7382
isInt_re:    1.1156
isInt_re2:   0.5344
check_int:   0.3452

C语言的方法可以对它进行一次扫描。我认为，用C语言来扫描字符串是正确的做法。

编辑:

我更新了上面的代码，使其能够在Python 3.5中工作，并包含了来自当前投票最多的答案的check_int函数，并使用了我能找到的当前最流行的正则表达式来测试整型。这个正则表达式拒绝'abc 123'这样的字符串。我添加了'abc 123'作为测试值。

我很有趣地注意到，在这一点上，测试的所有函数，包括try方法、流行的check_int函数和最流行的正则表达式，都没有返回所有测试值的正确答案(好吧，这取决于你认为的正确答案是什么;请参阅下面的测试结果)。

内置的int()函数无声地截断浮点数的小数部分并返回小数之前的整数部分，除非浮点数首先转换为字符串。

check_int()函数对于0.0和1.0(技术上是整数)这样的值返回false，对于'06'这样的值返回true。

以下是当前(Python 3.5)的测试结果:

              isInt_try |       isInt_str       |       isInt_re        |       isInt_re2       |   check_int   |
0               True    |               True    |               True    |               True    |       True    |
1               True    |               True    |               True    |               True    |       True    |
-1              True    |               True    |               True    |               True    |       True    |
1.0             True    |               True    |               False   |               False   |       False   |
-1.0            True    |               True    |               False   |               False   |       False   |
'0'             True    |               True    |               True    |               True    |       True    |
'0.'            False   |               True    |               False   |               False   |       False   |
'0.0'           False   |               True    |               False   |               False   |       False   |
'1'             True    |               True    |               True    |               True    |       True    |
'-1'            True    |               True    |               True    |               True    |       True    |
'+1'            True    |               True    |               True    |               True    |       True    |
'1.0'           False   |               True    |               False   |               False   |       False   |
'-1.0'          False   |               True    |               False   |               False   |       False   |
'+1.0'          False   |               True    |               False   |               False   |       False   |
'06'            True    |               True    |               False   |               False   |       True    |
'abc 123'       False   |               False   |               False   |               False   |       False   |
1.1             True    |               False   |               False   |               False   |       False   |
-1.1            True    |               False   |               False   |               False   |       False   |
'1.1'           False   |               False   |               False   |               False   |       False   |
'-1.1'          False   |               False   |               False   |               False   |       False   |
'+1.1'          False   |               False   |               False   |               False   |       False   |
'1.1.1'         False   |               False   |               False   |               False   |       False   |
'1.1.0'         False   |               False   |               False   |               False   |       False   |
'1.0.1'         False   |               False   |               False   |               False   |       False   |
'1.0.0'         False   |               False   |               False   |               False   |       False   |
'1.0.'          False   |               False   |               False   |               False   |       False   |
'1..0'          False   |               False   |               False   |               False   |       False   |
'1..'           False   |               False   |               False   |               False   |       False   |
'0.0.'          False   |               False   |               False   |               False   |       False   |
'0..0'          False   |               False   |               False   |               False   |       False   |
'0..'           False   |               False   |               False   |               False   |       False   |
'one'           False   |               False   |               False   |               False   |       False   |
<obj..>         False   |               False   |               False   |               False   |       False   |
(1, 2, 3)       False   |               False   |               False   |               False   |       False   |
[1, 2, 3]       False   |               False   |               False   |               False   |       False   |
{'one': 'two'}  False   |               False   |               False   |               False   |       False   |
' 0 '           True    |               True    |               True    |               True    |       False   |
' 0.'           False   |               True    |               False   |               False   |       False   |
' .0'           False   |               False   |               False   |               False   |       False   |
'.01 '          False   |               False   |               False   |               False   |       False   |

刚才我试着添加这个函数:

def isInt_float(s):
    try:
        return float(str(s)).is_integer()
    except:
        return False

它的性能几乎与check_int(0.3486)一样好，对于1.0和0.0以及+1.0和0这样的值，它返回true。0等等。但它也为'06'返回true，所以。我想，选择你的毒药吧。

2012-03-25 09:41:13

我认为

s.startswith('-') and s[1:].isdigit()

最好重写为:

s.replace('-', '').isdigit()

因为s[1:]也创建了一个新的字符串

但更好的解决办法是

s.lstrip('+-').isdigit()

2015-03-13 12:03:14

我真的很喜欢Shavais的帖子，但我又添加了一个测试用例(&内置的isdigit()函数):

def isInt_loop(v):
    v = str(v).strip()
    # swapping '0123456789' for '9876543210' makes nominal difference (might have because '1' is toward the beginning of the string)
    numbers = '0123456789'
    for i in v:
        if i not in numbers:
            return False
    return True

def isInt_Digit(v):
    v = str(v).strip()
    return v.isdigit()

而且它一直明显优于其他时代:

timings..
isInt_try:   0.4628
isInt_str:   0.3556
isInt_re:    0.4889
isInt_re2:   0.2726
isInt_loop:   0.1842
isInt_Digit:   0.1577

使用普通2.7 python:

$ python --version
Python 2.7.10

我添加的两个测试用例(isInt_loop和isInt_digit)都通过了完全相同的测试用例(它们都只接受无符号整数)，但我认为人们可以更聪明地修改字符串实现(isInt_loop)而不是内置的isdigit()函数，所以我包括了它，尽管执行时间略有不同。(这两种方法都比其他方法好很多，但没有处理额外的东西:“。/ + / -)

此外，我发现有趣的是，regex (isInt_re2方法)在2012年(目前是2018年)由Shavais执行的相同测试中击败了字符串比较。也许正则表达式库已经改进了?

2018-06-15 23:21:27

在我看来，这可能是最直接和python化的方法。我没有看到这个解它基本上和正则表达式的解是一样的，但是没有正则表达式。

def is_int(test):
    import string
    return not (set(test) - set(string.digits))

2012-10-12 17:07:52

如何检查字符串是否代表int，而不使用try/except?

推荐文章

最新文章

标签