我如何打印一个整数与逗号作为千分隔符?

1234567   ⟶   1,234,567

在句点和逗号之间决定不需要特定于区域设置。


当前回答

我使用的是python 2.5,所以我无法访问内置格式。

我查看了Django代码intcomma(下面代码中的intcomma_recurs),意识到它的效率很低,因为它是递归的,而且每次运行时都编译正则表达式也不是一件好事。这并不是一个必要的“问题”,因为django并没有真正关注这种低级性能。此外,我原本预计性能会有10倍的差异,但实际上只慢了3倍。

出于好奇,我实现了几个版本的intcomma,看看使用regex时的性能优势是什么。我的测试数据显示,这项任务有一点优势,但令人惊讶的是,优势并不大。

我也很高兴地看到了我的怀疑:在没有正则表达式的情况下,使用反向xrange方法是不必要的,但它确实使代码看起来更好,代价是性能下降了10%左右。

另外,我假设你传入的是一个字符串,看起来有点像一个数字。其他结果未定。

from __future__ import with_statement
from contextlib import contextmanager
import re,time

re_first_num = re.compile(r"\d")
def intcomma_noregex(value):
    end_offset, start_digit, period = len(value),re_first_num.search(value).start(),value.rfind('.')
    if period == -1:
        period=end_offset
    segments,_from_index,leftover = [],0,(period-start_digit) % 3
    for _index in xrange(start_digit+3 if not leftover else start_digit+leftover,period,3):
        segments.append(value[_from_index:_index])
        _from_index=_index
    if not segments:
        return value
    segments.append(value[_from_index:])
    return ','.join(segments)

def intcomma_noregex_reversed(value):
    end_offset, start_digit, period = len(value),re_first_num.search(value).start(),value.rfind('.')
    if period == -1:
        period=end_offset
    _from_index,segments = end_offset,[]
    for _index in xrange(period-3,start_digit,-3):
        segments.append(value[_index:_from_index])
        _from_index=_index
    if not segments:
        return value
    segments.append(value[:_from_index])
    return ','.join(reversed(segments))

re_3digits = re.compile(r'(?<=\d)\d{3}(?!\d)')
def intcomma(value):
    segments,last_endoffset=[],len(value)
    while last_endoffset > 3:
        digit_group = re_3digits.search(value,0,last_endoffset)
        if not digit_group:
            break
        segments.append(value[digit_group.start():last_endoffset])
        last_endoffset=digit_group.start()
    if not segments:
        return value
    if last_endoffset:
        segments.append(value[:last_endoffset])
    return ','.join(reversed(segments))

def intcomma_recurs(value):
    """
    Converts an integer to a string containing commas every three digits.
    For example, 3000 becomes '3,000' and 45000 becomes '45,000'.
    """
    new = re.sub("^(-?\d+)(\d{3})", '\g<1>,\g<2>', str(value))
    if value == new:
        return new
    else:
        return intcomma(new)

@contextmanager
def timed(save_time_func):
    begin=time.time()
    try:
        yield
    finally:
        save_time_func(time.time()-begin)

def testset_xsimple(func):
    func('5')

def testset_simple(func):
    func('567')

def testset_onecomma(func):
    func('567890')

def testset_complex(func):
    func('-1234567.024')

def testset_average(func):
    func('-1234567.024')
    func('567')
    func('5674')

if __name__ == '__main__':
    print 'Test results:'
    for test_data in ('5','567','1234','1234.56','-253892.045'):
        for func in (intcomma,intcomma_noregex,intcomma_noregex_reversed,intcomma_recurs):
            print func.__name__,test_data,func(test_data)
    times=[]
    def overhead(x):
        pass
    for test_run in xrange(1,4):
        for func in (intcomma,intcomma_noregex,intcomma_noregex_reversed,intcomma_recurs,overhead):
            for testset in (testset_xsimple,testset_simple,testset_onecomma,testset_complex,testset_average):
                for x in xrange(1000): # prime the test
                    testset(func)
                with timed(lambda x:times.append(((test_run,func,testset),x))):
                    for x in xrange(50000):
                        testset(func)
    for (test_run,func,testset),_delta in times:
        print test_run,func.__name__,testset.__name__,_delta

下面是测试结果:

intcomma 5 5
intcomma_noregex 5 5
intcomma_noregex_reversed 5 5
intcomma_recurs 5 5
intcomma 567 567
intcomma_noregex 567 567
intcomma_noregex_reversed 567 567
intcomma_recurs 567 567
intcomma 1234 1,234
intcomma_noregex 1234 1,234
intcomma_noregex_reversed 1234 1,234
intcomma_recurs 1234 1,234
intcomma 1234.56 1,234.56
intcomma_noregex 1234.56 1,234.56
intcomma_noregex_reversed 1234.56 1,234.56
intcomma_recurs 1234.56 1,234.56
intcomma -253892.045 -253,892.045
intcomma_noregex -253892.045 -253,892.045
intcomma_noregex_reversed -253892.045 -253,892.045
intcomma_recurs -253892.045 -253,892.045
1 intcomma testset_xsimple 0.0410001277924
1 intcomma testset_simple 0.0369999408722
1 intcomma testset_onecomma 0.213000059128
1 intcomma testset_complex 0.296000003815
1 intcomma testset_average 0.503000020981
1 intcomma_noregex testset_xsimple 0.134000062943
1 intcomma_noregex testset_simple 0.134999990463
1 intcomma_noregex testset_onecomma 0.190999984741
1 intcomma_noregex testset_complex 0.209000110626
1 intcomma_noregex testset_average 0.513000011444
1 intcomma_noregex_reversed testset_xsimple 0.124000072479
1 intcomma_noregex_reversed testset_simple 0.12700009346
1 intcomma_noregex_reversed testset_onecomma 0.230000019073
1 intcomma_noregex_reversed testset_complex 0.236999988556
1 intcomma_noregex_reversed testset_average 0.56299996376
1 intcomma_recurs testset_xsimple 0.348000049591
1 intcomma_recurs testset_simple 0.34600019455
1 intcomma_recurs testset_onecomma 0.625
1 intcomma_recurs testset_complex 0.773999929428
1 intcomma_recurs testset_average 1.6890001297
1 overhead testset_xsimple 0.0179998874664
1 overhead testset_simple 0.0190000534058
1 overhead testset_onecomma 0.0190000534058
1 overhead testset_complex 0.0190000534058
1 overhead testset_average 0.0309998989105
2 intcomma testset_xsimple 0.0360000133514
2 intcomma testset_simple 0.0369999408722
2 intcomma testset_onecomma 0.207999944687
2 intcomma testset_complex 0.302000045776
2 intcomma testset_average 0.523000001907
2 intcomma_noregex testset_xsimple 0.139999866486
2 intcomma_noregex testset_simple 0.141000032425
2 intcomma_noregex testset_onecomma 0.203999996185
2 intcomma_noregex testset_complex 0.200999975204
2 intcomma_noregex testset_average 0.523000001907
2 intcomma_noregex_reversed testset_xsimple 0.130000114441
2 intcomma_noregex_reversed testset_simple 0.129999876022
2 intcomma_noregex_reversed testset_onecomma 0.236000061035
2 intcomma_noregex_reversed testset_complex 0.241999864578
2 intcomma_noregex_reversed testset_average 0.582999944687
2 intcomma_recurs testset_xsimple 0.351000070572
2 intcomma_recurs testset_simple 0.352999925613
2 intcomma_recurs testset_onecomma 0.648999929428
2 intcomma_recurs testset_complex 0.808000087738
2 intcomma_recurs testset_average 1.81900000572
2 overhead testset_xsimple 0.0189998149872
2 overhead testset_simple 0.0189998149872
2 overhead testset_onecomma 0.0190000534058
2 overhead testset_complex 0.0179998874664
2 overhead testset_average 0.0299999713898
3 intcomma testset_xsimple 0.0360000133514
3 intcomma testset_simple 0.0360000133514
3 intcomma testset_onecomma 0.210000038147
3 intcomma testset_complex 0.305999994278
3 intcomma testset_average 0.493000030518
3 intcomma_noregex testset_xsimple 0.131999969482
3 intcomma_noregex testset_simple 0.136000156403
3 intcomma_noregex testset_onecomma 0.192999839783
3 intcomma_noregex testset_complex 0.202000141144
3 intcomma_noregex testset_average 0.509999990463
3 intcomma_noregex_reversed testset_xsimple 0.125999927521
3 intcomma_noregex_reversed testset_simple 0.126999855042
3 intcomma_noregex_reversed testset_onecomma 0.235999822617
3 intcomma_noregex_reversed testset_complex 0.243000030518
3 intcomma_noregex_reversed testset_average 0.56200003624
3 intcomma_recurs testset_xsimple 0.337000131607
3 intcomma_recurs testset_simple 0.342000007629
3 intcomma_recurs testset_onecomma 0.609999895096
3 intcomma_recurs testset_complex 0.75
3 intcomma_recurs testset_average 1.68300008774
3 overhead testset_xsimple 0.0189998149872
3 overhead testset_simple 0.018000125885
3 overhead testset_onecomma 0.018000125885
3 overhead testset_complex 0.0179998874664
3 overhead testset_average 0.0299999713898

其他回答

Python 2.5+和Python 3(仅限正int):

''.join(reversed([x + (',' if i and not i % 3 else '') for i, x in enumerate(reversed(str(1234567)))]))

浮点数:

float(filter(lambda x: x!=',', '1,234.52'))
# returns 1234.52

对于整数:

int(filter(lambda x: x!=',', '1,234'))
# returns 1234

我很惊讶,没有人提到你可以在Python 3.6+中使用f-strings做到这一点,就像这样简单:

>>> num = 10000000
>>> print(f"{num:,}")
10,000,000

... 冒号后面的部分是格式说明符。逗号是您想要的分隔符,因此f"{num:_}"使用下划线而不是逗号。此方法只能使用“,”和“_”。

这相当于在旧版本的python 3中使用format(num, ",")。

当你第一次看到它时,它可能看起来像魔法,但它不是。它只是语言的一部分,通常需要有一个可用的快捷方式。要了解更多信息,请查看group子组件。

你也可以使用'{:n}'。区域设置表示的格式(值)。我认为这是最简单的现场解决方案。

有关更多信息,请在Python DOC中搜索数千个。

对于货币,可以使用locale。货币,设置标志分组:

Code

import locale

locale.setlocale( locale.LC_ALL, '' )
locale.currency( 1234567.89, grouping = True )

输出

'Portuguese_Brazil.1252'
'R$ 1.234.567,89'

下面是一行正则表达式替换:

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % val)

仅适用于积分输出:

import re
val = 1234567890
re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % val)
# Returns: '1,234,567,890'

val = 1234567890.1234567890
# Returns: '1,234,567,890'

或者对于小于4位的浮点数,将格式说明符更改为%.3f:

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%.3f" % val)
# Returns: '1,234,567,890.123'

注意:不能正确工作与超过三个十进制数字,因为它将尝试分组小数部分:

re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%.5f" % val)
# Returns: '1,234,567,890.12,346'

它是如何工作的

让我们来分析一下:

re.sub(pattern, repl, string)

pattern = \
    "(\d)           # Find one digit...
     (?=            # that is followed by...
         (\d{3})+   # one or more groups of three digits...
         (?!\d)     # which are not followed by any more digits.
     )",

repl = \
    r"\1,",         # Replace that one digit by itself, followed by a comma,
                    # and continue looking for more matches later in the string.
                    # (re.sub() replaces all matches it finds in the input)

string = \
    "%d" % val      # Format the string as a decimal to begin with