Python允许从给定基数的字符串中轻松创建一个整数

int(str, base). 

我想执行相反的操作:从一个整数创建一个字符串, 例如,我想要一些函数int2base(num, base),这样:

int(int2base(x, b), b) == x

函数名/参数的顺序并不重要。

对于int()将接受的任何以b为底的数字x。

这是一个很容易写的函数:事实上,它比在这个问题中描述它更容易。然而,我觉得我一定是错过了什么。

我知道函数bin, oct, hex,但我不能使用它们的几个原因:

这些函数在旧版本的Python中不可用,我需要与(2.2)兼容 我想要一个通解对于不同的碱都可以用同样的方式表示 我想允许2 8 16以外的底数

相关的

Python优雅的int(string, base)逆函数 在python中使用递归的整数到base-x系统 Python中的Base 62转换 如何在Python中将整数转换为最短的url安全字符串?


当前回答

def baseN(num,b,numerals="0123456789abcdefghijklmnopqrstuvwxyz"):
    return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])

裁判: http://code.activestate.com/recipes/65212/

请注意这可能会导致

RuntimeError: maximum recursion depth exceeded in cmp

对于非常大的整数。

其他回答

另一个简短的(在我看来更容易理解):

def int_to_str(n, b, symbols='0123456789abcdefghijklmnopqrstuvwxyz'):
    return (int_to_str(n/b, b, symbols) if n >= b else "") + symbols[n%b]

通过适当的异常处理:

def int_to_str(n, b, symbols='0123456789abcdefghijklmnopqrstuvwxyz'):
    try:
        return (int_to_str(n/b, b) if n >= b else "") + symbols[n%b]
    except IndexError:
        raise ValueError(
            "The symbols provided are not enough to represent this number in "
            "this base")

The below provided Python code converts a Python integer to a string in arbitrary base ( from 2 up to infinity ) and works in both directions. So all the created strings can be converted back to Python integers by providing a string for N instead of an integer. The code works only on positive numbers by intention (there is in my eyes some hassle about negative values and their bit representations I don't want to dig into). Just pick from this code what you need, want or like, or just have fun learning about available options. Much is there only for the purpose of documenting all the various available approaches ( e.g. the Oneliner seems not to be fast, even if promised to be ).

我喜欢萨尔瓦多·达利提出的无限大基地的格式。一个很好的建议,它在光学上工作得很好,即使是简单的二进制位表示。注意,在infiniteBase=True格式的字符串的情况下,width=x填充参数适用于数字,而不是整个数字。似乎,代码处理无穷大数字格式运行甚至比其他选项快一点-使用它的另一个原因?

我不喜欢使用Unicode来扩展数字可用的符号数量的想法,所以不要在下面的代码中寻找它,因为它不存在。请使用建议的infiniteBase格式,或者将整数存储为字节以进行压缩。

    def inumToStr( N, base=2, width=1, infiniteBase=False,\
    useNumpy=False, useRecursion=False, useOneliner=False, \
    useGmpy=False, verbose=True):
    ''' Positive numbers only, but works in BOTH directions.
    For strings in infiniteBase notation set for bases <= 62 
    infiniteBase=True . Examples of use:
    inumToStr( 17,  2, 1, 1)             # [1,0,0,0,1]
    inumToStr( 17,  3, 5)                #       00122
    inumToStr(245, 16, 4)                #        00F5
    inumToStr(245, 36, 4,0,1)            #        006T
    inumToStr(245245245245,36,10,0,1)    #  0034NWOQBH
    inumToStr(245245245245,62)           #     4JhA3Th 
        245245245245 == int(gmpy2.mpz('4JhA3Th',62))
    inumToStr(245245245245,99,2) # [25,78, 5,23,70,44]
    ----------------------------------------------------
    inumToStr( '[1,0,0,0,1]',2, infiniteBase=True ) # 17 
    inumToStr( '[25,78, 5,23,70,44]', 99) # 245245245245
    inumToStr( '0034NWOQBH', 36 )         # 245245245245 
    inumToStr( '4JhA3Th'   , 62 )         # 245245245245
    ----------------------------------------------------
    --- Timings for N = 2**4096, base=36: 
                                      standard: 0.0023
                                      infinite: 0.0017
                                      numpy   : 0.1277
                                      recursio; 0.0022
                                      oneliner: 0.0146
                For N = 2**8192: 
                                      standard: 0.0075
                                      infinite: 0.0053
                                      numpy   : 0.1369
    max. recursion depth exceeded:    recursio/oneliner
    '''
    show = print
    if type(N) is str and ( infiniteBase is True or base > 62 ):
        lstN = eval(N)
        if verbose: show(' converting a non-standard infiniteBase bits string to Python integer')
        return sum( [ item*base**pow for pow, item in enumerate(lstN[::-1]) ] )
    if type(N) is str and base <= 36:
        if verbose: show('base <= 36. Returning Python int(N, base)')
        return int(N, base)
    if type(N) is str and base <= 62:
        if useGmpy: 
            if verbose: show(' base <= 62, useGmpy=True, returning int(gmpy2.mpz(N,base))')
            return int(gmpy2.mpz(N,base))
        else:
            if verbose: show(' base <= 62, useGmpy=False, self-calculating return value)')
            lstStrOfDigits="0123456789"+ \
                "abcdefghijklmnopqrstuvwxyz".upper() + \
                "abcdefghijklmnopqrstuvwxyz"
            dictCharToPow = {}
            for index, char in enumerate(lstStrOfDigits):
                dictCharToPow.update({char : index}) 
            return sum( dictCharToPow[item]*base**pow for pow, item in enumerate(N[::-1]) )
        #:if
    #:if        
        
    if useOneliner and base <= 36:  
        if verbose: show(' base <= 36, useOneliner=True, running the Oneliner code')
        d="0123456789abcdefghijklmnopqrstuvwxyz"
        baseit = lambda a=N, b=base: (not a) and d[0]  or \
        baseit(a-a%b,b*base)+d[a%b%(base-1) or (a%b) and (base-1)]
        return baseit().rjust(width, d[0])[1:]

    if useRecursion and base <= 36: 
        if verbose: show(' base <= 36, useRecursion=True, running recursion algorythm')
        BS="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
        def to_base(n, b): 
            return "0" if not n else to_base(n//b, b).lstrip("0") + BS[n%b]
        return to_base(N, base).rjust(width,BS[0])
        
    if base > 62 or infiniteBase:
        if verbose: show(' base > 62 or infiniteBase=True, returning a non-standard digits string')
        # Allows arbitrary large base with 'width=...' 
        # applied to each digit (useful also for bits )
        N, digit = divmod(N, base)
        strN = str(digit).rjust(width, ' ')+']'
        while N:
            N, digit = divmod(N, base)
            strN = str(digit).rjust(width, ' ') + ',' + strN
        return '[' + strN
    #:if        
    
    if base == 2:
        if verbose: show(" base = 2, returning Python str(f'{N:0{width}b}')")
        return str(f'{N:0{width}b}')
    if base == 8:
        if verbose: show(" base = 8, returning Python str(f'{N:0{width}o}')")
        return str(f'{N:0{width}o}')
    if base == 16:
        if verbose: show(" base = 16, returning Python str(f'{N:0{width}X}')")
        return str(f'{N:0{width}X}')

    if base <= 36:
        if useNumpy: 
            if verbose: show(" base <= 36, useNumpy=True, returning np.base_repr(N, base)")
            import numpy as np
            strN = np.base_repr(N, base)
            return strN.rjust(width, '0') 
        else:
            if verbose: show(' base <= 36, useNumpy=False, self-calculating return value)')
            lstStrOfDigits="0123456789"+"abcdefghijklmnopqrstuvwxyz".upper()
            strN = lstStrOfDigits[N % base] # rightmost digit
            while N >= base:
                N //= base # consume already converted digit
                strN = lstStrOfDigits[N % base] + strN # add digits to the left
            #:while
            return strN.rjust(width, lstStrOfDigits[0])
        #:if
    #:if
    
    if base <= 62:
        if useGmpy: 
            if verbose: show(" base <= 62, useGmpy=True, returning gmpy2.digits(N, base)")
            import gmpy2
            strN = gmpy2.digits(N, base)
            return strN.rjust(width, '0') 
            # back to Python int from gmpy2.mpz with 
            #     int(gmpy2.mpz('4JhA3Th',62))
        else:
            if verbose: show(' base <= 62, useGmpy=False, self-calculating return value)')
            lstStrOfDigits= "0123456789" + \
                "abcdefghijklmnopqrstuvwxyz".upper() + \
                "abcdefghijklmnopqrstuvwxyz"
            strN = lstStrOfDigits[N % base] # rightmost digit
            while N >= base:
                N //= base # consume already converted digit
                strN = lstStrOfDigits[N % base] + strN # add digits to the left
            #:while
            return strN.rjust(width, lstStrOfDigits[0])
        #:if
    #:if    
#:def

下面是一个处理有符号整数和自定义数字的递归版本。

import string

def base_convert(x, base, digits=None):
    """Convert integer `x` from base 10 to base `base` using `digits` characters as digits.
    If `digits` is omitted, it will use decimal digits + lowercase letters + uppercase letters.
    """
    digits = digits or (string.digits + string.ascii_letters)
    assert 2 <= base <= len(digits), "Unsupported base: {}".format(base)
    if x == 0:
        return digits[0]
    sign = '-' if x < 0 else ''
    x = abs(x)
    first_digits = base_convert(x // base, base, digits).lstrip(digits[0])
    return sign + first_digits + digits[x % base]
"{0:b}".format(100) # bin: 1100100
"{0:x}".format(100) # hex: 64
"{0:o}".format(100) # oct: 144

字符串不是表示数字的唯一选择:您可以使用一个整数列表来表示每个数字的顺序。这些可以很容易地转换为字符串。

没有一个答案拒绝底数< 2;对于非常大的数字(如56789 ** 43210),大多数将运行非常缓慢或因堆栈溢出而崩溃。为了避免这种失败,可以像这样快速减少:

def n_to_base(n, b):
    if b < 2: raise # invalid base
    if abs(n) < b: return [n]
    ret = [y for d in n_to_base(n, b*b) for y in divmod(d, b)]
    return ret[1:] if ret[0] == 0 else ret # remove leading zeros

def base_to_n(v, b):
    h = len(v) // 2
    if h == 0: return v[0]
    return base_to_n(v[:-h], b) * (b**h) + base_to_n(v[-h:], b)

assert ''.join(['0123456789'[x] for x in n_to_base(56789**43210,10)])==str(56789**43210)

在速度方面,n_to_base对于较大的数字(在我的机器上约为0.3秒)与str相当,但如果与十六进制进行比较,您可能会感到惊讶(在我的机器上约为0.3毫秒,或快1000倍)。这是因为大整数以256(字节)为基数存储在内存中。每个字节可以简单地转换为两个字符的十六进制字符串。这种对齐只发生在底数为2的幂的情况下,这就是为什么有2、8和16(以及base64, ascii, utf16, utf32)的特殊情况。

Consider the last digit of a decimal string. How does it relate to the sequence of bytes that forms its integer? Let's label the bytes s[i] with s[0] being the least significant (little endian). Then the last digit is sum([s[i]*(256**i) % 10 for i in range(n)]). Well, it happens that 256**i ends with a 6 for i > 0 (6*6=36) so that last digit is (s[0]*5 + sum(s)*6)%10. From this, you can see that the last digit depends on the sum of all the bytes. This nonlocal property is what makes converting to decimal harder.