从字节大小返回人类可读大小的函数:

>>> human_readable(2048)
'2 kilobytes'
>>>

如何做到这一点?


当前回答

我最近提出了一个避免循环的版本,使用log2来确定大小顺序,作为后缀列表的移位和索引:

from math import log2

_suffixes = ['bytes', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']

def file_size(size):
    # determine binary order in steps of size 10 
    # (coerce to int, // still returns a float)
    order = int(log2(size) / 10) if size else 0
    # format file size
    # (.4g results in rounded numbers for exact matches and max 3 decimals, 
    # should never resort to exponent values)
    return '{:.4g} {}'.format(size / (1 << (order * 10)), _suffixes[order])

不过,它的可读性很可能被认为是非python化的。

其他回答

重复作为匆匆.filesize()替代方案提供的代码段,下面的代码段根据所使用的前缀给出不同的精度数字。它不像某些片段那样简洁,但我喜欢这样的结果。

def human_size(size_bytes):
    """
    format a size in bytes into a 'human' file size, e.g. bytes, KB, MB, GB, TB, PB
    Note that bytes/KB will be reported in whole numbers but MB and above will have greater precision
    e.g. 1 byte, 43 bytes, 443 KB, 4.3 MB, 4.43 GB, etc
    """
    if size_bytes == 1:
        # because I really hate unnecessary plurals
        return "1 byte"

    suffixes_table = [('bytes',0),('KB',0),('MB',1),('GB',2),('TB',2), ('PB',2)]

    num = float(size_bytes)
    for suffix, precision in suffixes_table:
        if num < 1024.0:
            break
        num /= 1024.0

    if precision == 0:
        formatted_size = "%d" % num
    else:
        formatted_size = str(round(num, ndigits=precision))

    return "%s %s" % (formatted_size, suffix)

这是我的版本。它不使用for循环。它具有常数复杂度O(1),理论上比这里使用for循环的答案更有效。

from math import log
unit_list = zip(['bytes', 'kB', 'MB', 'GB', 'TB', 'PB'], [0, 0, 1, 2, 2, 2])
def sizeof_fmt(num):
    """Human friendly file size"""
    if num > 1:
        exponent = min(int(log(num, 1024)), len(unit_list) - 1)
        quotient = float(num) / 1024**exponent
        unit, num_decimals = unit_list[exponent]
        format_string = '{:.%sf} {}' % (num_decimals)
        return format_string.format(quotient, unit)
    if num == 0:
        return '0 bytes'
    if num == 1:
        return '1 byte'

为了更清楚地说明发生了什么,我们可以省略字符串格式化的代码。以下是真正起作用的台词:

exponent = int(log(num, 1024))
quotient = num / 1024**exponent
unit_list[exponent]

如果你安装了Django,你也可以试试filesizeformat:

from django.template.defaultfilters import filesizeformat
filesizeformat(1073741824)

=>

"1.0 GB"

该功能在Boltons中可用,这对于大多数项目来说都是一个非常方便的库。

>>> bytes2human(128991)
'126K'
>>> bytes2human(100001221)
'95M'
>>> bytes2human(0, 2)
'0.00B'

下面是一个使用while的选项:

def number_format(n):
   n2, n3 = n, 0
   while n2 >= 1e3:
      n2 /= 1e3
      n3 += 1
   return '%.3f' % n2 + ('', ' k', ' M', ' G')[n3]

s = number_format(9012345678)
print(s == '9.012 G')

https://docs.python.org/reference/compound_stmts.html#while