在我重新发明这个特殊的轮子之前,有没有人有一个很好的用Python计算目录大小的例程?如果该例程能以Mb/Gb等格式格式化大小,那就太好了。


当前回答

Python 3.6+递归文件夹/文件大小使用os.scandir。和@blakev的回答一样强大,但更短,采用EAFP python风格。

import os

def size(path, *, follow_symlinks=False):
    try:
        with os.scandir(path) as it:
            return sum(size(entry, follow_symlinks=follow_symlinks) for entry in it)
    except NotADirectoryError:
        return os.stat(path, follow_symlinks=follow_symlinks).st_size

其他回答

递归的一行代码:

def getFolderSize(p):
   from functools import partial
   prepend = partial(os.path.join, p)
   return sum([(os.path.getsize(f) if os.path.isfile(f) else getFolderSize(f)) for f in map(prepend, os.listdir(p))])

这个脚本告诉您CWD中哪个文件最大,还告诉您文件在哪个文件夹中。 这个脚本适用于win8和python 3.3.3 shell

import os

folder=os.cwd()

number=0
string=""

for root, dirs, files in os.walk(folder):
    for file in files:
        pathname=os.path.join(root,file)
##        print (pathname)
##        print (os.path.getsize(pathname)/1024/1024)
        if number < os.path.getsize(pathname):
            number = os.path.getsize(pathname)
            string=pathname


##        print ()


print (string)
print ()
print (number)
print ("Number in bytes")

python3.5 +

from pathlib import Path

def get_size(folder: str) -> int:
    return sum(p.stat().st_size for p in Path(folder).rglob('*'))

用法::

In [6]: get_size('/etc/not-exist-path')
Out[6]: 0
In [7]: get_size('.')
Out[7]: 12038689
In [8]: def filesize(size: int) -> str:
   ...:     for unit in ("B", "K", "M", "G", "T"):
   ...:         if size < 1024:
   ...:             break
   ...:         size /= 1024
   ...:     return f"{size:.1f}{unit}"
   ...:

In [9]: filesize(get_size('.'))
Out[9]: '11.5M'

接受的答案不考虑硬链接或软链接,并将这些文件计算两次。您可能希望跟踪已看到的inode,而不是为这些文件添加大小。

import os
def get_size(start_path='.'):
    total_size = 0
    seen = {}
    for dirpath, dirnames, filenames in os.walk(start_path):
        for f in filenames:
            fp = os.path.join(dirpath, f)
            try:
                stat = os.stat(fp)
            except OSError:
                continue

            try:
                seen[stat.st_ino]
            except KeyError:
                seen[stat.st_ino] = True
            else:
                continue

            total_size += stat.st_size

    return total_size

print get_size()
import os

def get_size(path):
    total_size = 0
    for dirpath, dirnames, filenames in os.walk(path):
        for f in filenames:
            if os.path.exists(fp):
                fp = os.path.join(dirpath, f)
                total_size += os.path.getsize(fp)

    return total_size   # in megabytes

谢谢monkut & troex!