在我重新发明这个特殊的轮子之前,有没有人有一个很好的用Python计算目录大小的例程?如果该例程能以Mb/Gb等格式格式化大小,那就太好了。
当前回答
问题的第二部分
def human(size):
B = "B"
KB = "KB"
MB = "MB"
GB = "GB"
TB = "TB"
UNITS = [B, KB, MB, GB, TB]
HUMANFMT = "%f %s"
HUMANRADIX = 1024.
for u in UNITS[:-1]:
if size < HUMANRADIX : return HUMANFMT % (size, u)
size /= HUMANRADIX
return HUMANFMT % (size, UNITS[-1])
其他回答
当计算子目录的大小时,它应该更新其父目录的文件夹大小,这将一直进行下去,直到它到达根父目录。
下面的函数计算文件夹及其所有子文件夹的大小。
import os
def folder_size(path):
parent = {} # path to parent path mapper
folder_size = {} # storing the size of directories
folder = os.path.realpath(path)
for root, _, filenames in os.walk(folder):
if root == folder:
parent[root] = -1 # the root folder will not have any parent
folder_size[root] = 0.0 # intializing the size to 0
elif root not in parent:
immediate_parent_path = os.path.dirname(root) # extract the immediate parent of the subdirectory
parent[root] = immediate_parent_path # store the parent of the subdirectory
folder_size[root] = 0.0 # initialize the size to 0
total_size = 0
for filename in filenames:
filepath = os.path.join(root, filename)
total_size += os.stat(filepath).st_size # computing the size of the files under the directory
folder_size[root] = total_size # store the updated size
temp_path = root # for subdirectories, we need to update the size of the parent till the root parent
while parent[temp_path] != -1:
folder_size[parent[temp_path]] += total_size
temp_path = parent[temp_path]
return folder_size[folder]/1000000.0
使用pathlib,我想出了这个一行程序来获取文件夹的大小:
sum(file.stat().st_size for file in Path(folder).rglob('*'))
这是我为一个漂亮的格式化输出:
from pathlib import Path
def get_folder_size(folder):
return ByteSize(sum(file.stat().st_size for file in Path(folder).rglob('*')))
class ByteSize(int):
_KB = 1024
_suffixes = 'B', 'KB', 'MB', 'GB', 'PB'
def __new__(cls, *args, **kwargs):
return super().__new__(cls, *args, **kwargs)
def __init__(self, *args, **kwargs):
self.bytes = self.B = int(self)
self.kilobytes = self.KB = self / self._KB**1
self.megabytes = self.MB = self / self._KB**2
self.gigabytes = self.GB = self / self._KB**3
self.petabytes = self.PB = self / self._KB**4
*suffixes, last = self._suffixes
suffix = next((
suffix
for suffix in suffixes
if 1 < getattr(self, suffix) < self._KB
), last)
self.readable = suffix, getattr(self, suffix)
super().__init__()
def __str__(self):
return self.__format__('.2f')
def __repr__(self):
return '{}({})'.format(self.__class__.__name__, super().__repr__())
def __format__(self, format_spec):
suffix, val = self.readable
return '{val:{fmt}} {suf}'.format(val=val, fmt=format_spec, suf=suffix)
def __sub__(self, other):
return self.__class__(super().__sub__(other))
def __add__(self, other):
return self.__class__(super().__add__(other))
def __mul__(self, other):
return self.__class__(super().__mul__(other))
def __rsub__(self, other):
return self.__class__(super().__sub__(other))
def __radd__(self, other):
return self.__class__(super().__add__(other))
def __rmul__(self, other):
return self.__class__(super().__rmul__(other))
用法:
>>> size = get_folder_size("c:/users/tdavis/downloads")
>>> print(size)
5.81 GB
>>> size.GB
5.810891855508089
>>> size.gigabytes
5.810891855508089
>>> size.PB
0.005674699077644618
>>> size.MB
5950.353260040283
>>> size
ByteSize(6239397620)
我还遇到了这个问题,它有一些更紧凑、可能更高效的打印文件大小的策略。
这有点晚了,但只要安装了glob2和humanize,就行了。注意,在Python 3中,默认的iglob具有递归模式。如何修改Python 3的代码是留给读者的简单练习。
>>> import os
>>> from humanize import naturalsize
>>> from glob2 import iglob
>>> naturalsize(sum(os.path.getsize(x) for x in iglob('/var/**'))))
'546.2 MB'
问题的第二部分
def human(size):
B = "B"
KB = "KB"
MB = "MB"
GB = "GB"
TB = "TB"
UNITS = [B, KB, MB, GB, TB]
HUMANFMT = "%f %s"
HUMANRADIX = 1024.
for u in UNITS[:-1]:
if size < HUMANRADIX : return HUMANFMT % (size, u)
size /= HUMANRADIX
return HUMANFMT % (size, UNITS[-1])
Python 3.6+递归文件夹/文件大小使用os.scandir。和@blakev的回答一样强大,但更短,采用EAFP python风格。
import os
def size(path, *, follow_symlinks=False):
try:
with os.scandir(path) as it:
return sum(size(entry, follow_symlinks=follow_symlinks) for entry in it)
except NotADirectoryError:
return os.stat(path, follow_symlinks=follow_symlinks).st_size
推荐文章
- 证书验证失败:无法获得本地颁发者证书
- 当使用pip3安装包时,“Python中的ssl模块不可用”
- 无法切换Python与pyenv
- Python if not == vs if !=
- 如何从scikit-learn决策树中提取决策规则?
- 为什么在Mac OS X v10.9 (Mavericks)的终端中apt-get功能不起作用?
- 将旋转的xtick标签与各自的xtick对齐
- 为什么元组可以包含可变项?
- 如何合并字典的字典?
- 如何创建类属性?
- 不区分大小写的“in”
- 在Python中获取迭代器中的元素个数
- 解析日期字符串并更改格式
- 使用try和。Python中的if
- 如何在Python中获得所有直接子目录