如何在Python中获得对象在内存中占用的大小?


当前回答

如果不想包含链接(嵌套)对象的大小,请使用sys.getsizeof()。

然而,如果你想计算嵌套在列表、字典、集、元组中的子对象——通常这就是你要找的——使用如下所示的递归深层sizeof()函数:

import sys
def sizeof(obj):
    size = sys.getsizeof(obj)
    if isinstance(obj, dict): return size + sum(map(sizeof, obj.keys())) + sum(map(sizeof, obj.values()))
    if isinstance(obj, (list, tuple, set, frozenset)): return size + sum(map(sizeof, obj))
    return size

你也可以在漂亮的工具箱中找到这个函数,以及许多其他有用的一行程序:

https://github.com/mwojnars/nifty/blob/master/util.py

其他回答

这可能比看起来要复杂得多,这取决于你想要如何计数。例如,如果您有一个int类型的列表,您是否需要包含对int类型引用的列表的大小?(即-列表,而不是包含在其中的内容),或者你想包括实际指向的数据,在这种情况下,你需要处理重复引用,以及如何防止重复计数当两个对象包含对同一对象的引用时。

您可能想要查看python内存分析器之一,例如pysizer,以查看它们是否满足您的需求。

使用以下函数获取python对象的实际大小:

import sys
import gc

def actualsize(input_obj):
    memory_size = 0
    ids = set()
    objects = [input_obj]
    while objects:
        new = []
        for obj in objects:
            if id(obj) not in ids:
                ids.add(id(obj))
                memory_size += sys.getsizeof(obj)
                new.append(obj)
        objects = gc.get_referents(*new)
    return memory_size

actualsize([1, 2, [3, 4, 5, 1]])

参考:https://towardsdatascience.com/the-strange-size-of-python-objects-in-memory-ce87bdfbb97f

我用这个技巧…May在小对象上不准确,但我认为对于复杂对象(如pygame surface)比sys.getsizeof()更准确。

import pygame as pg
import os
import psutil
import time


process = psutil.Process(os.getpid())
pg.init()    
vocab = ['hello', 'me', 'you', 'she', 'he', 'they', 'we',
         'should', 'why?', 'necessarily', 'do', 'that']

font = pg.font.SysFont("monospace", 100, True)

dct = {}

newMem = process.memory_info().rss  # don't mind this line
Str = f'store ' + f'Nothing \tsurface use about '.expandtabs(15) + \
      f'0\t bytes'.expandtabs(9)  # don't mind this assignment too

usedMem = process.memory_info().rss

for word in vocab:
    dct[word] = font.render(word, True, pg.Color("#000000"))

    time.sleep(0.1)  # wait a moment

    # get total used memory of this script:
    newMem = process.memory_info().rss
    Str = f'store ' + f'{word}\tsurface use about '.expandtabs(15) + \
          f'{newMem - usedMem}\t bytes'.expandtabs(9)

    print(Str)
    usedMem = newMem

在我的windows 10, python 3.7.3,输出是:

store hello          surface use about 225280    bytes
store me             surface use about 61440     bytes
store you            surface use about 94208     bytes
store she            surface use about 81920     bytes
store he             surface use about 53248     bytes
store they           surface use about 114688    bytes
store we             surface use about 57344     bytes
store should         surface use about 172032    bytes
store why?           surface use about 110592    bytes
store necessarily    surface use about 311296    bytes
store do             surface use about 57344     bytes
store that           surface use about 110592    bytes

如果不想包含链接(嵌套)对象的大小,请使用sys.getsizeof()。

然而,如果你想计算嵌套在列表、字典、集、元组中的子对象——通常这就是你要找的——使用如下所示的递归深层sizeof()函数:

import sys
def sizeof(obj):
    size = sys.getsizeof(obj)
    if isinstance(obj, dict): return size + sum(map(sizeof, obj.keys())) + sum(map(sizeof, obj.values()))
    if isinstance(obj, (list, tuple, set, frozenset)): return size + sum(map(sizeof, obj))
    return size

你也可以在漂亮的工具箱中找到这个函数,以及许多其他有用的一行程序:

https://github.com/mwojnars/nifty/blob/master/util.py

If you don't need the exact size of the object but roughly to know how big it is, one quick (and dirty) way is to let the program run, sleep for an extended period of time, and check the memory usage (ex: Mac's activity monitor) by this particular python process. This would be effective when you are trying to find the size of one single large object in a python process. For example, I recently wanted to check the memory usage of a new data structure and compare it with that of Python's set data structure. First I wrote the elements (words from a large public domain book) to a set, then checked the size of the process, and then did the same thing with the other data structure. I found out the Python process with a set is taking twice as much memory as the new data structure. Again, you wouldn't be able to exactly say the memory used by the process is equal to the size of the object. As the size of the object gets large, this becomes close as the memory consumed by the rest of the process becomes negligible compared to the size of the object you are trying to monitor.