我最近对算法产生了兴趣,并开始通过编写简单的实现来探索算法,然后以各种方式优化它。

我已经熟悉了用于分析运行时的标准Python模块(对于大多数事情,我发现IPython中的timeit魔术函数就足够了),但我也对内存使用感兴趣,所以我也可以探索这些权衡(例如,缓存以前计算值的表的成本与根据需要重新计算它们的成本)。是否有一个模块,将为我分析给定函数的内存使用情况?


当前回答

也许它有帮助: <看到额外的>

pip install gprof2dot
sudo apt-get install graphviz

gprof2dot -f pstats profile_for_func1_001 | dot -Tpng -o profile.png

def profileit(name):
    """
    @profileit("profile_for_func1_001")
    """
    def inner(func):
        def wrapper(*args, **kwargs):
            prof = cProfile.Profile()
            retval = prof.runcall(func, *args, **kwargs)
            # Note use of name from outer scope
            prof.dump_stats(name)
            return retval
        return wrapper
    return inner

@profileit("profile_for_func1_001")
def func1(...)

其他回答

下面是一个简单的函数装饰器,它可以跟踪函数调用之前,函数调用之后进程消耗了多少内存,以及有什么区别:

import time
import os
import psutil
 
 
def elapsed_since(start):
    return time.strftime("%H:%M:%S", time.gmtime(time.time() - start))
 
 
def get_process_memory():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss
 
 
def profile(func):
    def wrapper(*args, **kwargs):
        mem_before = get_process_memory()
        start = time.time()
        result = func(*args, **kwargs)
        elapsed_time = elapsed_since(start)
        mem_after = get_process_memory()
        print("{}: memory before: {:,}, after: {:,}, consumed: {:,}; exec time: {}".format(
            func.__name__,
            mem_before, mem_after, mem_after - mem_before,
            elapsed_time))
        return result
    return wrapper

这是我的博客,上面描述了所有的细节。(归档链接)

也许它有帮助: <看到额外的>

pip install gprof2dot
sudo apt-get install graphviz

gprof2dot -f pstats profile_for_func1_001 | dot -Tpng -o profile.png

def profileit(name):
    """
    @profileit("profile_for_func1_001")
    """
    def inner(func):
        def wrapper(*args, **kwargs):
            prof = cProfile.Profile()
            retval = prof.runcall(func, *args, **kwargs)
            # Note use of name from outer scope
            prof.dump_stats(name)
            return retval
        return wrapper
    return inner

@profileit("profile_for_func1_001")
def func1(...)

一个简单的例子,使用memory_profile计算一个代码块/函数的内存使用情况,同时返回函数的结果:

import memory_profiler as mp

def fun(n):
    tmp = []
    for i in range(n):
        tmp.extend(list(range(i*i)))
    return "XXXXX"

在运行代码之前计算内存使用量,然后在代码期间计算最大使用量:

start_mem = mp.memory_usage(max_usage=True)
res = mp.memory_usage(proc=(fun, [100]), max_usage=True, retval=True) 
print('start mem', start_mem)
print('max mem', res[0][0])
print('used mem', res[0][0]-start_mem)
print('fun output', res[1])

在运行函数时计算采样点的使用情况:

res = mp.memory_usage((fun, [100]), interval=.001, retval=True)
print('min mem', min(res[0]))
print('max mem', max(res[0]))
print('used mem', max(res[0])-min(res[0]))
print('fun output', res[1])

学分:@skeept

由于公认的答案和第二高的投票答案,在我看来,有一些问题,我想提供一个更多的答案,这是密切基于Ihor B。在回答中做了一些小而重要的修改。

这个解决方案允许您通过使用profile函数包装函数调用并调用它,或者通过使用@profile装饰器装饰函数/方法来运行分析。

当您希望在不破坏源代码的情况下分析某些第三方代码时,第一种技术非常有用,而当您不介意修改想要分析的函数/方法的源代码时,第二种技术稍微“干净”一些,效果更好。

我还修改了输出,以便获得RSS、VMS和共享内存。我不太关心“之前”和“之后”的值,但只关心delta,所以我删除了那些(如果你是比较Ihor B。的回答)。

分析代码

# profile.py
import time
import os
import psutil
import inspect


def elapsed_since(start):
    #return time.strftime("%H:%M:%S", time.gmtime(time.time() - start))
    elapsed = time.time() - start
    if elapsed < 1:
        return str(round(elapsed*1000,2)) + "ms"
    if elapsed < 60:
        return str(round(elapsed, 2)) + "s"
    if elapsed < 3600:
        return str(round(elapsed/60, 2)) + "min"
    else:
        return str(round(elapsed / 3600, 2)) + "hrs"


def get_process_memory():
    process = psutil.Process(os.getpid())
    mi = process.memory_info()
    return mi.rss, mi.vms, mi.shared


def format_bytes(bytes):
    if abs(bytes) < 1000:
        return str(bytes)+"B"
    elif abs(bytes) < 1e6:
        return str(round(bytes/1e3,2)) + "kB"
    elif abs(bytes) < 1e9:
        return str(round(bytes / 1e6, 2)) + "MB"
    else:
        return str(round(bytes / 1e9, 2)) + "GB"


def profile(func, *args, **kwargs):
    def wrapper(*args, **kwargs):
        rss_before, vms_before, shared_before = get_process_memory()
        start = time.time()
        result = func(*args, **kwargs)
        elapsed_time = elapsed_since(start)
        rss_after, vms_after, shared_after = get_process_memory()
        print("Profiling: {:>20}  RSS: {:>8} | VMS: {:>8} | SHR {"
              ":>8} | time: {:>8}"
            .format("<" + func.__name__ + ">",
                    format_bytes(rss_after - rss_before),
                    format_bytes(vms_after - vms_before),
                    format_bytes(shared_after - shared_before),
                    elapsed_time))
        return result
    if inspect.isfunction(func):
        return wrapper
    elif inspect.ismethod(func):
        return wrapper(*args,**kwargs)

示例用法,假设上面的代码保存为profile.py:

from profile import profile
from time import sleep
from sklearn import datasets # Just an example of 3rd party function call


# Method 1
run_profiling = profile(datasets.load_digits)
data = run_profiling()

# Method 2
@profile
def my_function():
    # do some stuff
    a_list = []
    for i in range(1,100000):
        a_list.append(i)
    return a_list


res = my_function()

这将导致类似于下面的输出:

Profiling:        <load_digits>  RSS:   5.07MB | VMS:   4.91MB | SHR  73.73kB | time:  89.99ms
Profiling:        <my_function>  RSS:   1.06MB | VMS:   1.35MB | SHR       0B | time:   8.43ms

最后有几点重要的说明:

Keep in mind, this method of profiling is only going to be approximate, since lots of other stuff might be happening on the machine. Due to garbage collection and other factors, the deltas might even be zero. For some unknown reason, very short function calls (e.g. 1 or 2 ms) show up with zero memory usage. I suspect this is some limitation of the hardware/OS (tested on basic laptop with Linux) on how often memory statistics are updated. To keep the examples simple, I didn't use any function arguments, but they should work as one would expect, i.e. profile(my_function, arg) to profile my_function(arg)

如果你只想查看一个对象的内存使用情况,(回答其他问题)

有一个叫做Pympler的模块,它包含asizeof 模块。 使用方法如下: 从pympler进口asizeof asizeof.asizeof (my_object) 不像系统。Getsizeof,它适用于你自己创建的对象。 > > > asizeof.asizeof(元组(bcd)) 200 > > > asizeof。Asizeof ({'foo': 'bar', 'baz': 'bar'}) 400 > > > asizeof.asizeof ({}) 280 > > > asizeof.asizeof({“foo”:“酒吧”}) 360 > > > asizeof.asizeof(“foo”) 40 > > > asizeof.asizeof (Bar ()) 352 > > > asizeof.asizeof (Bar () . __dict__) 280

>>> help(asizeof.asizeof)
Help on function asizeof in module pympler.asizeof:

asizeof(*objs, **opts)
    Return the combined size in bytes of all objects passed as positional arguments.