什么是内存碎片?

在c++动态内存分配的上下文中，我听说过几次“内存碎片”这个术语。我发现了一些关于如何处理内存碎片的问题，但找不到一个直接处理它本身的问题。所以:

什么是内存碎片? 如何判断内存碎片是否是我的应用程序的一个问题?什么样的程序最可能受到影响? 处理内存碎片的常用方法是什么?

另外:

我听说经常使用动态分配会增加内存碎片。这是真的吗?在c++环境中，我知道所有的标准容器(std::string, std::vector等)都使用动态内存分配。如果在整个程序中使用这些(特别是std::string)，内存碎片更可能是一个问题吗? 在stl较多的应用程序中如何处理内存碎片?

当前回答

什么样的程序最可能受到影响?

关于与内存碎片相关的问题，一个很好的(=可怕的)例子是Stardock的电脑游戏“Elemental: War of Magic”的开发和发行。

这款游戏是基于32位/2GB内存构建的，所以必须在内存管理方面进行大量优化，以确保游戏能够在这2GB内存中运行。由于“优化”导致不断分配和取消分配，随着时间的推移，堆内存碎片发生，导致游戏每次都崩溃。

YouTube上有一个“战争故事”采访。

2020-06-03 23:08:06

其他回答

内存碎片与磁盘碎片是同一个概念:它指的是由于正在使用的区域没有足够紧密地打包在一起而浪费的空间。

举个简单的例子，假设你有10个字节的内存:

 |   |   |   |   |   |   |   |   |   |   |
   0   1   2   3   4   5   6   7   8   9

现在让我们分配三个3字节的块，命名为A, B和C:

 | A | A | A | B | B | B | C | C | C |   |
   0   1   2   3   4   5   6   7   8   9

现在释放block B:

 | A | A | A |   |   |   | C | C | C |   |
   0   1   2   3   4   5   6   7   8   9

如果我们分配一个4字节的块D会发生什么?好吧，我们有四个字节的空闲内存，但是我们没有四个连续的空闲内存，所以我们不能分配D!这是对内存的低效使用，因为我们应该能够存储D，但我们做不到。我们不能移动C语言来腾出空间，因为程序中的一些变量很可能指向C语言，我们不能自动找到并更改所有这些值。

你怎么知道这是个问题?那么，最大的迹象就是程序的虚拟内存大小比实际使用的内存量大得多。在现实世界的示例中，您将拥有超过10个字节的内存，因此D将从字节9开始分配，而字节3-5将一直未使用，除非稍后分配长度为3字节或更小的内存。

在这个例子中，3个字节并不是很大的浪费，但是考虑一个更病态的情况，两个字节的分配，例如，内存中间隔10兆字节，而您需要分配一个大小为10兆字节+ 1字节的块。你必须要求操作系统提供超过10兆字节的虚拟内存，即使你只差一个字节就有足够的空间了。

How do you prevent it? The worst cases tend to arise when you frequently create and destroy small objects, since that tends to produce a "swiss cheese" effect with many small objects separated by many small holes, making it impossible to allocate larger objects in those holes. When you know you're going to be doing this, an effective strategy is to pre-allocate a large block of memory as a pool for your small objects, and then manually manage the creation of the small objects within that block, rather than letting the default allocator handle it.

In general, the fewer allocations you do, the less likely memory is to get fragmented. However, STL deals with this rather effectively. If you have a string which is using the entirety of its current allocation and you append one character to it, it doesn't simply re-allocate to its current length plus one, it doubles its length. This is a variation on the "pool for frequent small allocations" strategy. The string is grabbing a large chunk of memory so that it can deal efficiently with repeated small increases in size without doing repeated small reallocations. All STL containers in fact do this sort of thing, so generally you won't need to worry too much about fragmentation caused by automatically-reallocating STL containers.

当然，STL容器不会在彼此之间共享内存，所以如果你要创建许多小容器(而不是几个经常调整大小的容器)，你可能需要像处理任何经常创建的小对象(不管是不是STL)一样，注意防止碎片化。

2010-09-22 15:10:20

假设你有一个“大”(32字节)的空闲内存:

----------------------------------
|                                |
----------------------------------

现在，分配其中的一些(5个分配):

----------------------------------
|aaaabbccccccddeeee              |
----------------------------------

现在，释放前四个分配，但不释放第五个:

----------------------------------
|              eeee              |
----------------------------------

现在，尝试分配16个字节。哦，我不能，尽管有近两倍的免费。

在具有虚拟内存的系统上，碎片并不是您想象的那么大的问题，因为大的分配只需要在虚拟地址空间中连续，而不需要在物理地址空间中连续。所以在我的例子中，如果我有一个页面大小为2字节的虚拟内存，那么我可以毫无问题地分配16字节。物理内存看起来是这样的:

----------------------------------
|ffffffffffffffeeeeff            |
----------------------------------

而虚拟内存(要大得多)可能是这样的:

------------------------------------------------------...
|              eeeeffffffffffffffff                   
------------------------------------------------------...

内存碎片的典型症状是，您试图分配一个大块，但您不能，即使您看起来有足够的空闲内存。另一个可能的后果是进程无法将内存释放回操作系统(因为它从操作系统中分配给malloc等进行细分的每个大块中都有一些剩余的东西，即使每个块的大部分现在都没有使用)。

Tactics to prevent memory fragmentation in C++ work by allocating objects from different areas according to their size and/or their expected lifetime. So if you're going to create a lot of objects and destroy them all together later, allocate them from a memory pool. Any other allocations you do in between them won't be from the pool, hence won't be located in between them in memory, so memory will not be fragmented as a result. Or, if you're going to allocate a lot of objects of the same size then allocate them from the same pool. Then a stretch of free space in the pool can never be smaller than the size you're trying to allocate from that pool.

一般来说，您不需要太担心它，除非您的程序是长时间运行的，并且进行了大量的分配和释放。当您同时拥有短寿命和长寿命对象时，您的风险最大，但即使在这种情况下，malloc也会尽最大努力提供帮助。基本上，忽略它，直到您的程序出现分配失败或意外地导致系统内存不足(在测试中捕获它，这是首选!)。

标准库并不比其他任何分配内存的工具差，标准容器都有一个Alloc模板参数，如果绝对必要，您可以使用它来微调它们的分配策略。

2010-09-22 15:02:52

什么样的程序最可能受到影响?

关于与内存碎片相关的问题，一个很好的(=可怕的)例子是Stardock的电脑游戏“Elemental: War of Magic”的开发和发行。

YouTube上有一个“战争故事”采访。

2020-06-03 23:08:06

更新: 谷歌TCMalloc:线程缓存Malloc 已经发现它在处理长时间运行进程中的碎片方面相当出色。

我一直在开发一个服务器应用程序，它在HP-UX 11.23/11.31 ia64上存在内存碎片问题。

它是这样的。有一个进程进行内存分配和释放，并运行了几天。即使没有内存泄漏，进程的内存消耗也在不断增加。

About my experience. On HP-UX it is very easy to find memory fragmentation using HP-UX gdb. You set a break-point and when you hit it you run this command: info heap and see all memory allocations for the process and the total size of heap. Then your continue your program and then some time later your again hit the break-point. You do again info heap. If the total size of heap is bigger but the number and the size of separate allocations are the same then it is likely that you have memory allocation problems. If necessary do this check few fore times.

My way of improving the situation was this. After I had done some analysis with HP-UX gdb I saw that memory problems were caused by the fact that I used std::vector for storing some types of information from a database. std::vector requires that its data must be kept in one block. I had a few containers based on std::vector. These containers were regularly recreated. There were often situations when new records were added to the database and after that the containers were recreated. And since the recreated containers were bigger their did not fit into available blocks of free memory and the runtime asked for a new bigger block from the OS. As a result even though there were no memory leaks the memory consumption of the process grew. I improved the situation when I changed the containers. Instead of std::vector I started using std::deque which has a different way of allocating memory for data.

我知道在HP-UX上避免内存碎片的方法之一是使用小块分配器或使用MallocNextGen。在RedHat Linux上，默认的分配器似乎可以很好地处理大量小块的分配。在Windows上有低碎片堆，它解决了大量小分配的问题。

My understanding is that in an STL-heavy application you have first to identify problems. Memory allocators (like in libc) actually handle the problem of a lot of small allocations, which is typical for std::string (for instance in my server application there are lots of STL strings but as I see from running info heap they are not causing any problems). My impression is that you need to avoid frequent large allocations. Unfortunately there are situations when you can't avoid them and have to change your code. As I say in my case I improved the situation when switched to std::deque. If you identify your memory fragmention it might be possible to talk about it more precisely.

2010-09-22 15:46:53

当分配和释放许多大小不同的对象时，最可能发生内存碎片。假设你在内存中有如下布局:

obj1 (10kb) | obj2(20kb) | obj3(5kb) | unused space (100kb)

现在，当obj2被释放时，您有120kb的未使用内存，但是您不能分配120kb的完整块，因为内存是碎片化的。

避免这种影响的常用技术包括环形缓冲区和对象池。在STL的上下文中，像std::vector::reserve()这样的方法可以提供帮助。

2010-09-22 15:01:56

什么是内存碎片?

推荐文章

最新文章

标签