抛弃std::allocator以支持自定义解决方案的一些真正好的理由是什么?您是否遇到过这样的情况:它对于正确性、性能、可伸缩性等来说是绝对必要的?有什么聪明的例子吗?

自定义分配器一直是标准库的一个特性,但我并不太需要它。我只是想知道是否有人能提供一些令人信服的例子来证明他们的存在。


当前回答

一种基本情况:当编写必须跨模块(EXE/DLL)边界工作的代码时,必须保持分配和删除只发生在一个模块中。

我在Windows上的插件架构中遇到了这种情况。例如,如果你跨DLL边界传递一个std::string,任何字符串的重新分配都发生在它起源的堆中,而不是在DLL中的堆中,这可能是不同的*。

*实际上比这更复杂,如果你动态链接到CRT,这可能会工作。但是,如果每个DLL都有一个到CRT的静态链接,那么您将陷入痛苦的世界,在那里幻影分配错误不断发生。

其他回答

我正在使用一个自定义分配器来计算程序的一部分中的分配/释放的数量,并测量它需要多长时间。还有其他方法可以达到这个目的,但这个方法对我来说非常方便。特别有用的是,我只能对容器的一个子集使用自定义分配器。

One example of I time I have used these was working with very resource constrained embedded systems. Lets say you have 2k of ram free and your program has to use some of that memory. You need to store say 4-5 sequences somewhere that's not on the stack and additionally you need to have very precise access over where these things get stored, this is a situation where you might want to write your own allocator. The default implementations can fragment the memory, this might be unacceptable if you don't have enough memory and cannot restart your program.

One project I was working on was using AVR-GCC on some low powered chips. We had to store 8 sequences of variable length but with a known maximum. The standard library implementation of the memory management is a thin wrapper around malloc/free which keeps track of where to place items with by prepending every allocated block of memory with a pointer to just past the end of that allocated piece of memory. When allocating a new piece of memory the standard allocator has to walk over each of the pieces of memory to find the next block that is available where the requested size of memory will fit. On a desktop platform this would be very fast for this few items but you have to keep in mind that some of these microcontrollers are very slow and primitive in comparison. Additionally the memory fragmentation issue was a massive problem that meant we really had no choice but to take a different approach.

So what we did was to implement our own memory pool. Each block of memory was big enough to fit the largest sequence we would need in it. This allocated fixed sized blocks of memory ahead of time and marked which blocks of memory were currently in use. We did this by keeping one 8 bit integer where each bit represented if a certain block was used. We traded off memory usage here for attempting to make the whole process faster, which in our case was justified as we were pushing this microcontroller chip close to it's maximum processing capacity.

在嵌入式系统上下文中,我还可以看到编写自己的自定义分配器的其他情况,例如,如果序列的内存不在主ram中,而在这些平台上可能经常出现这种情况。

这里我使用的是自定义分配器;您甚至可以说它是为了绕过其他自定义动态内存管理。

背景:我们有malloc, calloc, free的重载,以及操作符new和delete的各种变体,并且链接器很高兴地让STL为我们使用这些。这让我们可以做一些事情,如自动小对象池,泄漏检测,分配填充,自由填充,填充分配与哨兵,缓存线对齐某些分配,和延迟释放。

问题是,我们正在一个嵌入式环境中运行——没有足够的内存来在一段较长的时间内正确地进行泄漏检测。至少,不是在标准RAM中——通过自定义分配函数,在其他地方还有另一堆RAM可用。

解决方案:编写一个使用扩展堆的自定义分配器,并且只在内存泄漏跟踪体系结构的内部使用它……其他所有内容默认为执行泄漏跟踪的普通新建/删除重载。这避免了跟踪器跟踪本身(并且提供了一些额外的打包功能,我们知道跟踪器节点的大小)。

出于同样的原因,我们也使用它来保存功能成本分析数据;为每个函数调用和返回编写一个条目,以及线程切换,成本会很快增加。自定义分配器再次在较大的调试内存区域中为我们提供较小的分配。

当使用gpu或其他协处理器时,以特殊的方式在主存中分配数据结构有时是有益的。这种特殊的内存分配方式可以在自定义分配器中以一种方便的方式实现。

在使用加速器时,通过加速器运行时进行自定义分配是有益的,原因如下:

through custom allocation the accelerator runtime or driver is notified of the memory block in addition the operating system can make sure that the allocated block of memory is page-locked (some call this pinned memory), that is, the virtual memory subsystem of the operating system may not move or remove the page within or from memory if 1. and 2. hold and a data transfer between a page-locked memory block and an accelerator is requested, the runtime can directly access the data in main memory since it knows where it is and it can be sure the operating system did not move/remove it this saves one memory copy that would occur with memory that was allocated in a non-page-locked way: the data has to be copied in main memory to a page-locked staging area from with the accelerator can initialize the data transfer (through DMA)

前一段时间我发现这个解决方案对我非常有用:STL容器的快速c++ 11分配器。它略微加快了VS2017上的STL容器(~5倍)以及GCC上的STL容器(~7倍)。它是一种基于内存池的特殊用途的分配器。它可以与STL容器一起使用,这多亏了您所要求的机制。