我有一个在Linux下运行的Java应用程序的问题。

当我使用默认的最大堆大小(64 MB)启动应用程序时,我看到使用tops应用程序为应用程序分配了240 MB的虚拟内存。这就给计算机上的其他一些软件带来了一些问题,这些软件的资源相对有限。

保留的虚拟内存无论如何都不会被使用,据我所知,因为一旦达到堆限制,就会抛出OutOfMemoryError错误。我在windows下运行相同的应用程序,我看到虚拟内存大小和堆大小是相似的。

是否有任何方式,我可以配置在Linux下的Java进程使用的虚拟内存?

编辑1:问题不在于堆。问题是,如果我设置一个128 MB的堆,Linux仍然分配210 MB的虚拟内存,这是不需要的,永远。**

编辑2:使用ulimit -v允许限制虚拟内存的数量。如果大小集低于204 MB,那么即使应用程序不需要204 MB,只需要64 MB,它也不会运行。所以我想了解为什么Java需要这么多虚拟内存。这种情况可以改变吗?

编辑3:系统中还运行着其他几个嵌入式应用程序。系统确实有一个虚拟内存限制(来自评论,重要细节)。


当前回答

这是Java长期以来的一个抱怨,但它在很大程度上是没有意义的,而且通常是基于查看错误的信息。通常的说法是“Java上的Hello World需要10兆字节!”它为什么需要这个?”好吧,这里有一种方法可以让Hello World在64位JVM上占用超过4g字节……至少从一种衡量方式来看是这样的。

java -Xms1024m -Xmx4096m com.example.Hello

测量内存的不同方法

在Linux上,top命令为内存提供了几个不同的数字。下面是关于Hello World的例子:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2120 kgregory  20   0 4373m  15m 7152 S    0  0.2   0:00.10 java

VIRT is the virtual memory space: the sum of everything in the virtual memory map (see below). It is largely meaningless, except when it isn't (see below). RES is the resident set size: the number of pages that are currently resident in RAM. In almost all cases, this is the only number that you should use when saying "too big." But it's still not a very good number, especially when talking about Java. SHR is the amount of resident memory that is shared with other processes. For a Java process, this is typically limited to shared libraries and memory-mapped JARfiles. In this example, I only had one Java process running, so I suspect that the 7k is a result of libraries used by the OS. SWAP isn't turned on by default, and isn't shown here. It indicates the amount of virtual memory that is currently resident on disk, whether or not it's actually in the swap space. The OS is very good about keeping active pages in RAM, and the only cures for swapping are (1) buy more memory, or (2) reduce the number of processes, so it's best to ignore this number.

Windows任务管理器的情况要复杂一些。在Windows XP下,有“内存使用”和“虚拟内存大小”列,但官方文档对它们的含义保持沉默。Windows Vista和Windows 7增加了更多的列,它们实际上是有文档的。其中,“工作集”测量是最有用的;它大致相当于Linux上RES和SHR的总和。

了解虚拟内存映射

The virtual memory consumed by a process is the total of everything that's in the process memory map. This includes data (eg, the Java heap), but also all of the shared libraries and memory-mapped files used by the program. On Linux, you can use the pmap command to see all of the things mapped into the process space (from here on out I'm only going to refer to Linux, because it's what I use; I'm sure there are equivalent tools for Windows). Here's an excerpt from the memory map of the "Hello World" program; the entire memory map is over 100 lines long, and it's not unusual to have a thousand-line list.

0000000040000000     36K r-x--  /usr/local/java/jdk-1.6-x64/bin/java
0000000040108000      8K rwx--  /usr/local/java/jdk-1.6-x64/bin/java
0000000040eba000    676K rwx--    [ anon ]
00000006fae00000  21248K rwx--    [ anon ]
00000006fc2c0000  62720K rwx--    [ anon ]
0000000700000000 699072K rwx--    [ anon ]
000000072aab0000 2097152K rwx--    [ anon ]
00000007aaab0000 349504K rwx--    [ anon ]
00000007c0000000 1048576K rwx--    [ anon ]
...
00007fa1ed00d000   1652K r-xs-  /usr/local/java/jdk-1.6-x64/jre/lib/rt.jar
...
00007fa1ed1d3000   1024K rwx--    [ anon ]
00007fa1ed2d3000      4K -----    [ anon ]
00007fa1ed2d4000   1024K rwx--    [ anon ]
00007fa1ed3d4000      4K -----    [ anon ]
...
00007fa1f20d3000    164K r-x--  /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
00007fa1f20fc000   1020K -----  /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
00007fa1f21fb000     28K rwx--  /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
...
00007fa1f34aa000   1576K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3634000   2044K -----  /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3833000     16K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3837000      4K rwx--  /lib/x86_64-linux-gnu/libc-2.13.so
...

简单解释一下格式:每一行都以段的虚拟内存地址开始。接下来是段大小、权限和段的源。最后一项是一个文件或“anon”,它表示通过mmap分配的内存块。

从上面开始,我们有

The JVM loader (ie, the program that gets run when you type java). This is very small; all it does is load in the shared libraries where the real JVM code is stored. A bunch of anon blocks holding the Java heap and internal data. This is a Sun JVM, so the heap is broken into multiple generations, each of which is its own memory block. Note that the JVM allocates virtual memory space based on the -Xmx value; this allows it to have a contiguous heap. The -Xms value is used internally to say how much of the heap is "in use" when the program starts, and to trigger garbage collection as that limit is approached. A memory-mapped JARfile, in this case the file that holds the "JDK classes." When you memory-map a JAR, you can access the files within it very efficiently (versus reading it from the start each time). The Sun JVM will memory-map all JARs on the classpath; if your application code needs to access a JAR, you can also memory-map it. Per-thread data for two threads. The 1M block is the thread stack. I didn't have a good explanation for the 4k block, but @ericsoe identified it as a "guard block": it does not have read/write permissions, so will cause a segment fault if accessed, and the JVM catches that and translates it to a StackOverFlowError. For a real app, you will see dozens if not hundreds of these entries repeated through the memory map. One of the shared libraries that holds the actual JVM code. There are several of these. The shared library for the C standard library. This is just one of many things that the JVM loads that are not strictly part of Java.

共享库特别有趣:每个共享库至少有两个段:一个只读段包含库代码,一个读写段包含库的全局每进程数据(我不知道没有权限的段是什么;我只在x64 Linux上见过)。库的只读部分可以在所有使用该库的进程之间共享;例如,libc有1.5M的虚拟内存空间可以共享。

什么时候虚拟内存大小很重要?

虚拟内存映射包含很多东西。其中一些是只读的,一些是共享的,还有一些是已分配但从未被触及的(例如,在本例中几乎所有的4Gb堆)。但是操作系统足够智能,只加载它需要的东西,所以虚拟内存大小在很大程度上是无关紧要的。

虚拟内存大小很重要的情况是,如果运行在32位操作系统上,则只能分配2Gb(某些情况下是3Gb)进程地址空间。在这种情况下,您正在处理稀缺资源,并且可能不得不做出权衡,例如为了内存映射一个大文件或创建大量线程而减小堆大小。

但是,考虑到64位计算机无处不在,我不认为虚拟内存大小将是一个完全无关的统计数据。

常驻集大小什么时候重要?

常驻集大小是RAM中实际存在的虚拟内存空间的一部分。如果您的RSS增长到总物理内存的很大一部分,那么可能是时候开始担心了。如果您的RSS增长到占用您所有的物理内存,并且您的系统开始交换,那么您早就该开始担心了。

但是RSS也会误导人,特别是在负载较轻的机器上。操作系统不需要花费大量精力来回收进程使用的页面。这样做几乎没有什么好处,而且如果将来流程接触页面,可能会出现代价高昂的页面错误。因此,RSS统计数据可能包括许多不活跃使用的页面。

底线

除非您正在进行交换,否则不要过度关注各种内存统计数据告诉您的信息。需要注意的是,不断增长的RSS可能表明某种类型的内存泄漏。

对于Java程序,关注堆中发生的事情要重要得多。所消耗的空间总量很重要,您可以采取一些步骤来减少空间总量。更重要的是在垃圾收集上花费的时间,以及收集堆的哪些部分。

访问磁盘(即数据库)是昂贵的,而内存是便宜的。如果你可以用其中一个来交换另一个,那就这样做。

其他回答

这是Java长期以来的一个抱怨,但它在很大程度上是没有意义的,而且通常是基于查看错误的信息。通常的说法是“Java上的Hello World需要10兆字节!”它为什么需要这个?”好吧,这里有一种方法可以让Hello World在64位JVM上占用超过4g字节……至少从一种衡量方式来看是这样的。

java -Xms1024m -Xmx4096m com.example.Hello

测量内存的不同方法

在Linux上,top命令为内存提供了几个不同的数字。下面是关于Hello World的例子:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2120 kgregory  20   0 4373m  15m 7152 S    0  0.2   0:00.10 java

VIRT is the virtual memory space: the sum of everything in the virtual memory map (see below). It is largely meaningless, except when it isn't (see below). RES is the resident set size: the number of pages that are currently resident in RAM. In almost all cases, this is the only number that you should use when saying "too big." But it's still not a very good number, especially when talking about Java. SHR is the amount of resident memory that is shared with other processes. For a Java process, this is typically limited to shared libraries and memory-mapped JARfiles. In this example, I only had one Java process running, so I suspect that the 7k is a result of libraries used by the OS. SWAP isn't turned on by default, and isn't shown here. It indicates the amount of virtual memory that is currently resident on disk, whether or not it's actually in the swap space. The OS is very good about keeping active pages in RAM, and the only cures for swapping are (1) buy more memory, or (2) reduce the number of processes, so it's best to ignore this number.

Windows任务管理器的情况要复杂一些。在Windows XP下,有“内存使用”和“虚拟内存大小”列,但官方文档对它们的含义保持沉默。Windows Vista和Windows 7增加了更多的列,它们实际上是有文档的。其中,“工作集”测量是最有用的;它大致相当于Linux上RES和SHR的总和。

了解虚拟内存映射

The virtual memory consumed by a process is the total of everything that's in the process memory map. This includes data (eg, the Java heap), but also all of the shared libraries and memory-mapped files used by the program. On Linux, you can use the pmap command to see all of the things mapped into the process space (from here on out I'm only going to refer to Linux, because it's what I use; I'm sure there are equivalent tools for Windows). Here's an excerpt from the memory map of the "Hello World" program; the entire memory map is over 100 lines long, and it's not unusual to have a thousand-line list.

0000000040000000     36K r-x--  /usr/local/java/jdk-1.6-x64/bin/java
0000000040108000      8K rwx--  /usr/local/java/jdk-1.6-x64/bin/java
0000000040eba000    676K rwx--    [ anon ]
00000006fae00000  21248K rwx--    [ anon ]
00000006fc2c0000  62720K rwx--    [ anon ]
0000000700000000 699072K rwx--    [ anon ]
000000072aab0000 2097152K rwx--    [ anon ]
00000007aaab0000 349504K rwx--    [ anon ]
00000007c0000000 1048576K rwx--    [ anon ]
...
00007fa1ed00d000   1652K r-xs-  /usr/local/java/jdk-1.6-x64/jre/lib/rt.jar
...
00007fa1ed1d3000   1024K rwx--    [ anon ]
00007fa1ed2d3000      4K -----    [ anon ]
00007fa1ed2d4000   1024K rwx--    [ anon ]
00007fa1ed3d4000      4K -----    [ anon ]
...
00007fa1f20d3000    164K r-x--  /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
00007fa1f20fc000   1020K -----  /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
00007fa1f21fb000     28K rwx--  /usr/local/java/jdk-1.6-x64/jre/lib/amd64/libjava.so
...
00007fa1f34aa000   1576K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3634000   2044K -----  /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3833000     16K r-x--  /lib/x86_64-linux-gnu/libc-2.13.so
00007fa1f3837000      4K rwx--  /lib/x86_64-linux-gnu/libc-2.13.so
...

简单解释一下格式:每一行都以段的虚拟内存地址开始。接下来是段大小、权限和段的源。最后一项是一个文件或“anon”,它表示通过mmap分配的内存块。

从上面开始,我们有

The JVM loader (ie, the program that gets run when you type java). This is very small; all it does is load in the shared libraries where the real JVM code is stored. A bunch of anon blocks holding the Java heap and internal data. This is a Sun JVM, so the heap is broken into multiple generations, each of which is its own memory block. Note that the JVM allocates virtual memory space based on the -Xmx value; this allows it to have a contiguous heap. The -Xms value is used internally to say how much of the heap is "in use" when the program starts, and to trigger garbage collection as that limit is approached. A memory-mapped JARfile, in this case the file that holds the "JDK classes." When you memory-map a JAR, you can access the files within it very efficiently (versus reading it from the start each time). The Sun JVM will memory-map all JARs on the classpath; if your application code needs to access a JAR, you can also memory-map it. Per-thread data for two threads. The 1M block is the thread stack. I didn't have a good explanation for the 4k block, but @ericsoe identified it as a "guard block": it does not have read/write permissions, so will cause a segment fault if accessed, and the JVM catches that and translates it to a StackOverFlowError. For a real app, you will see dozens if not hundreds of these entries repeated through the memory map. One of the shared libraries that holds the actual JVM code. There are several of these. The shared library for the C standard library. This is just one of many things that the JVM loads that are not strictly part of Java.

共享库特别有趣:每个共享库至少有两个段:一个只读段包含库代码,一个读写段包含库的全局每进程数据(我不知道没有权限的段是什么;我只在x64 Linux上见过)。库的只读部分可以在所有使用该库的进程之间共享;例如,libc有1.5M的虚拟内存空间可以共享。

什么时候虚拟内存大小很重要?

虚拟内存映射包含很多东西。其中一些是只读的,一些是共享的,还有一些是已分配但从未被触及的(例如,在本例中几乎所有的4Gb堆)。但是操作系统足够智能,只加载它需要的东西,所以虚拟内存大小在很大程度上是无关紧要的。

虚拟内存大小很重要的情况是,如果运行在32位操作系统上,则只能分配2Gb(某些情况下是3Gb)进程地址空间。在这种情况下,您正在处理稀缺资源,并且可能不得不做出权衡,例如为了内存映射一个大文件或创建大量线程而减小堆大小。

但是,考虑到64位计算机无处不在,我不认为虚拟内存大小将是一个完全无关的统计数据。

常驻集大小什么时候重要?

常驻集大小是RAM中实际存在的虚拟内存空间的一部分。如果您的RSS增长到总物理内存的很大一部分,那么可能是时候开始担心了。如果您的RSS增长到占用您所有的物理内存,并且您的系统开始交换,那么您早就该开始担心了。

但是RSS也会误导人,特别是在负载较轻的机器上。操作系统不需要花费大量精力来回收进程使用的页面。这样做几乎没有什么好处,而且如果将来流程接触页面,可能会出现代价高昂的页面错误。因此,RSS统计数据可能包括许多不活跃使用的页面。

底线

除非您正在进行交换,否则不要过度关注各种内存统计数据告诉您的信息。需要注意的是,不断增长的RSS可能表明某种类型的内存泄漏。

对于Java程序,关注堆中发生的事情要重要得多。所消耗的空间总量很重要,您可以采取一些步骤来减少空间总量。更重要的是在垃圾收集上花费的时间,以及收集堆的哪些部分。

访问磁盘(即数据库)是昂贵的,而内存是便宜的。如果你可以用其中一个来交换另一个,那就这样做。

Sun的java 1.4有以下参数来控制内存大小:

-Xmsn Specify the initial size, in bytes, of the memory allocation pool. This value must be a multiple of 1024 greater than 1MB. Append the letter k or K to indicate kilobytes, or m or M to indicate megabytes. The default value is 2MB. Examples: -Xms6291456 -Xms6144k -Xms6m -Xmxn Specify the maximum size, in bytes, of the memory allocation pool. This value must a multiple of 1024 greater than 2MB. Append the letter k or K to indicate kilobytes, or m or M to indicate megabytes. The default value is 64MB. Examples: -Xmx83886080 -Xmx81920k -Xmx80m

http://java.sun.com/j2se/1.4.2/docs/tooldocs/windows/java.html

Java 5和6有更多的功能。参见http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp

不能配置虚拟机所需内存数量。但是,请注意,这是虚拟内存,而不是常驻内存,因此如果不实际使用,它只会保持在那里而不会受到损害。

提醒一下,您可以尝试其他JVM,而不是Sun JVM,内存占用更小,但我不能在这里建议。

分配给Java进程的内存量与我所期望的相当。我在嵌入式/内存有限的系统上运行Java时也遇到过类似的问题。运行任何具有任意VM限制的应用程序,或者在没有足够交换量的系统上运行应用程序,都容易崩溃。这似乎是许多现代应用程序的本质,它们的设计并不适合在资源有限的系统上使用。

您还可以尝试其他一些选项来限制JVM的内存占用。这可能会减少虚拟内存占用:

- xx:ReservedCodeCacheSize=32m保留代码缓存大小(单位:字节)- maximum 代码缓存大小。(Solaris 64位, -server x86: 48m;在 1.5.0_06及以前版本,Solaris 64位和and64: 1024m。 -XX:MaxPermSize=64m永久代的大小。[5.0及更新版本: 64位虚拟机扩展30%;1.4 amd64: 96;1.3.1 -client: 32m.]

此外,还应该将-Xmx(最大堆大小)设置为尽可能接近应用程序实际内存使用峰值的值。我相信JVM的默认行为仍然是每次将堆大小扩展到最大时将堆大小增加一倍。如果你从32M的堆开始,你的应用程序的峰值是65M,那么堆最终会增长32M -> 64M -> 128M。

你也可以尝试这样做,让VM不那么积极地增长堆:

-XX:MinHeapFreeRatio=40垃圾回收后的最小堆空闲百分比 避免扩张。

另外,根据我几年前的实验,加载的本机库的数量对最小内存占用有很大的影响。如果我没记错的话(我可能记错了),加载java.net.Socket增加了超过15M。

Java和glibc >= 2.10(包括Ubuntu >= 10.04, RHEL >= 6)有一个已知的问题。

解药是把这个环境。变量:

export MALLOC_ARENA_MAX=4

如果您正在运行Tomcat,您可以将此添加到TOMCAT_HOME/bin/setenv.sh文件中。

对于Docker,将此添加到Dockerfile

ENV MALLOC_ARENA_MAX=4

IBM有一篇关于设置MALLOC_ARENA_MAX的文章 https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en

这篇博文说

常驻记忆以一种类似于 内存泄漏或内存碎片。

还有一个开放的JDK bug JDK-8193521“glibc默认配置浪费内存”

在谷歌或SO上搜索MALLOC_ARENA_MAX以获得更多参考。

你可能还想调优其他malloc选项,以优化低内存碎片分配:

# tune glibc memory allocation, optimize for low fragmentation
# limit the number of arenas
export MALLOC_ARENA_MAX=2
# disable dynamic mmap threshold, see M_MMAP_THRESHOLD in "man mallopt"
export MALLOC_MMAP_THRESHOLD_=131072
export MALLOC_TRIM_THRESHOLD_=131072
export MALLOC_TOP_PAD_=131072
export MALLOC_MMAP_MAX_=65536