我们都知道String在Java中是不可变的,但是检查下面的代码:
String s1 = "Hello World";
String s2 = "Hello World";
String s3 = s1.substring(6);
System.out.println(s1); // Hello World
System.out.println(s2); // Hello World
System.out.println(s3); // World
Field field = String.class.getDeclaredField("value");
field.setAccessible(true);
char[] value = (char[])field.get(s1);
value[6] = 'J';
value[7] = 'a';
value[8] = 'v';
value[9] = 'a';
value[10] = '!';
System.out.println(s1); // Hello Java!
System.out.println(s2); // Hello Java!
System.out.println(s3); // World
为什么这个程序是这样运行的?为什么s1和s2的值变了,而s3的值不变?
可见性修饰符和final(即不可变性)不是针对Java中的恶意代码的度量;它们仅仅是防止出现错误并使代码更具可维护性的工具(这是系统的一大卖点)。这就是为什么您可以通过反射访问内部实现细节,例如字符串的支持字符数组。
你看到的第二个效果是所有的字符串都改变了,而看起来你只改变了s1。它是Java字符串字面量的一个特定属性,它们被自动存储,即缓存。两个具有相同值的String字面值实际上是同一个对象。当你用new创建一个String时,它不会自动被实习生,你也不会看到这个效果。
#substring until recently (Java 7u6) worked in a similar way, which would have explained the behaviour in the original version of your question. It didn't create a new backing char array but reused the one from the original String; it just created a new String object that used an offset and a length to present only a part of that array. This generally worked as Strings are immutable - unless you circumvent that. This property of #substring also meant that the whole original String couldn't be garbage collected when a shorter substring created from it still existed.
在当前的Java和当前版本的问题中,#substring没有奇怪的行为。
根据池化的概念,所有包含相同值的String变量将指向相同的内存地址。因此,s1和s2都包含相同的" Hello World "值,将指向相同的内存位置(例如M1)。
另一方面,s3包含“World”,因此它将指向不同的内存分配(比如M2)。
所以现在所发生的是S1的值被改变了(通过使用char[]值)。因此,由s1和s2指向的内存位置M1的值被改变了。
因此,内存位置M1被修改,导致s1和s2的值发生变化。
但是位置M2的值保持不变,因此s3包含相同的原始值。
[免责声明,这是一个故意固执己见的回答风格,因为我觉得一个更“不要在家里这样做,孩子们”的回答是有保证的]
sin是line field.setAccessible(true);它说通过允许访问私有字段来违反公共API。这是一个巨大的安全漏洞,可以通过配置一个安全管理器来锁定。
The phenomenon in the question are implementation details which you would never see when not using that dangerous line of code to violate the access modifiers via reflection. Clearly two (normally) immutable strings can share the same char array. Whether a substring shares the same array depends on whether it can and whether the developer thought to share it. Normally these are invisible implementation details which you should not have to know unless you shoot the access modifier through the head with that line of code.
依赖这些细节并不是一个好主意,因为如果不违反使用反射的访问修饰符就无法体验这些细节。该类的所有者只支持普通的公共API,并且可以在将来自由地进行实现更改。
Having said all that the line of code is really very useful when you have a gun held you your head forcing you to do such dangerous things. Using that back door is usually a code smell that you need to upgrade to better library code where you don't have to sin. Another common use of that dangerous line of code is to write a "voodoo framework" (orm, injection container, ...). Many folks get religious about such frameworks (both for and against them) so I will avoid inviting a flame war by saying nothing other than the vast majority of programmers don't have to go there.