这里有一些看不见的字符可以改变代码的显示方式。在Intellij中,可以通过将代码复制粘贴到空字符串("")中来找到这些字符,这将用Unicode转义替换它们,删除它们的效果并显示编译器看到的顺序。
下面是复制粘贴的输出:
"class M\u202E{public static void main(String[]a\u202D){System.out.print(new char[]\n"+
"{'H','e','l','l','o',' ','W','o','r','l','d','!'});}} "
源代码字符按此顺序存储,编译器也按此顺序处理它们,但它们的显示方式不同。
请注意\u202E字符,这是一个从右到左的覆盖,它开始一个块,其中所有字符都被强制从右到左显示;\u202D是一个从左到右的覆盖,它开始一个嵌套块,其中所有字符都被强制从左到右的顺序,覆盖第一个覆盖。
Ergo, when it displays the original code, class M is displayed normally, but the \u202E reverses the display order of everything from there to the \u202D, which reverses everything again. (Formally, everything from the \u202D to the line terminator gets reversed twice, once due to the \u202D and once with the rest of the text reversed due to the \u202E, which is why this text shows up in the middle of the line instead of the end.) The next line's directionality is handled independently of the first's due to the line terminator, so {'H','e','l','l','o',' ','W','o','r','l','d','!'});}} is displayed normally.
完整的Unicode双向算法(非常复杂,长达数十页),请参见Unicode标准附录#9。