Python的“虚拟机”似乎很少读到,而在Java中“虚拟机”一直被使用。
两者都解释字节码;为什么一个叫虚拟机,另一个叫解释器?
Python的“虚拟机”似乎很少读到,而在Java中“虚拟机”一直被使用。
两者都解释字节码;为什么一个叫虚拟机,另一个叫解释器?
当前回答
A virtual machine is a virtual computing environment with a specific set of atomic well defined instructions that are supported independent of any specific language and it is generally thought of as a sandbox unto itself. The VM is analogous to an instruction set of a specific CPU and tends to work at a more fundamental level with very basic building blocks of such instructions (or byte codes) that are independent of the next. An instruction executes deterministically based only on the current state of the virtual machine and does not depend on information elsewhere in the instruction stream at that point in time.
另一方面,解释器更复杂,因为它是为解析特定语言和特定语法的某些语法流而定制的,这些语法必须在周围标记的上下文中进行解码。您不能单独地查看每个字节甚至每一行,然后确切地知道下一步该做什么。语言中的令牌不能像相对于VM的指令(字节码)那样孤立地获取。
Java编译器将Java语言转换为字节码流,与C编译器将C语言程序转换为汇编代码没有什么不同。另一方面,解释器并不真正将程序转换为任何定义良好的中间形式,它只是将程序操作作为解释源代码的过程。
VM和解释器区别的另一个测试是你是否认为它是独立于语言的。我们所知道的Java虚拟机并不是Java特有的。您可以使用其他语言制作编译器,生成可以在JVM上运行的字节代码。另一方面,我不认为我们真的会考虑将Python以外的其他语言“编译”为Python以供Python解释器解释。
Because of the sophistication of the interpretation process, this can be a relatively slow process....specifically parsing and identifying the language tokens, etc. and understanding the context of the source to be able to undertake the execution process within the interpreter. To help accelerate such interpreted languages, this is where we can define intermediate forms of pre-parsed, pre-tokenized source code that is more readily directly interpreted. This sort of binary form is still interpreted at execution time, it is just starting from a much less human readable form to improve performance. However, the logic executing that form is not a virtual machine, because those codes still can't be taken in isolation - the context of the surrounding tokens still matter, they are just now in a different more computer efficient form.
其他回答
他们之间没有真正的区别,人们只是遵循创造者选择的惯例。
HotSpot运行时被称为虚拟机,而CPython仅仅被称为解释器,这可能是有原因的
首先,CPython只是普通的、基于栈的字节码解释器。你向它输入Python操作码,CPython内部的软件堆栈机器就会计算你的代码,就像普通的解释器一样。
The Java HotSpot runtime is different. First and foremost, Java has 3 Just-in Time Compilers, C1, C2, and an experimental one that isn't in use yet. But that's not the main reason. The Interpreter inside the JVM is a very special kind of Interpreter called a Template Interpreter. Instead of just executing bytecode directly in a massive opcode switch case statement like CPython (And really almost every other interpreter does) does, the Template Interpreter inside the JVM contains an enormous arraylist. What does it contain? Key-value pairs of bytecodes and native CPU instructions! The arraylist is empty on startup and is filled with mappings of bytecodes pointing to native machine language to be directly run on the hardware just before your application starts up, what this means is that the "Interpreter" inside the JVM isn't actually an Interpreter at all- It's actually a discount Compiler! When Java bytecode is run, the "Interpreter" simply maps the input bytecode directly to native machine language and executes the native mapping directly, rather than implementing it in software. I'm not exactly sure why the JVM was made this way, but I suspect it was to easily execute "Interpreted" Code together with JIT Compiled Code seamlessly, and for speed/performance. If you pitted the JVM without JIT against CPython or most other interpreters it would still probably come out ahead of them, in virtue of its ingenious design which to my knowledge no other language has used before.
A virtual machine is a virtual computing environment with a specific set of atomic well defined instructions that are supported independent of any specific language and it is generally thought of as a sandbox unto itself. The VM is analogous to an instruction set of a specific CPU and tends to work at a more fundamental level with very basic building blocks of such instructions (or byte codes) that are independent of the next. An instruction executes deterministically based only on the current state of the virtual machine and does not depend on information elsewhere in the instruction stream at that point in time.
另一方面,解释器更复杂,因为它是为解析特定语言和特定语法的某些语法流而定制的,这些语法必须在周围标记的上下文中进行解码。您不能单独地查看每个字节甚至每一行,然后确切地知道下一步该做什么。语言中的令牌不能像相对于VM的指令(字节码)那样孤立地获取。
Java编译器将Java语言转换为字节码流,与C编译器将C语言程序转换为汇编代码没有什么不同。另一方面,解释器并不真正将程序转换为任何定义良好的中间形式,它只是将程序操作作为解释源代码的过程。
VM和解释器区别的另一个测试是你是否认为它是独立于语言的。我们所知道的Java虚拟机并不是Java特有的。您可以使用其他语言制作编译器,生成可以在JVM上运行的字节代码。另一方面,我不认为我们真的会考虑将Python以外的其他语言“编译”为Python以供Python解释器解释。
Because of the sophistication of the interpretation process, this can be a relatively slow process....specifically parsing and identifying the language tokens, etc. and understanding the context of the source to be able to undertake the execution process within the interpreter. To help accelerate such interpreted languages, this is where we can define intermediate forms of pre-parsed, pre-tokenized source code that is more readily directly interpreted. This sort of binary form is still interpreted at execution time, it is just starting from a much less human readable form to improve performance. However, the logic executing that form is not a virtual machine, because those codes still can't be taken in isolation - the context of the surrounding tokens still matter, they are just now in a different more computer efficient form.
我认为两者之间的界限是模糊的,人们大多争论的是“解释器”这个词的含义,以及语言与“解释器……编译器”范围的每一方有多接近。然而,没有一个是100%的。我认为编写Java或Python实现是很容易的,这是频谱的任何价值。
目前Java和Python都有虚拟机和字节码,尽管一个操作具体的值大小(如32位整数),而另一个必须确定每次调用的大小,在我看来,这并没有定义术语之间的边界。
Python没有正式定义的字节码,它只存在于内存中,这一论点也不能说服我,只是因为我计划开发只识别Python字节码的设备,编译部分将在浏览器JS机器中完成。
性能只与具体的实现有关。我们不需要知道对象的大小就能处理它,最后,在大多数情况下,我们处理的是结构,而不是基本类型。可以通过重用现有对象来优化Python VM,从而消除每次在表达式计算期间创建新对象的需要。一旦完成,在计算两个整数的和之间没有全局性能差异,这是Java的闪光点。
两者之间没有致命的区别,只有一些与最终用户无关的实现上的细微差别和缺乏优化,可能在她开始注意到性能滞后的时候,但这又是实现而不是架构的问题。
for posts that mention that python does not need to generate byte code, I'm not sure that's true. it seems that all callables in Python must have a .__code__.co_code attribute which contains the byte code. I don't see a meaningful reason to call python "not compiled" just because the compiled artifacts may not be saved; and often aren't saved by design in Python, for example all comprehension compile new bytecode for it's input, this is the reason comprehension variable scope is not consistent between compile(mode='exec, ...) and compile compile(mode='single', ...) such as between running a python script and using pdb