Python是一种解释性语言。但是为什么我的源目录包含。pyc文件,这些文件被Windows识别为“编译过的Python文件”?


当前回答

Python代码经过两个阶段。第一步将代码编译成.pyc文件,这实际上是一个字节码。然后使用CPython解释器解释这个.pyc文件(字节码)。请参考此连结。这里用简单的术语解释了代码编译和执行的过程。

其他回答

我是这么理解的 Python是一种解释性语言…

这个流行的迷因是不正确的,或者,更确切地说,是建立在对(自然)语言水平的误解之上的:类似的错误会说“圣经是一本精装书”。让我来解释一下这个比喻……

"The Bible" is "a book" in the sense of being a class of (actual, physical objects identified as) books; the books identified as "copies of the Bible" are supposed to have something fundamental in common (the contents, although even those can be in different languages, with different acceptable translations, levels of footnotes and other annotations) -- however, those books are perfectly well allowed to differ in a myriad of aspects that are not considered fundamental -- kind of binding, color of binding, font(s) used in the printing, illustrations if any, wide writable margins or not, numbers and kinds of builtin bookmarks, and so on, and so forth.

It's quite possible that a typical printing of the Bible would indeed be in hardcover binding -- after all, it's a book that's typically meant to be read over and over, bookmarked at several places, thumbed through looking for given chapter-and-verse pointers, etc, etc, and a good hardcover binding can make a given copy last longer under such use. However, these are mundane (practical) issues that cannot be used to determine whether a given actual book object is a copy of the Bible or not: paperback printings are perfectly possible!

Similarly, Python is "a language" in the sense of defining a class of language implementations which must all be similar in some fundamental respects (syntax, most semantics except those parts of those where they're explicitly allowed to differ) but are fully allowed to differ in just about every "implementation" detail -- including how they deal with the source files they're given, whether they compile the sources to some lower level forms (and, if so, which form -- and whether they save such compiled forms, to disk or elsewhere), how they execute said forms, and so forth.

The classical implementation, CPython, is often called just "Python" for short -- but it's just one of several production-quality implementations, side by side with Microsoft's IronPython (which compiles to CLR codes, i.e., ".NET"), Jython (which compiles to JVM codes), PyPy (which is written in Python itself and can compile to a huge variety of "back-end" forms including "just-in-time" generated machine language). They're all Python (=="implementations of the Python language") just like many superficially different book objects can all be Bibles (=="copies of The Bible").

If you're interested in CPython specifically: it compiles the source files into a Python-specific lower-level form (known as "bytecode"), does so automatically when needed (when there is no bytecode file corresponding to a source file, or the bytecode file is older than the source or compiled by a different Python version), usually saves the bytecode files to disk (to avoid recompiling them in the future). OTOH IronPython will typically compile to CLR codes (saving them to disk or not, depending) and Jython to JVM codes (saving them to disk or not -- it will use the .class extension if it does save them).

这些较低级别的表单然后由适当的“虚拟机”(也称为“解释器”)执行——CPython VM、. net运行时、Java VM(又名JVM)。

因此,在这个意义上(典型的实现是做什么的),Python是一种“解释型语言”,当且仅当c#和Java是:它们都有一个典型的实现策略,首先产生字节码,然后通过VM/解释器执行它。

More likely the focus is on how "heavy", slow, and high-ceremony the compilation process is. CPython is designed to compile as fast as possible, as lightweight as possible, with as little ceremony as feasible -- the compiler does very little error checking and optimization, so it can run fast and in small amounts of memory, which in turns lets it be run automatically and transparently whenever needed, without the user even needing to be aware that there is a compilation going on, most of the time. Java and C# typically accept more work during compilation (and therefore don't perform automatic compilation) in order to check errors more thoroughly and perform more optimizations. It's a continuum of gray scales, not a black or white situation, and it would be utterly arbitrary to put a threshold at some given level and say that only above that level you call it "compilation"!-)

它们包含字节代码,Python解释器将源代码编译为字节代码。这段代码随后由Python的虚拟机执行。

Python的文档是这样解释定义的:

Python是一种解释性语言,例如 而不是编译的 区别可能是模糊的,因为 字节码编译器的存在。 这意味着源文件可以 直接运行而不显式地运行 创建一个可执行文件,然后 运行。

语言规范与语言实现的重要区别是:

语言规范只是带有语言正式规范的文档,具有上下文无关的语法和语义规则定义(如指定基本类型和作用域动态)。 语言实现只是一个程序(编译器),它根据语言的规范实现语言的使用。

Any compiler consists of two independent parts: a frontend and backend. The frontend receives the source code, validate it and translate it into an intermediate code. After that, a backend translate it to machine code to run in a physical or a virtual machine. An interpreter is a compiler, but in this case it can produce a way of executing the intermediate code directly in a virtual machine. To execute python code, its necessary transform the code in a intermediate code, after that the code is then "assembled" as bytecode that can be stored in a file.pyc, so no need to compile modules of a program every time you run it. You can view this assembled python code using:

from dis import dis
def a(): pass

dis(a)

任何人都可以用Python语言构建静态二进制的编译器,就像可以构建C语言的解释器一样。有一些工具(lex/yacc)可以简化编译器的构建过程并使之自动化。

Python(至少是它最常见的实现)遵循将原始源代码编译为字节码的模式,然后在虚拟机上解释字节码。这意味着(同样,最常见的实现)既不是纯解释器也不是纯编译器。

然而,另一方面,编译过程基本上是隐藏的——.pyc文件基本上被视为缓存;它们能加快速度,但你通常根本不需要注意到它们。它在必要时根据文件时间/日期戳自动使它们失效并重新加载(重新编译源代码)。

我所见过的唯一一次问题是,编译后的字节码文件以某种方式获得了未来的时间戳,这意味着它看起来总是比源文件更新。因为它看起来更新,源文件从来没有重新编译过,所以无论你做了什么更改,它们都被忽略了…

没有解释性语言这种东西。使用解释器还是编译器纯粹是实现的特性,与语言完全无关。

每种语言都可以由解释器或编译器实现。绝大多数语言都至少有每种类型的一个实现。(例如,C和c++有解释器,JavaScript、PHP、Perl、Python和Ruby有编译器。)此外,大多数现代语言实现实际上结合了解释器和编译器(甚至多个编译器)。

A language is just a set of abstract mathematical rules. An interpreter is one of several concrete implementation strategies for a language. Those two live on completely different abstraction levels. If English were a typed language, the term "interpreted language" would be a type error. The statement "Python is an interpreted language" is not just false (because being false would imply that the statement even makes sense, even if it is wrong), it just plain doesn't make sense, because a language can never be defined as "interpreted."

特别是,如果你看一下当前现有的Python实现,这些是他们正在使用的实现策略:

IronPython: compiles to DLR trees which the DLR then compiles to CIL bytecode. What happens to the CIL bytecode depends upon which CLI VES you are running on, but Microsoft .NET, GNU Portable.NET and Novell Mono will eventually compile it to native machine code. Jython: interprets Python sourcecode until it identifies the hot code paths, which it then compiles to JVML bytecode. What happens to the JVML bytecode depends upon which JVM you are running on. Maxine will directly compile it to un-optimized native code until it identifies the hot code paths, which it then recompiles to optimized native code. HotSpot will first interpret the JVML bytecode and then eventually compile the hot code paths to optimized machine code. PyPy: compiles to PyPy bytecode, which then gets interpreted by the PyPy VM until it identifies the hot code paths which it then compiles into native code, JVML bytecode or CIL bytecode depending on which platform you are running on. CPython: compiles to CPython bytecode which it then interprets. Stackless Python: compiles to CPython bytecode which it then interprets. Unladen Swallow: compiles to CPython bytecode which it then interprets until it identifies the hot code paths which it then compiles to LLVM IR which the LLVM compiler then compiles to native machine code. Cython: compiles Python code to portable C code, which is then compiled with a standard C compiler Nuitka: compiles Python code to machine-dependent C++ code, which is then compiled with a standard C compiler

您可能会注意到,该列表中的每一个实现(加上我没有提到的其他一些实现,如tinypy、Shedskin或Psyco)都有一个编译器。事实上,据我所知,目前还没有纯解释的Python实现,没有这样的实现计划,也从来没有这样的实现。

不仅术语“解释语言”没有意义,即使你将其理解为“带有解释实现的语言”,这显然是不正确的。不管是谁告诉你的,显然他不知道自己在说什么。

特别是,您看到的.pyc文件是由CPython、Stackless Python或Unladen Swallow生成的缓存字节码文件。