参考透明度是什么意思?我曾听人描述它为“这意味着你可以用等号替换等号”,但这似乎是一个不充分的解释。


当前回答

下面的答案我希望能补充并限定有争议的第一个和第三个 的答案。

Let us grant that an expression denotes or refers to some referent. However, a question is whether these referents can be encoded isomorphically as part of expressions themselves, calling such expressions 'values'. For example, literal number values are a subset of the set of arithmetic expressions, truth values are a subset of the set of boolean expressions, etc. The idea is to evaluate an expression to its value (if it has one). So the word 'value' may refer to a denotation or to a distinguished element of the set of expressions. But if there is an isomorphism (a bijection) between the referent and the value we can say they are the same thing. (This said, one must be careful to define the referents and the isomorphism, as proven by the field of denotational semantics. To put an example mentioned by replies to the 3rd answer, the algebraic data type definition data Nat = Zero | Suc Nat does not correspond as expected to the set of natural numbers.)

让我们用E[·]表示一个带洞的表达式,这个表达式在某些地方也很常见 作为一个“上下文”。类c表达式的两个上下文示例是[·]+1和 (·)+ +。

让我们写[[·]]来表示接受一个表达式(不带空洞)的函数 并在某些方面传达其意义(指涉物、外延等) meaning-providing宇宙。(我借用了这个领域的符号 指涉语义学。)

让我们稍微正式地将奎因的定义改编如下: 如果给定任意两个表达式E1和E2(无孔 因此[[E1]] = [[E2]](即表达式表示/指的是 相同的referent)则[[E[E1]]] = [[E[E2]]](即填写 具有E1或E2的孔会导致表示相同的表达式 referent)。

Leibniz's rule of substituting equals for equals is typically expressed as 'if E1 = E2 then E[E1] = E[E2]', which says that E[·] is a function. A function (or for that matter a program computing the function) is a mapping from a source to a target so that there is at most one target element for each source element. Non-deterministic functions are misnomers, they are either relations, functions delivering sets, etc. If in Leibniz's rule the equality = is denotational then the double-brackets are simply taken for granted and elided. So a referentially transparent context is a function. And Leibniz's rule is the main ingredient of equational reasoning, so equational reasoning is definitely related to referential transparency.

虽然[[·]]是一个从表达式到表示法的函数,但它可以是一个 函数从表达式到“值”被理解为一个受限制的子集 表达式,和[[·]]可以理解为求值。

现在,如果E1是一个表达式,E2是一个值,我们就有了我认为大多数人在定义表达式、值和求值方面的引用透明性时的意思。但正如本页第1和第3个答案所说明的那样,这是一个不准确的定义。

像[·]++这样的上下文的问题不是副作用,而是它的值在C语言中的定义与它的含义不是同构的。函数是 不是值(好吧,指向函数的指针是),而在函数式编程语言中它们是。Landin, 斯特雷奇和表示性语义学的先驱们都很聪明 使用功能世界来提供意义。

对于命令式c类语言,我们可以(粗略地)提供语义 使用函数[[·]]的表达式:Expression -> (State -> State x Value)。

Value is a subset of Expression. State contains pairs (identifier,value). The semantic function takes an expression and delivers as its meaning a function from the current state to the pair with the updated state and a value. For example, [[x]] is the function from the current state to the pair whose first component is the current state and whose second component is the value of x. In contrast, [[x++]] is the function from the current state to the pair whose first component is a state in which the value of x is incremented, and whose second component is that very value. In this sense, the context [·]++ is referentially transparent iff it satisfies the definition given above.

我认为函数式程序员有权在 它们自然地将[[·]]作为函数从表达式恢复到值。 函数是一类值,状态也可以是值,而不是 外延。状态单子(在某种程度上)是一种用于传递(或 线程化)状态。

其他回答

The term "referential transparency" comes from analytical philosophy, the branch of philosophy that analyzes natural language constructs, statements and arguments based on the methods of logic and mathematics. In other words, it is the closest subject outside computer science to what we call programming language semantics. The philosopher Willard Quine was responsible for initiating the concept of referential transparency, but it was also implicit in the approaches of Bertrand Russell and Alfred Whitehead.

就其核心而言,“参考透明度”是一个非常简单明了的概念。“指涉物”一词在分析哲学中用来谈论一个表达所指代的事物。它与我们在编程语言语义中所说的“意义”或“外延”大致相同。以Andrew Birkett的博客文章为例,“苏格兰的首都”指的是爱丁堡市。这是“referent”的一个简单例子。

一个句子中的上下文是“引用透明的”,如果用另一个引用同一实体的术语替换该上下文中的一个术语不会改变其含义。例如

苏格兰议会在苏格兰首都开会。

意思和

苏格兰议会在爱丁堡开会。

因此,“苏格兰议会在……开会”是一个指涉透明的上下文。我们可以把“苏格兰的首府”换成“爱丁堡”而不改变它的意思。换句话说,上下文只关心术语所指的内容,而不关心其他内容。也就是说,上下文是“引用透明的”。

另一方面,在句子中,

自1999年以来,爱丁堡一直是苏格兰的首府。

我们不能做这样的替换。如果我们这样做,我们会得到“Edinburgh has been Edinburgh since 1999”,这是一个疯狂的说法,并且不能传达与原句子相同的意思。所以,“Edinburgh has been…”“自1999年以来”是指不透明的(指透明的反义词)。显然,它关心的东西比这个词所指的东西更重要。是什么?

像“苏格兰的首都”这样的词被称为“限定名词”,在很长一段时间里,它们并没有让逻辑学家和哲学家感到头痛。Russell和Quine把它们整理出来,说它们实际上不是“指涉的”,也就是说,认为上面的例子是用来指实体的是错误的。理解“爱丁堡自1999年以来一直是苏格兰的首都”的正确方法是说

苏格兰自1999年以来就有了首都,那就是爱丁堡。

这个句子不能变成一个疯狂的句子。问题解决了!奎因的观点是,自然语言是混乱的,或至少是复杂的,因为它是为了方便实际使用而设计的,但哲学家和逻辑学家应该通过正确的方式理解它们,从而使它们变得清晰。参考透明度是一种工具,用于带来这种意义的清晰度。

What does all this have to do with programming? Not very much, actually. As we said, referential transparency is a tool to be used in understanding language, i.e., in assigning meaning. Christopher Strachey, who founded the field of programming language semantics, used it in his study of meaning. His foundational paper "Fundamental concepts in programming languages" is available on the web. It is a beautiful paper and everybody can read and understand it. So, please do so. You will be much enlightened. He introduces the term "referential transparency" in this paragraph:

One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value. Any other features of the sub-expression, such as its internal structure, the number and nature of its components, the order in which they are evaluated or the colour of the ink in which they are written, are irrelevant to the value of the main expression.

The use of "in essence" suggests that Strachey is paraphrasing it in order to explain it in simple terms. Functional programmers seem to understand this paragraph in their own way. There are 9 other occurrences of "referential transparency" in the paper, but they don't seem to bother about any of the others. In fact, the whole paper of Strachey is devoted to explaining the meaning of imperative programming languages. But, today, functional programmers claim that imperative programming languages are not referentially transparent. Strachey would be turning in his grave.

We can salvage the situation. We said that natural language is "messy, or at least complicated" because it is made to be convenient for practical use. Programming languages are the same way. They are "messy, or at least complicated" because they are made to be convenient for practical use. That does not mean that they need to confuse us. They just have to be understood the right way, using a meta language that is referentially transparent so that we have clarity of meaning. In the paper I cited, Strachey does exactly that. He explains the meaning of imperative programming languages by breaking them down into elementary concepts, never losing clarity anywhere. An important part of his analysis is to point out that expressions in programming languages have two kinds of "values", called l-values and r-values. Before Strachey's paper, this was not understood and confusion reigned supreme. Today, the definition of C mentions it routinely and every C programmer understands the distinction. (Whether the programmers in other languages understand it equally well is hard to say.)

Both Quine and Strachey were concerned with the meaning of language constructions that involve some form of context-dependence. For example, our example "Edinburgh has been the capital of Scotland since 1999" signifies the fact that "capital of Scotland" depends on the time at which it is being considered. Such context-dependence is a reality, both in natural languages and programming languages. Even in functional programming, free and bound variables are to be interpreted with respect to the context in which they appear in. Context dependence of any kind blocks referential transparency in some way or the other. If you try to understand the meaning of terms without regard to the contexts they depend on, you would again end up with confusion. Quine was concerned with the meaning of modal logic. He held that modal logic was referentially opaque and it should be cleaned up by translating it into a referentially transparent framework (e.g., by regarding necessity as provability). He largely lost this debate. Logicians and philosophers alike found Kripke's possible world semantics to be perfectly adequate. Similar situation also reigns with imperative programming. State-dependence explained by Strachey and store-dependence explained by Reynolds (in a manner similar to Kripke's possible world semantics) are perfectly adequate. Functional programmers don't know much of this research. Their ideas on referential transparency are to be taken with a large grain of salt.

[Additional note: The examples above illustrate that a simple phrase such as "capital of Scotland" has multiple levels of meaning. At one level, we might be talking about the capital at the current time. At another level, we might talking about all possible capitals that Scotland might have had through the course of time. We can "zoom into" a particular context and "zoom out" to span all contexts quite easily in normal practice. The efficiency of natural language makes use of our ability to do so. Imperative programming languages are efficient in very much the same way. We can use a variable x on the right hand side of an assignment (the r-value) to talk about its value in a particular state. Or, we might talk about its l-value which spans all states. People are rarely confused by such things. However, they may or may not be able to precisely explain all the layers of meaning inherent in language constructs. All such layers of meaning are not necessarily 'obvious' and it is a matter of science to study them properly. However, the inarticulacy of ordinary people to explain such layered meanings doesn't imply that they are confused about them.]

下面的一个单独的“后记”将这个讨论与函数式编程和命令式编程的关注点联系起来。

引用透明函数是只依赖于其输入的函数。

下面的答案我希望能补充并限定有争议的第一个和第三个 的答案。

Let us grant that an expression denotes or refers to some referent. However, a question is whether these referents can be encoded isomorphically as part of expressions themselves, calling such expressions 'values'. For example, literal number values are a subset of the set of arithmetic expressions, truth values are a subset of the set of boolean expressions, etc. The idea is to evaluate an expression to its value (if it has one). So the word 'value' may refer to a denotation or to a distinguished element of the set of expressions. But if there is an isomorphism (a bijection) between the referent and the value we can say they are the same thing. (This said, one must be careful to define the referents and the isomorphism, as proven by the field of denotational semantics. To put an example mentioned by replies to the 3rd answer, the algebraic data type definition data Nat = Zero | Suc Nat does not correspond as expected to the set of natural numbers.)

让我们用E[·]表示一个带洞的表达式,这个表达式在某些地方也很常见 作为一个“上下文”。类c表达式的两个上下文示例是[·]+1和 (·)+ +。

让我们写[[·]]来表示接受一个表达式(不带空洞)的函数 并在某些方面传达其意义(指涉物、外延等) meaning-providing宇宙。(我借用了这个领域的符号 指涉语义学。)

让我们稍微正式地将奎因的定义改编如下: 如果给定任意两个表达式E1和E2(无孔 因此[[E1]] = [[E2]](即表达式表示/指的是 相同的referent)则[[E[E1]]] = [[E[E2]]](即填写 具有E1或E2的孔会导致表示相同的表达式 referent)。

Leibniz's rule of substituting equals for equals is typically expressed as 'if E1 = E2 then E[E1] = E[E2]', which says that E[·] is a function. A function (or for that matter a program computing the function) is a mapping from a source to a target so that there is at most one target element for each source element. Non-deterministic functions are misnomers, they are either relations, functions delivering sets, etc. If in Leibniz's rule the equality = is denotational then the double-brackets are simply taken for granted and elided. So a referentially transparent context is a function. And Leibniz's rule is the main ingredient of equational reasoning, so equational reasoning is definitely related to referential transparency.

虽然[[·]]是一个从表达式到表示法的函数,但它可以是一个 函数从表达式到“值”被理解为一个受限制的子集 表达式,和[[·]]可以理解为求值。

现在,如果E1是一个表达式,E2是一个值,我们就有了我认为大多数人在定义表达式、值和求值方面的引用透明性时的意思。但正如本页第1和第3个答案所说明的那样,这是一个不准确的定义。

像[·]++这样的上下文的问题不是副作用,而是它的值在C语言中的定义与它的含义不是同构的。函数是 不是值(好吧,指向函数的指针是),而在函数式编程语言中它们是。Landin, 斯特雷奇和表示性语义学的先驱们都很聪明 使用功能世界来提供意义。

对于命令式c类语言,我们可以(粗略地)提供语义 使用函数[[·]]的表达式:Expression -> (State -> State x Value)。

Value is a subset of Expression. State contains pairs (identifier,value). The semantic function takes an expression and delivers as its meaning a function from the current state to the pair with the updated state and a value. For example, [[x]] is the function from the current state to the pair whose first component is the current state and whose second component is the value of x. In contrast, [[x++]] is the function from the current state to the pair whose first component is a state in which the value of x is incremented, and whose second component is that very value. In this sense, the context [·]++ is referentially transparent iff it satisfies the definition given above.

我认为函数式程序员有权在 它们自然地将[[·]]作为函数从表达式恢复到值。 函数是一类值,状态也可以是值,而不是 外延。状态单子(在某种程度上)是一种用于传递(或 线程化)状态。

当我读到被接受的答案时,我以为我在不同的页面上,而不是在stackoverflow上。

引用透明性是定义纯函数的一种更正式的方式。因此,如果一个函数在相同的输入上始终产生相同的结果,那么它就是引用透明的。

let counter=0
function count(){
  return counter++
}

这不是引用透明的,因为返回值取决于外部变量“counter”,并且它一直在变化。

这是我们如何使它的参考透明:

function count(counter){
       return counter+1
   }

现在这个函数是稳定的,并且在提供相同的输入时总是返回相同的输出。

引用透明性可以简单地表述为:

一个表达式在任何上下文中总是求相同的结果[1], 一个函数,如果给定相同的参数两次,必须产生相同的结果两次。

例如,编程语言Haskell是一种纯函数式语言;这意味着它是引用透明的。