c++是上下文无关的还是上下文敏感的?

我经常听到有人说c++是一种上下文敏感的语言。举个例子:

a b(c);

这是变量定义还是函数声明?这取决于符号c的含义。如果c是一个变量，则ab (c);定义一个名为b的类型为a的变量。它直接用c初始化。但如果c是类型，则ab (c);声明一个名为b的函数，该函数接受c并返回a。

如果您查找上下文无关语言的定义，它基本上会告诉您，所有语法规则的左侧必须恰好包含一个非终结符。另一方面，上下文敏感语法允许在左侧使用任意的终结符和非终结符字符串。

浏览“c++程序设计语言”的附录A，我找不到一条语法规则，它的左边除了一个非终结符之外，还有其他任何东西。这意味着c++是上下文无关的。(当然，每一种与上下文无关的语言也是与上下文相关的，因为与上下文无关的语言构成了与上下文相关的语言的一个子集，但这不是重点。)

那么，c++是上下文无关的还是上下文敏感的?

当前回答

Meta-S”是Quinn Tyler Jackson开发的上下文敏感解析引擎。我没有用过，但他讲了一个令人印象深刻的故事。在comp.compilers中查看他的评论，并查看rnaparse.com/MetaS%20defined.htm - Ira Baxter 7月25日10:42

正确的环节是解析谜题

Meta-S是一家已经倒闭的tho栖公司的财产。我可以将Meta-S的免费副本发送给任何感兴趣的人，我已经在rna解析研究中使用了它。请注意，示例文件夹中包含的“伪结语法”是由一个非生物信息学的、成熟的程序员编写的，基本上不起作用。我的语法采用了一种不同的方法，而且效果很好。

2009-09-18 11:31:35

其他回答

是的。下面的表达式根据解析的上下文类型有不同的操作顺序:

编辑:当实际操作顺序发生变化时，使用“常规”编译器在修饰AST(传播类型信息)之前解析未修饰的AST会变得非常困难。与此相比，提到的其他上下文敏感的事情“相当容易”(并不是说模板计算一点都不容易)。

#if FIRST_MEANING
   template<bool B>
   class foo
   { };
#else
   static const int foo = 0;
   static const int bar = 15;
#endif

紧随其后的是:

static int foobar( foo < 2 ? 1 < 1 : 0 > & bar );

2009-07-23 16:46:09

没有一种类algol语言是与上下文无关的，因为它们有规则约束表达式和语句，标识符可以根据它们的类型出现在这些表达式和语句中，并且因为在声明和使用之间可以出现的语句数量没有限制。

通常的解决方案是编写一个上下文无关的解析器，它实际上接受有效程序的超集，并将上下文敏感的部分放在附加到规则的特殊“语义”代码中。

c++的图灵完备模板系统远远超越了这一点。参见堆栈溢出问题794015。

2013-01-30 18:58:54

这里的一个大问题是术语“上下文无关”和“上下文敏感”在计算机科学中有点不直观。对于c++，上下文敏感性看起来很像歧义，但在一般情况下不一定是这样。

在C/ c++中，if语句只允许在函数体中使用。这似乎是上下文敏感的，对吧?嗯,没有。与上下文无关的语法实际上不需要这样的属性，即您可以提取某一行代码并确定它是否有效。这实际上并不是context-free的意思。它实际上只是一个标签，模糊地暗示了一些与它的发音相关的东西。

现在，如果函数体中的语句根据直接语法祖先之外的定义而被不同地解析(例如，标识符是描述类型还是变量)，如a * b;大小写，那么它实际上是上下文敏感的。这里没有实际的歧义;如果a是类型，则解析为指针声明，否则解析为乘法声明。

Being context-sensitive does not necessarily mean "hard to parse". C is actually not that hard because the infamous a * b; "ambiguity" can be resolved with a symbol table containing typedefs encountered previously. It doesn't require any arbitrary template instantiations (which have been proven to be Turing Complete) to resolve that case like C++ does on occasion. It's not actually possible to write a C program that will not compile in a finite amount of time even though it has the same context-sensitivity that C++ does.

Python (and other whitespace-sensitive languages) is also context-dependent, as it requires state in the lexer to generate indent and dedent tokens, but that doesn't make it any harder to parse than a typical LL-1 grammar. It actually uses a parser-generator, which is part of why Python has such uninformative syntax error messages. It's also important to note here that there is no "ambiguity" like the a * b; problem in Python, giving a good concrete example of a context-sensitive language without "ambiguous" grammar (as mentioned in the first paragraph).

2019-07-29 16:34:11

使用GLR解析器解析c++。这意味着在解析源代码期间，解析器可能会遇到歧义，但它应该继续并决定稍后使用哪个语法规则。

还看,

为什么c++不能用LR(1)解析器解析?

请记住，上下文无关语法不能描述编程语言语法的所有规则。例如，属性语法用于检查表达式类型的有效性。

int x;
x = 9 + 1.0;

你不能用上下文无关的语法描述下面的规则: 作业的右侧应与左侧的类型相同。

2009-07-23 16:37:46

下面是我(目前)最喜欢的关于为什么解析c++(可能)是图灵完备的演示，因为它显示了当且仅当给定整数是素数时语法正确的程序。

因此，我断言c++既不是上下文无关的，也不是上下文敏感的。

If you allow arbitrary symbol sequences on both sides of any production, you produce an Type-0 grammar ("unrestricted") in the Chomsky hierarchy, which is more powerful than a context-sensitive grammar; unrestricted grammars are Turing-complete. A context-sensitive (Type-1) grammar allows multiple symbols of context on the left hand side of a production, but the same context must appear on the right hand side of the production (hence the name "context-sensitive"). [1] Context-sensitive grammars are equivalent to linear-bounded Turing machines.

在示例程序中，质数计算可以由线性有界的图灵机执行，因此它不能完全证明图灵等价，但重要的部分是解析器需要执行计算以便执行语法分析。它可以是任何可以表示为模板实例化的计算，并且有充分的理由相信c++模板实例化是图灵完备的。例如，Todd L. Veldhuizen在2003年的论文。

不管怎样，c++可以被计算机解析，所以它当然可以被图灵机解析。因此，不受限制的语法可以识别它。实际上，编写这样的语法是不切实际的，这就是标准没有尝试这样做的原因。(见下文)。

某些表达的“模糊性”问题主要是转移注意力。首先，歧义是一种特定语法的特征，而不是一种语言。即使一种语言可以被证明没有明确的语法，如果它可以被上下文无关的语法识别，那么它就是上下文无关的。类似地，如果它不能被上下文无关的语法识别，但可以被上下文敏感的语法识别，那么它就是上下文敏感的。模棱两可无关紧要。

But in any event, like line 21 (i.e. auto b = foo<IsPrime<234799>>::typen<1>();) in the program below, the expressions are not ambiguous at all; they are simply parsed differently depending on context. In the simplest expression of the issue, the syntactic category of certain identifiers is dependent on how they have been declared (types and functions, for example), which means that the formal language would have to recognize the fact that two arbitrary-length strings in the same program are identical (declaration and use). This can be modelled by the "copy" grammar, which is the grammar which recognizes two consecutive exact copies of the same word. It's easy to prove with the pumping lemma that this language is not context-free. A context-sensitive grammar for this language is possible, and a Type-0 grammar is provided in the answer to this question: https://math.stackexchange.com/questions/163830/context-sensitive-grammar-for-the-copy-language .

如果有人试图编写上下文敏感的(或不受限制的)语法来解析c++，那么很可能会到处乱写乱写。编写图灵机来解析c++同样是不可能完成的任务。即使是编写c++程序也是困难的，而且据我所知，没有一个被证明是正确的。这就是为什么标准没有尝试提供完整的正式语法，以及为什么它选择用技术英语编写一些解析规则。

What looks like a formal grammar in the C++ standard is not the complete formal definition of the syntax of the C++ language. It's not even the complete formal definition of the language after preprocessing, which might be easier to formalize. (That wouldn't be the language, though: the C++ language as defined by the standard includes the preprocessor, and the operation of the preprocessor is described algorithmically since it would be extremely hard to describe in any grammatical formalism. It is in that section of the standard where lexical decomposition is described, including the rules where it must be applied more than once.)

各种语法(两个用于词法分析的重叠语法，一个发生在预处理之前，另一个，如果有必要，发生在预处理之后，再加上“句法”语法)收集在附录A中，其中有一个重要的注释(强调):

本文对c++语法的总结旨在帮助理解。这不是对语言的准确表述。特别地，这里描述的语法接受有效的c++构造的超集。必须应用消歧规则(6.8,7.1,10.2)来区分表达式和声明。此外，必须使用访问控制、模糊性和类型规则来清除语法上有效但无意义的结构。

最后，这是我们承诺的项目。当且仅当IsPrime中的N <N>是素数时，第21行在语法上是正确的。否则，typen是一个整数，而不是一个模板，因此typen<1>()将被解析为(typen<1)>()，这在语法上是不正确的，因为()不是一个语法上有效的表达式。

template<bool V> struct answer { answer(int) {} bool operator()(){return V;}};

template<bool no, bool yes, int f, int p> struct IsPrimeHelper
  : IsPrimeHelper<p % f == 0, f * f >= p, f + 2, p> {};
template<bool yes, int f, int p> struct IsPrimeHelper<true, yes, f, p> { using type = answer<false>; };
template<int f, int p> struct IsPrimeHelper<false, true, f, p> { using type = answer<true>; };

template<int I> using IsPrime = typename IsPrimeHelper<!(I&1), false, 3, I>::type;
template<int I>
struct X { static const int i = I; int a[i]; }; 

template<typename A> struct foo;
template<>struct foo<answer<true>>{
  template<int I> using typen = X<I>;
};
template<> struct foo<answer<false>>{
  static const int typen = 0;
};

int main() {
  auto b = foo<IsPrime<234799>>::typen<1>(); // Syntax error if not prime
  return 0;
}

更专业地说，上下文敏感语法中的每个结果都必须是这样的:

aB → AGB

其中A是非终结符，α、β可能是语法符号的空序列，γ是非空序列。(语法符号可以是终结符也可以是非终结符)。

这只能在[α， β]中被理解为A→γ。在上下文无关(类型2)语法中，α和β必须为空。

事实证明，你也可以用“单调”限制语法，其中每个结果必须是这样的形式:

α→β,|α|≥|β| > 0(|α|意味着“α”的长度)

可以证明单调语法所识别的语言集与上下文敏感语法所识别的语言集完全相同，而且通常更容易将证明建立在单调语法的基础上。因此，我们经常看到“上下文敏感”被当作“单调”的意思来使用。

2013-01-29 18:18:40

c++是上下文无关的还是上下文敏感的?

推荐文章

最新文章

标签