P=NP问题可能是整个计算机科学中最著名的问题。这是什么意思?为什么这么有趣?

哦,为了获得额外的学分,请张贴声明的真假证明。:)


就我所知做一个简短的总结:

有一些简单的计算问题(比如寻找图中两点之间的最短路径),可以相当快地计算出来(O(n^k),其中n是输入的大小,k是一个常数(在图的情况下,它是顶点或边的数量)。

其他的问题,比如找到一条穿过图中每个顶点的路径,或者从公钥中获取RSA私钥就更难了(O(e^n))。

但是CS语言告诉我们,问题在于我们不能将非确定性图灵机“转换”为确定性图灵机,然而,我们可以将非确定性有限自动机(如正则表达式解析器)转换为确定性自动机(好吧,你可以,但机器的运行时间会很长)。也就是说,我们必须尝试每一条可能的路径(通常聪明的计算机教授会排除一些)。

这很有趣,因为甚至没有人知道答案。有人说这是真的,有人说这是假的,但没有共识。另一件有趣的事情是,解决方案对公钥/私钥加密(如RSA)是有害的。你可以像现在生成RSA密钥一样轻易地破解它们。

这是一个非常鼓舞人心的问题。


P代表多项式时间。NP代表非确定性多项式时间。

定义:

Polynomial time means that the complexity of the algorithm is O(n^k), where n is the size of your data (e. g. number of elements in a list to be sorted), and k is a constant. Complexity is time measured in the number of operations it would take, as a function of the number of data items. Operation is whatever makes sense as a basic operation for a particular task. For sorting, the basic operation is a comparison. For matrix multiplication, the basic operation is multiplication of two numbers.

Now the question is, what does deterministic vs. non-deterministic mean? There is an abstract computational model, an imaginary computer called a Turing machine (TM). This machine has a finite number of states, and an infinite tape, which has discrete cells into which a finite set of symbols can be written and read. At any given time, the TM is in one of its states, and it is looking at a particular cell on the tape. Depending on what it reads from that cell, it can write a new symbol into that cell, move the tape one cell forward or backward, and go into a different state. This is called a state transition. Amazingly enough, by carefully constructing states and transitions, you can design a TM, which is equivalent to any computer program that can be written. This is why it is used as a theoretical model for proving things about what computers can and cannot do.

There are two kinds of TM's that concern us here: deterministic and non-deterministic. A deterministic TM only has one transition from each state for each symbol that it is reading off the tape. A non-deterministic TM may have several such transition, i. e. it is able to check several possibilities simultaneously. This is sort of like spawning multiple threads. The difference is that a non-deterministic TM can spawn as many such "threads" as it wants, while on a real computer only a specific number of threads can be executed at a time (equal to the number of CPUs). In reality, computers are basically deterministic TMs with finite tapes. On the other hand, a non-deterministic TM cannot be physically realized, except maybe with a quantum computer.

事实证明,任何可以用非确定性TM解决的问题都可以用确定性TM解决。然而,尚不清楚这需要多长时间。陈述P=NP意味着如果一个问题在非确定性TM上花费多项式时间,那么人们可以构建一个确定性TM,它也可以在多项式时间内解决相同的问题。到目前为止,还没有人能够证明这是可以做到的,但也没有人能够证明这是不能做到的。

NP完全问题是指一个NP问题X,这样任何NP问题Y都可以通过多项式约简化为X。这意味着,如果有人提出了一个NP完全问题的多项式时间解,那也会给出任何NP问题的多项式时间解。这样就证明了P=NP。相反,如果有人能证明P!=NP,那么我们可以肯定,在传统计算机上没有办法在多项式时间内解决NP问题。

np完全问题的一个例子是寻找一个真值赋值的问题,该值赋值将使包含n个变量的布尔表达式为真。 目前在实践中,任何在非确定性TM上需要多项式时间的问题都只能在确定性TM或传统计算机上以指数时间解决。 例如,解决真值分配问题的唯一方法是尝试2^n种可能性。


对于P=的“什么”和“为什么”,我没有什么可以补充的了。NP部分的问题,但是关于证明。这样的证明不仅可以加分,而且还能解决千禧年难题之一。最近进行了一项有趣的民意调查,就证明的主题而言,发表的结果(PDF)绝对值得一读。


给出我能想到的最简单的答案:

Suppose we have a problem that takes a certain number of inputs, and has various potential solutions, which may or may not solve the problem for given inputs. A logic puzzle in a puzzle magazine would be a good example: the inputs are the conditions ("George doesn't live in the blue or green house"), and the potential solution is a list of statements ("George lives in the yellow house, grows peas, and owns the dog"). A famous example is the Traveling Salesman problem: given a list of cities, and the times to get from any city to any other, and a time limit, a potential solution would be a list of cities in the order the salesman visits them, and it would work if the sum of the travel times was less than the time limit.

如果我们能有效地检查一个潜在的解决方案,看看它是否有效,这样的问题就属于NP。例如,给定一个销售人员要按顺序访问的城市列表,我们可以将每个城市之间的旅行时间加起来,并很容易地查看它是否在时间限制之内。如果我们能有效地找到解决方案,那么问题就在P中。

(高效,在这里有一个精确的数学意义。实际上,这意味着解决大问题并不是不合理的困难。当搜索一个可能的解决方案时,低效的方法是列出所有可能的潜在解决方案,而有效的方法需要搜索一个更有限的集合。)

Therefore, the P=NP problem can be expressed this way: If you can verify a solution for a problem of the sort described above efficiently, can you find a solution (or prove there is none) efficiently? The obvious answer is "Why should you be able to?", and that's pretty much where the matter stands today. Nobody has been able to prove it one way or another, and that bothers a lot of mathematicians and computer scientists. That's why anybody who can prove the solution is up for a million dollars from the Claypool Foundation.

We generally assume that P does not equal NP, that there is no general way to find solutions. If it turned out that P=NP, a lot of things would change. For example, cryptography would become impossible, and with it any sort of privacy or verifiability on the Internet. After all, we can efficiently take the encrypted text and the key and produce the original text, so if P=NP we could efficiently find the key without knowing it beforehand. Password cracking would become trivial. On the other hand, there's whole classes of planning problems and resource allocation problems that we could solve effectively.

你可能听过NP-complete这个描述。一个NP完全问题是一个NP问题(当然),并且有一个有趣的性质:如果它在P中,那么每个NP问题都是,因此P=NP。如果你能找到一种有效解决旅行推销员问题或益智杂志上的逻辑谜题的方法,你就可以有效地解决NP中的任何问题。NP完全问题在某种程度上是NP问题中最难的一类。

所以,如果你能找到任何np完全问题的有效通解技术,或者证明不存在这样的问题,名利就是你的了。


如果答案可以在多项式时间内计算出来,那么一个是或否的问题就是在P(多项式时间)内。 一个是或否的问题是NP(非确定性多项式时间),如果一个是答案可以在多项式时间内验证。

直观地,我们可以看到,如果一个问题在P中,那么它就在NP中。给定P中问题的潜在答案,我们可以通过简单地重新计算答案来验证答案。

不太明显,但更难回答的是,NP中的所有问题是否都在p中,我们可以在多项式时间内验证答案,这是否意味着我们可以在多项式时间内计算答案?

有大量已知的重要问题是NP完全的(基本上,如果这些问题中的任何一个被证明是在P中,那么所有的NP问题都被证明是在P中)。如果P = NP,那么所有这些问题都将被证明有一个有效的(多项式时间)解。

大多数科学家认为P!=NP。然而,目前还没有证据证明P =NP或P!=NP。如果有人能提供任何一个猜想的证据,他们将赢得100万美元。


首先,一些定义:

A particular problem is in P if you can compute a solution in time less than n^k for some k, where n is the size of the input. For instance, sorting can be done in n log n which is less than n^2, so sorting is polynomial time. A problem is in NP if there exists a k such that there exists a solution of size at most n^k which you can verify in time at most n^k. Take 3-coloring of graphs: given a graph, a 3-coloring is a list of (vertex, color) pairs which has size O(n) and you can verify in time O(m) (or O(n^2)) whether all neighbors have different colors. So a graph is 3-colorable only if there is a short and readily verifiable solution.

NP的等效定义是“在多项式时间内由非确定性图灵机解决的问题”。虽然这告诉你这个名字的来源,但它并没有让你对NP问题有同样直观的感觉。

注意P是NP的一个子集:如果你能在多项式时间内找到一个解,那么就有一个解可以在多项式时间内被验证——只要检查给定的解是否等于你能找到的解。

为什么这个问题是P =?NP有趣吗?要回答这个问题,首先需要了解什么是np完全问题。简单地说,

如果(1)L在P中,且(2)解决L的算法可以用于解决NP中的任何L'问题,则问题L是NP完备的;也就是说,给定L'的一个实例,你可以创建一个L的实例,当且仅当L'的实例有解时,它有解。从形式上讲,NP中的每一个问题L’都可约为L。

注意,L的实例必须是多项式时间可计算的,并且具有多项式大小,大小为L';这样,在多项式时间内解决一个NP完全问题,就得到了所有NP问题的多项式时间解。

这里有一个例子:假设我们知道图的3种颜色是一个np困难的问题。我们想证明布尔公式的可满足性也是一个np难题。

对于每个顶点v,有两个布尔变量v_h和v_l,并且要求(v_h或v_l):每对只能有值{01,10,11},我们可以认为是颜色1,2和3。

对于每条边(u, v),都要求(u_h, u_l) != (v_h, v_l)。也就是说,

Not ((u_h and Not u_l) and (v_h and Not v_l) or…) 列举所有相等的构型,并规定它们都不是这样的。

将所有这些约束条件结合在一起,得到一个具有多项式大小(O(n+m))的布尔公式。你可以检查一下,它也需要多项式时间来计算:你对每个顶点和每条边都做了O(1)的事情。

如果你能解出我所做的布尔公式,那么你也可以解出图的着色:对于每一对变量v_h和v_l,让v的颜色与这些变量的值相匹配。根据公式的构造,相邻的颜色不会相等。

因此,如果图的3色是np完备的,那么布尔公式可满足性也是np完备的。

我们知道图的3色是np完备的;然而,历史上我们已经知道,通过首先展示布尔电路可满足性的np完整性,然后将其减少到3-着色性(而不是相反)。