我在问更多关于这对我的代码意味着什么。我能从数学上理解这些概念,只是很难理解它们在概念上的含义。例如,如果有人要对一个数据结构执行O(1)操作,我知道它必须执行的操作数量不会增长,因为有更多的项。O(n)操作意味着您将对每个元素执行一组操作。有人能帮我填一下吗?

比如O(n²)的运算会怎样? 如果一个操作是O(nlog (n))这是什么意思? 有人必须吸可卡因才能写出O(x!)吗?


当前回答

这可能太数学化了,但这是我的尝试。(我是数学家。)

如果某个东西是O(f(n)),那么它在n个元素上的运行时间将等于A f(n) + B(以时钟周期或CPU操作为单位)。理解这些常量A和B是非常关键的,它们来自特定的实现。B本质上代表你的操作的“常量开销”,例如你所做的一些预处理不依赖于集合的大小。A表示实际项目处理算法的速度。

关键在于,你可以使用大O符号来计算某物的可伸缩性。所以这些常数并不重要:如果你想弄清楚如何从10个项目扩展到10000个项目,谁会关心开销常数B呢?类似地,其他问题(见下文)肯定会超过乘法常数A的重要性。

So the real deal is f(n). If f grows not at all with n, e.g. f(n) = 1, then you'll scale fantastically---your running time will always just be A + B. If f grows linearly with n, i.e. f(n) = n, your running time will scale pretty much as best as can be expected---if your users are waiting 10 ns for 10 elements, they'll wait 10000 ns for 10000 elements (ignoring the additive constant). But if it grows faster, like n2, then you're in trouble; things will start slowing down way too much when you get larger collections. f(n) = n log(n) is a good compromise, usually: your operation can't be so simple as to give linear scaling, but you've managed to cut things down such that it'll scale much better than f(n) = n2.

实际上,这里有一些很好的例子:

O(1): retrieving an element from an array. We know exactly where it is in memory, so we just go get it. It doesn't matter if the collection has 10 items or 10000; it's still at index (say) 3, so we just jump to location 3 in memory. O(n): retrieving an element from a linked list. Here, A = 0.5, because on average you''ll have to go through 1/2 of the linked list before you find the element you're looking for. O(n2): various "dumb" sorting algorithms. Because generally their strategy involves, for each element (n), you look at all the other elements (so times another n, giving n2), then position yourself in the right place. O(n log(n)): various "smart" sorting algorithms. It turns out that you only need to look at, say, 10 elements in a 1010-element collection to intelligently sort yourself relative to everyone else in the collection. Because everyone else is also going to look at 10 elements, and the emergent behavior is orchestrated just right so that this is enough to produce a sorted list. O(n!): an algorithm that "tries everything," since there are (proportional to) n! possible combinations of n elements that might solve a given problem. So it just loops through all such combinations, tries them, then stops whenever it succeeds.

其他回答

只是为了回应我上面帖子的一些评论:

Domenic - I'm on this site, and I care. Not for pedantry's sake, but because we - as programmers - typically care about precision. Using O( ) notation incorrectly in the style that some have done here renders it kind of meaningless; we may just as well say something takes n^2 units of time as O( n^2 ) under the conventions used here. Using the O( ) adds nothing. It's not just a small discrepancy between common usage and mathematical precision that I'm talking about, it's the difference between it being meaningful and it not.

我知道很多很多优秀的程序员都准确地使用这些术语。说“哦,我们是程序员,所以我们不在乎”会降低整个企业的成本。

一个接一个-嗯,不完全是,尽管我同意你的观点。对于任意大的n,它不是O(1)这是O()的定义。它只是表明O()对于有界n的适用性有限,在这里我们更愿意讨论所走的步数,而不是这个数字的界限。

有一件事由于某种原因还没有被提及:

当你看到像O(2^n)或O(n^3)这样的算法时,这通常意味着你将不得不接受一个不完美的问题答案,以获得可接受的性能。

在处理优化问题时,像这样的正确解决方案很常见。在合理的时间内给出一个近乎正确的答案,总比在机器腐烂成灰尘很久之后才给出一个正确答案要好。

以国际象棋为例:我不知道正确的解决方案是什么,但它可能是O(n^50)或更糟。从理论上讲,任何计算机都不可能真正计算出正确答案——即使你用宇宙中的每个粒子作为计算元素,在宇宙生命周期内尽可能短的时间内执行一项操作,你仍然会剩下很多零。(量子计算机能否解决这个问题是另一回事。)

我是这样向我那些不懂技术的朋友描述的:

考虑多位数加法。很好的老式铅笔和纸的补充。就是你7-8岁时学的那种。给定两个三位数或四位数,你很容易就能求出它们加起来是多少。

如果我给你两个100位的数字,然后问你它们加起来是多少,即使你必须使用铅笔和纸,计算出来也会非常简单。一个聪明的孩子可以在几分钟内做这样的加法。这只需要大约100次操作。

现在,考虑多位数乘法。你可能在八九岁的时候就学会了。你(希望)做了很多重复的练习来学习它背后的机制。

Now, imagine I gave you those same two 100-digit numbers and told you to multiply them together. This would be a much, much harder task, something that would take you hours to do - and that you'd be unlikely to do without mistakes. The reason for this is that (this version of) multiplication is O(n^2); each digit in the bottom number has to be multiplied by each digit in the top number, leaving a total of about n^2 operations. In the case of the 100-digit numbers, that's 10,000 multiplications.

其中很多都很容易用非编程的东西来演示,比如洗牌。

对一副牌进行排序通过遍历整副牌找到黑桃a,然后遍历整副牌找到黑桃2,以此类推最坏情况是n^2,如果这副牌已经倒着排序了。你看了52张牌52次。

一般来说,真正糟糕的算法不一定是故意的,它们通常是对其他东西的误用,比如在同一集合上线性重复的另一个方法中调用一个线性方法。

假设你有一台可以解决一定规模问题的计算机。现在想象一下,我们可以将性能提高几倍。每加倍一次,我们能解决多大的问题?

如果我们能解决一个两倍大的问题,那就是O(n)

如果我们有一个非1的乘数,那就是某种多项式复杂度。例如,如果每加倍一次,问题的规模就会增加约40%,即O(n²),而约30%则是O(n³)。

如果我们只是增加问题的规模,它是指数级的,甚至更糟。例如,如果每翻一倍意味着我们可以解决一个大1的问题,它就是O(2^n)。(这就是为什么使用合理大小的密钥实际上不可能强制使用密码密钥:128位密钥需要的处理量大约是64位密钥的16万亿倍。)