有没有什么情况下你更喜欢O(log n)时间复杂度而不是O(1)时间复杂度?还是O(n)到O(log n)
你能举个例子吗?
有没有什么情况下你更喜欢O(log n)时间复杂度而不是O(1)时间复杂度?还是O(n)到O(log n)
你能举个例子吗?
当前回答
简单地说:因为系数(与该步骤的设置、存储和执行时间相关的成本)在较小的大o问题中比在较大的大o问题中要大得多。Big-O只是算法可伸缩性的一个衡量标准。
考虑以下来自黑客词典的例子,提出了一个依赖于量子力学的多重世界解释的排序算法:
用量子过程随机排列数组, 如果数组没有排序,毁灭宇宙。 所有剩下的宇宙现在都被排序了(包括你所在的宇宙)。
(来源:http://catb.org/ esr /术语/ html / B / bogo-sort.html)
注意,这个算法的大O是O(n),它击败了迄今为止在一般项目上的任何已知排序算法。线性阶跃的系数也很低(因为它只是一个比较,而不是交换,是线性完成的)。事实上,类似的算法可以用于在多项式时间内解决NP和co-NP中的任何问题,因为每个可能的解(或没有解的可能证明)都可以使用量子过程生成,然后在多项式时间内验证。
然而,在大多数情况下,我们可能不想冒多重世界可能不正确的风险,更不用说实现步骤2的行为仍然是“留给读者的练习”。
其他回答
Alistra指出了这一点,但未能提供任何例子,所以我会。
您有一个包含10,000个UPC代码的列表,用于您的商店销售的产品。10位UPC,整数价格(便士价格)和30个字符的收据描述。
O(log N)方法:你有一个排序的列表。ASCII是44字节,Unicode是84字节。或者,将UPC视为int64,将得到42和72字节。10,000条记录——在最高的情况下,您看到的存储空间略低于1mb。
O(1)方法:不存储UPC,而是将其用作数组的一个条目。在最低的情况下,您将看到近三分之一tb的存储空间。
Which approach you use depends on your hardware. On most any reasonable modern configuration you're going to use the log N approach. I can picture the second approach being the right answer if for some reason you're running in an environment where RAM is critically short but you have plenty of mass storage. A third of a terabyte on a disk is no big deal, getting your data in one probe of the disk is worth something. The simple binary approach takes 13 on average. (Note, however, that by clustering your keys you can get this down to a guaranteed 3 reads and in practice you would cache the first one.)
考虑一个红黑树。它具有O(log n)的访问、搜索、插入和删除操作。与数组相比,数组的访问权限为O(1),其余操作为O(n)。
因此,对于一个插入、删除或搜索比访问更频繁的应用程序,并且只能在这两种结构之间进行选择,我们更喜欢红黑树。在这种情况下,你可能会说我们更喜欢红黑树更麻烦的O(log n)访问时间。
为什么?因为权限不是我们最关心的。我们正在权衡:应用程序的性能更大程度上受到其他因素的影响。我们允许这种特定的算法受到性能影响,因为我们通过优化其他算法获得了很大的收益。
So the answer to your question is simply this: when the algorithm's growth rate isn't what we want to optimize, when we want to optimize something else. All of the other answers are special cases of this. Sometimes we optimize the run time of other operations. Sometimes we optimize for memory. Sometimes we optimize for security. Sometimes we optimize maintainability. Sometimes we optimize for development time. Even the overriding constant being low enough to matter is optimizing for run time when you know the growth rate of the algorithm isn't the greatest impact on run time. (If your data set was outside this range, you would optimize for the growth rate of the algorithm because it would eventually dominate the constant.) Everything has a cost, and in many cases, we trade the cost of a higher growth rate for the algorithm to optimize something else.
There is a good use case for using a O(log(n)) algorithm instead of an O(1) algorithm that the numerous other answers have ignored: immutability. Hash maps have O(1) puts and gets, assuming good distribution of hash values, but they require mutable state. Immutable tree maps have O(log(n)) puts and gets, which is asymptotically slower. However, immutability can be valuable enough to make up for worse performance and in the case where multiple versions of the map need to be retained, immutability allows you to avoid having to copy the map, which is O(n), and therefore can improve performance.
A more general question is if there are situations where one would prefer an O(f(n)) algorithm to an O(g(n)) algorithm even though g(n) << f(n) as n tends to infinity. As others have already mentioned, the answer is clearly "yes" in the case where f(n) = log(n) and g(n) = 1. It is sometimes yes even in the case that f(n) is polynomial but g(n) is exponential. A famous and important example is that of the Simplex Algorithm for solving linear programming problems. In the 1970s it was shown to be O(2^n). Thus, its worse-case behavior is infeasible. But -- its average case behavior is extremely good, even for practical problems with tens of thousands of variables and constraints. In the 1980s, polynomial time algorithms (such a Karmarkar's interior-point algorithm) for linear programming were discovered, but 30 years later the simplex algorithm still seems to be the algorithm of choice (except for certain very large problems). This is for the obvious reason that average-case behavior is often more important than worse-case behavior, but also for a more subtle reason that the simplex algorithm is in some sense more informative (e.g. sensitivity information is easier to extract).
我很惊讶没有人提到内存绑定应用程序。
可能存在一种算法具有较少的浮点运算,这要么是因为它的复杂性(即O(1) < O(log n)),要么是因为复杂度前面的常数更小(即2n2 < 6n2)。无论如何,如果较低的FLOP算法的内存限制更大,您可能仍然更喜欢具有更多FLOP的算法。
我所说的“内存受限”是指您经常访问的数据经常超出缓存。为了获取这些数据,在对其执行操作之前,必须将内存从实际内存空间拉到缓存中。这个抓取步骤通常非常慢——比您的操作本身慢得多。
因此,如果你的算法需要更多的操作(但这些操作是在已经在缓存中的数据上执行的[因此不需要读取]),它仍然会在实际的walltime方面以更少的操作(必须在缓存外的数据上执行[因此需要读取])胜过你的算法。