有没有什么情况下你更喜欢O(log n)时间复杂度而不是O(1)时间复杂度?还是O(n)到O(log n)
你能举个例子吗?
有没有什么情况下你更喜欢O(log n)时间复杂度而不是O(1)时间复杂度?还是O(n)到O(log n)
你能举个例子吗?
当前回答
在n有界且O(1)算法的常数乘子高于log(n)上的界的任意点。例如,在哈希集中存储值是O(1),但可能需要对哈希函数进行昂贵的计算。如果数据项可以简单地进行比较(相对于某些顺序),并且n的边界是这样的,log n明显小于任何一项上的哈希计算,那么存储在平衡二叉树中可能比存储在哈希集中更快。
其他回答
选择大O复杂度高的算法而不是大O复杂度低的算法的原因有很多:
most of the time, lower big-O complexity is harder to achieve and requires skilled implementation, a lot of knowledge and a lot of testing. big-O hides the details about a constant: algorithm that performs in 10^5 is better from big-O point of view than 1/10^5 * log(n) (O(1) vs O(log(n)), but for most reasonable n the first one will perform better. For example the best complexity for matrix multiplication is O(n^2.373) but the constant is so high that no (to my knowledge) computational libraries use it. big-O makes sense when you calculate over something big. If you need to sort array of three numbers, it matters really little whether you use O(n*log(n)) or O(n^2) algorithm. sometimes the advantage of the lowercase time complexity can be really negligible. For example there is a data structure tango tree which gives a O(log log N) time complexity to find an item, but there is also a binary tree which finds the same in O(log n). Even for huge numbers of n = 10^20 the difference is negligible. time complexity is not everything. Imagine an algorithm that runs in O(n^2) and requires O(n^2) memory. It might be preferable over O(n^3) time and O(1) space when the n is not really big. The problem is that you can wait for a long time, but highly doubt you can find a RAM big enough to use it with your algorithm parallelization is a good feature in our distributed world. There are algorithms that are easily parallelizable, and there are some that do not parallelize at all. Sometimes it makes sense to run an algorithm on 1000 commodity machines with a higher complexity than using one machine with a slightly better complexity. in some places (security) a complexity can be a requirement. No one wants to have a hash algorithm that can hash blazingly fast (because then other people can bruteforce you way faster) although this is not related to switch of complexity, but some of the security functions should be written in a manner to prevent timing attack. They mostly stay in the same complexity class, but are modified in a way that it always takes worse case to do something. One example is comparing that strings are equal. In most applications it makes sense to break fast if the first bytes are different, but in security you will still wait for the very end to tell the bad news. somebody patented the lower-complexity algorithm and it is more economical for a company to use higher complexity than to pay money. some algorithms adapt well to particular situations. Insertion sort, for example, has an average time-complexity of O(n^2), worse than quicksort or mergesort, but as an online algorithm it can efficiently sort a list of values as they are received (as user input) where most other algorithms can only efficiently operate on a complete list of values.
Alistra指出了这一点,但未能提供任何例子,所以我会。
您有一个包含10,000个UPC代码的列表,用于您的商店销售的产品。10位UPC,整数价格(便士价格)和30个字符的收据描述。
O(log N)方法:你有一个排序的列表。ASCII是44字节,Unicode是84字节。或者,将UPC视为int64,将得到42和72字节。10,000条记录——在最高的情况下,您看到的存储空间略低于1mb。
O(1)方法:不存储UPC,而是将其用作数组的一个条目。在最低的情况下,您将看到近三分之一tb的存储空间。
Which approach you use depends on your hardware. On most any reasonable modern configuration you're going to use the log N approach. I can picture the second approach being the right answer if for some reason you're running in an environment where RAM is critically short but you have plenty of mass storage. A third of a terabyte on a disk is no big deal, getting your data in one probe of the disk is worth something. The simple binary approach takes 13 on average. (Note, however, that by clustering your keys you can get this down to a guaranteed 3 reads and in practice you would cache the first one.)
Yes.
在实际情况下,我们运行了一些使用短字符串和长字符串键进行表查找的测试。
我们使用了std::map, std::unordered_map和一个哈希,该哈希最多对字符串长度进行10次采样(我们的键倾向于guidlike,所以这是体面的),以及一个哈希,对每个字符进行采样(理论上减少了冲突),一个未排序的向量,其中我们进行==比较,以及(如果我没记错的话)一个未排序的向量,其中我们还存储了一个哈希,首先比较哈希,然后比较字符。
这些算法的范围从O(1) (unordered_map)到O(n)(线性搜索)。
对于中等大小的N,通常O(N)优于O(1)。我们怀疑这是因为基于节点的容器需要我们的计算机在内存中跳跃更多,而基于线性的容器则不需要。
O(lgn)存在于两者之间。我不记得是怎么回事了。
性能差异并不大,在更大的数据集上,基于哈希的表现要好得多。所以我们坚持使用基于哈希的无序映射。
实际上,对于合理的n大小,O(lgn)为O(1)。如果你的计算机在你的表中只有40亿的空间,那么O(lgn)的上界是32。(lg(2^32)=32)(在计算机科学中,lg是log based 2的简称)。
在实践中,lg(n)算法比O(1)算法慢,不是因为对数增长因子,而是因为lg(n)部分通常意味着算法有一定程度的复杂性,并且这种复杂性比lg(n)项中的任何“增长”都增加了更大的常数因子。
然而,复杂的O(1)算法(如哈希映射)很容易具有类似或更大的常数因子。
A more general question is if there are situations where one would prefer an O(f(n)) algorithm to an O(g(n)) algorithm even though g(n) << f(n) as n tends to infinity. As others have already mentioned, the answer is clearly "yes" in the case where f(n) = log(n) and g(n) = 1. It is sometimes yes even in the case that f(n) is polynomial but g(n) is exponential. A famous and important example is that of the Simplex Algorithm for solving linear programming problems. In the 1970s it was shown to be O(2^n). Thus, its worse-case behavior is infeasible. But -- its average case behavior is extremely good, even for practical problems with tens of thousands of variables and constraints. In the 1980s, polynomial time algorithms (such a Karmarkar's interior-point algorithm) for linear programming were discovered, but 30 years later the simplex algorithm still seems to be the algorithm of choice (except for certain very large problems). This is for the obvious reason that average-case behavior is often more important than worse-case behavior, but also for a more subtle reason that the simplex algorithm is in some sense more informative (e.g. sensitivity information is easier to extract).
简单地说:因为系数(与该步骤的设置、存储和执行时间相关的成本)在较小的大o问题中比在较大的大o问题中要大得多。Big-O只是算法可伸缩性的一个衡量标准。
考虑以下来自黑客词典的例子,提出了一个依赖于量子力学的多重世界解释的排序算法:
用量子过程随机排列数组, 如果数组没有排序,毁灭宇宙。 所有剩下的宇宙现在都被排序了(包括你所在的宇宙)。
(来源:http://catb.org/ esr /术语/ html / B / bogo-sort.html)
注意,这个算法的大O是O(n),它击败了迄今为止在一般项目上的任何已知排序算法。线性阶跃的系数也很低(因为它只是一个比较,而不是交换,是线性完成的)。事实上,类似的算法可以用于在多项式时间内解决NP和co-NP中的任何问题,因为每个可能的解(或没有解的可能证明)都可以使用量子过程生成,然后在多项式时间内验证。
然而,在大多数情况下,我们可能不想冒多重世界可能不正确的风险,更不用说实现步骤2的行为仍然是“留给读者的练习”。