有没有什么情况下你更喜欢O(log n)时间复杂度而不是O(1)时间复杂度?还是O(n)到O(log n)

你能举个例子吗?


当前回答

对于安全应用程序来说,这经常是这样的情况,我们希望设计算法缓慢的问题,以阻止某人过快地获得问题的答案。

这里有几个我能想到的例子。

Password hashing is sometimes made arbitrarily slow in order to make it harder to guess passwords by brute-force. This Information Security post has a bullet point about it (and much more). Bit Coin uses a controllably slow problem for a network of computers to solve in order to "mine" coins. This allows the currency to be mined at a controlled rate by the collective system. Asymmetric ciphers (like RSA) are designed to make decryption without the keys intentionally slow in order to prevent someone else without the private key to crack the encryption. The algorithms are designed to be cracked in hopefully O(2^n) time where n is the bit-length of the key (this is brute force).

在CS的其他地方,快速排序在最坏的情况下是O(n²),但在一般情况下是O(n*log(n))。因此,在分析算法效率时,“大O”分析有时并不是您唯一关心的事情。

其他回答

总有一个隐藏常数,在O(log n)算法中可以更低。因此,在实际生活数据中,它可以更快地工作。

还有空间问题(比如在烤面包机上运行)。

还有开发人员的时间问题——O(log n)可能更容易实现和验证1000倍。

人们已经回答了你的确切问题,所以我要回答一个稍微不同的问题,人们来这里时可能会想到这个问题。

许多“O(1)时间”算法和数据结构实际上只需要预期的O(1)时间,这意味着它们的平均运行时间是O(1),可能仅在某些假设下。

常见的例子:哈希表,“数组列表”的扩展(也就是动态大小的数组/向量)。

在这种情况下,您可能更喜欢使用保证时间绝对受对数限制的数据结构或算法,即使它们的平均性能可能更差。 一个例子可能是平衡二叉搜索树,它的运行时间平均较差,但在最坏的情况下更好。

选择大O复杂度高的算法而不是大O复杂度低的算法的原因有很多:

most of the time, lower big-O complexity is harder to achieve and requires skilled implementation, a lot of knowledge and a lot of testing. big-O hides the details about a constant: algorithm that performs in 10^5 is better from big-O point of view than 1/10^5 * log(n) (O(1) vs O(log(n)), but for most reasonable n the first one will perform better. For example the best complexity for matrix multiplication is O(n^2.373) but the constant is so high that no (to my knowledge) computational libraries use it. big-O makes sense when you calculate over something big. If you need to sort array of three numbers, it matters really little whether you use O(n*log(n)) or O(n^2) algorithm. sometimes the advantage of the lowercase time complexity can be really negligible. For example there is a data structure tango tree which gives a O(log log N) time complexity to find an item, but there is also a binary tree which finds the same in O(log n). Even for huge numbers of n = 10^20 the difference is negligible. time complexity is not everything. Imagine an algorithm that runs in O(n^2) and requires O(n^2) memory. It might be preferable over O(n^3) time and O(1) space when the n is not really big. The problem is that you can wait for a long time, but highly doubt you can find a RAM big enough to use it with your algorithm parallelization is a good feature in our distributed world. There are algorithms that are easily parallelizable, and there are some that do not parallelize at all. Sometimes it makes sense to run an algorithm on 1000 commodity machines with a higher complexity than using one machine with a slightly better complexity. in some places (security) a complexity can be a requirement. No one wants to have a hash algorithm that can hash blazingly fast (because then other people can bruteforce you way faster) although this is not related to switch of complexity, but some of the security functions should be written in a manner to prevent timing attack. They mostly stay in the same complexity class, but are modified in a way that it always takes worse case to do something. One example is comparing that strings are equal. In most applications it makes sense to break fast if the first bytes are different, but in security you will still wait for the very end to tell the bad news. somebody patented the lower-complexity algorithm and it is more economical for a company to use higher complexity than to pay money. some algorithms adapt well to particular situations. Insertion sort, for example, has an average time-complexity of O(n^2), worse than quicksort or mergesort, but as an online algorithm it can efficiently sort a list of values as they are received (as user input) where most other algorithms can only efficiently operate on a complete list of values.

Alistra指出了这一点,但未能提供任何例子,所以我会。

您有一个包含10,000个UPC代码的列表,用于您的商店销售的产品。10位UPC,整数价格(便士价格)和30个字符的收据描述。

O(log N)方法:你有一个排序的列表。ASCII是44字节,Unicode是84字节。或者,将UPC视为int64,将得到42和72字节。10,000条记录——在最高的情况下,您看到的存储空间略低于1mb。

O(1)方法:不存储UPC,而是将其用作数组的一个条目。在最低的情况下,您将看到近三分之一tb的存储空间。

Which approach you use depends on your hardware. On most any reasonable modern configuration you're going to use the log N approach. I can picture the second approach being the right answer if for some reason you're running in an environment where RAM is critically short but you have plenty of mass storage. A third of a terabyte on a disk is no big deal, getting your data in one probe of the disk is worth something. The simple binary approach takes 13 on average. (Note, however, that by clustering your keys you can get this down to a guaranteed 3 reads and in practice you would cache the first one.)

在关注数据安全的上下文中,如果更复杂的算法对定时攻击有更好的抵抗能力,那么更复杂的算法可能比不太复杂的算法更可取。