这是我能想到的最好的算法。
def get_primes(n):
numbers = set(range(n, 1, -1))
primes = []
while numbers:
p = numbers.pop()
primes.append(p)
numbers.difference_update(set(range(p*2, n+1, p)))
return primes
>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import get_primes').timeit(1)
1.1499958793645562
还能做得更快吗?
这段代码有一个缺陷:由于numbers是一个无序集,不能保证numbers.pop()将从集合中移除最低的数字。尽管如此,它还是适用于(至少对我来说)一些输入数字:
>>> sum(get_primes(2000000))
142913828922L
#That's the correct sum of all numbers below 2 million
>>> 529 in get_primes(1000)
False
>>> 529 in get_primes(530)
True
假设N < 9,080,191, Miller-Rabin's Primality检验的确定性实现
import sys
def miller_rabin_pass(a, n):
d = n - 1
s = 0
while d % 2 == 0:
d >>= 1
s += 1
a_to_power = pow(a, d, n)
if a_to_power == 1:
return True
for i in range(s-1):
if a_to_power == n - 1:
return True
a_to_power = (a_to_power * a_to_power) % n
return a_to_power == n - 1
def miller_rabin(n):
if n <= 2:
return n == 2
if n < 2_047:
return miller_rabin_pass(2, n)
return all(miller_rabin_pass(a, n) for a in (31, 73))
n = int(sys.argv[1])
primes = [2]
for p in range(3,n,2):
if miller_rabin(p):
primes.append(p)
print len(primes)
根据维基百科(http://en.wikipedia.org/wiki/Miller -Rabin_primality_test)上的文章,对于a = 37和73,测试N < 9,080,191足以判断N是否为合数。
我从原始米勒-拉宾测试的概率实现中改编了源代码:https://www.literateprograms.org/miller-rabin_primality_test__python_.html
使用Sundaram的Sieve,我想我打破了pure-Python的记录:
def sundaram3(max_n):
numbers = range(3, max_n+1, 2)
half = (max_n)//2
initial = 4
for step in xrange(3, max_n+1, 2):
for i in xrange(initial, half, step):
numbers[i-1] = 0
initial += 2*(step+1)
if initial > half:
return [2] + filter(None, numbers)
Comparasion:
C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.get_primes_erat(1000000)"
10 loops, best of 3: 710 msec per loop
C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.daniel_sieve_2(1000000)"
10 loops, best of 3: 435 msec per loop
C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.sundaram3(1000000)"
10 loops, best of 3: 327 msec per loop
下面是我在Python中通常用来生成质数的代码:
$ python -mtimeit -s'import sieve' 'sieve.sieve(1000000)'
10 loops, best of 3: 445 msec per loop
$ cat sieve.py
from math import sqrt
def sieve(size):
prime=[True]*size
rng=xrange
limit=int(sqrt(size))
for i in rng(3,limit+1,+2):
if prime[i]:
prime[i*i::+i]=[False]*len(prime[i*i::+i])
return [2]+[i for i in rng(3,size,+2) if prime[i]]
if __name__=='__main__':
print sieve(100)
它不能与这里发布的更快的解决方案竞争,但至少它是纯python。
谢谢你提出这个问题。我今天真的学到了很多东西。