这是我能想到的最好的算法。

def get_primes(n):
    numbers = set(range(n, 1, -1))
    primes = []
    while numbers:
        p = numbers.pop()
        primes.append(p)
        numbers.difference_update(set(range(p*2, n+1, p)))
    return primes

>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import   get_primes').timeit(1)
1.1499958793645562

还能做得更快吗?

这段代码有一个缺陷:由于numbers是一个无序集,不能保证numbers.pop()将从集合中移除最低的数字。尽管如此,它还是适用于(至少对我来说)一些输入数字:

>>> sum(get_primes(2000000))
142913828922L
#That's the correct sum of all numbers below 2 million
>>> 529 in get_primes(1000)
False
>>> 529 in get_primes(530)
True

当前回答

使用Numpy实现的半筛子略有不同:

http://rebrained.com/?p=458

import math
import numpy
def prime6(upto):
    primes=numpy.arange(3,upto+1,2)
    isprime=numpy.ones((upto-1)/2,dtype=bool)
    for factor in primes[:int(math.sqrt(upto))]:
        if isprime[(factor-2)/2]: isprime[(factor*3-2)/2:(upto-1)/2:factor]=0
    return numpy.insert(primes[isprime],0,2)

有人能把这个和其他时间比较一下吗?在我的机器上,它似乎与其他Numpy半筛相当。

其他回答

这里有一个来自Python Cookbook的非常简洁的示例——该URL的最快版本是:

import itertools
def erat2( ):
    D = {  }
    yield 2
    for q in itertools.islice(itertools.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = p + q
            while x in D or not (x&1):
                x += p
            D[x] = p

这就给出了

def get_primes_erat(n):
  return list(itertools.takewhile(lambda p: p<n, erat2()))

在shell提示符(正如我喜欢做的那样)中测量这段代码在pri.py中,我观察到:

$ python2.5 -mtimeit -s'import pri' 'pri.get_primes(1000000)'
10 loops, best of 3: 1.69 sec per loop
$ python2.5 -mtimeit -s'import pri' 'pri.get_primes_erat(1000000)'
10 loops, best of 3: 673 msec per loop

所以看起来食谱解决方案的速度是原来的两倍多。

随着时间的推移,我收集了几个质数筛子。我电脑上最快的是这样的:

from time import time
# 175 ms for all the primes up to the value 10**6
def primes_sieve(limit):
    a = [True] * limit
    a[0] = a[1] = False
    #a[2] = True
    for n in xrange(4, limit, 2):
        a[n] = False
    root_limit = int(limit**.5)+1
    for i in xrange(3,root_limit):
        if a[i]:
            for n in xrange(i*i, limit, 2*i):
                a[n] = False
    return a

LIMIT = 10**6
s=time()
primes = primes_sieve(LIMIT)
print time()-s

你有一个更快的代码和最简单的代码生成质数。 但对于更大的数字,当n=10000, 10000000时,它不起作用,可能是。pop()方法失败了

考虑:N是质数吗?

case 1: You got some factors of N, for i in range(2, N): If N is prime loop is performed for ~(N-2) times. else less number of times case 2: for i in range(2, int(math.sqrt(N)): Loop is performed for almost ~(sqrt(N)-2) times if N is prime else will break somewhere case 3: Better We Divide N With Only number of primes<=sqrt(N) Where loop is performed for only π(sqrt(N)) times π(sqrt(N)) << sqrt(N) as N increases from math import sqrt from time import * prime_list = [2] n = int(input()) s = time() for n0 in range(2,n+1): for i0 in prime_list: if n0%i0==0: break elif i0>=int(sqrt(n0)): prime_list.append(n0) break e = time() print(e-s) #print(prime_list); print(f'pi({n})={len(prime_list)}') print(f'{n}: {len(prime_list)}, time: {e-s}') Output 100: 25, time: 0.00010275840759277344 1000: 168, time: 0.0008606910705566406 10000: 1229, time: 0.015588521957397461 100000: 9592, time: 0.023436546325683594 1000000: 78498, time: 4.1965954303741455 10000000: 664579, time: 109.24591708183289 100000000: 5761455, time: 2289.130858898163

小于1000似乎很慢,但小于10^6我认为更快。

然而,我无法理解时间的复杂性。

这个算法很快,但它有一个严重的缺陷:

>>> sorted(get_primes(530))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73,
79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163,
167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251,
257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349,
353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443,
449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 527, 529]
>>> 17*31
527
>>> 23*23
529

您假设numbers.pop()将返回集合中最小的数字,但这根本不能保证。集合是无序的,pop()删除并返回任意元素,因此不能使用它从剩余数字中选择下一个质数。

从2021年的答案开始,我还没有发现二进制数组方法对10亿以下的质数有利。

但我可以用几个技巧将质数从2加速到接近x2:

使用numexpr库将numpy表达式转换为分配较少的紧循环 取代np。有更快的选择 以某种方式操作筛选的前9个元素,因此不需要改变数组的形状

总之,在我的机器上,质数< 10亿的时间从25秒变成了14.5秒

import numexpr as ne
import numpy as np

def primesfrom2to_numexpr(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=24, Returns a array of primes, 2 <= p < n + a few over"""
    sieve = np.zeros((n // 3 + (n % 6 == 2))//4+1, dtype=np.int32)
    ne.evaluate('sieve + 0x01010101', out=sieve)
    sieve = sieve.view('int8')
    #sieve = np.ones(n // 3 + (n % 6 == 2), dtype=np.bool_)
    sieve[0] = 0
    for i in np.arange(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = 0
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = 0
    sieve[[0,8]] = 1
    result = np.flatnonzero(sieve)
    ne.evaluate('result * 3 + 1 + result%2', out=result)
    result[:9] = [2,3,5,7,11,13,17,19,23]
    return result