np.random.seed做什么?

np.random.seed(0)

当前回答

Np.random.seed(0)使随机数可预测

>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55,  0.72,  0.6 ,  0.54])
>>> numpy.random.seed(0) ; numpy.random.rand(4)
array([ 0.55,  0.72,  0.6 ,  0.54])

随着种子重置(每次),相同的一组数字将每次出现。

如果随机种子没有被重置,每次调用都会出现不同的数字:

>>> numpy.random.rand(4)
array([ 0.42,  0.65,  0.44,  0.89])
>>> numpy.random.rand(4)
array([ 0.96,  0.38,  0.79,  0.53])

(伪)随机数的工作原理是从一个数字(种子)开始,乘以一个大数,加上一个偏移量,然后对这个和取模。然后,生成的数字被用作生成下一个“随机”数字的种子。当你(每次)设置种子时,它每次都做同样的事情,给你相同的数字。

如果你想要看似随机的数字,不要设置种子。但是,如果您的代码使用了想要调试的随机数,那么在每次运行之前设置种子会非常有帮助,这样代码每次运行时都会执行相同的操作。

要为每次运行获取最多的随机数,请调用numpy.random.seed()。这将导致numpy将种子设置为从/dev/urandom或其Windows模拟程序获得的随机数,或者,如果两者都不可用,它将使用时钟。

有关使用种子生成伪随机数的更多信息,请参阅维基百科。

其他回答

我希望给出一个非常简短的答案:

种子使(下一个系列)随机数可预测。你可以认为每次调用seed之后,它都预先定义了序列号numpy random保留了它的迭代器,然后每次你得到一个随机数它就会调用get next。

例如:

np.random.seed(2)
np.random.randn(2) # array([-0.41675785, -0.05626683])
np.random.randn(1) # array([-1.24528809])

np.random.seed(2)
np.random.randn(1) # array([-0.41675785])
np.random.randn(2) # array([-0.05626683, -1.24528809])

您可以注意到,当我设置相同的种子时,无论每次从numpy请求多少个随机数,它总是给出相同的数字序列,在本例中是数组([-0.41675785,-0.05626683,-1.24528809])。

随机种子指定计算机生成随机数序列时的起始点。

For example, let’s say you wanted to generate a random number in Excel (Note: Excel sets a limit of 9999 for the seed). If you enter a number into the Random Seed box during the process, you’ll be able to use the same set of random numbers again. If you typed “77” into the box, and typed “77” the next time you run the random number generator, Excel will display that same set of random numbers. If you type “99”, you’ll get an entirely different set of numbers. But if you revert back to a seed of 77, then you’ll get the same set of random numbers you started with.

例如,“取一个数x,加上900 +x,然后减去52。”为了使进程开始,您必须指定一个起始数字x(种子)。让我们以77为例:

900 + 77 = 977 减去52 = 925 按照相同的算法,第二个“随机”数将是:

900 + 925 = 1825 减去52 = 1773 这个简单的例子遵循一个模式,但是计算机数字生成背后的算法要复杂得多

如果你每次调用numpy的其他随机函数时都设置np.random.seed(a_fixed_number),结果将是相同的:

>>> import numpy as np
>>> np.random.seed(0) 
>>> perm = np.random.permutation(10) 
>>> print perm 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10) 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10) 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10) 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.rand(4) 
[0.5488135  0.71518937 0.60276338 0.54488318]
>>> np.random.seed(0) 
>>> print np.random.rand(4) 
[0.5488135  0.71518937 0.60276338 0.54488318]

然而,如果你只调用它一次,并使用各种随机函数,结果仍然会不同:

>>> import numpy as np
>>> np.random.seed(0) 
>>> perm = np.random.permutation(10)
>>> print perm 
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0) 
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> print np.random.permutation(10) 
[3 5 1 2 9 8 0 6 7 4]
>>> print np.random.permutation(10) 
[2 3 8 4 5 1 0 6 9 7]
>>> print np.random.rand(4) 
[0.64817187 0.36824154 0.95715516 0.14035078]
>>> print np.random.rand(4) 
[0.87008726 0.47360805 0.80091075 0.52047748]

如前所述,numpy.random.seed(0)将随机种子设置为0,因此从random获得的伪随机数将从同一点开始。在某些情况下,这有助于调试。然而,经过一些阅读,如果您有线程,这似乎是错误的方法,因为它不是线程安全的。

从differences-between-numpy-random-and-random-random-in-python:

For numpy.random.seed(), the main difficulty is that it is not thread-safe - that is, it's not safe to use if you have many different threads of execution, because it's not guaranteed to work if two different threads are executing the function at the same time. If you're not using threads, and if you can reasonably expect that you won't need to rewrite your program this way in the future, numpy.random.seed() should be fine for testing purposes. If there's any reason to suspect that you may need threads in the future, it's much safer in the long run to do as suggested, and to make a local instance of the numpy.random.Random class. As far as I can tell, random.random.seed() is thread-safe (or at least, I haven't found any evidence to the contrary).

如何做到这一点的例子:

from numpy.random import RandomState
prng = RandomState()
print prng.permutation(10)
prng = RandomState()
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)

可能给:

[3 0 4 6 8 2 1 9 7 5] [1 6 9 0 2 7 8 3 5 4] [8 1 5 0 7 2 9 4 3 6] [8 1 5 0 7 2 9 4 3 6]

最后,请注意,由于xor的工作方式,在某些情况下初始化为0(而不是所有位都为0的种子)可能会导致一些第一次迭代的不均匀分布,但这取决于算法,超出了我目前的担忧和这个问题的范围。

我在神经网络中经常用到这个。众所周知,当我们开始训练神经网络时,我们会随机初始化权重。该模型在特定数据集上的这些权重上进行训练。在数代之后,你得到了一组训练好的权重。

Now suppose you want to again train from scratch or you want to pass the model to others to reproduce your results, the weights will be again initialised to a random numbers which mostly will be different from earlier ones. The obtained trained weights after same number of epochs ( keeping same data and other parameters ) as earlier one will differ. The problem is your model is no more reproducible that is every time you train your model from scratch it provides you different sets of weights. This is because the model is being initialized by different random numbers every time.

如果每次你从头开始训练时,模型初始化到相同的随机初始化权重集会怎样?在这种情况下,您的模型可以变得可重复。这是通过numpy.random.seed(0)实现的。通过将seed()提到一个特定的数字,您将始终保留相同的随机数集。