np.random.seed做什么?
np.random.seed(0)
np.random.seed做什么?
np.random.seed(0)
当前回答
想象一下,您正在向某人展示如何用一堆“随机”数字编写代码。通过使用numpy种子,它们可以使用相同的种子号并获得相同的“随机”数字集。
所以它不是完全随机的,因为算法会吐出数字但它看起来像是随机生成的一堆。
其他回答
如果你每次调用numpy的其他随机函数时都设置np.random.seed(a_fixed_number),结果将是相同的:
>>> import numpy as np
>>> np.random.seed(0)
>>> perm = np.random.permutation(10)
>>> print perm
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.rand(4)
[0.5488135 0.71518937 0.60276338 0.54488318]
>>> np.random.seed(0)
>>> print np.random.rand(4)
[0.5488135 0.71518937 0.60276338 0.54488318]
然而,如果你只调用它一次,并使用各种随机函数,结果仍然会不同:
>>> import numpy as np
>>> np.random.seed(0)
>>> perm = np.random.permutation(10)
>>> print perm
[2 8 4 9 1 6 7 3 0 5]
>>> np.random.seed(0)
>>> print np.random.permutation(10)
[2 8 4 9 1 6 7 3 0 5]
>>> print np.random.permutation(10)
[3 5 1 2 9 8 0 6 7 4]
>>> print np.random.permutation(10)
[2 3 8 4 5 1 0 6 9 7]
>>> print np.random.rand(4)
[0.64817187 0.36824154 0.95715516 0.14035078]
>>> print np.random.rand(4)
[0.87008726 0.47360805 0.80091075 0.52047748]
设置特定种子值后产生的所有随机数在所有平台/系统中都是相同的。
我在神经网络中经常用到这个。众所周知,当我们开始训练神经网络时,我们会随机初始化权重。该模型在特定数据集上的这些权重上进行训练。在数代之后,你得到了一组训练好的权重。
Now suppose you want to again train from scratch or you want to pass the model to others to reproduce your results, the weights will be again initialised to a random numbers which mostly will be different from earlier ones. The obtained trained weights after same number of epochs ( keeping same data and other parameters ) as earlier one will differ. The problem is your model is no more reproducible that is every time you train your model from scratch it provides you different sets of weights. This is because the model is being initialized by different random numbers every time.
如果每次你从头开始训练时,模型初始化到相同的随机初始化权重集会怎样?在这种情况下,您的模型可以变得可重复。这是通过numpy.random.seed(0)实现的。通过将seed()提到一个特定的数字,您将始终保留相同的随机数集。
上面的所有答案都展示了np.random.seed()在代码中的实现。我会尽量简单地解释为什么会发生这种情况。计算机是基于预先定义的算法设计的机器。计算机的任何输出都是对输入执行算法的结果。所以当我们要求计算机生成随机数时,当然它们是随机的,但计算机并不是随机产生的!
因此,当我们编写np.random.seed(any_number_here)时,算法将输出一个特定的数字集,该数字集对参数any_number_here是唯一的。这就好像我们传递正确的参数就能得到一组特定的随机数。但这需要我们知道算法是如何工作的,这很乏味。
因此,例如,如果我写np.random.seed(10),我得到的特定数字集将保持不变,即使我在10年后执行同一行,除非算法改变。
如前所述,numpy.random.seed(0)将随机种子设置为0,因此从random获得的伪随机数将从同一点开始。在某些情况下,这有助于调试。然而,经过一些阅读,如果您有线程,这似乎是错误的方法,因为它不是线程安全的。
从differences-between-numpy-random-and-random-random-in-python:
For numpy.random.seed(), the main difficulty is that it is not thread-safe - that is, it's not safe to use if you have many different threads of execution, because it's not guaranteed to work if two different threads are executing the function at the same time. If you're not using threads, and if you can reasonably expect that you won't need to rewrite your program this way in the future, numpy.random.seed() should be fine for testing purposes. If there's any reason to suspect that you may need threads in the future, it's much safer in the long run to do as suggested, and to make a local instance of the numpy.random.Random class. As far as I can tell, random.random.seed() is thread-safe (or at least, I haven't found any evidence to the contrary).
如何做到这一点的例子:
from numpy.random import RandomState
prng = RandomState()
print prng.permutation(10)
prng = RandomState()
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)
prng = RandomState(42)
print prng.permutation(10)
可能给:
[3 0 4 6 8 2 1 9 7 5] [1 6 9 0 2 7 8 3 5 4] [8 1 5 0 7 2 9 4 3 6] [8 1 5 0 7 2 9 4 3 6]
最后,请注意,由于xor的工作方式,在某些情况下初始化为0(而不是所有位都为0的种子)可能会导致一些第一次迭代的不均匀分布,但这取决于算法,超出了我目前的担忧和这个问题的范围。