python多进程环境调用np.random()时, 默认每个进程会有相同的初始状态, 如果直接使用, 每个进程生成的随机序列会完全一致.
import numpy as np
import multiprocessing
def gen_value():
values = list()
for i in range(10):
values.append(np.random.randint(100))
print(values)
procs = [multiprocessing.Process(target=gen_value) for _ in range(10)]
for p in proces:
p.start()
p.join()
输出结果可以看到,每个进程输出的序列均一致:
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
[62, 96, 91, 48, 18, 72, 21, 78, 74, 86]
如果不希望这种情况发生, 而是每个进程都是独立产生随机数,需要在每个进程开始处对np.random初始化, 可以在每个进程用np.random.RandomState() 生成一个新的随机数引擎实例
import numpy as np
import multiprocessing
def gen_value(randomstate):
values = []
for i in range(10):
values.append(randomstate.randint(100)) # 使用randomstart生成随机数
print(values)
procs = [multiprocessing.Process(target=gen_value, args=(np.random.RandomState(),)) for i in range(10)]
for p in procs:
p.start()
p.join()
输出结果:
[39, 25, 65, 93, 71, 10, 27, 28, 93, 51]
[79, 40, 64, 58, 18, 48, 93, 68, 99, 15]
[39, 31, 85, 31, 69, 91, 85, 71, 59, 82]
[49, 58, 56, 23, 52, 65, 59, 84, 37, 26]
[35, 99, 3, 27, 16, 83, 85, 42, 76, 43]
[37, 62, 2, 30, 75, 14, 18, 79, 81, 9]
[93, 17, 62, 86, 38, 10, 46, 30, 68, 44]
[87, 52, 15, 44, 11, 69, 93, 5, 14, 89]
[83, 2, 81, 75, 95, 33, 21, 98, 92, 43]
[8, 36, 42, 19, 89, 80, 7, 2, 77, 56]
或者使用python原生的random模块替换np.random, 也会在每个进程初始化随机种子.
参考:
https://stackoverflow.com/questions/29854398/seeding-random-number-generators-in-parallel-programs
网友评论