在上一篇中我们介绍了 mpi4py 中的全发散操作方法,下面我们将介绍扫描操作。
注意:只有组内通信子支持扫描操作。
方法接口
mpi4py 中的扫描操作的方法(MPI.Intracomm 类的方法)接口为:
scan(self, sendobj, op=SUM)
exscan(self, sendobj, op=SUM)
Scan(self, sendbuf, recvbuf, Op op=SUM)
Exscan(self, sendbuf, recvbuf, Op op=SUM)
这些方法的参数与规约操作相应的参数一致,参数 op
指明用什么算符进行规约,其默认值是 MPI.SUM,即求和算符,其它内置的规约算符可见规约操作。
scan/Scan (也称作 inclusive scan),实际上是逐级执行规约操作,即进程 i 对进程 0, 1, ... , i 执行规约。
exscan/Exscan (也称作 exclusive scan),定义了一种“前缀扫描”操作,具体为:对进程 0,其接收缓冲区未定义;对进程 1,其接收缓冲区数据为进程 0 的发送缓冲区数据;对进程 i > 1,其接收缓冲区数据为进程 0, 1, ... , i - 1 发送缓冲区数据的规约。
对 Scan/Exscan,可将其 sendbuf
参数设置成 MPI.IN_PLACE,此时从 recvbuf
中提取数据进行逐级规约,然后将结果替换 recvbuf
的数据缓冲区。
例程
下面给出扫描操作的使用例程。
# scan.py
"""
Demonstrates the usage of scan, exscan, Scan, Exscan.
Run this with 4 processes like:
$ mpiexec -n 4 python scan.py
"""
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
# ------------------------------------------------------------------------------
# scan
send_obj = [2.5, 0.5, 3.5, 1.5][rank]
recv_obj = comm.scan(send_obj)
# scan by SUM:
# rank 0: 2.5
# rank 1: 2.5 + 0.5 = 3.0
# rank 2: 2.5 + 0.5 + 3.5 = 6.5
# rank 3: 2.5 + 0.5 + 3.5 + 1.5 = 8.0
print 'scan with SUM: rank %d has %s' % (rank, recv_obj)
recv_obj = comm.scan(send_obj, op=MPI.MAX)
# scan by MAX:
# rank 0: 2.5
# rank 1: max(2.5, 0.5) = 2.5
# rank 2: max(2.5, 0.5, 3.5) = 3.5
# rank 3: max(2.5, 0.5, 3.5, 1.5) = 3.5
print 'scan with MAX: rank %d has %s' % (rank, recv_obj)
# ------------------------------------------------------------------------------
# exscan
recv_obj = comm.exscan(send_obj)
# scan by SUM:
# rank 0: None
# rank 1: 2.5
# rank 2: 2.5 + 0.5 = 3.0
# rank 3: 2.5 + 0.5 + 3.5 = 6.5
print 'exscan with SUM: rank %d has %s' % (rank, recv_obj)
recv_obj = comm.exscan(send_obj, op=MPI.MAX)
# scan by MAX:
# rank 0: None
# rank 1: 2.5
# rank 2: max(2.5, 0.5) = 2.5
# rank 3: max(2.5, 0.5, 3.5) = 3.5
print 'exscan with MAX: rank %d has %s' % (rank, recv_obj)
# ------------------------------------------------------------------------------
# Scan
send_buf = np.array([0, 1], dtype='i') + 2 * rank
recv_buf = np.empty(2, dtype='i')
comm.Scan(send_buf, recv_buf, op=MPI.SUM)
# Scan by SUM:
# rank 0: [0, 1]
# rank 1: [0, 1] + [2, 3] = [2, 4]
# rank 2: [0, 1] + [2, 3] + [4, 5] = [6, 9]
# rank 3: [0, 1] + [2, 3] + [4, 5] + [6, 7] = [12, 16]
print 'Scan by SUM: rank %d has %s' % (rank, recv_buf)
# ------------------------------------------------------------------------------
# Exscan
send_buf = np.array([0, 1], dtype='i') + 2 * rank
# initialize recv_buf with [-1, -1]
recv_buf = np.zeros(2, dtype='i') - 1
comm.Exscan(send_buf, recv_buf, op=MPI.SUM)
# Exscan by SUM:
# rank 0: [-1, -1]
# rank 1: [0, 1]
# rank 2: [0, 1] + [2, 3] = [2, 4]
# rank 3: [0, 1] + [2, 3] + [4, 5] = [6, 9]
print 'Exscan by SUM: rank %d has %s' % (rank, recv_buf)
# ------------------------------------------------------------------------------
# Scan with MPI.IN_PLACE
recv_buf = np.array([0, 1], dtype='i') + 2 * rank
comm.Scan(MPI.IN_PLACE, recv_buf, op=MPI.SUM)
# recv_buf used as both send buffer and receive buffer
# result same as Scan
print 'Scan by SUM with MPI.IN_PLACE: rank %d has %s' % (rank, recv_buf)
# ------------------------------------------------------------------------------
# Exscan with MPI.IN_PLACE
recv_buf = np.array([0, 1], dtype='i') + 2 * rank
comm.Exscan(MPI.IN_PLACE, recv_buf, op=MPI.SUM)
# recv_buf used as both send buffer and receive buffer
# rank 0: [0, 1]
# rank 1: [0, 1]
# rank 2: [0, 1] + [2, 3] = [2, 4]
# rank 3: [0, 1] + [2, 3] + [4, 5] = [6, 9]
print 'Exscan by SUM with MPI.IN_PLACE: rank %d has %s' % (rank, recv_buf)
运行结果如下:
$ mpiexec -n 4 python scan.py
scan with SUM: rank 0 has 2.5
scan with MAX: rank 0 has 2.5
exscan with SUM: rank 0 has None
exscan with MAX: rank 0 has None
scan with SUM: rank 1 has 3.0
scan with MAX: rank 1 has 2.5
exscan with SUM: rank 1 has 2.5
exscan with MAX: rank 1 has 2.5
scan with SUM: rank 2 has 6.5
scan with MAX: rank 2 has 3.5
exscan with SUM: rank 2 has 3.0
exscan with MAX: rank 2 has 2.5
scan with SUM: rank 3 has 8.0
scan with MAX: rank 3 has 3.5
exscan with SUM: rank 3 has 6.5
exscan with MAX: rank 3 has 3.5
Scan by SUM: rank 3 has [12 16]
Exscan by SUM: rank 3 has [6 9]
Scan by SUM with MPI.IN_PLACE: rank 3 has [12 16]
Scan by SUM: rank 0 has [0 1]
Exscan by SUM: rank 0 has [-1 -1]
Scan by SUM with MPI.IN_PLACE: rank 0 has [0 1]
Scan by SUM: rank 1 has [2 4]
Exscan by SUM: rank 1 has [0 1]
Scan by SUM with MPI.IN_PLACE: rank 1 has [2 4]
Scan by SUM: rank 2 has [6 9]
Exscan by SUM: rank 2 has [2 4]
Scan by SUM with MPI.IN_PLACE: rank 2 has [6 9]
Exscan by SUM with MPI.IN_PLACE: rank 1 has [0 1]
Exscan by SUM with MPI.IN_PLACE: rank 2 has [2 4]
Exscan by SUM with MPI.IN_PLACE: rank 3 has [6 9]
Exscan by SUM with MPI.IN_PLACE: rank 0 has [0 1]
以上我们介绍了 mpi4py 中的扫描操作方法,在下一篇中我们将介绍栅障同步操作。
网友评论