来源: http://www.rdmamojo.com/2013/06/01/which-queue-pair-type-to-use/
When writing a new RDMA application (just like when writing a new application over sockets), one should decide which QP type he should work with.
当写一个新的RDMA应用(好比用socket写一个新应用)时,需要决定使用哪一种QP类型。
In this post, I will describe in detail the characteristics of each transport type.
在本文中,将详细描述每种传输类型的特点。
In RDMA, there are several QP types. They can be represented by : XY
在RDMA中,有好几种QP类型,可以用XY来表示。
X can be:
Reliable: There is a guarantee that messages are delivered at most once, in order and without corruption.
Unreliable: There isn't any guarantee that the messages will be delivered or about the order of the packets.
In RDMA, every packet has a CRC and corrupted packets are being dropped (for any transport type). The Reliability of a QP transport type refers to the whole message reliability.
X可以是:
可靠: 传输消息有保证,即传输消息一次完成,有序且没有脏数据。
不可靠: 消息传输没有保证或者不保证数据包是否有序到达。
在RDMA中,每一个数据包都由CRC,对任何一种传输类型来说,已经弄脏的数据包会被drop掉。 QP传输类型的可靠性指的是整条消息的可靠性。
Y can be:
Connected: one QP send/receive with exactly one QP
Unconnected: one QP send/receive with any QP
Y可以是:
连接: QP是1:1
无连接: QP是1:any
The following mechanisms are being used in RDMA:
* CRC: The CRC field which validates that packets weren't corrupted along the path.
* PSN: The Packet Serial Number makes sure that packets are being received by the order. This helps detect missing packets and packet duplications.
* Acknowledgement: (only in RC QP) Only after a message is being written successfully on the responder side, an ack packet is being sent back to the requestor. If an ack isn't being sent by the requestor, it resend the message again according to the QP's attributes. If there won't be any ack (or nack) from a QP, it will report that there is an error (retry exceeded).
If there is any kind of error on the responder side (protection, resources, etc.) an ack will be sent to the requestor and it will report that there is an error.
RDMA使用了下面几种机制:
CRC: 循环冗余校验码。用于校验数据包在传输过程中是否被弄脏。
PSN: 包序列号。保证一系列数据包被接收到的时候是有序的。这有助于探测是否有丢包或者重复的包。
Acknowledgement: (仅用于RC QP)。当且仅当一条消息被成功地写入到响应方,一个ack包才会发送给请求方。如果请求方没有收到ack, 请求方就会根据QP的属性重新发送消息。如果在QP中没有收到ack或nack, 请求方就会在超过重试次数后报告一个错误。
如果响应方meiyoiu发现任何类型的错误(比如保护,资源等),一个ack包就会发送给请求方,该ack包给请求方报告一个错误。
Reliable Connected (RC) QP | 可靠连接(RC)QP
One RC QP is being connected (i.e. send and receive messages) to exactly one RC QP in a reliable way. It is guaranteed that messages are delivered from a requester to a responder at most once, in order and without corruption. The maximum supported message size is up to 2GB (this value may be lower, depends on the support RDMA device attributes). RC QP supports Send operations (w/o immediate), RDMA Write operations (w/o immediate), RDMA Read operations and Atomic operations (it depends on the RDMA device support level in atomic operations).
一个RC QP是以可靠的方式准确地与另一个RC QP相连接(即发送和接收消息)。在可靠连接中,保证消息从请求方一次性地有序地传递到响应方,且数据不被弄脏。支持的最大消息长度是2GB(这个值可能更小一些,取决于所支持的RDMA设备的属性)。RC QP支持Send操作(w/o立即数), RDMA Write操作(w/o 立即数), RDMA Read操作和Atomic操作(取决于RDMA设备对原子操作的支持级别)。
If a message size is bigger than the path MTU, it is being fragmented in the side that sends the data and being reassembled in the receiver side.
如果消息长度比path MTU(最大传输单元)大, 那么消息就会被分段发送,然后接收方收到知道后重新组装。
Requester considers a message operation complete once there is an ack from the responder side that the message was read/written to its memory.
在收到响应方的ack后,请求方认为消息操作已经完成,该消息被读入/写入它的内存。
Requester considers a message operation complete once the message was read/written to its (local) memory.
一旦消息读入/写入它的(本地)内存,请求方就认为消息操作已经完成。
Unreliable Connected (UC) QP | 不可靠连接(UC)QP
One UC QP is being connected (i.e. send and receive messages) to exactly one UC QP in an unreliable way. There isn't any guaranteed that the messages will be received by the other side: corrupted or out of sequence packets are silently dropped. If a packet is being dropped, the whole message that it belongs to will be dropped. In this case, the responder won't stop, but continues to receive incoming packets. There isn't any guarantee about the packet ordering. The maximum supported message size is up to 2GB (this value may be lower, depends on the support RDMA device attributes). UC QP supports Send operations (w/o immediate) and RDMA Write operations (w/o immediate).
一个UC QP是以不可靠的方式与另一个UC QP相连接(即发送和接收消息)。消息发送出去,接收方是否收到是没有保证的。数据包被弄脏了,或者顺序被改变了,都会被悄悄地drop掉。 如果一个包被drop掉了,那么整条消息就被drop掉了。在这种情况下,响应方不会停下来,一直在接收进站的包。对于包到达的顺序,没有任何保证。支持的最大消息长度是2GB(这个值可能更小一些,取决于所支持的RDMA设备的属性)。UC支持Send操作(w/o立即数), RDMA Write操作(w/o 立即数)。
If a message size is bigger than the path MTU, it is being fragmented in the side that sends the data and being reassembled in the receiver side.
如果消息长度比path MTU(最大传输单元)大, 那么消息就会被分段发送,然后接收方收到知道后重新组装。
Requester considers a message operation complete once all of the message was sent to the fabric.
一旦所有消息都发送到fabric, 请求方就认为消息操作完成了。
Responder considers a message operation complete once it received a complete message in correct sequence and it written the data to its (local) memory.
一旦收到一条有序的完整消息,并把对应的数据写入(本地)内存,响应方就认为消息操作完成了。
Unreliable Datagram (UD) QP | 不可靠数据报(UD)QP
One QP can send and receive message to any other UD QP in either unicast (one to one) or multicast (one to many) way in an unreliable way. There isn't any guaranteed that the messages will be received by the other side: corrupted or out of sequence packets are silently dropped. There isn't any guarantee about the packet ordering. The maximum supported message size is the maximum path MTU. UD QP supports only Send operations.
一个UD QP可以用不可靠的方式给另一个UD QP发送/接收消息,使用单播(1:1) 或者多播(1:N)。 消息发送出去,接收方是否收到是没有保证的。数据包被弄脏了,或者顺序被改变了,都会被悄悄地drop掉。 对数据包的顺序也是没有保障的。支持的最大消息长度等于最大的path MTU。 UD QP只支持Send操作。
Requester considers a message operation complete once the (one packet) message was sent to the fabric.
一旦消息(单数据包)都发送到fabric, 请求方就认为消息操作完成了。
Responder considers a message operation complete once it received a complete message and it written the data to its (local) memory.
一旦收到一条完整的消息,并把对应的数据写入(本地)内存,响应方就认为消息操作完成了。
Choosing the right QP type | 选择合适的QP类型
Choosing the right QP type is critical to the correction and scalability of an application.
选择合适的QP类型,对保证应用程序的正确性和扩展性至关重要。
RC QP should be chosen if:
o Reliability by the fabric is needed
o Fabric size isn't big or the cluster size is big, but not all nodes send traffic to the same node (one victim)
Several uses for a RC QP can be: FTP over RDMA or file system over RDMA.
在下列情况下选择RC QP:
o Fabric要求可靠性
o Fabric不大,但是集群比较大,不是所有结点都给某个(倒霉蛋)结点发消息
这些应用要使用RC QP: FTP over RDMA 或 FS over RDMA
UC QP should be chosen if:
o Reliability by the fabric isn't needed (i.e. reliability isn't important at all or it is being taken care of by the application)
o Fabric size isn't big or the cluster size is big, but not all nodes send traffic to the same node (one victim)
o Big messages (more than the path MTU) are being sent
One use for an UC QP can be: video over RDMA.
在下列情况下选择UC QP:
o Fabric并不需要可靠性(也就是说,可靠性一点都不重要 或者 应用程序对可靠性负责)
o Fabric不大,但是集群比较大,不是所有结点都给某个(倒霉蛋)结点发消息
o 大消息(比path MTU大)被发送
使用UC QP的应用是: 基于RDMA传输的视频
UD QP should be chosen if:
o Reliability by the fabric isn't needed (i.e. reliability isn't important at all or it is being taken care of by the application)
o Fabric size is big and all nodes and every node send messages to any other node in the fabric. UD is one of the best solutions for scalability problems.
o Multicast messages are needed
One use for an UD QP can be: voice over RDMA.
选择UD WP主要根据:
o Fabric并不需要可靠性(也就是说,可靠性一点都不重要 或者 应用程序对可靠性负责)
o Fabric很大,所有结点都相互发消息。 解决可扩展性问题,UD是最好的解决方案
o 需要发送多播消息
使用UD QP的有: 基于RDMA传输的音频。
Summary | 总结
The following table describes the characteristics of each QP Transport Service Type:
下面的表格描述了每一个QP传输服务类型的特征:

注记: 本文并没有介绍RD。
网友评论