1*0265e6eeSYanteng Si.. include:: ../disclaimer-zh_CN.rst 2*0265e6eeSYanteng Si 3*0265e6eeSYanteng Si:Original: Documentation/infiniband/user_mad.rst 4*0265e6eeSYanteng Si 5*0265e6eeSYanteng Si:翻译: 6*0265e6eeSYanteng Si 7*0265e6eeSYanteng Si 司延腾 Yanteng Si <siyanteng@loongson.cn> 8*0265e6eeSYanteng Si 9*0265e6eeSYanteng Si:校译: 10*0265e6eeSYanteng Si 11*0265e6eeSYanteng Si 王普宇 Puyu Wang <realpuyuwang@gmail.com> 12*0265e6eeSYanteng Si 时奎亮 Alex Shi <alexs@kernel.org> 13*0265e6eeSYanteng Si 14*0265e6eeSYanteng Si.. _cn_infiniband_user_mad: 15*0265e6eeSYanteng Si 16*0265e6eeSYanteng Si=============== 17*0265e6eeSYanteng Si用户空间MAD访问 18*0265e6eeSYanteng Si=============== 19*0265e6eeSYanteng Si 20*0265e6eeSYanteng Si设备文件 21*0265e6eeSYanteng Si======== 22*0265e6eeSYanteng Si 23*0265e6eeSYanteng Si 每个InfiniBand设备的每个端口都有一个“umad”设备和一个“issm”设备连接。 24*0265e6eeSYanteng Si 例如,一个双端口的HCA将有两个umad设备和两个issm设备,而一个交换机将 25*0265e6eeSYanteng Si 有每个类型的一个设备(对于交换机端口0)。 26*0265e6eeSYanteng Si 27*0265e6eeSYanteng Si创建MAD代理 28*0265e6eeSYanteng Si=========== 29*0265e6eeSYanteng Si 30*0265e6eeSYanteng Si 一个MAD代理可以通过填写一个结构体ib_user_mad_reg_req来创建,然后在 31*0265e6eeSYanteng Si 适当的设备文件的文件描述符上调用IB_USER_MAD_REGISTER_AGENT ioctl。 32*0265e6eeSYanteng Si 如果注册请求成功,结构体中会返回一个32位的ID。比如说:: 33*0265e6eeSYanteng Si 34*0265e6eeSYanteng Si struct ib_user_mad_reg_req req = { /* ... */ }; 35*0265e6eeSYanteng Si ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req); 36*0265e6eeSYanteng Si if (!ret) 37*0265e6eeSYanteng Si my_agent = req.id; 38*0265e6eeSYanteng Si else 39*0265e6eeSYanteng Si perror("agent register"); 40*0265e6eeSYanteng Si 41*0265e6eeSYanteng Si 代理可以通过IB_USER_MAD_UNREGISTER_AGENT ioctl取消注册。另外,所有 42*0265e6eeSYanteng Si 通过文件描述符注册的代理在描述符关闭时将被取消注册。 43*0265e6eeSYanteng Si 44*0265e6eeSYanteng Si 2014 45*0265e6eeSYanteng Si 现在提供了一个新的注册IOctl,允许在注册时提供额外的字段。这个注册 46*0265e6eeSYanteng Si 调用的用户隐含了对pkey_index的使用(见下文)。现在提供了一个新的 47*0265e6eeSYanteng Si 注册IOctl,允许在注册时提供额外的字段。这个注册调用的用户隐含了对 48*0265e6eeSYanteng Si pkey_index的使用(见下文)。 49*0265e6eeSYanteng Si 50*0265e6eeSYanteng Si接收MADs 51*0265e6eeSYanteng Si======== 52*0265e6eeSYanteng Si 53*0265e6eeSYanteng Si 使用read()接收MAD。现在接收端支持RMPP。传给read()的缓冲区必须至少是 54*0265e6eeSYanteng Si 一个struct ib_user_mad + 256字节。比如说: 55*0265e6eeSYanteng Si 56*0265e6eeSYanteng Si 如果传递的缓冲区不足以容纳收到的MAD(RMPP),errno被设置为ENOSPC,需 57*0265e6eeSYanteng Si 要的缓冲区长度被设置在mad.length中。 58*0265e6eeSYanteng Si 59*0265e6eeSYanteng Si 正常MAD(非RMPP)的读取示例:: 60*0265e6eeSYanteng Si 61*0265e6eeSYanteng Si struct ib_user_mad *mad; 62*0265e6eeSYanteng Si mad = malloc(sizeof *mad + 256); 63*0265e6eeSYanteng Si ret = read(fd, mad, sizeof *mad + 256); 64*0265e6eeSYanteng Si if (ret != sizeof mad + 256) { 65*0265e6eeSYanteng Si perror("read"); 66*0265e6eeSYanteng Si free(mad); 67*0265e6eeSYanteng Si } 68*0265e6eeSYanteng Si 69*0265e6eeSYanteng Si RMPP读取示例:: 70*0265e6eeSYanteng Si 71*0265e6eeSYanteng Si struct ib_user_mad *mad; 72*0265e6eeSYanteng Si mad = malloc(sizeof *mad + 256); 73*0265e6eeSYanteng Si ret = read(fd, mad, sizeof *mad + 256); 74*0265e6eeSYanteng Si if (ret == -ENOSPC)) { 75*0265e6eeSYanteng Si length = mad.length; 76*0265e6eeSYanteng Si free(mad); 77*0265e6eeSYanteng Si mad = malloc(sizeof *mad + length); 78*0265e6eeSYanteng Si ret = read(fd, mad, sizeof *mad + length); 79*0265e6eeSYanteng Si } 80*0265e6eeSYanteng Si if (ret < 0) { 81*0265e6eeSYanteng Si perror("read"); 82*0265e6eeSYanteng Si free(mad); 83*0265e6eeSYanteng Si } 84*0265e6eeSYanteng Si 85*0265e6eeSYanteng Si 除了实际的MAD内容外,其他结构体ib_user_mad字段将被填入收到的MAD的信 86*0265e6eeSYanteng Si 息。例如,远程LID将在mad.lid中。 87*0265e6eeSYanteng Si 88*0265e6eeSYanteng Si 如果发送超时,将产生一个接收,mad.status设置为ETIMEDOUT。否则,当一个 89*0265e6eeSYanteng Si MAD被成功接收后,mad.status将是0。 90*0265e6eeSYanteng Si 91*0265e6eeSYanteng Si poll()/select()可以用来等待一个MAD可以被读取。 92*0265e6eeSYanteng Si 93*0265e6eeSYanteng Si poll()/select()可以用来等待,直到可以读取一个MAD。 94*0265e6eeSYanteng Si 95*0265e6eeSYanteng Si发送MADs 96*0265e6eeSYanteng Si======== 97*0265e6eeSYanteng Si 98*0265e6eeSYanteng Si MADs是用write()发送的。发送的代理ID应该填入MAD的id字段,目的地LID应该 99*0265e6eeSYanteng Si 填入lid字段,以此类推。发送端确实支持RMPP,所以可以发送任意长度的MAD。 100*0265e6eeSYanteng Si 比如说:: 101*0265e6eeSYanteng Si 102*0265e6eeSYanteng Si struct ib_user_mad *mad; 103*0265e6eeSYanteng Si 104*0265e6eeSYanteng Si mad = malloc(sizeof *mad + mad_length); 105*0265e6eeSYanteng Si 106*0265e6eeSYanteng Si /* fill in mad->data */ 107*0265e6eeSYanteng Si 108*0265e6eeSYanteng Si mad->hdr.id = my_agent; /* req.id from agent registration */ 109*0265e6eeSYanteng Si mad->hdr.lid = my_dest; /* in network byte order... */ 110*0265e6eeSYanteng Si /* etc. */ 111*0265e6eeSYanteng Si 112*0265e6eeSYanteng Si ret = write(fd, &mad, sizeof *mad + mad_length); 113*0265e6eeSYanteng Si if (ret != sizeof *mad + mad_length) 114*0265e6eeSYanteng Si perror("write"); 115*0265e6eeSYanteng Si 116*0265e6eeSYanteng Si交换IDs 117*0265e6eeSYanteng Si======= 118*0265e6eeSYanteng Si 119*0265e6eeSYanteng Si umad设备的用户可以在发送的MAD中使用交换ID字段的低32位(也就是网络字节顺序中 120*0265e6eeSYanteng Si 最小有效的一半字段)来匹配请求/响应对。上面的32位是保留给内核使用的,在发送 121*0265e6eeSYanteng Si MAD之前会被改写。 122*0265e6eeSYanteng Si 123*0265e6eeSYanteng SiP_Key索引处理 124*0265e6eeSYanteng Si============= 125*0265e6eeSYanteng Si 126*0265e6eeSYanteng Si 旧的ib_umad接口不允许为发送的MAD设置P_Key索引,也没有提供获取接收的MAD的 127*0265e6eeSYanteng Si P_Key索引的方法。一个带有pkey_index成员的struct ib_user_mad_hdr的新布局已 128*0265e6eeSYanteng Si 经被定义;然而,为了保持与旧的应用程序的二进制兼容性,除非在文件描述符被用于 129*0265e6eeSYanteng Si 其他用途之前调用IB_USER_MAD_ENABLE_PKEY或IB_USER_MAD_REGISTER_AGENT2 ioctl 130*0265e6eeSYanteng Si 之一,否则不会使用这种新布局。 131*0265e6eeSYanteng Si 132*0265e6eeSYanteng Si 在2008年9月,IB_USER_MAD_ABI_VERSION将被增加到6,默认使用新的ib_user_mad_hdr 133*0265e6eeSYanteng Si 结构布局,并且IB_USER_MAD_ENABLE_PKEY ioctl将被删除。 134*0265e6eeSYanteng Si 135*0265e6eeSYanteng Si设置IsSM功能位 136*0265e6eeSYanteng Si============== 137*0265e6eeSYanteng Si 138*0265e6eeSYanteng Si 要为一个端口设置IsSM功能位,只需打开相应的issm设备文件。如果IsSM位已经被设置,那 139*0265e6eeSYanteng Si 么打开调用将阻塞,直到该位被清除(或者如果O_NONBLOCK标志被传递给open(),则立即返 140*0265e6eeSYanteng Si 回,errno设置为EAGAIN)。当issm文件被关闭时,IsSM位将被清除。在issm文件上不能进 141*0265e6eeSYanteng Si 行任何读、写或其他操作。 142*0265e6eeSYanteng Si 143*0265e6eeSYanteng Si/dev文件 144*0265e6eeSYanteng Si======== 145*0265e6eeSYanteng Si 146*0265e6eeSYanteng Si为了用 udev自动创建相应的字符设备文件,一个类似:: 147*0265e6eeSYanteng Si 148*0265e6eeSYanteng Si KERNEL=="umad*", NAME="infiniband/%k" 149*0265e6eeSYanteng Si KERNEL=="issm*", NAME="infiniband/%k" 150*0265e6eeSYanteng Si 151*0265e6eeSYanteng Si 的规则可以被使用。它将创建节点的名字:: 152*0265e6eeSYanteng Si 153*0265e6eeSYanteng Si /dev/infiniband/umad0 154*0265e6eeSYanteng Si /dev/infiniband/issm0 155*0265e6eeSYanteng Si 156*0265e6eeSYanteng Si 为第一个端口,以此类推。与这些设备相关的infiniband设备和端口可以从以下文件中确定:: 157*0265e6eeSYanteng Si 158*0265e6eeSYanteng Si /sys/class/infiniband_mad/umad0/ibdev 159*0265e6eeSYanteng Si /sys/class/infiniband_mad/umad0/port 160*0265e6eeSYanteng Si 161*0265e6eeSYanteng Si 和:: 162*0265e6eeSYanteng Si 163*0265e6eeSYanteng Si /sys/class/infiniband_mad/issm0/ibdev 164*0265e6eeSYanteng Si /sys/class/infiniband_mad/issm0/port 165