Multicast in Azure RDMA networks (InfiniBand)

Marcin Copik 11 Reputation points
2021-07-31T16:53:42.263+00:00

Hi!

I have deployed two Azure HB60rs machines. I can establish a connection, and I can see that there's an SM running as the NIC has a correct value of lid:

hca_id: mlx5_ib0
transport: InfiniBand (0)
fw_ver: 16.28.4000
node_guid: 0015:5dff:fe33:ff1b
sys_image_guid: 9803:9b03:009f:613a
vendor_id: 0x02c9
vendor_part_id: 4120
hw_ver: 0x0
board_id: MT_0000000010
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 214
port_lid: 393
port_lmc: 0x00
link_layer: InfiniBand

I've been trying to use the unreliable multicast in RDMA, but it seems that I can't learn in any way what are the multicast groups.

azureuser@test-rdma-3:~/multicast_udp$ saquery -g
ibwarn: [5619] sa_get_handle: umad_open_port on port (null):0 failed
ibpanic: [5619] main: Failed to bind to the SA: Permission denied

azureuser@test-rdma-3:~/multicast_udp$ sudo saquery -g
ibwarn: [5637] sa_query: umad_recv failed: attr 0x38: Connection timed out

Query SA failed: Connection timed out

Other commands don't work as well

azureuser@test-rdma-3:~/multicast_udp$ ibnodes 
ibwarn: [5944] mad_rpc_open_port: can't open UMAD port ((null):0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:802; can't open MAD port ((null):0)
/usr/sbin/ibnetdiscover: iberror: failed: discover failed
ibwarn: [5949] mad_rpc_open_port: can't open UMAD port ((null):0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:802; can't open MAD port ((null):0)
/usr/sbin/ibnetdiscover: iberror: failed: discover failed
azureuser@test-rdma-3:~/multicast_udp$ sudo ibnodes 
ibwarn: [5965] _do_madrpc: send failed; Invalid argument
ibwarn: [5965] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:811; Failed to resolve self
/usr/sbin/ibnetdiscover: iberror: failed: discover failed
ibwarn: [5970] _do_madrpc: send failed; Invalid argument
ibwarn: [5970] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:811; Failed to resolve self
/usr/sbin/ibnetdiscover: iberror: failed: discover failed



azureuser@test-rdma-3:~/multicast_udp$ ibnetdiscover 
ibwarn: [5979] mad_rpc_open_port: can't open UMAD port ((null):0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:802; can't open MAD port ((null):0)
ibnetdiscover: iberror: failed: discover failed

azureuser@test-rdma-3:~/multicast_udp$ sudo ibnetdiscover 
ibwarn: [5981] _do_madrpc: send failed; Invalid argument
ibwarn: [5981] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:811; Failed to resolve self
ibnetdiscover: iberror: failed: discover failed

Is this feature supported? I know that mcast is not supported in Azure virtual networks, but I assumed that the InfiniBand is a separate network.

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
7,585 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Cristian SPIRIDON 4,471 Reputation points
    2021-08-03T05:29:46.77+00:00
    1 person found this answer helpful.