Multicast in Azure RDMA networks (InfiniBand)

Marcin Copik 11 Reputation points
2021-07-31T16:53:42.263+00:00

Hi!

I have deployed two Azure HB60rs machines. I can establish a connection, and I can see that there's an SM running as the NIC has a correct value of lid:

hca_id: mlx5_ib0
transport: InfiniBand (0)
fw_ver: 16.28.4000
node_guid: 0015:5dff:fe33:ff1b
sys_image_guid: 9803:9b03:009f:613a
vendor_id: 0x02c9
vendor_part_id: 4120
hw_ver: 0x0
board_id: MT_0000000010
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 214
port_lid: 393
port_lmc: 0x00
link_layer: InfiniBand

I've been trying to use the unreliable multicast in RDMA, but it seems that I can't learn in any way what are the multicast groups.

azureuser@test-rdma-3:~/multicast_udp$ saquery -g
ibwarn: [5619] sa_get_handle: umad_open_port on port (null):0 failed
ibpanic: [5619] main: Failed to bind to the SA: Permission denied

azureuser@test-rdma-3:~/multicast_udp$ sudo saquery -g
ibwarn: [5637] sa_query: umad_recv failed: attr 0x38: Connection timed out

Query SA failed: Connection timed out

Other commands don't work as well

azureuser@test-rdma-3:~/multicast_udp$ ibnodes 
ibwarn: [5944] mad_rpc_open_port: can't open UMAD port ((null):0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:802; can't open MAD port ((null):0)
/usr/sbin/ibnetdiscover: iberror: failed: discover failed
ibwarn: [5949] mad_rpc_open_port: can't open UMAD port ((null):0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:802; can't open MAD port ((null):0)
/usr/sbin/ibnetdiscover: iberror: failed: discover failed
azureuser@test-rdma-3:~/multicast_udp$ sudo ibnodes 
ibwarn: [5965] _do_madrpc: send failed; Invalid argument
ibwarn: [5965] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:811; Failed to resolve self
/usr/sbin/ibnetdiscover: iberror: failed: discover failed
ibwarn: [5970] _do_madrpc: send failed; Invalid argument
ibwarn: [5970] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:811; Failed to resolve self
/usr/sbin/ibnetdiscover: iberror: failed: discover failed



azureuser@test-rdma-3:~/multicast_udp$ ibnetdiscover 
ibwarn: [5979] mad_rpc_open_port: can't open UMAD port ((null):0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:802; can't open MAD port ((null):0)
ibnetdiscover: iberror: failed: discover failed

azureuser@test-rdma-3:~/multicast_udp$ sudo ibnetdiscover 
ibwarn: [5981] _do_madrpc: send failed; Invalid argument
ibwarn: [5981] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
/var/tmp/rdma-core/rdma-core-52mlnx1/libibnetdisc/ibnetdisc.c:811; Failed to resolve self
ibnetdiscover: iberror: failed: discover failed

Is this feature supported? I know that mcast is not supported in Azure virtual networks, but I assumed that the InfiniBand is a separate network.

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
{count} vote

1 answer

Sort by: Most helpful
  1. Cristian SPIRIDON 4,486 Reputation points Volunteer Moderator
    2021-08-03T05:29:46.77+00:00
    1 person found this answer helpful.

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.