在复杂的组网环境下,一旦出现网络故障问题如IP无法通信等问题进行排查往往需要投入大量的时间和精力。而对于InfiniBand网络组网环境,则可以通过 iblinkinfo
工具来快速排查网络环境是否正常,在出现故障时及时定位到故障根源并进行恢复处理。
最近在公司环境内遇到一个比较诡异的现象,一套测试环境中的IB网卡接线后从计算层到存储层的网络却无法ping通,查看IB网卡的状态都是正常连通的,并且每张网卡上对应网口的IP配置都正常。由于组网环境分为三层,通过将计算层与存储层的网络连接到InfiniBand交换机上,实现计算层与存储层网络的互通。在这样的配置下,要找到故障的原因则需要从计算层、交换机层、存储层分别进行排查,先大致确定问题出在哪一层上。
计算层网络展示如下
#ibdev2netdev # ==> node1 IB网口状态
mlx4_0 port 1 ==> ib0 (Up)
mlx4_0 port 2 ==> ib1 (Up)
mlx4_1 port 1 ==> ib2 (Up)
mlx4_1 port 2 ==> ib3 (Up)
#ifconfig -a | grep -E "mtu|inet" # ==> node1 IP配置情况
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.22.12 netmask 255.255.255.0 broadcast xx.xx.22.255
inet6 fe80::f652:1403:93:1791 prefixlen 64 scopeid 0x20<link>
ib1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.23.12 netmask 255.255.255.0 broadcast xx.xx.23.255
inet6 fe80::f652:1403:93:1792 prefixlen 64 scopeid 0x20<link>
ib2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.24.12 netmask 255.255.255.0 broadcast xx.xx.24.255
inet6 fe80::f652:1403:22:ba41 prefixlen 64 scopeid 0x20<link>
ib3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.25.12 netmask 255.255.255.0 broadcast xx.xx.25.255
inet6 fe80::f652:1403:22:ba42 prefixlen 64 scopeid 0x20<link>
#ibdev2netdev # ==> node2 IB网口状态
mlx4_0 port 1 ==> ib0 (Up)
mlx4_0 port 2 ==> ib1 (Up)
mlx4_1 port 1 ==> ib2 (Up)
mlx4_1 port 2 ==> ib3 (Up)
#ifconfig -a | grep -E "mtu|inet" # ==> node2 IP配置情况
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.22.94 netmask 255.255.255.0 broadcast xx.xx.22.255
inet6 fe80::f652:1403:72:c171 prefixlen 64 scopeid 0x20<link>
ib1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.23.94 netmask 255.255.255.0 broadcast xx.xx.23.255
inet6 fe80::f652:1403:72:c172 prefixlen 64 scopeid 0x20<link>
ib2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.24.94 netmask 255.255.255.0 broadcast xx.xx.24.255
inet6 fe80::26be:5ff:ffc6:78d1 prefixlen 64 scopeid 0x20<link>
ib3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.25.94 netmask 255.255.255.0 broadcast xx.xx.25.255
inet6 fe80::26be:5ff:ffc6:78d2 prefixlen 64 scopeid 0x20<link>
存储层网络配置如下
#ibdev2netdev # ==> node3 IB网口状态
mlx4_0 port 1 ==> ib0 (Up)
mlx4_0 port 2 ==> ib1 (Up)
mlx4_1 port 1 ==> ib2 (Down)
mlx4_1 port 2 ==> ib3 (Down)
#ifconfig -a | grep -E "mtu|inet" # ==> node3 IP配置情况
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.22.22 netmask 255.255.255.0 broadcast xx.xx.22.255
inet6 fe80::202:c903:31:8321 prefixlen 64 scopeid 0x20<link>
ib1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.24.22 netmask 255.255.255.0 broadcast xx.xx.24.255
inet6 fe80::202:c903:31:8322 prefixlen 64 scopeid 0x20<link>
ib2: flags=4098<BROADCAST,MULTICAST> mtu 4092
ib3: flags=4098<BROADCAST,MULTICAST> mtu 4092
#ibdev2netdev # ==> node4 IB网口状态
mlx4_0 port 1 ==> ib0 (Up)
mlx4_0 port 2 ==> ib1 (Up)
mlx4_1 port 1 ==> ib2 (Down)
mlx4_1 port 2 ==> ib3 (Down)
#ifconfig -a | grep -E "mtu|inet" # ==> node4 IP配置情况
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.22.23 netmask 255.255.255.0 broadcast xx.xx.22.255
inet6 fe80::202:c903:30:a9a1 prefixlen 64 scopeid 0x20<link>
ib1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 4092
inet xx.xx.24.23 netmask 255.255.255.0 broadcast xx.xx.24.255
inet6 fe80::202:c903:30:a9a2 prefixlen 64 scopeid 0x20<link>
ib2: flags=4098<BROADCAST,MULTICAST> mtu 4092
ib3: flags=4098<BROADCAST,MULTICAST> mtu 4092
现象
在上面的配置中,计算层之间的两个服务器间的IP可以相互通信,存储层之间的网络也可以互通,但从计算层到存储层的网络却无法通信
#ping 192.168.24.22 # 存储层测试 node4 ==> node3
PING xx.xx.24.22 (xx.xx.24.22) 56(84) bytes of data.
64 bytes from xx.xx.24.22: icmp_seq=1 ttl=64 time=0.132 ms
^C
--- xx.xx.24.22 ping statistics ---
1 packets transmitted, received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.119/0.125/0.132/0.012 ms
#ping 192.168.22.94 # 计算层测试 node1 ==> node2
PING xx.xx.22.94 (xx.xx.22.94) 56(84) bytes of data.
64 bytes from xx.xx.22.94: icmp_seq=1 ttl=64 time=0.076 ms
^C
--- xx.xx.22.94 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.076/0.076/0.076/0.000 ms
#ping 192.168.22.22 # 计算层到存储层测试 node2 ==> node3
PING xx.xx.22.22 (xx.xx.22.22) 56(84) bytes of data.
^C
--- xx.xx.22.22 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2001ms
出现此类情况,计算层之间通信正常,存储层之间通信正常,而计算层到存储层的网络无法通信,说明故障发生在交换机层上。这可能会有两种情况,其一交换机发生了故障,这会导致所有连接到交换机的网口都无法通信,也就是说计算层也无法通信,显然,当前的情况并非此类;另一种则可能因为tcp包倒了交换机后,交换机无法接收ip包,而一些场景下网线连接错误可能会导致此类情况。
排查诊断
既然确定了故障位于交换机层,组网环境也是基于InfiniBand网络的组网,且前面已经看到了IB网卡的状态是正常工作的,说明网线都正常连接到交换机上了,但并不排除IB网线可能插错的情况。排查集群内部IB网卡的接线情况,那么就需要使用到 iblinkinfo
工具,该工具会输出交换机上的所有接线网卡的详细信息,包含网卡速率、对接的交换机网口、网口名称、位宽、通道等信息。
因为交换机处于网络的中间层,因此仍需要在查看集群内部所有节点上的IB网口接线情况。
下为计算层所有节点的IB网口接线拓扑
计算层node1服务器上HCA-1网卡(mlx4_0)的两个端口连接到jfyl-ib-switches-1交换机上,HCA-2网卡(mlx4_1)的两个端口连接到jfyl-ib-switches-2交换机上
#iblinkinfo -C mlx4_0 -P 1 # ==> iblinkinfo -C mlx4_0 -P 2
Switch: 0xe41d2d0300xxxxxx MF0;jfyl-ib-switches-1:SX6xxx/U1:
12 1[ ] ==( Down/ Polling)==> [ ] "" ( )
12 2[ ] ==( Down/ Polling)==> [ ] "" ( )
12 3[ ] ==( Down/ Polling)==> [ ] "" ( )
12 4[ ] ==( Down/ Polling)==> [ ] "" ( )
12 5[ ] ==( Down/ Polling)==> [ ] "" ( )
12 6[ ] ==( Down/ Polling)==> [ ] "" ( )
12 7[ ] ==( Down/ Polling)==> [ ] "" ( )
12 8[ ] ==( Down/ Polling)==> [ ] "" ( )
12 9[ ] ==( Down/ Polling)==> [ ] "" ( )
12 10[ ] ==( Down/ Polling)==> [ ] "" ( )
12 11[ ] ==( Down/ Polling)==> [ ] "" ( )
12 12[ ] ==( Down/ Polling)==> [ ] "" ( )
12 13[ ] ==( Down/ Polling)==> [ ] "" ( )
12 14[ ] ==( Down/ Polling)==> [ ] "" ( )
12 15[ ] ==( Down/ Polling)==> [ ] "" ( )
12 16[ ] ==( Down/ Polling)==> [ ] "" ( )
12 17[ ] ==( Down/ Polling)==> [ ] "" ( )
12 18[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 2[ ] "node2 HCA-1" ( )
12 19[ ] ==( Down/ Polling)==> [ ] "" ( )
12 20[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 38 1[ ] "node2 HCA-1" ( )
12 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 37 2[ ] "node4 HCA-1" ( )
12 22[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 39 1[ ] "node1 HCA-1" ( )
12 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 41 2[ ] "node3 HCA-1" ( )
12 24[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 40 2[ ] "node1 HCA-1" ( )
12 25[ ] ==( Down/ Polling)==> [ ] "" ( )
12 26[ ] ==( Down/ Polling)==> [ ] "" ( )
12 27[ ] ==( Down/ Polling)==> [ ] "" ( )
12 28[ ] ==( Down/ Polling)==> [ ] "" ( )
12 29[ ] ==( Down/ Polling)==> [ ] "" ( )
12 30[ ] ==( Down/ Polling)==> [ ] "" ( )
12 31[ ] ==( Down/ Polling)==> [ ] "" ( )
12 32[ ] ==( Down/ Polling)==> [ ] "" ( )
12 33[ ] ==( Down/ Polling)==> [ ] "" ( )
12 34[ ] ==( Down/ Polling)==> [ ] "" ( )
12 35[ ] ==( Down/ Polling)==> [ ] "" ( )
12 36[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node1 HCA-1:
0xf452140300xxxxxx 39 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 12 22[ ] "MF0;jfyl-ib-switches-1:SX6xxx/U1" ( )
0xf452140300xxxxxx 40 2[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 12 24[ ] "MF0;jfyl-ib-switches-1:SX6xxx/U1" ( )
#iblinkinfo -C mlx4_1 -P 1 # ==> iblinkinfo -C mlx4_1 -P 2
Switch: 0x7cfe900300xxxxxx MF0;jfyl-ib-switches-2:SX6xxx/U1:
11 1[ ] ==( Down/ Polling)==> [ ] "" ( )
11 2[ ] ==( Down/ Polling)==> [ ] "" ( )
11 3[ ] ==( Down/ Polling)==> [ ] "" ( )
11 4[ ] ==( Down/ Polling)==> [ ] "" ( )
11 5[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 23 1[ ] "node4 HCA-1" ( )
11 6[ ] ==( Down/ Polling)==> [ ] "" ( )
11 7[ ] ==( Down/ Polling)==> [ ] "" ( )
11 8[ ] ==( Down/ Polling)==> [ ] "" ( )
11 9[ ] ==( Down/ Polling)==> [ ] "" ( )
11 10[ ] ==( Down/ Polling)==> [ ] "" ( )
11 11[ ] ==( Down/ Polling)==> [ ] "" ( )
11 12[ ] ==( Down/ Polling)==> [ ] "" ( )
11 13[ ] ==( Down/ Polling)==> [ ] "" ( )
11 14[ ] ==( Down/ Polling)==> [ ] "" ( )
11 15[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 28 2[ ] "node2 HCA-2" ( )
11 16[ ] ==( Down/ Polling)==> [ ] "" ( )
11 17[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 26 1[ ] "node2 HCA-2" ( )
11 18[ ] ==( Down/ Polling)==> [ ] "" ( )
11 19[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 1[ ] "node1 HCA-2" ( )
11 20[ ] ==( Down/ Polling)==> [ ] "" ( )
11 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 29 2[ ] "node1 HCA-2" ( )
11 22[ ] ==( Down/ Polling)==> [ ] "" ( )
11 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 19 1[ ] "node3 HCA-1" ( )
11 24[ ] ==( Down/ Polling)==> [ ] "" ( )
11 25[ ] ==( Down/ Polling)==> [ ] "" ( )
11 26[ ] ==( Down/ Polling)==> [ ] "" ( )
11 27[ ] ==( Down/ Polling)==> [ ] "" ( )
11 28[ ] ==( Down/ Polling)==> [ ] "" ( )
11 29[ ] ==( Down/ Polling)==> [ ] "" ( )
11 30[ ] ==( Down/ Polling)==> [ ] "" ( )
11 31[ ] ==( Down/ Polling)==> [ ] "" ( )
11 32[ ] ==( Down/ Polling)==> [ ] "" ( )
11 33[ ] ==( Down/ Polling)==> [ ] "" ( )
11 34[ ] ==( Down/ Polling)==> [ ] "" ( )
11 35[ ] ==( Down/ Polling)==> [ ] "" ( )
11 36[ ] ==( Down/ Polling)==> [ ] "" ( )
11 37[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node1 HCA-2:
0xf452140300xxxxxx 36 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 11 19[ ] "MF0;jfyl-ib-switches-2:SX6xxx/U1" ( )
0xf452140300xxxxxx 29 2[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 11 21[ ] "MF0;jfyl-ib-switches-2:SX6xxx/U1" ( )
计算层node2服务器上HCA-1网卡(mlx4_0)的两个端口连接到jfyl-ib-switches-1交换机上,HCA-2网卡(mlx4_1)的两个端口连接到jfyl-ib-switches-2交换机上
#iblinkinfo -C mlx4_0 -P 1 # ==> iblinkinfo -C mlx4_0 -P 2
Switch: 0xe41d2d0300xxxxxx MF0;jfyl-ib-switches-1:SX6xxx/U1:
12 1[ ] ==( Down/ Polling)==> [ ] "" ( )
12 2[ ] ==( Down/ Polling)==> [ ] "" ( )
12 3[ ] ==( Down/ Polling)==> [ ] "" ( )
12 4[ ] ==( Down/ Polling)==> [ ] "" ( )
12 5[ ] ==( Down/ Polling)==> [ ] "" ( )
12 6[ ] ==( Down/ Polling)==> [ ] "" ( )
12 7[ ] ==( Down/ Polling)==> [ ] "" ( )
12 8[ ] ==( Down/ Polling)==> [ ] "" ( )
12 9[ ] ==( Down/ Polling)==> [ ] "" ( )
12 10[ ] ==( Down/ Polling)==> [ ] "" ( )
12 11[ ] ==( Down/ Polling)==> [ ] "" ( )
12 12[ ] ==( Down/ Polling)==> [ ] "" ( )
12 13[ ] ==( Down/ Polling)==> [ ] "" ( )
12 14[ ] ==( Down/ Polling)==> [ ] "" ( )
12 15[ ] ==( Down/ Polling)==> [ ] "" ( )
12 16[ ] ==( Down/ Polling)==> [ ] "" ( )
12 17[ ] ==( Down/ Polling)==> [ ] "" ( )
12 18[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 2[ ] "node2 HCA-1" ( )
12 19[ ] ==( Down/ Polling)==> [ ] "" ( )
12 20[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 38 1[ ] "node2 HCA-1" ( )
12 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 37 2[ ] "node4 HCA-1" ( )
12 22[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 39 1[ ] "node1 HCA-1" ( )
12 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 41 2[ ] "node3 HCA-1" ( )
12 24[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 40 2[ ] "node1 HCA-1" ( )
12 25[ ] ==( Down/ Polling)==> [ ] "" ( )
12 26[ ] ==( Down/ Polling)==> [ ] "" ( )
12 27[ ] ==( Down/ Polling)==> [ ] "" ( )
12 28[ ] ==( Down/ Polling)==> [ ] "" ( )
12 29[ ] ==( Down/ Polling)==> [ ] "" ( )
12 30[ ] ==( Down/ Polling)==> [ ] "" ( )
12 31[ ] ==( Down/ Polling)==> [ ] "" ( )
12 32[ ] ==( Down/ Polling)==> [ ] "" ( )
12 33[ ] ==( Down/ Polling)==> [ ] "" ( )
12 34[ ] ==( Down/ Polling)==> [ ] "" ( )
12 35[ ] ==( Down/ Polling)==> [ ] "" ( )
12 36[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node2 HCA-1:
0xf452140300xxxxxx 38 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 12 20[ ] "MF0;jfyl-ib-switches-1:SX6xxx/U1" ( )
0xf452140300xxxxxx 36 2[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 12 18[ ] "MF0;jfyl-ib-switches-1:SX6xxx/U1" ( )
#iblinkinfo -C mlx4_1 -P 1 # ==> iblinkinfo -C mlx4_1 -P 2
Switch: 0x7cfe900300xxxxxx MF0;jfyl-ib-switches-2:SX6xxx/U1:
11 1[ ] ==( Down/ Polling)==> [ ] "" ( )
11 2[ ] ==( Down/ Polling)==> [ ] "" ( )
11 3[ ] ==( Down/ Polling)==> [ ] "" ( )
11 4[ ] ==( Down/ Polling)==> [ ] "" ( )
11 5[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 23 1[ ] "node4 HCA-1" ( )
11 6[ ] ==( Down/ Polling)==> [ ] "" ( )
11 7[ ] ==( Down/ Polling)==> [ ] "" ( )
11 8[ ] ==( Down/ Polling)==> [ ] "" ( )
11 9[ ] ==( Down/ Polling)==> [ ] "" ( )
11 10[ ] ==( Down/ Polling)==> [ ] "" ( )
11 11[ ] ==( Down/ Polling)==> [ ] "" ( )
11 12[ ] ==( Down/ Polling)==> [ ] "" ( )
11 13[ ] ==( Down/ Polling)==> [ ] "" ( )
11 14[ ] ==( Down/ Polling)==> [ ] "" ( )
11 15[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 28 2[ ] "node2 HCA-2" ( )
11 16[ ] ==( Down/ Polling)==> [ ] "" ( )
11 17[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 26 1[ ] "node2 HCA-2" ( )
11 18[ ] ==( Down/ Polling)==> [ ] "" ( )
11 19[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 1[ ] "node1 HCA-2" ( )
11 20[ ] ==( Down/ Polling)==> [ ] "" ( )
11 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 29 2[ ] "node1 HCA-2" ( )
11 22[ ] ==( Down/ Polling)==> [ ] "" ( )
11 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 19 1[ ] "node3 HCA-1" ( )
11 24[ ] ==( Down/ Polling)==> [ ] "" ( )
11 25[ ] ==( Down/ Polling)==> [ ] "" ( )
11 26[ ] ==( Down/ Polling)==> [ ] "" ( )
11 27[ ] ==( Down/ Polling)==> [ ] "" ( )
11 28[ ] ==( Down/ Polling)==> [ ] "" ( )
11 29[ ] ==( Down/ Polling)==> [ ] "" ( )
11 30[ ] ==( Down/ Polling)==> [ ] "" ( )
11 31[ ] ==( Down/ Polling)==> [ ] "" ( )
11 32[ ] ==( Down/ Polling)==> [ ] "" ( )
11 33[ ] ==( Down/ Polling)==> [ ] "" ( )
11 34[ ] ==( Down/ Polling)==> [ ] "" ( )
11 35[ ] ==( Down/ Polling)==> [ ] "" ( )
11 36[ ] ==( Down/ Polling)==> [ ] "" ( )
11 37[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node2 HCA-2:
0x24be05ffffxxxxxx 26 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 11 17[ ] "MF0;jfyl-ib-switches-2:SX6xxx/U1" ( )
0x24be05ffffxxxxxx 28 2[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 11 15[ ] "MF0;jfyl-ib-switches-2:SX6xxx/U1" ( )
存储层node3服务器上HCA-1网卡(mlx4_0)的一端口连接到jfyl-ib-switches-2交换机上,HCA-1网卡(mlx4_0)的二端口连接到jfyl-ib-switches-1交换机上
#iblinkinfo -C mlx4_0 -P 1
Switch: 0x7cfe900300xxxxxx MF0;jfyl-ib-switches-2:SX6xxx/U1:
11 1[ ] ==( Down/ Polling)==> [ ] "" ( )
11 2[ ] ==( Down/ Polling)==> [ ] "" ( )
11 3[ ] ==( Down/ Polling)==> [ ] "" ( )
11 4[ ] ==( Down/ Polling)==> [ ] "" ( )
11 5[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 23 1[ ] "node4 HCA-1" ( )
11 6[ ] ==( Down/ Polling)==> [ ] "" ( )
11 7[ ] ==( Down/ Polling)==> [ ] "" ( )
11 8[ ] ==( Down/ Polling)==> [ ] "" ( )
11 9[ ] ==( Down/ Polling)==> [ ] "" ( )
11 10[ ] ==( Down/ Polling)==> [ ] "" ( )
11 11[ ] ==( Down/ Polling)==> [ ] "" ( )
11 12[ ] ==( Down/ Polling)==> [ ] "" ( )
11 13[ ] ==( Down/ Polling)==> [ ] "" ( )
11 14[ ] ==( Down/ Polling)==> [ ] "" ( )
11 15[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 28 2[ ] "node2 HCA-2" ( )
11 16[ ] ==( Down/ Polling)==> [ ] "" ( )
11 17[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 26 1[ ] "node2 HCA-2" ( )
11 18[ ] ==( Down/ Polling)==> [ ] "" ( )
11 19[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 1[ ] "node1 HCA-2" ( )
11 20[ ] ==( Down/ Polling)==> [ ] "" ( )
11 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 29 2[ ] "node1 HCA-2" ( )
11 22[ ] ==( Down/ Polling)==> [ ] "" ( )
11 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 19 1[ ] "node3 HCA-1" ( )
11 24[ ] ==( Down/ Polling)==> [ ] "" ( )
11 25[ ] ==( Down/ Polling)==> [ ] "" ( )
11 26[ ] ==( Down/ Polling)==> [ ] "" ( )
11 27[ ] ==( Down/ Polling)==> [ ] "" ( )
11 28[ ] ==( Down/ Polling)==> [ ] "" ( )
11 29[ ] ==( Down/ Polling)==> [ ] "" ( )
11 30[ ] ==( Down/ Polling)==> [ ] "" ( )
11 31[ ] ==( Down/ Polling)==> [ ] "" ( )
11 32[ ] ==( Down/ Polling)==> [ ] "" ( )
11 33[ ] ==( Down/ Polling)==> [ ] "" ( )
11 34[ ] ==( Down/ Polling)==> [ ] "" ( )
11 35[ ] ==( Down/ Polling)==> [ ] "" ( )
11 36[ ] ==( Down/ Polling)==> [ ] "" ( )
11 37[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node3 HCA-1:
0x0002c90300xxxxxx 19 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 11 23[ ] "MF0;jfyl-ib-switches-2:SX6xxx/U1" ( )
#iblinkinfo -C mlx4_0 -P 2
Switch: 0xe41d2d0300xxxxxx MF0;jfyl-ib-switches-1:SX6xxx/U1:
12 1[ ] ==( Down/ Polling)==> [ ] "" ( )
12 2[ ] ==( Down/ Polling)==> [ ] "" ( )
12 3[ ] ==( Down/ Polling)==> [ ] "" ( )
12 4[ ] ==( Down/ Polling)==> [ ] "" ( )
12 5[ ] ==( Down/ Polling)==> [ ] "" ( )
12 6[ ] ==( Down/ Polling)==> [ ] "" ( )
12 7[ ] ==( Down/ Polling)==> [ ] "" ( )
12 8[ ] ==( Down/ Polling)==> [ ] "" ( )
12 9[ ] ==( Down/ Polling)==> [ ] "" ( )
12 10[ ] ==( Down/ Polling)==> [ ] "" ( )
12 11[ ] ==( Down/ Polling)==> [ ] "" ( )
12 12[ ] ==( Down/ Polling)==> [ ] "" ( )
12 13[ ] ==( Down/ Polling)==> [ ] "" ( )
12 14[ ] ==( Down/ Polling)==> [ ] "" ( )
12 15[ ] ==( Down/ Polling)==> [ ] "" ( )
12 16[ ] ==( Down/ Polling)==> [ ] "" ( )
12 17[ ] ==( Down/ Polling)==> [ ] "" ( )
12 18[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 2[ ] "node2 HCA-1" ( )
12 19[ ] ==( Down/ Polling)==> [ ] "" ( )
12 20[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 38 1[ ] "node2 HCA-1" ( )
12 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 37 2[ ] "node4 HCA-1" ( )
12 22[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 39 1[ ] "node1 HCA-1" ( )
12 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 41 2[ ] "node3 HCA-1" ( )
12 24[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 40 2[ ] "node1 HCA-1" ( )
12 25[ ] ==( Down/ Polling)==> [ ] "" ( )
12 26[ ] ==( Down/ Polling)==> [ ] "" ( )
12 27[ ] ==( Down/ Polling)==> [ ] "" ( )
12 28[ ] ==( Down/ Polling)==> [ ] "" ( )
12 29[ ] ==( Down/ Polling)==> [ ] "" ( )
12 30[ ] ==( Down/ Polling)==> [ ] "" ( )
12 31[ ] ==( Down/ Polling)==> [ ] "" ( )
12 32[ ] ==( Down/ Polling)==> [ ] "" ( )
12 33[ ] ==( Down/ Polling)==> [ ] "" ( )
12 34[ ] ==( Down/ Polling)==> [ ] "" ( )
12 35[ ] ==( Down/ Polling)==> [ ] "" ( )
12 36[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node3 HCA-1:
0x0002c90300xxxxxx 41 2[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 12 23[ ] "MF0;jfyl-ib-switches-1:SX6xxx/U1" ( )
存储层node4服务器上HCA-1网卡(mlx4_0)的一端口连接到jfyl-ib-switches-2交换机上,HCA-1网卡(mlx4_0)的二端口连接到jfyl-ib-switches-1交换机上
#iblinkinfo -C mlx4_0 -P 1
Switch: 0x7cfe900300xxxxxx MF0;jfyl-ib-switches-2:SX6xxx/U1:
11 1[ ] ==( Down/ Polling)==> [ ] "" ( )
11 2[ ] ==( Down/ Polling)==> [ ] "" ( )
11 3[ ] ==( Down/ Polling)==> [ ] "" ( )
11 4[ ] ==( Down/ Polling)==> [ ] "" ( )
11 5[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 23 1[ ] "node4 HCA-1" ( )
11 6[ ] ==( Down/ Polling)==> [ ] "" ( )
11 7[ ] ==( Down/ Polling)==> [ ] "" ( )
11 8[ ] ==( Down/ Polling)==> [ ] "" ( )
11 9[ ] ==( Down/ Polling)==> [ ] "" ( )
11 10[ ] ==( Down/ Polling)==> [ ] "" ( )
11 11[ ] ==( Down/ Polling)==> [ ] "" ( )
11 12[ ] ==( Down/ Polling)==> [ ] "" ( )
11 13[ ] ==( Down/ Polling)==> [ ] "" ( )
11 14[ ] ==( Down/ Polling)==> [ ] "" ( )
11 15[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 28 2[ ] "node2 HCA-2" ( )
11 16[ ] ==( Down/ Polling)==> [ ] "" ( )
11 17[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 26 1[ ] "node2 HCA-2" ( )
11 18[ ] ==( Down/ Polling)==> [ ] "" ( )
11 19[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 1[ ] "node1 HCA-2" ( )
11 20[ ] ==( Down/ Polling)==> [ ] "" ( )
11 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 29 2[ ] "node1 HCA-2" ( )
11 22[ ] ==( Down/ Polling)==> [ ] "" ( )
11 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 19 1[ ] "node3 HCA-1" ( )
11 24[ ] ==( Down/ Polling)==> [ ] "" ( )
11 25[ ] ==( Down/ Polling)==> [ ] "" ( )
11 26[ ] ==( Down/ Polling)==> [ ] "" ( )
11 27[ ] ==( Down/ Polling)==> [ ] "" ( )
11 28[ ] ==( Down/ Polling)==> [ ] "" ( )
11 29[ ] ==( Down/ Polling)==> [ ] "" ( )
11 30[ ] ==( Down/ Polling)==> [ ] "" ( )
11 31[ ] ==( Down/ Polling)==> [ ] "" ( )
11 32[ ] ==( Down/ Polling)==> [ ] "" ( )
11 33[ ] ==( Down/ Polling)==> [ ] "" ( )
11 34[ ] ==( Down/ Polling)==> [ ] "" ( )
11 35[ ] ==( Down/ Polling)==> [ ] "" ( )
11 36[ ] ==( Down/ Polling)==> [ ] "" ( )
11 37[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node4 HCA-1:
0x0002c90300xxxxxx 23 1[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 11 5[ ] "MF0;jfyl-ib-switches-2:SX6xxx/U1" ( )
#iblinkinfo -C mlx4_0 -P 2
Switch: 0xe41d2d0300xxxxxx MF0;jfyl-ib-switches-1:SX6xxx/U1:
12 1[ ] ==( Down/ Polling)==> [ ] "" ( )
12 2[ ] ==( Down/ Polling)==> [ ] "" ( )
12 3[ ] ==( Down/ Polling)==> [ ] "" ( )
12 4[ ] ==( Down/ Polling)==> [ ] "" ( )
12 5[ ] ==( Down/ Polling)==> [ ] "" ( )
12 6[ ] ==( Down/ Polling)==> [ ] "" ( )
12 7[ ] ==( Down/ Polling)==> [ ] "" ( )
12 8[ ] ==( Down/ Polling)==> [ ] "" ( )
12 9[ ] ==( Down/ Polling)==> [ ] "" ( )
12 10[ ] ==( Down/ Polling)==> [ ] "" ( )
12 11[ ] ==( Down/ Polling)==> [ ] "" ( )
12 12[ ] ==( Down/ Polling)==> [ ] "" ( )
12 13[ ] ==( Down/ Polling)==> [ ] "" ( )
12 14[ ] ==( Down/ Polling)==> [ ] "" ( )
12 15[ ] ==( Down/ Polling)==> [ ] "" ( )
12 16[ ] ==( Down/ Polling)==> [ ] "" ( )
12 17[ ] ==( Down/ Polling)==> [ ] "" ( )
12 18[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 36 2[ ] "node2 HCA-1" ( )
12 19[ ] ==( Down/ Polling)==> [ ] "" ( )
12 20[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 38 1[ ] "node2 HCA-1" ( )
12 21[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 37 2[ ] "node4 HCA-1" ( )
12 22[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 39 1[ ] "node1 HCA-1" ( )
12 23[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 41 2[ ] "node3 HCA-1" ( )
12 24[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 40 2[ ] "node1 HCA-1" ( )
12 25[ ] ==( Down/ Polling)==> [ ] "" ( )
12 26[ ] ==( Down/ Polling)==> [ ] "" ( )
12 27[ ] ==( Down/ Polling)==> [ ] "" ( )
12 28[ ] ==( Down/ Polling)==> [ ] "" ( )
12 29[ ] ==( Down/ Polling)==> [ ] "" ( )
12 30[ ] ==( Down/ Polling)==> [ ] "" ( )
12 31[ ] ==( Down/ Polling)==> [ ] "" ( )
12 32[ ] ==( Down/ Polling)==> [ ] "" ( )
12 33[ ] ==( Down/ Polling)==> [ ] "" ( )
12 34[ ] ==( Down/ Polling)==> [ ] "" ( )
12 35[ ] ==( Down/ Polling)==> [ ] "" ( )
12 36[ ] ==( Down/ Polling)==> [ ] "" ( )
CA: node4 HCA-1:
0x0002c90300xxxxxx 37 2[ ] ==( 4X 14.0625 Gbps Active/ LinkUp)==> 12 21[ ] "MF0;jfyl-ib-switches-1:SX6xxx/U1" ( )
通过4台服务器的IB组网拓扑和网口对应的IP不难发现,计算层HCA-1网卡接线在jfyl-ib-switches-1交换机上,对应网口ip的网络段为192.168.22.x 和 192.168.23.x,而存储层HCA-1网卡2端口接线在jfyl-ib-switches-2交换机上,对应网口的ip网络段为xx.xx.24.x,因此从计算层ping存储层的网络段时,相当于以192.168.22.x 网络段与192.168.24.x网络段进行通信,因此网络无法通信。
再看计算层HCA-2网卡连线连在jfyl-ib-switches-2交换机上,对应网口的ip为192.168.24.x 和 192.168.25.x,而存储层HCA-1网卡1端口连线连在jfyl-ib-switches-1交换机上,从计算层ping存储层的网络时,相当于以192.168.24.x 网络段与192.168.22.x 进行通信,自然也无法通信。
至此,网络故障的原因已经找到,后续经过重新连接存储层到交换机上的IB线缆后,集群内部计算层到存储层的通信恢复正常。