nvidia xavier平台10Gb PCIE网卡速度限制为1Gb调试
1. 背景
Xavier的PCIe插槽安装了一个10Gb以太网卡。
当运行速度超过1Gb/秒时,大量数据包被丢弃。
Netstat确认接口丢弃的数据包。
系统似乎已经将卡标识为10Gb,但在1Gb时出现了瓶颈。
SDK版本:Jetpack 4.1
软件: 无线电制造商提供的基准I/O例程,使用的是Intel X520-DA2卡;两个GbE接口的ethtool输出显示固件版本为0x61c10001,驱动版本为ixgbe 4.6.4
2. 节点调试
ifconfig output:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 128.112.3.3 netmask 255.255.0.0 broadcast 128.112.255.255
inet6 fe80::f97f:4b79:ec64:cea7 prefixlen 64 scopeid 0x20
ether 00:04:4b:cb:9b:a5 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 59 bytes 6200 (6.2 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 40
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet 192.168.30.1 netmask 255.255.255.0 broadcast 192.168.30.255
inet6 fe80::1e3:fc5f:89f3:c358 prefixlen 64 scopeid 0x20
ether 90:e2:ba:f2:1c:18 txqueuelen 1000 (Ethernet)
RX packets 13 bytes 780 (780.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 59 bytes 6266 (6.2 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet 192.168.40.1 netmask 255.255.255.0 broadcast 192.168.40.255
inet6 fe80::79dd:b1f:cdcb:4036 prefixlen 64 scopeid 0x20
ether 90:e2:ba:f2:1c:19 txqueuelen 1000 (Ethernet)
RX packets 13 bytes 780 (780.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 56 bytes 6032 (6.0 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
l4tbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 192.168.55.1 netmask 255.255.255.0 broadcast 192.168.55.255
inet6 fe80::1 prefixlen 128 scopeid 0x20
inet6 fe80::5ce0:5eff:fe90:c37 prefixlen 64 scopeid 0x20
ether 52:ea:3b:35:a5:d6 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6 bytes 534 (534.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10
loop txqueuelen 1 (Local Loopback)
RX packets 627 bytes 39815 (39.8 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 627 bytes 39815 (39.8 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
rndis0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether a2:ea:33:5c:f6:29 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
usb0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 52:ea:3b:35:a5:d6 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
route output:
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
128.112.0.0 0.0.0.0 255.255.0.0 U 102 0 0 eth0
link-local 0.0.0.0 255.255.0.0 U 1000 0 0 l4tbr0
192.168.30.0 0.0.0.0 255.255.255.0 U 100 0 0 eth1
192.168.40.0 0.0.0.0 255.255.255.0 U 101 0 0 eth2
192.168.55.0 0.0.0.0 255.255.255.0 U 0 0 0 l4tbr0
ethtool eth1 output:
Settings for eth1:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Advertised link modes: 10000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: external
Auto-negotiation: off
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
3. 停止USB设备模式的网桥模式
可以消除“192.168.55.0”桥来进行简化。
这个桥实际上是USB小工具模式示例代码的一部分。
如果查看“/opt/nvidia/l4t-usb-device-mode/”,可看到如何使用USB端口模拟大容量存储和以太网卡。这可以被禁用而不会造成伤害,并且应该在大多数系统中禁用。
这个命令会显示在引导时激活这个的两个文件:
ls -l `find /etc/systemd -type l` | grep opt
删除这两个符号链接:
sudo rm /etc/systemd/system/multi-user.target.wants/nv-l4t-usb-device-mode.service
sudo rm /etc/systemd/system/nv-l4t-usb-device-mode.service
然后重新启动,USB设备模式的网桥将停止。
在移除USB小工具模式后,上述性能,会运行更多的时间和更多的流量,可再次提交eth1和eth2的ifconfig输出,更多的流量会是一个更好的指示。
尝试使用网络交换机而不是直接连接进行相同的测试。
按照上述停止USB设备模式的网桥模式后,仍然没有效果:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 128.112.3.3 netmask 255.255.0.0 broadcast 128.112.255.255
inet6 fe80::f97f:4b79:ec64:cea7 prefixlen 64 scopeid 0x20
ether 00:04:4b:cb:9b:a5 txqueuelen 1000 (Ethernet)
RX packets 955 bytes 81191 (81.1 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 835 bytes 79910 (79.9 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 40
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet 192.168.30.1 netmask 255.255.255.0 broadcast 192.168.30.255
inet6 fe80::1e3:fc5f:89f3:c358 prefixlen 64 scopeid 0x20
ether 90:e2:ba:f2:1c:18 txqueuelen 1000 (Ethernet)
RX packets 3189675 bytes 25139723250 (25.1 GB)
RX errors 0 dropped 139847 overruns 0 frame 0
TX packets 63443 bytes 5196484 (5.1 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet 192.168.40.1 netmask 255.255.255.0 broadcast 192.168.40.255
inet6 fe80::79dd:b1f:cdcb:4036 prefixlen 64 scopeid 0x20
ether 90:e2:ba:f2:1c:19 txqueuelen 1000 (Ethernet)
RX packets 3286221 bytes 25114496876 (25.1 GB)
RX errors 0 dropped 143452 overruns 0 frame 0
TX packets 163941 bytes 11025129 (11.0 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10
loop txqueuelen 1 (Local Loopback)
RX packets 10208 bytes 627832 (627.8 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 10208 bytes 627832 (627.8 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
4. 分析
eth0:正常运行。
eth1: RX丢失大量数据包。可能是以太网问题或终端用户问题。
可能不是硬件问题,没有溢出、帧或碰撞,我怀疑这是冲突;
这绝对不是对另一个网络设备干扰的不良反应。
eth2:与eth1相同。
5. 查看10GbE 接口 中断
ifconfig eth0
device interrupt 40
另一方面,我也看到了两种描述:
42: ... 2490000.ether_qos.rx0
43: ... 2490000.ether_qos.tx0
ifconfig”中列出的可能是特定于该硬件的,并且tx/rx可能与数据源/同步有关。
2490000将是控制器硬件的地址
sudo find /sys -name '*2490000*'
找到与控制器相关的东西。找到了ether_qos:
/sys/kernel/iommu_groups/4/devices/2490000.ether_qos
/sys/bus/platform/devices/2490000.ether_qos
sudo find /sys -name ' eth0 '
得到进一步的确认:
/sys/devices/2490000.ether_qos/net/eth0
/sys/class/net/eth0
sudo -s
cd /sys/class/net
ls -l eth1
# The name of the file pointed to should have some sort of identifier for the driver,
# e.g., the case of my eth0 I see the controller address concatenated with "ether_qos".
# So I look for "ether_qos" in "/proc/interrupts":
egrep ether_qos /proc/interupts
40: 8696 0 0 0 0 0 0 0 GICv2 226 Level ether_qos.common_irq
42: 3737 0 0 0 0 0 0 0 GICv2 222 Level 2490000.ether_qos.rx0
43: 2114 0 0 0 0 0 0 0 GICv2 218 Level 2490000.ether_qos.tx0
通过上述查询,
发现中断也是正常的
6. 尝试更新ubuntu系统
通过更新操作系统到Ubuntu v.18;
发现上述问题都被解决了;
暂时还不太清楚什么原因
- 分享
- 举报
-
浏览量:9407次2021-05-26 14:17:15
-
浏览量:8674次2021-01-15 16:31:31
-
2021-05-06 16:03:00
-
2021-01-16 14:38:27
-
浏览量:9179次2021-05-19 17:32:00
-
浏览量:8542次2021-05-14 14:43:43
-
浏览量:1772次2022-09-26 15:03:03
-
浏览量:7371次2021-05-11 17:51:48
-
浏览量:11372次2021-02-05 14:30:37
-
浏览量:9355次2021-04-27 17:56:41
-
浏览量:10160次2021-06-08 17:32:00
-
浏览量:1941次2019-11-07 14:15:02
-
浏览量:6412次2021-04-21 17:42:58
-
浏览量:10796次2021-05-06 16:22:01
-
浏览量:7050次2021-05-11 17:04:57
-
浏览量:8611次2021-03-16 14:44:24
-
浏览量:5219次2021-05-10 17:48:42
-
浏览量:6208次2021-07-09 15:17:28
-
浏览量:7439次2021-05-20 17:08:14
-
广告/SPAM
-
恶意灌水
-
违规内容
-
文不对题
-
重复发帖
free-jdx
感谢您的打赏,如若您也想被打赏,可前往 发表专栏 哦~
举报类型
- 内容涉黄/赌/毒
- 内容侵权/抄袭
- 政治相关
- 涉嫌广告
- 侮辱谩骂
- 其他
详细说明