nvidia xavier nx平台调整PCIE速率调试
1. 前言
如何增加最大速度的pcie上的jetson xavier?
因为被限制在2.5 GT/s
Xavier似乎可以增加到8 GT/s。
我使用Jetpack 4.5
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1) (prog-if 00 [Normal decode])
LnkCap: Port #0, Speed 8GT/s, Width x1, ASPM not supported, Exit Latency L0s <1us, L1 <64us
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
没设备连接在nx上时
0004:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad1 (rev a1) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 33
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: 00001000-00001fff
Memory behind bridge: 40000000-400fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x1, ASPM not supported, Exit Latency L0s <1us, L1 <64us
ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
RootCap: CRSVisible+
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
Vector table: BAR=2 offset=00000000
PBA: BAR=2 offset=00010000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [148 v1] #19
Capabilities: [168 v1] #26
Capabilities: [18c v1] #27
Capabilities: [1ac v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=60us
L1SubCtl2: T_PwrOn=60us
Capabilities: [1bc v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
Capabilities: [2bc v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
Capabilities: [2f4 v1] #25
Capabilities: [300 v1] Precision Time Measurement
PTMCap: Requester:+ Responder:+ Root:+
PTMClockGranularity: 16ns
PTMControl: Enabled:- RootSelected:-
PTMEffectiveGranularity: Unknown
Capabilities: [30c v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
Kernel driver in use: pcieport
2. 查询文档
Jetson Xavier实际上具有Gen-4速度(即16 GT/s),
这是默认设置(即当一个具有Gen-4速度的设备连接时,链接会以Gen-4速度出现)
否则链接速度取决于连接到根端口的是什么,最终速度取决于设备端
可以用这个脚本改变速度pcie_set_speed.sh
#!/bin/bash
dev=$1
speed=$2
if [ -z "$dev" ]; then
echo "Error: no device specified"
exit 1
fi
if [ ! -e "/sys/bus/pci/devices/$dev" ]; then
dev="0000:$dev"
fi
if [ ! -e "/sys/bus/pci/devices/$dev" ]; then
echo "Error: device $dev not found"
exit 1
fi
pciec=$(setpci -s $dev CAP_EXP+02.W)
pt=$((("0x$pciec" & 0xF0) >> 4))
port=$(basename $(dirname $(readlink "/sys/bus/pci/devices/$dev")))
if (($pt == 0)) || (($pt == 1)) || (($pt == 5)); then
dev=$port
fi
lc=$(setpci -s $dev CAP_EXP+0c.L)
ls=$(setpci -s $dev CAP_EXP+12.W)
max_speed=$(("0x$lc" & 0xF))
echo "Link capabilities:" $lc
echo "Max link speed:" $max_speed
echo "Link status:" $ls
echo "Current link speed:" $(("0x$ls" & 0xF))
if [ -z "$speed" ]; then
speed=$max_speed
fi
if (($speed > $max_speed)); then
speed=$max_speed
fi
echo "Configuring $dev..."
lc2=$(setpci -s $dev CAP_EXP+30.L)
echo "Original link control 2:" $lc2
echo "Original link target speed:" $(("0x$lc2" & 0xF))
lc2n=$(printf "%08x" $((("0x$lc2" & 0xFFFFFFF0) | $speed)))
echo "New target link speed:" $speed
echo "New link control 2:" $lc2n
setpci -s $dev CAP_EXP+30.L=$lc2n
echo "Triggering link retraining..."
lc=$(setpci -s $dev CAP_EXP+10.L)
echo "Original link control:" $lc
lcn=$(printf "%08x" $(("0x$lc" | 0x20)))
echo "New link control:" $lcn
setpci -s $dev CAP_EXP+10.L=$lcn
sleep 0.1
ls=$(setpci -s $dev CAP_EXP+12.W)
echo "Link status:" $ls
echo "Current link speed:" $(("0x$ls" & 0xF))
是否有更深层次方法来改变pcie速度,而不是执行这个脚本每次?
3.安装一个8GT/s的设备
不执行任何脚本
没有任何连接,就没有什么可以协商,所以链接速度保持在2.5GT/。
如果安装一个8GT/s的设备,会看到相应的速度调整。
这是一个NVMe设备的片段,运行在8GT/s x4…
0005:01:00.0 Non-Volatile memory controller: Micron/Crucial Technology Device 540a (rev 01) (prog-if 02 [NVM Express])
Subsystem: Micron/Crucial Technology Device 540a
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 35
IOMMU group: 61
Region 0: Memory at 1f40000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [80] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #1, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x4 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
这是它连接的bridge
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 35
IOMMU group: 60
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: 0000f000-00000fff [disabled]
Memory behind bridge: 40000000-400fffff [size=1M]
Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff [disabled]
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM not supported
ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
LnkSta: Speed 8GT/s (downgraded), Width x4 (downgraded)
TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
4. 挖矿程序调试
当我启动挖矿程序,检查基于pcie的内存空间。
如果我什么都没改变,在xavier nx上,我不能挖掘任何东西,因为我得到的消息:
cuda-0 Using Pci Id : 00:00.0 Xavier (Compute 7.2) Memory : 2.5 GB
该进程至少需要4.2 GB才能生成DAG。
如果改变pcie的速度,并运行挖掘进程,我得到以下消息:
cuda-0 Using Pci Id : 00:00.0 Xavier (Compute 7.2) Memory : 6.19 GB
挖掘进程成功运行,因为这次它有足够的内存来生成DAG。
因此,以某种方式或另一种方式,他们是与pcie速度的一个链接,能够在此卡上运行采矿过程。
5. 调整设备树
有一个名为“nvidia,init-speed”的设备树
可以尝试用设备树覆盖将其添加到pcie设备中
pcie@14160000 {
nvidia,init-speed = <3>;
};
pcie@141a0000 {
nvidia,init-speed = <4>;
};
介绍的方法涉及创建一个新的dtb,它在引导时加载到内核中。
完成此任务的最简单方法是在启动时自动运行pcie_set_speed.sh脚本。
可以用系统服务轻松做到这一点…
保存路径为“/etc/systemd/system/pcie_set_speed.service”
[Unit]
Description=Set PCIe Speed
[Service]
Type=oneshot
ExecStart=/root/pcie_set_speed.sh
[Install]
WantedBy=sysinit.target
然后将pcie_set_speed.sh脚本复制到/root/,并确保它是可执行的。
现在运行
$ sudo systemctl daemon-reload
$ sudo systemctl enable pcie_set_speed
$ sudo systemctl start pcie_set_speed
配置ok
- 分享
- 举报
-
浏览量:8542次2021-05-14 14:43:43
-
浏览量:9354次2021-04-27 17:56:41
-
浏览量:7370次2021-05-11 17:51:48
-
2021-05-06 16:03:00
-
浏览量:10795次2021-05-06 16:22:01
-
浏览量:5218次2021-05-10 17:48:42
-
浏览量:7050次2021-05-11 17:04:57
-
浏览量:7439次2021-05-20 17:08:14
-
浏览量:7051次2021-05-20 16:37:42
-
浏览量:9179次2021-05-19 17:32:00
-
浏览量:7856次2021-05-25 15:32:16
-
浏览量:9344次2021-05-25 17:31:40
-
2021-04-23 15:54:21
-
浏览量:7912次2021-06-04 16:25:58
-
浏览量:7125次2021-07-31 15:26:23
-
浏览量:5862次2021-05-28 13:52:17
-
浏览量:5349次2021-04-06 14:54:49
-
浏览量:7818次2021-06-09 14:49:23
-
浏览量:10160次2021-06-08 17:32:00
-
178篇
- nvidia nx平台 Gstreamer tcpserverink延迟2-3秒问题调试1
- 我们等你!GAME24英伟达游戏节邀你来狂欢!
- NVIDIA显卡特别优化!游戏神作《看门狗》尝鲜
- VR解剖“历险记”:NVIDIA CloudXR 和 5G 赋力大学科学可视化
- TITAN BLACK梦幻作品!GeForce Garage交叉书桌打造日志
- GAME 24狂欢不停歇!英伟达游戏节圆满举办
- 初创故事 | Derq 通过 AI 分析提升交通和行人安全
- Jetson Nano平台deepstream-app视频颠倒调试
- xavier nx刷机后SMD分区损坏问题
- 游戏玩家的盛宴!ChinaJoy 2014精彩回顾
-
广告/SPAM
-
恶意灌水
-
违规内容
-
文不对题
-
重复发帖
free-jdx
感谢您的打赏,如若您也想被打赏,可前往 发表专栏 哦~
举报类型
- 内容涉黄/赌/毒
- 内容侵权/抄袭
- 政治相关
- 涉嫌广告
- 侮辱谩骂
- 其他
详细说明