Bug #325

Asus P5QL-EM: NVIDIA card in PCIe X1 port not working

Added by Francois Ollonois about 2 months ago. Updated about 2 months ago.

Status:NewStart date:12/03/2021
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

When using my nvidia geforce GT 710 in PCIe X1 slot the screen is not working after graphic driver is loaded. With nomodeset parameter it works but without driver I can't start Xorg.
The same card in X16 slot works, but I want to use the X16 port on this board for a nvme adapter card.
I discussed the problem with nouveau project and they consider some sort of DMA problem and found some differences between vendor firmware and coreboot that may cause the problems.
They also mention some kernel parameters to work around this problem but even with this parameters the screen flickers when using Xorg.
With original vendor firmware the card works in X1 port.
I use this card:
https://www.asus.com/Motherboards-Components/Graphics-Cards/ASUS/GT710-4H-SL-2GD5/

Here is the link to the nouveau ticket.
https://gitlab.freedesktop.org/drm/nouveau/-/issues/132

I will also add the kernel log and lcpci output and the cbmem output.
And here are some information about the differences:

Coreboot:
    Latency: 0, Cache Line Size: 64 bytes
    Capabilities: [100 v1] Virtual Channel
            Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff

Vendor:
    Latency: 0, Cache Line Size: 32 bytes
    Capabilities: [100 v1] Virtual Channel
            Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01

The host bridge also has SERR+ on coreboot vs SERR- on vendor.

And the root ports actually have a lot of differences, including stuff like

Coreboot:
    Capabilities: [40] Express (v1) Root Port (Slot-), MSI 00
        DevCap: MaxPayload 128 bytes, PhantFunc 0
            ExtTag- RBE+
        DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
            MaxPayload 128 bytes, MaxReadReq 128 bytes
        DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
        LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us
            ClockPM- Surprise- LLActRep+ BwNot- ASPMOptComp-
        LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s (ok), Width x0 (downgraded)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

Vendor:
    Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
        DevCap: MaxPayload 128 bytes, PhantFunc 0
            ExtTag- RBE+
        DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
            MaxPayload 128 bytes, MaxReadReq 128 bytes
        DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
        LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <256ns, L1 <4us
            ClockPM- Surprise- LLActRep+ BwNot- ASPMOptComp-
        LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s (ok), Width x0 (downgraded)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
            Slot #0, PowerLimit 10.000W; Interlock- NoCompl-
        SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
            Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
        SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
            Changed: MRL- PresDet- LinkState-

cbmem_p5ql-em.log Magnifier (43.5 KB) Francois Ollonois, 12/03/2021 01:33 PM

kernel_log_X1.txt Magnifier (56.9 KB) Francois Ollonois, 12/03/2021 01:33 PM

kernel_log_X1_with_para.txt Magnifier (57.3 KB) Francois Ollonois, 12/03/2021 01:33 PM

kernel_log_X1_nomodeset.txt Magnifier (55.1 KB) Francois Ollonois, 12/03/2021 01:33 PM

kernel_log_X16.txt Magnifier (56.7 KB) Francois Ollonois, 12/03/2021 01:33 PM

lspci_coreboot_root2.txt Magnifier (64.3 KB) Francois Ollonois, 12/03/2021 01:33 PM

lspci_vendor.txt Magnifier (61.9 KB) Francois Ollonois, 12/03/2021 01:33 PM

p5ql-em.config (17.2 KB) Francois Ollonois, 12/03/2021 02:22 PM

dmesg_iommu_off.txt Magnifier (60 KB) Francois Ollonois, 12/03/2021 04:55 PM

History

#2 Updated by Nico Huber about 2 months ago

Just a hunch as DMA was mentioned: please try with iommu=off in the kernel command line. If that helps, please provide a kernel log booted with vendor BIOS.

#3 Updated by Francois Ollonois about 2 months ago

Nico Huber wrote:

Just a hunch as DMA was mentioned: please try with iommu=off in the kernel command line. If that helps, please provide a kernel log booted with vendor BIOS.

with iommu=off I get a little bit further but the screen gets still stuck. The kernel log looks a little bit different with some more errors related to usb.

I attach the log.

Also available in: Atom PDF