This contains notes about using the Kintex UltraScale XCKU5P PCIE 3.0 QSFP X 2 or SFP X 2 xilinx board xilinx fpga board xilixn fpga development board pcie board with Bundle: KU5P full heightQSFP
.
Read the Device DNA over JTAG and using AMD Device Lookup reports:
Device | XCKU5P |
Package-Pin | FFVB676 |
Revision Code | AAZ |
SpeedGrade | 2I |
No schematic or PCB layout is available for the board. Asking the seller after purchase got sent an email containing:
- Spreadsheet with the XCKU5P pinout. Typo in that the
pcie_p_tx6
andpcie_n_tx6
pins are set to the pins forpcie_p_rx6
andpcie_n_rx6
. Noticed when checking the IO Placed Report of a design. The XDC file used only therxp
pins for each lane - Picture of the Add Configuration Memory Device dialog in Vivado with name
mt25qu256-spi-x1_x2_x4
. - A
pcie4_uscale_plus_0_ex
Vivado v2021.1 project. The part isxcku5p-ffvb676-2-i
. The design has:- A UltraScale+ Integrated Block (PCIE4) for PCI Express (1.3) IP block set for 8 lanes and 8.0 GT/s and one 128K BAR.
- Some verilog example code which implements 2 Kbytes of memory space.
- Only the PCIe signals connected.
- A
ibert_ultrascale_gty_0_ex
Vivado v2021.1 project. The part isxcku5p-ffvb676-2-i
. Can re-generate the bitstream in Vivado 2021.1 under Ubuntu. The design has:- IBERT Ultrascale GTY (1.3) with 8 lanes with a Line Rate of 10.3125 Gbps.
- RefClk of 156.25 MHz provided on GTY Refclk ,bank227 clk1.
- System Clock of 100 MHz provided on a single ended input.
- Top level fixing
QSFP_RESET_A
andQSFP_RESET_B
output signals as1
.
There is a Micron device with RW167
FBGA code on the board. The Micron FBGA and component marking decoder identifies this as MT25QU256ABA1EW9-0SIT. This matches the part shown in the Add Configuration Memory Device Vivado dialog. The I/O voltage is 1.8V. MT25QL256ABA1EW9-0SIT on the Micron website has a download link for the MT25QU256ABA datasheet updated 2024-02-21, but need to register. MT25QU256ABA on the Mouser website is Rev. L – 02/2024 of the datasheet.
The PCIE section of the pinout spreadsheet has the RESET
on pin T19
but does't describe the bank voltage.
The PCIE_5P/PCIE/pcie4_uscale_plus_0_ex/imports/xilinx_pcie4_uscale_plus_x0y0.xdc
file from the example project has:
create_clock -period 10.000 -name sys_clk [get_ports sys_clk_p]
#
set_false_path -from [get_ports sys_rst_n]
set_property PULLUP true [get_ports sys_rst_n]
set_property IOSTANDARD LVCMOS18 [get_ports sys_rst_n]
#
#create_waiver -type METHODOLOGY -id {LUTAR-1} -user "pcie4_uscale_plus" -desc "user link up is synchroized in the user clk so it is safe to ignore" -internal -scoped -tags 1024539 -objects [get_cells { pcie_app_uscale_i/PIO_i/len_i[5]_i_4 }] -objects [get_pins { pcie4_uscale_plus_0_i/inst/user_lnk_up_cdc/arststages_ff_reg[0]/CLR pcie4_uscale_plus_0_i/inst/user_lnk_up_cdc/arststages_ff_reg[1]/CLR }]
set_property PACKAGE_PIN T19 [get_ports sys_rst_n]
This enumerated on the PCIe bus with the following, taken after running bind_xilinx_devices_to_vfio.sh
:
$ sudo lspci -d 10ee: -vv
[sudo] password for mr_halfword:
01:00.0 Memory controller: Xilinx Corporation Device 9038
Subsystem: Xilinx Corporation Device 0007
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
IOMMU group: 0
Region 0: Memory at fe300000 (32-bit, non-prefetchable) [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D3 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s (downgraded), Width x8 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [1c0 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Kernel driver in use: vfio-pci
$ dump_info/dump_pci_info_pciutils
domain=0000 bus=01 dev=00 func=00
vendor_id=10ee (Xilinx Corporation) device_id=9038 (Device 9038) subvendor_id=10ee subdevice_id=0007
iommu_group=0
driver=vfio-pci
control: I/O- Mem- BusMaster- ParErr- SERR- DisINTx-
status: INTx- <ParErr- >TAbort- <TAbort- <MAbort- >SERR- DetParErr-
bar[0] base_addr=fe300000 size=20000 is_IO=0 is_prefetchable=0 is_64=0
Capabilities: [40] Power Management
Capabilities: [48] Message Signaled Interrupts
Capabilities: [70] PCI Express v2 Express Endpoint, MSI 0
Link capabilities: Max speed 8.0 GT/s Max width x8
Negotiated link status: Current speed 5.0 GT/s Width x8
Link capabilities2: Supported link speeds 2.5 GT/s 5.0 GT/s 8.0 GT/s
DevCap: MaxPayload 1024 bytes PhantFunc 0 Latency L0s Maximum of 64 ns L1 Maximum of 1 μs
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port # 0 ASPM not supported
L0s Exit Latency More than 4 μs
L1 Exit Latency More than 64 μs
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- ABWMgmt-
LnkSta: TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
domain=0000 bus=00 dev=01 func=00
vendor_id=8086 (Intel Corporation) device_id=0101 (Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port)
driver=pcieport
control: I/O+ Mem+ BusMaster+ ParErr- SERR- DisINTx+
status: INTx- <ParErr- >TAbort- <TAbort- <MAbort- >SERR- DetParErr-
Capabilities: [88] Bridge subsystem vendor/device ID
Capabilities: [80] Power Management
Capabilities: [90] Message Signaled Interrupts
Capabilities: [a0] PCI Express v2 Root Port, MSI 0
Link capabilities: Max speed 5.0 GT/s Max width x16
Negotiated link status: Current speed 5.0 GT/s Width x8
Link capabilities2: Not implemented
DevCap: MaxPayload 128 bytes PhantFunc 0 Latency L0s Maximum of 64 ns L1 Maximum of 1 μs
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port # 2 ASPM L0s and L1
L0s Exit Latency 128 ns to less than 256 ns
L1 Exit Latency 2 μs to less than 4 μs
ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp-
LnkCtl: ASPM Disabled RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- ABWMgmt-
LnkSta: TrErr- Train- SlotClk+ DLActive- BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #0 PowerLimit 0.000W Interlock- NoCompl+
The negotiated width is x8 as expected.
The negotiated speed is 5GT/s as expected. The FPGA design supports 2.5-8GT/s, but the root complex a max of 5GT/s.
If run the program which uses VFIO to access the device, and therefore perform a PCI reset during the open, then the link speed drops to 2.5 GT/s
$ dump_info/dump_pci_info_vfio
Opening device 0000:01:00.0 (10ee:9038) with IOMMU group 0
domain=0000 bus=01 dev=00 func=00
vendor_id=10ee (Xilinx Corporation) device_id=9038 (Device 9038) subvendor_id=10ee subdevice_id=0007
iommu_group=0
driver=vfio-pci
control: I/O- Mem+ BusMaster- ParErr- SERR- DisINTx-
status: INTx- <ParErr- >TAbort- <TAbort- <MAbort- >SERR- DetParErr-
bar[0] base_addr=fe300000 size=20000 is_IO=0 is_prefetchable=0 is_64=0
Capabilities: [40] Power Management
Capabilities: [48] Message Signaled Interrupts
Capabilities: [70] PCI Express v2 Express Endpoint, MSI 0
Link capabilities: Max speed 8.0 GT/s Max width x8
Negotiated link status: Current speed 2.5 GT/s Width x8
Link capabilities2: Supported link speeds 2.5 GT/s 5.0 GT/s 8.0 GT/s
DevCap: MaxPayload 1024 bytes PhantFunc 0 Latency L0s Maximum of 64 ns L1 Maximum of 1 μs
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port # 0 ASPM not supported
L0s Exit Latency More than 4 μs
L1 Exit Latency More than 64 μs
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- ABWMgmt-
LnkSta: TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
The negotiated link speed dropping from 5.0 GT/s to 2.5 GT/s after a PCI reset caused when VFIO opened the device has been seen with a different FPGA card in the same PC, so likely the link speed drop is due to something in the PC.
Also, sometimes the width enumerates as 1x, but following a reboot of the PC then enumerates as the expected x8 width. Seen after:
- Powered on the PC, and the FPGA loads from the as delivered bitstream in the configuration flash.
- After loading a new bitstream over JTAG.
The actual configuration flash is a Micron MT25QU256ABA
. Running the driver which was written for a Micron N25Q256A for a different board was able to identify the flash are parse a valid configuration bitstream (with the as delivered configuration contents):
$ xilinx_quad_spi/quad_spi_flasher
Opening device 0000:01:00.0 (10ee:9038) with IOMMU group 0
Enabled bus master for 0000:01:00.0
Displaying information for SPI flash using XCKU5P_DUAL_QSFP_dma_stream_loopback design in PCI device 0000:01:00.0 IOMMU group 0
Initial device identification incorrect - ignoring due to Quad SPI core not outputting initial clock cycles
FIFO depth=256
Flash device : Micron N25Q256A
Manufacturer ID=0x20 Memory Interface Type=0xbb Density=0x19
Flash Size Bytes=33554432 Page Size Bytes=256 Num Address Bytes=4
Successfully parsed bitstream of length 4958368 bytes with 299662 configuration packets
Read 4980736 bytes from SPI flash starting at address 0
Sync word at byte index 0x50
Type 1 packet opcode NOP
Type 1 packet opcode write register BSPI words 0000066C
Type 1 packet opcode write register CMD command BSPI_READ
Type 1 packet opcode NOP (2 consecutive)
Type 1 packet opcode write register TIMER words 00000000
Type 1 packet opcode write register WBSTAR words 00000000
Type 1 packet opcode write register CMD command NULL
Type 1 packet opcode NOP
Type 1 packet opcode write register CMD command RCRC
Type 1 packet opcode NOP (2 consecutive)
Type 1 packet opcode write register FAR words 00000000
Type 1 packet opcode write register RBCRC_SW words 00000000
Type 1 packet opcode write register COR0 words 383C3FE5
Type 1 packet opcode write register COR1 words 00400000
Type 1 packet opcode write register IDCODE KU5P
Type 1 packet opcode write register CMD command FALL_EDGE
Type 1 packet opcode write register CMD command SWITCH
Type 1 packet opcode NOP
Type 1 packet opcode write register MASK words 00000001
Type 1 packet opcode write register CTL0 words 00000101
Type 1 packet opcode write register MASK words 00001000
Type 1 packet opcode write register CTL1 words 00001000
Type 1 packet opcode NOP (8 consecutive)
Configuration data writes consisting of:
222217 NOPs
38213 FAR writes
371 WCFG commands
371 FDRI writes with a total of 97185 words
80 MFW commands
80 NULL commands
37842 MFWR writes with a total of 529788 words
25 Type 2 packets with a total of 274164 words
Type 1 packet opcode write register CRC words B6F9F409
Type 1 packet opcode NOP (2 consecutive)
Type 1 packet opcode write register CMD command GRESTORE
Type 1 packet opcode NOP (2 consecutive)
Type 1 packet opcode write register CMD command DGHIGH_LFRM
Type 1 packet opcode NOP (20 consecutive)
Type 1 packet opcode write register MASK words 00001000
Type 1 packet opcode write register CTL1 words 00000000
Type 1 packet opcode write register CMD command START
Type 1 packet opcode NOP
Type 1 packet opcode write register FAR words 07FC0000
Type 1 packet opcode write register MASK words 00000101
Type 1 packet opcode write register CTL0 words 00000101
Type 1 packet opcode write register CRC words 5FFE959E
Type 1 packet opcode NOP (2 consecutive)
Type 1 packet opcode write register CMD command DESYNC
Type 1 packet opcode NOP (393 consecutive)
Looking at the usage of different banks, to determine potential for VUSER supplies to monitor:
Signal(s) | Bank | IO standard |
---|---|---|
PCIe lanes 0-3 | 225 | |
PCIe lanes 4-7 | 224 | |
CLK_PCIe_100MHz_clk | 225 | |
PCI_PERSTN | 65 | LVCMOS18 |
QSFP lanes 0-3 | 226 | |
QSFP lanes 4-7 | 227 | |
gty_refclk1 | 227 | |
gty_refclk0 | 227 | |
QSFP_RESET_B | 86 | LVCMOS33 |
QSFP_RESET_A | 84 | LVCMOS33 |
gty_sysclkp_i | 66 | LVCMOS18 |
QSFP A SCL | 87 | LVCMOS33 |
The above is partial, and doesn't cover all the discrete signals.
Will enable the following user supplies in the SYSMON, selected to try and get different voltages:
Channel | Bank | Monitored supply | Comment |
---|---|---|---|
USER0 | 226 | AVCC | Analog supply voltage for transceiver circuits |
USER1 | 226 | AVTT | Analog supply voltage for transceiver termination circuits |
USER2 | 65 | VCCO | Expect 1.8V |
USER3 | 87 | VCCO | Expect 3.3V |
The values read were:
$ xilinx_sensors/display_sensor_values
Opening device 0000:01:00.0 (10ee:9038) with IOMMU group 0
Enabled bus master for 0000:01:00.0
Displaying SYSMON values for design XCKU5P_DUAL_QSFP_dma_stream_loopback in PCI device 0000:01:00.0 IOMMU group 0:
SYSMON samples using Continuous sequence mode
Number of samples averaged none
Current enabled channels in sequencer: Temp Vccint Vccaux Vbram Cal Vuser0 Vuser1 Vuser2 Vuser3
Analog Bus configuration 0x0E98
Channel Measurement Min Max
Temp 57.4881C 55.0012C 60.4723C
Vccint 0.8877V 0.8818V 0.8936V
Vccaux 1.8252V 1.8223V 1.8311V
Vbram 0.8877V 0.8818V 0.8936V
Vuser0 0.8965V 0.8936V 0.9023V
Vuser1 1.2041V 1.1982V 1.2100V
Vuser2 1.8457V 1.8398V 1.8574V
Vuser3 3.3164V 3.3105V 3.3398V
The values look valid. For Vuser3
the scaling has been set to a 6V range as expected. Whereas the value displayed by the Vivado Hardware Manager when read the SYSMON values over JTAG was half the expected value. I.e. appears the Vivado Hardware Manager doesn't detect when the supply being monitored is a high IO range.
The XCKU5P_DUAL_QSFP_dma_stream_loopback/XCKU5P_DUAL_QSFP_dma_stream_loopback.gen/sources_1/bd/XCKU5P_DUAL_QSFP_dma_stream_loopback/ip/XCKU5P_DUAL_QSFP_dma_stream_loopback_system_management_wiz_0_0/XCKU5P_DUAL_QSFP_dma_stream_loopback_system_management_wiz_0_0_xadc_core_drp.vhd
generated file in the XCKU5P_DUAL_QSFP_dma_stream_loopback project which instantiates SYSMONE4
has the following, which matches the value reported for Analog Bus configuration
by the program:
INIT_45 => X"0E98", -- Analog Bus Register
From running display_sensor_values
it appears that when the device is opened or closed, which performs a device resett, that the min/max sensor values are reset.
The card was fitted to a HP Pavilion 590-p0053na, which uses a AMD IOMMU as part of checking use of VFIO with a AMD IOMMU. The FPGA card is fitted in a x16 PCIe 3.0 slot, so should use the full FPGA capability of 8.0 GT/s at x8 width.
Some issues seen are:
- On one reboot the FPGA enumerated as only x2 rather than x8 width which cause the throughput to drop for
test_dma_bridge_parallel_streams
- A reboot into openSUSE leap 15.5 made the FPGA enumerate at x8 width. However, when monitoring
dmesg
noticed that when DMA tests were run numerous errors of the following form were reported:
Where the above errors are reported from the PCIe root port to which the FPGA is connected.[ +0.018245] pcieport 0000:00:01.1: AER: Corrected error message received from 0000:00:01.0 [ +0.000012] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) [ +0.000003] pcieport 0000:00:01.1: device [1022:15d3] error status/mask=00000040/00006000 [ +0.000003] pcieport 0000:00:01.1: [ 6] BadTLP
For the PCIe correctable errors:
- Rebooted into AlmaLinux 8.10and didn't get the AER errors in
dmesg
- Rebooted back into openSUSE, and got the AER correctable errors when DMA was running.
Saved the lspci -vvv
output for both OS's. Differences are:
- For the PCIe root port openSUSE is enabling reporting of correctable errors in
DevCtrl
and AlmaLinux 8.10 is disabling reporting. - For the FPGA endpoint openSUSE is disabling reporting of correctable errors in
DevCtrl
and AlmaLinix 8.10 is enabling reporting.
Having installed AlmaLinux 9.4 on the PC noticed:
- When the VFIO device is opened or closed sometimes the negotiated link width changes, have seen x8, x4 and x2 used. The change in negotiated width lead to a corresponding change in the bandwidth reported by
test_dma_bridge_parallel_streams
. - Eventually the link negotiated as x8 2/5 GT/s and didn't change to a higher rate over device opens / closes. Secure boot is enabled so unable to run PCIe Set Speed nor pcie_set_speed.c to trigger link retraining after the device has been opened.
Considered creating a VFIO based program to create the functionality as the pcie_set_speed.sh
. However, the Retrain Link bit in the Link Control Register is not applicable for endpoints. Given that the vfio-pci
module can't be bound to a PCIe root port, then it is not currently possible to use VFIO to perform a link retrain.
[Qemu-devel] QEMU PCIe link "negotiation" from 2018 has:
This email is already too long, but I also wonder whether we should consider additional vfio-pci interfaces to trigger a link retraining or allow virtualized access to the physical upstream port config space.
Not sure if has been any work on allowing vfio-pci to trigger link retraining on real hardware. Not sure how much of that message was about virtualised PCIe links for QEMU.
In the XCKU5P_DUAL_QSFP_dma_ram design for the DMA/Bridge Subsystem for PCI Express using the Advanced mode:
- Set
enable_auto_rxeq
true, but have still seen link speed and width changes when vfio has opened the device. - Enable debug options, but haven't yet used them to investigate.
The HP Pavilion 590-p0053na PC sometimes hangs booting AlmaLinux, and have seen the NVMe with the Windows installation sometimes disappear. I.e. maybe some issues with the PC which contribute to the PCIe link speed and width changing issues.