181aa5206SRajat Jain========================== 281aa5206SRajat JainPCIe Device AER statistics 381aa5206SRajat Jain========================== 481aa5206SRajat JainThese attributes show up under all the devices that are AER capable. These 581aa5206SRajat Jainstatistical counters indicate the errors "as seen/reported by the device". 681aa5206SRajat JainNote that this may mean that if an endpoint is causing problems, the AER 781aa5206SRajat Jaincounters may increment at its link partner (e.g. root port) because the 881aa5206SRajat Jainerrors may be "seen" / reported by the link partner and not the 981aa5206SRajat Jainproblematic endpoint itself (which may report all counters as 0 as it never 1081aa5206SRajat Jainsaw any problems). 1181aa5206SRajat Jain 12*abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_dev_correctable 1381aa5206SRajat JainDate: July 2018 1481aa5206SRajat JainKernel Version: 4.19.0 1581aa5206SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 1681aa5206SRajat JainDescription: List of correctable errors seen and reported by this 1781aa5206SRajat Jain PCI device using ERR_COR. Note that since multiple errors may 1881aa5206SRajat Jain be reported using a single ERR_COR message, thus 1981aa5206SRajat Jain TOTAL_ERR_COR at the end of the file may not match the actual 2081aa5206SRajat Jain total of all the errors in the file. Sample output: 2181aa5206SRajat Jain------------------------------------------------------------------------- 2281aa5206SRajat Jainlocalhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable 2381aa5206SRajat JainReceiver Error 2 2481aa5206SRajat JainBad TLP 0 2581aa5206SRajat JainBad DLLP 0 2681aa5206SRajat JainRELAY_NUM Rollover 0 2781aa5206SRajat JainReplay Timer Timeout 0 2881aa5206SRajat JainAdvisory Non-Fatal 0 2981aa5206SRajat JainCorrected Internal Error 0 3081aa5206SRajat JainHeader Log Overflow 0 3181aa5206SRajat JainTOTAL_ERR_COR 2 3281aa5206SRajat Jain------------------------------------------------------------------------- 3381aa5206SRajat Jain 34*abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_dev_fatal 3581aa5206SRajat JainDate: July 2018 3681aa5206SRajat JainKernel Version: 4.19.0 3781aa5206SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 3881aa5206SRajat JainDescription: List of uncorrectable fatal errors seen and reported by this 3981aa5206SRajat Jain PCI device using ERR_FATAL. Note that since multiple errors may 4081aa5206SRajat Jain be reported using a single ERR_FATAL message, thus 4181aa5206SRajat Jain TOTAL_ERR_FATAL at the end of the file may not match the actual 4281aa5206SRajat Jain total of all the errors in the file. Sample output: 4381aa5206SRajat Jain------------------------------------------------------------------------- 4481aa5206SRajat Jainlocalhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal 4581aa5206SRajat JainUndefined 0 4681aa5206SRajat JainData Link Protocol 0 4781aa5206SRajat JainSurprise Down Error 0 4881aa5206SRajat JainPoisoned TLP 0 4981aa5206SRajat JainFlow Control Protocol 0 5081aa5206SRajat JainCompletion Timeout 0 5181aa5206SRajat JainCompleter Abort 0 5281aa5206SRajat JainUnexpected Completion 0 5381aa5206SRajat JainReceiver Overflow 0 5481aa5206SRajat JainMalformed TLP 0 5581aa5206SRajat JainECRC 0 5681aa5206SRajat JainUnsupported Request 0 5781aa5206SRajat JainACS Violation 0 5881aa5206SRajat JainUncorrectable Internal Error 0 5981aa5206SRajat JainMC Blocked TLP 0 6081aa5206SRajat JainAtomicOp Egress Blocked 0 6181aa5206SRajat JainTLP Prefix Blocked Error 0 6281aa5206SRajat JainTOTAL_ERR_FATAL 0 6381aa5206SRajat Jain------------------------------------------------------------------------- 6481aa5206SRajat Jain 65*abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal 6681aa5206SRajat JainDate: July 2018 6781aa5206SRajat JainKernel Version: 4.19.0 6881aa5206SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 6981aa5206SRajat JainDescription: List of uncorrectable nonfatal errors seen and reported by this 7081aa5206SRajat Jain PCI device using ERR_NONFATAL. Note that since multiple errors 7181aa5206SRajat Jain may be reported using a single ERR_FATAL message, thus 7281aa5206SRajat Jain TOTAL_ERR_NONFATAL at the end of the file may not match the 7381aa5206SRajat Jain actual total of all the errors in the file. Sample output: 7481aa5206SRajat Jain------------------------------------------------------------------------- 7581aa5206SRajat Jainlocalhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal 7681aa5206SRajat JainUndefined 0 7781aa5206SRajat JainData Link Protocol 0 7881aa5206SRajat JainSurprise Down Error 0 7981aa5206SRajat JainPoisoned TLP 0 8081aa5206SRajat JainFlow Control Protocol 0 8181aa5206SRajat JainCompletion Timeout 0 8281aa5206SRajat JainCompleter Abort 0 8381aa5206SRajat JainUnexpected Completion 0 8481aa5206SRajat JainReceiver Overflow 0 8581aa5206SRajat JainMalformed TLP 0 8681aa5206SRajat JainECRC 0 8781aa5206SRajat JainUnsupported Request 0 8881aa5206SRajat JainACS Violation 0 8981aa5206SRajat JainUncorrectable Internal Error 0 9081aa5206SRajat JainMC Blocked TLP 0 9181aa5206SRajat JainAtomicOp Egress Blocked 0 9281aa5206SRajat JainTLP Prefix Blocked Error 0 9381aa5206SRajat JainTOTAL_ERR_NONFATAL 0 9481aa5206SRajat Jain------------------------------------------------------------------------- 9512833017SRajat Jain 9612833017SRajat Jain============================ 9712833017SRajat JainPCIe Rootport AER statistics 9812833017SRajat Jain============================ 9912833017SRajat JainThese attributes show up under only the rootports (or root complex event 10012833017SRajat Jaincollectors) that are AER capable. These indicate the number of error messages as 10112833017SRajat Jain"reported to" the rootport. Please note that the rootports also transmit 10212833017SRajat Jain(internally) the ERR_* messages for errors seen by the internal rootport PCI 10312833017SRajat Jaindevice, so these counters include them and are thus cumulative of all the error 10412833017SRajat Jainmessages on the PCI hierarchy originating at that root port. 10512833017SRajat Jain 106*abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor 10712833017SRajat JainDate: July 2018 10812833017SRajat JainKernel Version: 4.19.0 10912833017SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 11012833017SRajat JainDescription: Total number of ERR_COR messages reported to rootport. 11112833017SRajat Jain 112*abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal 11312833017SRajat JainDate: July 2018 11412833017SRajat JainKernel Version: 4.19.0 11512833017SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 11612833017SRajat JainDescription: Total number of ERR_FATAL messages reported to rootport. 11712833017SRajat Jain 118*abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_nonfatal 11912833017SRajat JainDate: July 2018 12012833017SRajat JainKernel Version: 4.19.0 12112833017SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 12212833017SRajat JainDescription: Total number of ERR_NONFATAL messages reported to rootport. 123