181aa5206SRajat JainPCIe Device AER statistics 234433332SMauro Carvalho Chehab-------------------------- 334433332SMauro Carvalho Chehab 481aa5206SRajat JainThese attributes show up under all the devices that are AER capable. These 581aa5206SRajat Jainstatistical counters indicate the errors "as seen/reported by the device". 681aa5206SRajat JainNote that this may mean that if an endpoint is causing problems, the AER 781aa5206SRajat Jaincounters may increment at its link partner (e.g. root port) because the 881aa5206SRajat Jainerrors may be "seen" / reported by the link partner and not the 981aa5206SRajat Jainproblematic endpoint itself (which may report all counters as 0 as it never 1081aa5206SRajat Jainsaw any problems). 1181aa5206SRajat Jain 12abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_dev_correctable 1381aa5206SRajat JainDate: July 2018 1481aa5206SRajat JainKernelVersion: 4.19.0 1581aa5206SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 1681aa5206SRajat JainDescription: List of correctable errors seen and reported by this 1781aa5206SRajat Jain PCI device using ERR_COR. Note that since multiple errors may 1881aa5206SRajat Jain be reported using a single ERR_COR message, thus 1981aa5206SRajat Jain TOTAL_ERR_COR at the end of the file may not match the actual 2034433332SMauro Carvalho Chehab total of all the errors in the file. Sample output:: 2134433332SMauro Carvalho Chehab 2281aa5206SRajat Jain localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable 2381aa5206SRajat Jain Receiver Error 2 2481aa5206SRajat Jain Bad TLP 0 2581aa5206SRajat Jain Bad DLLP 0 2681aa5206SRajat Jain RELAY_NUM Rollover 0 2781aa5206SRajat Jain Replay Timer Timeout 0 2881aa5206SRajat Jain Advisory Non-Fatal 0 2981aa5206SRajat Jain Corrected Internal Error 0 3081aa5206SRajat Jain Header Log Overflow 0 3181aa5206SRajat Jain TOTAL_ERR_COR 2 3281aa5206SRajat Jain 33abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_dev_fatal 3481aa5206SRajat JainDate: July 2018 3581aa5206SRajat JainKernelVersion: 4.19.0 3681aa5206SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 3781aa5206SRajat JainDescription: List of uncorrectable fatal errors seen and reported by this 3881aa5206SRajat Jain PCI device using ERR_FATAL. Note that since multiple errors may 3981aa5206SRajat Jain be reported using a single ERR_FATAL message, thus 4081aa5206SRajat Jain TOTAL_ERR_FATAL at the end of the file may not match the actual 4134433332SMauro Carvalho Chehab total of all the errors in the file. Sample output:: 4234433332SMauro Carvalho Chehab 4381aa5206SRajat Jain localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal 4481aa5206SRajat Jain Undefined 0 4581aa5206SRajat Jain Data Link Protocol 0 4681aa5206SRajat Jain Surprise Down Error 0 4781aa5206SRajat Jain Poisoned TLP 0 4881aa5206SRajat Jain Flow Control Protocol 0 4981aa5206SRajat Jain Completion Timeout 0 5081aa5206SRajat Jain Completer Abort 0 5181aa5206SRajat Jain Unexpected Completion 0 5281aa5206SRajat Jain Receiver Overflow 0 5381aa5206SRajat Jain Malformed TLP 0 5481aa5206SRajat Jain ECRC 0 5581aa5206SRajat Jain Unsupported Request 0 5681aa5206SRajat Jain ACS Violation 0 5781aa5206SRajat Jain Uncorrectable Internal Error 0 5881aa5206SRajat Jain MC Blocked TLP 0 5981aa5206SRajat Jain AtomicOp Egress Blocked 0 6081aa5206SRajat Jain TLP Prefix Blocked Error 0 6181aa5206SRajat Jain TOTAL_ERR_FATAL 0 6281aa5206SRajat Jain 63abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_dev_nonfatal 6481aa5206SRajat JainDate: July 2018 6581aa5206SRajat JainKernelVersion: 4.19.0 6681aa5206SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 6781aa5206SRajat JainDescription: List of uncorrectable nonfatal errors seen and reported by this 6881aa5206SRajat Jain PCI device using ERR_NONFATAL. Note that since multiple errors 6981aa5206SRajat Jain may be reported using a single ERR_FATAL message, thus 7081aa5206SRajat Jain TOTAL_ERR_NONFATAL at the end of the file may not match the 7134433332SMauro Carvalho Chehab actual total of all the errors in the file. Sample output:: 7234433332SMauro Carvalho Chehab 7381aa5206SRajat Jain localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal 7481aa5206SRajat Jain Undefined 0 7581aa5206SRajat Jain Data Link Protocol 0 7681aa5206SRajat Jain Surprise Down Error 0 7781aa5206SRajat Jain Poisoned TLP 0 7881aa5206SRajat Jain Flow Control Protocol 0 7981aa5206SRajat Jain Completion Timeout 0 8081aa5206SRajat Jain Completer Abort 0 8181aa5206SRajat Jain Unexpected Completion 0 8281aa5206SRajat Jain Receiver Overflow 0 8381aa5206SRajat Jain Malformed TLP 0 8481aa5206SRajat Jain ECRC 0 8581aa5206SRajat Jain Unsupported Request 0 8681aa5206SRajat Jain ACS Violation 0 8781aa5206SRajat Jain Uncorrectable Internal Error 0 8881aa5206SRajat Jain MC Blocked TLP 0 8981aa5206SRajat Jain AtomicOp Egress Blocked 0 9081aa5206SRajat Jain TLP Prefix Blocked Error 0 9181aa5206SRajat Jain TOTAL_ERR_NONFATAL 0 9212833017SRajat Jain 9312833017SRajat JainPCIe Rootport AER statistics 9434433332SMauro Carvalho Chehab---------------------------- 9534433332SMauro Carvalho Chehab 9612833017SRajat JainThese attributes show up under only the rootports (or root complex event 9712833017SRajat Jaincollectors) that are AER capable. These indicate the number of error messages as 9812833017SRajat Jain"reported to" the rootport. Please note that the rootports also transmit 9912833017SRajat Jain(internally) the ERR_* messages for errors seen by the internal rootport PCI 10012833017SRajat Jaindevice, so these counters include them and are thus cumulative of all the error 10112833017SRajat Jainmessages on the PCI hierarchy originating at that root port. 10212833017SRajat Jain 103abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor 10412833017SRajat JainDate: July 2018 10512833017SRajat JainKernelVersion: 4.19.0 10612833017SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 10712833017SRajat JainDescription: Total number of ERR_COR messages reported to rootport. 10812833017SRajat Jain 109abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal 11012833017SRajat JainDate: July 2018 11112833017SRajat JainKernelVersion: 4.19.0 11212833017SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 11312833017SRajat JainDescription: Total number of ERR_FATAL messages reported to rootport. 11412833017SRajat Jain 115abf313b5SMauro Carvalho ChehabWhat: /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_nonfatal 11612833017SRajat JainDate: July 2018 11712833017SRajat JainKernelVersion: 4.19.0 11812833017SRajat JainContact: linux-pci@vger.kernel.org, rajatja@google.com 11912833017SRajat JainDescription: Total number of ERR_NONFATAL messages reported to rootport. 120