181aa5206SRajat JainPCIe Device AER statistics
234433332SMauro Carvalho Chehab--------------------------
334433332SMauro Carvalho Chehab
481aa5206SRajat JainThese attributes show up under all the devices that are AER capable. These
581aa5206SRajat Jainstatistical counters indicate the errors "as seen/reported by the device".
681aa5206SRajat JainNote that this may mean that if an endpoint is causing problems, the AER
781aa5206SRajat Jaincounters may increment at its link partner (e.g. root port) because the
881aa5206SRajat Jainerrors may be "seen" / reported by the link partner and not the
981aa5206SRajat Jainproblematic endpoint itself (which may report all counters as 0 as it never
1081aa5206SRajat Jainsaw any problems).
1181aa5206SRajat Jain
12abf313b5SMauro Carvalho ChehabWhat:		/sys/bus/pci/devices/<dev>/aer_dev_correctable
1381aa5206SRajat JainDate:		July 2018
1481aa5206SRajat JainKernelVersion: 4.19.0
1581aa5206SRajat JainContact:	linux-pci@vger.kernel.org, rajatja@google.com
1681aa5206SRajat JainDescription:	List of correctable errors seen and reported by this
1781aa5206SRajat Jain		PCI device using ERR_COR. Note that since multiple errors may
1881aa5206SRajat Jain		be reported using a single ERR_COR message, thus
1981aa5206SRajat Jain		TOTAL_ERR_COR at the end of the file may not match the actual
2034433332SMauro Carvalho Chehab		total of all the errors in the file. Sample output::
2134433332SMauro Carvalho Chehab
2281aa5206SRajat Jain		    localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable
2381aa5206SRajat Jain		    Receiver Error 2
2481aa5206SRajat Jain		    Bad TLP 0
2581aa5206SRajat Jain		    Bad DLLP 0
2681aa5206SRajat Jain		    RELAY_NUM Rollover 0
2781aa5206SRajat Jain		    Replay Timer Timeout 0
2881aa5206SRajat Jain		    Advisory Non-Fatal 0
2981aa5206SRajat Jain		    Corrected Internal Error 0
3081aa5206SRajat Jain		    Header Log Overflow 0
3181aa5206SRajat Jain		    TOTAL_ERR_COR 2
3281aa5206SRajat Jain
33abf313b5SMauro Carvalho ChehabWhat:		/sys/bus/pci/devices/<dev>/aer_dev_fatal
3481aa5206SRajat JainDate:		July 2018
3581aa5206SRajat JainKernelVersion: 4.19.0
3681aa5206SRajat JainContact:	linux-pci@vger.kernel.org, rajatja@google.com
3781aa5206SRajat JainDescription:	List of uncorrectable fatal errors seen and reported by this
3881aa5206SRajat Jain		PCI device using ERR_FATAL. Note that since multiple errors may
3981aa5206SRajat Jain		be reported using a single ERR_FATAL message, thus
4081aa5206SRajat Jain		TOTAL_ERR_FATAL at the end of the file may not match the actual
4134433332SMauro Carvalho Chehab		total of all the errors in the file. Sample output::
4234433332SMauro Carvalho Chehab
4381aa5206SRajat Jain		    localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal
4481aa5206SRajat Jain		    Undefined 0
4581aa5206SRajat Jain		    Data Link Protocol 0
4681aa5206SRajat Jain		    Surprise Down Error 0
4781aa5206SRajat Jain		    Poisoned TLP 0
4881aa5206SRajat Jain		    Flow Control Protocol 0
4981aa5206SRajat Jain		    Completion Timeout 0
5081aa5206SRajat Jain		    Completer Abort 0
5181aa5206SRajat Jain		    Unexpected Completion 0
5281aa5206SRajat Jain		    Receiver Overflow 0
5381aa5206SRajat Jain		    Malformed TLP 0
5481aa5206SRajat Jain		    ECRC 0
5581aa5206SRajat Jain		    Unsupported Request 0
5681aa5206SRajat Jain		    ACS Violation 0
5781aa5206SRajat Jain		    Uncorrectable Internal Error 0
5881aa5206SRajat Jain		    MC Blocked TLP 0
5981aa5206SRajat Jain		    AtomicOp Egress Blocked 0
6081aa5206SRajat Jain		    TLP Prefix Blocked Error 0
6181aa5206SRajat Jain		    TOTAL_ERR_FATAL 0
6281aa5206SRajat Jain
63abf313b5SMauro Carvalho ChehabWhat:		/sys/bus/pci/devices/<dev>/aer_dev_nonfatal
6481aa5206SRajat JainDate:		July 2018
6581aa5206SRajat JainKernelVersion: 4.19.0
6681aa5206SRajat JainContact:	linux-pci@vger.kernel.org, rajatja@google.com
6781aa5206SRajat JainDescription:	List of uncorrectable nonfatal errors seen and reported by this
6881aa5206SRajat Jain		PCI device using ERR_NONFATAL. Note that since multiple errors
6981aa5206SRajat Jain		may be reported using a single ERR_FATAL message, thus
7081aa5206SRajat Jain		TOTAL_ERR_NONFATAL at the end of the file may not match the
7134433332SMauro Carvalho Chehab		actual total of all the errors in the file. Sample output::
7234433332SMauro Carvalho Chehab
7381aa5206SRajat Jain		    localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal
7481aa5206SRajat Jain		    Undefined 0
7581aa5206SRajat Jain		    Data Link Protocol 0
7681aa5206SRajat Jain		    Surprise Down Error 0
7781aa5206SRajat Jain		    Poisoned TLP 0
7881aa5206SRajat Jain		    Flow Control Protocol 0
7981aa5206SRajat Jain		    Completion Timeout 0
8081aa5206SRajat Jain		    Completer Abort 0
8181aa5206SRajat Jain		    Unexpected Completion 0
8281aa5206SRajat Jain		    Receiver Overflow 0
8381aa5206SRajat Jain		    Malformed TLP 0
8481aa5206SRajat Jain		    ECRC 0
8581aa5206SRajat Jain		    Unsupported Request 0
8681aa5206SRajat Jain		    ACS Violation 0
8781aa5206SRajat Jain		    Uncorrectable Internal Error 0
8881aa5206SRajat Jain		    MC Blocked TLP 0
8981aa5206SRajat Jain		    AtomicOp Egress Blocked 0
9081aa5206SRajat Jain		    TLP Prefix Blocked Error 0
9181aa5206SRajat Jain		    TOTAL_ERR_NONFATAL 0
9212833017SRajat Jain
9312833017SRajat JainPCIe Rootport AER statistics
9434433332SMauro Carvalho Chehab----------------------------
9534433332SMauro Carvalho Chehab
9612833017SRajat JainThese attributes show up under only the rootports (or root complex event
9712833017SRajat Jaincollectors) that are AER capable. These indicate the number of error messages as
9812833017SRajat Jain"reported to" the rootport. Please note that the rootports also transmit
9912833017SRajat Jain(internally) the ERR_* messages for errors seen by the internal rootport PCI
10012833017SRajat Jaindevice, so these counters include them and are thus cumulative of all the error
10112833017SRajat Jainmessages on the PCI hierarchy originating at that root port.
10212833017SRajat Jain
103abf313b5SMauro Carvalho ChehabWhat:		/sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor
10412833017SRajat JainDate:		July 2018
10512833017SRajat JainKernelVersion: 4.19.0
10612833017SRajat JainContact:	linux-pci@vger.kernel.org, rajatja@google.com
10712833017SRajat JainDescription:	Total number of ERR_COR messages reported to rootport.
10812833017SRajat Jain
109abf313b5SMauro Carvalho ChehabWhat:	    /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal
11012833017SRajat JainDate:		July 2018
11112833017SRajat JainKernelVersion: 4.19.0
11212833017SRajat JainContact:	linux-pci@vger.kernel.org, rajatja@google.com
11312833017SRajat JainDescription:	Total number of ERR_FATAL messages reported to rootport.
11412833017SRajat Jain
115abf313b5SMauro Carvalho ChehabWhat:	    /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_nonfatal
11612833017SRajat JainDate:		July 2018
11712833017SRajat JainKernelVersion: 4.19.0
11812833017SRajat JainContact:	linux-pci@vger.kernel.org, rajatja@google.com
11912833017SRajat JainDescription:	Total number of ERR_NONFATAL messages reported to rootport.
120