Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29 |
|
#
c35977b0 |
| 15-May-2023 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Decode UMC_V2 ECC errors
The MI200 (Aldebaran) series of devices introduced a new SMCA bank type for Unified Memory Controllers. The MCE subsystem already has support for
x86/MCE/AMD, EDAC/mce_amd: Decode UMC_V2 ECC errors
The MI200 (Aldebaran) series of devices introduced a new SMCA bank type for Unified Memory Controllers. The MCE subsystem already has support for this new type. The MCE decoder module will decode the common MCA error information for the new bank type, but it will not pass the information to the AMD64 EDAC module for detailed memory error decoding.
Have the MCE decoder module recognize the new bank type as an SMCA UMC memory error and pass the MCA information to AMD64 EDAC.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230515113537.1052146-3-muralimk@amd.com
show more ...
|
Revision tags: v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10 |
|
#
91f75eb4 |
| 16-Dec-2021 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
AMD systems currently lay out MCA bank types such that the type of bank number "i" is either the same across all CPUs or is R
x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
AMD systems currently lay out MCA bank types such that the type of bank number "i" is either the same across all CPUs or is Reserved/Read-as-Zero.
For example:
Bank # | CPUx | CPUy 0 LS LS 1 RAZ UMC 2 CS CS 3 SMU RAZ
Future AMD systems will lay out MCA bank types such that the type of bank number "i" may be different across CPUs.
For example:
Bank # | CPUx | CPUy 0 LS LS 1 RAZ UMC 2 CS NBIO 3 SMU RAZ
Change the structures that cache MCA bank types to be per-CPU and update smca_get_bank_type() to handle this change.
Move some SMCA-specific structures to amd.c from mce.h, since they no longer need to be global.
Break out the "count" for bank types from struct smca_hwid, since this should provide a per-CPU count rather than a system-wide count.
Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The values in this array should not change at runtime.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.com
show more ...
|
#
5176a93a |
| 16-Dec-2021 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add HWID and McaType values for new SMCA bank types, and add their error descriptions to edac_mce_amd.
The "PHY" bank types all have the same erro
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add HWID and McaType values for new SMCA bank types, and add their error descriptions to edac_mce_amd.
The "PHY" bank types all have the same error descriptions, and the NBIF and SHUB bank types have the same error descriptions. So reuse the same arrays where appropriate.
[ bp: Remove useless comments over hwid types. ]
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-2-yazen.ghannam@amd.com
show more ...
|
Revision tags: v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49 |
|
#
767f4b62 |
| 28-Jun-2021 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
EDAC/mce_amd: Do not load edac_mce_amd module on guests
Hypervisors likely do not expose the SMCA feature to the guest and loading this module leads to false warnings. This module should not be load
EDAC/mce_amd: Do not load edac_mce_amd module on guests
Hypervisors likely do not expose the SMCA feature to the guest and loading this module leads to false warnings. This module should not be loaded in guests to begin with, but people tend to do so, especially when testing kernels in VMs. And then they complain about those false warnings.
Do the practical thing and do not load this module when running as a guest to avoid all that complaining.
[ bp: Rewrite commit message. ]
Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Link: https://lkml.kernel.org/r/20210628172740.245689-1-Smita.KoralahalliChannabasappa@amd.com
show more ...
|
Revision tags: v5.13, v5.10.46, v5.10.43 |
|
#
429b2ba7 |
| 03-Jun-2021 |
Colin Ian King <colin.king@canonical.com> |
EDAC/mce_amd: Fix typo "FIfo" -> "Fifo"
There is an uppercase letter I in one of the MCE error descriptions instead of a lowercase one. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.c
EDAC/mce_amd: Fix typo "FIfo" -> "Fifo"
There is an uppercase letter I in one of the MCE error descriptions instead of a lowercase one. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20210603103349.79117-1-colin.king@canonical.com
show more ...
|
Revision tags: v5.10.42, v5.10.41 |
|
#
94a311ce |
| 26-May-2021 |
Muralidhara M K <muralimk@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add the (HWID, MCATYPE) tuples and names for new SMCA bank types.
Also, add their respective error descriptions to the MCE decoding module edac_mc
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add the (HWID, MCATYPE) tuples and names for new SMCA bank types.
Also, add their respective error descriptions to the MCE decoding module edac_mce_amd. Also while at it, optimize the string names for some SMCA banks.
[ bp: Drop repeated comments, explain why UMC_V2 is a separate entry. ]
Signed-off-by: Muralidhara M K <muralimk@amd.com> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20210526164601.66228-1-nchatrad@amd.com
show more ...
|
Revision tags: v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10 |
|
#
8de0c991 |
| 09-Nov-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId
The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do syste
EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId
The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter.
In AMD documentation, NodeId is used to identify a physical die in a system. This can be used to identify a node in the AMD_NB code and also it is used with umc_normaddr_to_sysaddr().
However, the input used for decode_dram_ecc() is currently the NUMA node of a logical CPU. In the default configuration, the NUMA node and physical die will be equivalent, so this doesn't have an impact.
But the NUMA node configuration can be adjusted with optional memory interleaving modes. This will cause the NUMA node enumeration to not match the physical die enumeration. The mismatch will cause the address translation function to fail or report incorrect results.
Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the physical ID is used.
Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-4-Yazen.Ghannam@amd.com
show more ...
|
#
db970bd2 |
| 09-Nov-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/CPU/AMD: Remove amd_get_nb_id()
The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in
x86/CPU/AMD: Remove amd_get_nb_id()
The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.cpu_die_id.
Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and remove the function.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-3-Yazen.Ghannam@amd.com
show more ...
|
#
8a6c5eec |
| 28-Jun-2021 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
EDAC/mce_amd: Do not load edac_mce_amd module on guests
[ Upstream commit 767f4b620edadac579c9b8b6660761d4285fa6f9 ]
Hypervisors likely do not expose the SMCA feature to the guest and loading this
EDAC/mce_amd: Do not load edac_mce_amd module on guests
[ Upstream commit 767f4b620edadac579c9b8b6660761d4285fa6f9 ]
Hypervisors likely do not expose the SMCA feature to the guest and loading this module leads to false warnings. This module should not be loaded in guests to begin with, but people tend to do so, especially when testing kernels in VMs. And then they complain about those false warnings.
Do the practical thing and do not load this module when running as a guest to avoid all that complaining.
[ bp: Rewrite commit message. ]
Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Kim Phillips <kim.phillips@amd.com> Link: https://lkml.kernel.org/r/20210628172740.245689-1-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
3897b71e |
| 09-Nov-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId
[ Upstream commit 8de0c9917cc1297bc5543b61992d5bdee4ce621a ]
The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and late
EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId
[ Upstream commit 8de0c9917cc1297bc5543b61992d5bdee4ce621a ]
The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and later systems. This function is used in amd64_edac_mod to do system-specific decoding for DRAM ECC errors. The function takes a "NodeId" as a parameter.
In AMD documentation, NodeId is used to identify a physical die in a system. This can be used to identify a node in the AMD_NB code and also it is used with umc_normaddr_to_sysaddr().
However, the input used for decode_dram_ecc() is currently the NUMA node of a logical CPU. In the default configuration, the NUMA node and physical die will be equivalent, so this doesn't have an impact.
But the NUMA node configuration can be adjusted with optional memory interleaving modes. This will cause the NUMA node enumeration to not match the physical die enumeration. The mismatch will cause the address translation function to fail or report incorrect results.
Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the physical ID is used.
Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-4-Yazen.Ghannam@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53 |
|
#
368d1887 |
| 20-Jul-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid err
x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid error codes on a system. However, this is unnecessary after a few product releases because the hardware will only report valid error codes. Thus, there's no need for it with future systems.
Remove the xec_bitmap field and all references to it.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200720145353.43924-1-Yazen.Ghannam@amd.com
show more ...
|
Revision tags: v5.4.52, v5.7.9, v5.7.8, v5.4.51 |
|
#
dc7a8476 |
| 08-Jul-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC/mce_amd: Add new error descriptions for existing types
A few existing MCA bank types will have new error types in future SMCA systems.
Add the descriptions for the new error types.
Signed-off
EDAC/mce_amd: Add new error descriptions for existing types
A few existing MCA bank types will have new error types in future SMCA systems.
Add the descriptions for the new error types.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200708153515.1911642-1-Yazen.Ghannam@amd.com
show more ...
|
Revision tags: v5.4.50, v5.7.7, v5.4.49, v5.7.6 |
|
#
bb2de0ad |
| 23-Jun-2020 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
x86/mce, EDAC/mce_amd: Print PPIN in machine check records
Print the Protected Processor Identification Number (PPIN) on processors which support it.
[ bp: Massage. ]
Signed-off-by: Smita Koralah
x86/mce, EDAC/mce_amd: Print PPIN in machine check records
Print the Protected Processor Identification Number (PPIN) on processors which support it.
[ bp: Massage. ]
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200623130059.8870-1-Smita.KoralahalliChannabasappa@amd.com
show more ...
|
Revision tags: v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21 |
|
#
23ba710a |
| 14-Feb-2020 |
Tony Luck <tony.luck@intel.com> |
x86/mce: Fix all mce notifiers to update the mce->kflags bitmask
If the handler took any action to log or deal with the error, set a bit in mce->kflags so that the default handler on the end of the
x86/mce: Fix all mce notifiers to update the mce->kflags bitmask
If the handler took any action to log or deal with the error, set a bit in mce->kflags so that the default handler on the end of the machine check chain can see what has been done.
Get rid of NOTIFY_STOP returns. Make the EDAC and dev-mcelog handlers skip over errors already processed by CEC.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200214222720.13168-5-tony.luck@intel.com
show more ...
|
#
3e0fdec8 |
| 07-Apr-2020 |
Borislav Petkov <bp@suse.de> |
x86/mce/amd, edac: Remove report_gart_errors
... because no one should be interested in spurious MCEs anyway. Make the filtering unconditional and move it to amd_filter_mce().
Signed-off-by: Borisl
x86/mce/amd, edac: Remove report_gart_errors
... because no one should be interested in spurious MCEs anyway. Make the filtering unconditional and move it to amd_filter_mce().
Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200407163414.18058-2-bp@alien8.de
show more ...
|
#
52cff04a |
| 17-Feb-2020 |
Prarit Bhargava <prarit@redhat.com> |
EDAC/mce_amd: Print !SMCA processor warning only once
This warning is output for every virtual CPU in a guest on an EPYC 2 system because kvm doesn't enable SMCA. Once is enough too.
[ bp: Massage
EDAC/mce_amd: Print !SMCA processor warning only once
This warning is output for every virtual CPU in a guest on an EPYC 2 system because kvm doesn't enable SMCA. Once is enough too.
[ bp: Massage. ]
Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200217134627.19765-1-prarit@redhat.com
show more ...
|
Revision tags: v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13 |
|
#
86e9f9d6 |
| 16-Jan-2020 |
Borislav Petkov <bp@suse.de> |
EDAC/mce_amd: Make fam_ops static global
... and do not kmalloc a three-pointer struct. Which simplifies mce_amd_init() a bit.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de> Li
EDAC/mce_amd: Make fam_ops static global
... and do not kmalloc a three-pointer struct. Which simplifies mce_amd_init() a bit.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200116163403.GF27148@zn.tnic
show more ...
|
Revision tags: v5.4.12, v5.4.11 |
|
#
9f6aef86 |
| 09-Jan-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC/mce_amd: Always load on SMCA systems
MCA error decoding on SMCA systems is not dependent on family. Return success early if the system supports the SMCA feature.
Signed-off-by: Yazen Ghannam <
EDAC/mce_amd: Always load on SMCA systems
MCA error decoding on SMCA systems is not dependent on family. Return success early if the system supports the SMCA feature.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200110015651.14887-3-Yazen.Ghannam@amd.com
show more ...
|
#
89a76171 |
| 09-Jan-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Add new Load Store unit McaType
Add support for a new version of the Load Store unit bank type as indicated by its McaType value, which will be present in future SMCA syst
x86/MCE/AMD, EDAC/mce_amd: Add new Load Store unit McaType
Add support for a new version of the Load Store unit bank type as indicated by its McaType value, which will be present in future SMCA systems.
Add the new (HWID, MCATYPE) tuple. Reuse the same name, since this is logically the same to the user.
Also, add the new error descriptions to edac_mce_amd.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200110015651.14887-2-Yazen.Ghannam@amd.com
show more ...
|
Revision tags: v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6, v5.1.5, v5.1.4 |
|
#
09c434b8 |
| 19-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Add SPDX license identifier for more missed files
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have MODULE_LICENCE("GPL*") inside which
treewide: Add SPDX license identifier for more missed files
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have MODULE_LICENCE("GPL*") inside which was used in the initial scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5 |
|
#
71a84402 |
| 25-Mar-2019 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models
AMD family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and
x86/MCE/AMD: Don't report L1 BTB MCA errors on some family 17h models
AMD family 17h Models 10h-2Fh may report a high number of L1 BTB MCA errors under certain conditions. The errors are benign and can safely be ignored. However, the high error rate may cause the MCA threshold counter to overflow causing a high rate of thresholding interrupts.
In addition, users may see the errors reported through the AMD MCE decoder module, even with the interrupt disabled, due to MCA polling.
Clear the "Counter Present" bit in the Instruction Fetch bank's MCA_MISC0 register. This will prevent enabling MCA thresholding on this bank which will prevent the high interrupt rate due to this error.
Define an AMD-specific function to filter these errors from the MCE event pool so that they don't get reported during early boot.
Rename filter function in EDAC/mce_amd to avoid a naming conflict, while at it.
[ bp: Move function prototype to the internal header and massage/cleanup, fix typos. ]
Reported-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "clemej@gmail.com" <clemej@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Shirish S <Shirish.S@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Cc: <stable@vger.kernel.org> # 5.0.x: c95b323dcd35: x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models Cc: <stable@vger.kernel.org> # 5.0.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk Cc: <stable@vger.kernel.org> # 5.0.x: 9308fd407455: x86/MCE: Group AMD function prototypes in <asm/mce.h> Cc: <stable@vger.kernel.org> # 5.0.x Link: https://lkml.kernel.org/r/20190325163410.171021-2-Yazen.Ghannam@amd.com
show more ...
|
Revision tags: v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22 |
|
#
a0bcd3c0 |
| 12-Feb-2019 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC/mce_amd: Decode MCA_STATUS in bit definition order
Sort the MCA_STATUS bits in decode output to follow how they are defined in the register.
The order is as follows:
Bit | Decode --------
EDAC/mce_amd: Decode MCA_STATUS in bit definition order
Sort the MCA_STATUS bits in decode output to follow how they are defined in the register.
The order is as follows:
Bit | Decode ------------ 62 | Over 61 | UC 59 | MiscV 58 | AddrV 57 | PCC 55 | TCC 53 | SyndV 46 | CECC 45 | UECC 44 | Deferred 43 | Poison 40 | Scrub
[ bp: Massage a bit. ]
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20190212212417.107049-2-Yazen.Ghannam@amd.com
show more ...
|
#
3f4da372 |
| 12-Feb-2019 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit
Previous AMD systems have had a bit in MCA_STATUS to indicate that an error was detected on a scrub operation. However, this bit was defined differently wi
EDAC/mce_amd: Decode MCA_STATUS[Scrub] bit
Previous AMD systems have had a bit in MCA_STATUS to indicate that an error was detected on a scrub operation. However, this bit was defined differently within different banks and families/models.
Starting with Family 17h, MCA_STATUS[40] is either Reserved/Read-as-Zero or defined as "Scrub", for all MCA banks and CPU models. Therefore, this bit can be defined as the "Scrub" bit.
Define MCA_STATUS[40] as "Scrub" and decode it in the AMD MCE decoding module for Family 17h and newer systems.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morse <james.morse@arm.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190212212417.107049-1-Yazen.Ghannam@amd.com
show more ...
|
Revision tags: v4.19.21, v4.19.20 |
|
#
1c1522d3 |
| 01-Feb-2019 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC, mce_amd: Print ExtErrorCode and description on a single line
Save a log line by printing the extended error code and the description on a single line. This is similar to how errors are printed
EDAC, mce_amd: Print ExtErrorCode and description on a single line
Save a log line by printing the extended error code and the description on a single line. This is similar to how errors are printed in other subsystems, e.g. "#, description". If we don't have a valid description then only the number/code is printed.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20190201225534.8177-6-Yazen.Ghannam@amd.com
show more ...
|
#
e03447ee |
| 01-Feb-2019 |
Yazen Ghannam <yazen.ghannam@amd.com> |
EDAC, mce_amd: Match error descriptions to latest documentation
Update the error descriptions to match the latest documentation for easier searching. In some cases the changes are small and in other
EDAC, mce_amd: Match error descriptions to latest documentation
Update the error descriptions to match the latest documentation for easier searching. In some cases the changes are small and in other cases the changes may be total rewording of the description.
No functional changes.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: x86@kernel.org Link: https://lkml.kernel.org/r/20190201225534.8177-5-Yazen.Ghannam@amd.com
show more ...
|