Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45 |
|
#
670c04ad |
| 09-Aug-2023 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86/apic: Nuke ack_APIC_irq()
Yet another wrapper of a wrapper gone along with the outdated comment that this compiles to a single instruction.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> S
x86/apic: Nuke ack_APIC_irq()
Yet another wrapper of a wrapper gone along with the outdated comment that this compiles to a single instruction.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Tested-by: Juergen Gross <jgross@suse.com> # Xen PV (dom0 and unpriv. guest)
show more ...
|
Revision tags: v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46 |
|
#
3ba2e833 |
| 06-Jun-2022 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
AMD systems from Family 10h to 16h share MCA bank 4 across multiple CPUs. Therefore, the threshold_bank structure for ba
x86/MCE/AMD: Decrement threshold_bank refcount when removing threshold blocks
AMD systems from Family 10h to 16h share MCA bank 4 across multiple CPUs. Therefore, the threshold_bank structure for bank 4, and its threshold_block structures, will be initialized once at boot time. And the kobject for the shared bank will be added to each of the CPUs that share it. Furthermore, the threshold_blocks for the shared bank will be added again to the bank's kobject. These additions will increase the refcount for the bank's kobject.
For example, a shared bank with two blocks and shared across two CPUs will be set up like this:
CPU0 init bank create and add; bank refcount = 1; threshold_create_bank() block 0 init and add; bank refcount = 2; allocate_threshold_blocks() block 1 init and add; bank refcount = 3; allocate_threshold_blocks() CPU1 init bank add; bank refcount = 3; threshold_create_bank() block 0 add; bank refcount = 4; __threshold_add_blocks() block 1 add; bank refcount = 5; __threshold_add_blocks()
Currently in threshold_remove_bank(), if the bank is shared then __threshold_remove_blocks() is called. Here the shared bank's kobject and the bank's blocks' kobjects are deleted. This is done on the first call even while the structures are still shared. Subsequent calls from other CPUs that share the structures will attempt to delete the kobjects.
During kobject_del(), kobject->sd is removed. If the kobject is not part of a kset with default_groups, then subsequent kobject_del() calls seem safe even with kobject->sd == NULL.
Originally, the AMD MCA thresholding structures did not use default_groups. And so the above behavior was not apparent.
However, a recent change implemented default_groups for the thresholding structures. Therefore, kobject_del() will go down the sysfs_remove_groups() code path. In this case, the first kobject_del() may succeed and remove kobject->sd. But subsequent kobject_del() calls will give a WARNing in kernfs_remove_by_name_ns() since kobject->sd == NULL.
Use kobject_put() on the shared bank's kobject when "removing" blocks. This decrements the bank's refcount while keeping kobjects enabled until the bank is no longer shared. At that point, kobject_put() will be called on the blocks which drives their refcount to 0 and deletes them and also decrementing the bank's refcount. And finally kobject_put() will be called on the bank driving its refcount to 0 and deleting it.
The same example above:
CPU1 shutdown bank is shared; bank refcount = 5; threshold_remove_bank() block 0 put parent bank; bank refcount = 4; __threshold_remove_blocks() block 1 put parent bank; bank refcount = 3; __threshold_remove_blocks() CPU0 shutdown bank is no longer shared; bank refcount = 3; threshold_remove_bank() block 0 put block; bank refcount = 2; deallocate_threshold_blocks() block 1 put block; bank refcount = 1; deallocate_threshold_blocks() put bank; bank refcount = 0; threshold_remove_bank()
Fixes: 7f99cb5e6039 ("x86/CPU/AMD: Use default_groups in kobj_type") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/alpine.LRH.2.02.2205301145540.25840@file01.intranet.prod.int.rdu2.redhat.com
show more ...
|
#
c35977b0 |
| 15-May-2023 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Decode UMC_V2 ECC errors
The MI200 (Aldebaran) series of devices introduced a new SMCA bank type for Unified Memory Controllers. The MCE subsystem already has support for
x86/MCE/AMD, EDAC/mce_amd: Decode UMC_V2 ECC errors
The MI200 (Aldebaran) series of devices introduced a new SMCA bank type for Unified Memory Controllers. The MCE subsystem already has support for this new type. The MCE decoder module will decode the common MCA error information for the new bank type, but it will not pass the information to the AMD64 EDAC module for detailed memory error decoding.
Have the MCE decoder module recognize the new bank type as an SMCA UMC memory error and pass the MCA information to AMD64 EDAC.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230515113537.1052146-3-muralimk@amd.com
show more ...
|
#
4c1cdec3 |
| 27-Jan-2023 |
Muralidhara M K <muralimk@amd.com> |
x86/MCE/AMD: Use an u64 for bank_map
Thee maximum number of MCA banks is 64 (MAX_NR_BANKS), see
a0bc32b3cacf ("x86/mce: Increase maximum number of banks to 64").
However, the bank_map which cont
x86/MCE/AMD: Use an u64 for bank_map
Thee maximum number of MCA banks is 64 (MAX_NR_BANKS), see
a0bc32b3cacf ("x86/mce: Increase maximum number of banks to 64").
However, the bank_map which contains a bitfield of which banks to initialize is of type unsigned int and that overflows when those bit numbers are >= 32, leading to UBSAN complaining correctly:
UBSAN: shift-out-of-bounds in arch/x86/kernel/cpu/mce/amd.c:1365:38 shift exponent 32 is too large for 32-bit type 'int'
Change the bank_map to a u64 and use the proper BIT_ULL() macro when modifying bits in there.
[ bp: Rewrite commit message. ]
Fixes: a0bc32b3cacf ("x86/mce: Increase maximum number of banks to 64") Signed-off-by: Muralidhara M K <muralimk@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230127151601.1068324-1-muralimk@amd.com
show more ...
|
#
7214b32b |
| 16-Feb-2023 |
Thomas Weißschuh <linux@weissschuh.net> |
x86/MCE/AMD: Make kobj_type structure constant
Since
ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.
Take advantage of this to
x86/MCE/AMD: Make kobj_type structure constant
Since
ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.
Take advantage of this to constify the structure definition to prevent modification at runtime.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230217-kobj_type-mce-amd-v1-1-40ef94816444@weissschuh.net
show more ...
|
#
fcd343a2 |
| 06-Dec-2022 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
x86/mce: Add support for Extended Physical Address MCA changes
Newer AMD CPUs support more physical address bits.
That is, the MCA_ADDR registers on Scalable MCA systems contain the ErrorAddr in bi
x86/mce: Add support for Extended Physical Address MCA changes
Newer AMD CPUs support more physical address bits.
That is, the MCA_ADDR registers on Scalable MCA systems contain the ErrorAddr in bits [56:0] instead of [55:0]. Hence, the existing LSB field from bits [61:56] in MCA_ADDR must be moved around to accommodate the larger ErrorAddr size.
MCA_CONFIG[McaLsbInStatusSupported] indicates this change. If set, the LSB field will be found in MCA_STATUS rather than MCA_ADDR.
Each logical CPU has unique MCA bank in hardware and is not shared with other logical CPUs. Additionally, on SMCA systems, each feature bit may be different for each bank within same logical CPU.
Check for MCA_CONFIG[McaLsbInStatusSupported] for each MCA bank and for each CPU.
Additionally, all MCA banks do not support maximum ErrorAddr bits in MCA_ADDR. Some banks might support fewer bits but the remaining bits are marked as reserved.
[ Yazen: Rebased and fixed up formatting. bp: Massage comments. ]
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20221206173607.1185907-5-yazen.ghannam@amd.com
show more ...
|
#
2117654e |
| 06-Dec-2022 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
x86/mce: Define a function to extract ErrorAddr from MCA_ADDR
Move MCA_ADDR[ErrorAddr] extraction into a separate helper function. This will be further refactored to support extended ErrorAddr bits
x86/mce: Define a function to extract ErrorAddr from MCA_ADDR
Move MCA_ADDR[ErrorAddr] extraction into a separate helper function. This will be further refactored to support extended ErrorAddr bits in MCA_ADDR in newer AMD CPUs.
[ bp: Massage. ]
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/all/20220225193342.215780-3-Smita.KoralahalliChannabasappa@amd.com/
show more ...
|
#
bc1b705b |
| 21-Jun-2022 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD: Clear DFR errors found in THR handler
AMD's MCA Thresholding feature counts errors of all severity levels, not just correctable errors. If a deferred error causes the threshold limit to
x86/MCE/AMD: Clear DFR errors found in THR handler
AMD's MCA Thresholding feature counts errors of all severity levels, not just correctable errors. If a deferred error causes the threshold limit to be reached (it was the error that caused the overflow), then both a deferred error interrupt and a thresholding interrupt will be triggered.
The order of the interrupts is not guaranteed. If the threshold interrupt handler is executed first, then it will clear MCA_STATUS for the error. It will not check or clear MCA_DESTAT which also holds a copy of the deferred error. When the deferred error interrupt handler runs it will not find an error in MCA_STATUS, but it will find the error in MCA_DESTAT. This will cause two errors to be logged.
Check for deferred errors when handling a threshold interrupt. If a bank contains a deferred error, then clear the bank's MCA_DESTAT register.
Define a new helper function to do the deferred error check and clearing of MCA_DESTAT.
[ bp: Simplify, convert comment to passive voice. ]
Fixes: 37d43acfd79f ("x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220621155943.33623-1-yazen.ghannam@amd.com
show more ...
|
Revision tags: v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
e5f28623 |
| 29-Mar-2022 |
Ammar Faizi <ammarfaizi2@gnuweeb.org> |
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
In mce_threshold_create_device(), if threshold_create_bank() fails, the previously allocated threshold banks array @bp will be leaked
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
In mce_threshold_create_device(), if threshold_create_bank() fails, the previously allocated threshold banks array @bp will be leaked because the call to mce_threshold_remove_device() will not free it.
This happens because mce_threshold_remove_device() fetches the pointer through the threshold_banks per-CPU variable but bp is written there only after the bank creation is successful, and not before, when threshold_create_bank() fails.
Add a helper which unwinds all the bank creation work previously done and pass into it the previously allocated threshold banks array for freeing.
[ bp: Massage. ]
Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-developed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Co-developed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220329104705.65256-3-ammarfaizi2@gnuweeb.org
show more ...
|
Revision tags: v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16 |
|
#
7f99cb5e |
| 06-Jan-2022 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
x86/CPU/AMD: Use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the AMD mc
x86/CPU/AMD: Use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the AMD mce sysfs code to use default_groups field which has been the preferred way since
aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type")
so that the obsolete default_attrs field can be removed soon.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lore.kernel.org/r/20220106103537.3663852-1-gregkh@linuxfoundation.org
show more ...
|
#
1f52b0ab |
| 17-Jan-2022 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD: Allow thresholding interface updates after init
Changes to the AMD Thresholding sysfs code prevents sysfs writes from updating the underlying registers once CPU init is completed, i.e.
x86/MCE/AMD: Allow thresholding interface updates after init
Changes to the AMD Thresholding sysfs code prevents sysfs writes from updating the underlying registers once CPU init is completed, i.e. "threshold_banks" is set.
Allow the registers to be updated if the thresholding interface is already initialized or if in the init path. Use the "set_lvt_off" value to indicate if running in the init path, since this value is only set during init.
Fixes: a037f3ca0ea0 ("x86/mce/amd: Make threshold bank setting hotplug robust") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220117161328.19148-1-yazen.ghannam@amd.com
show more ...
|
Revision tags: v5.15.10 |
|
#
91f75eb4 |
| 16-Dec-2021 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
AMD systems currently lay out MCA bank types such that the type of bank number "i" is either the same across all CPUs or is R
x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration
AMD systems currently lay out MCA bank types such that the type of bank number "i" is either the same across all CPUs or is Reserved/Read-as-Zero.
For example:
Bank # | CPUx | CPUy 0 LS LS 1 RAZ UMC 2 CS CS 3 SMU RAZ
Future AMD systems will lay out MCA bank types such that the type of bank number "i" may be different across CPUs.
For example:
Bank # | CPUx | CPUy 0 LS LS 1 RAZ UMC 2 CS NBIO 3 SMU RAZ
Change the structures that cache MCA bank types to be per-CPU and update smca_get_bank_type() to handle this change.
Move some SMCA-specific structures to amd.c from mce.h, since they no longer need to be global.
Break out the "count" for bank types from struct smca_hwid, since this should provide a per-CPU count rather than a system-wide count.
Apply the "const" qualifier to the struct smca_hwid_mcatypes array. The values in this array should not change at runtime.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-3-yazen.ghannam@amd.com
show more ...
|
#
5176a93a |
| 16-Dec-2021 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add HWID and McaType values for new SMCA bank types, and add their error descriptions to edac_mce_amd.
The "PHY" bank types all have the same erro
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add HWID and McaType values for new SMCA bank types, and add their error descriptions to edac_mce_amd.
The "PHY" bank types all have the same error descriptions, and the NBIF and SHUB bank types have the same error descriptions. So reuse the same arrays where appropriate.
[ bp: Remove useless comments over hwid types. ]
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20211216162905.4132657-2-yazen.ghannam@amd.com
show more ...
|
Revision tags: v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15 |
|
#
0b746e8c |
| 28-Oct-2021 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/amd64: Move address translation to AMD64 EDAC
The address translation code used for current AMD systems is non-architectural. So move it to EDAC.
Signed-off-by: Yazen Ghannam <yaz
x86/MCE/AMD, EDAC/amd64: Move address translation to AMD64 EDAC
The address translation code used for current AMD systems is non-architectural. So move it to EDAC.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211028175728.121452-2-yazen.ghannam@amd.com
show more ...
|
Revision tags: v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27 |
|
#
f38ce910 |
| 27-Mar-2021 |
Mukul Joshi <mukul.joshi@amd.com> |
x86/MCE/AMD: Export smca_get_bank_type symbol
Export smca_get_bank_type for use in the AMD GPU driver to determine MCA bank while handling correctable and uncorrectable errors in GPU UMC.
Signed-of
x86/MCE/AMD: Export smca_get_bank_type symbol
Export smca_get_bank_type for use in the AMD GPU driver to determine MCA bank while handling correctable and uncorrectable errors in GPU UMC.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
8121b8f9 |
| 02-Sep-2021 |
Borislav Petkov <bp@suse.de> |
x86/mce: Get rid of msr_ops
Avoid having indirect calls and use a normal function which returns the proper MSR address based on ->smca setting.
No functional changes.
Signed-off-by: Borislav Petko
x86/mce: Get rid of msr_ops
Avoid having indirect calls and use a normal function which returns the proper MSR address based on ->smca setting.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20210922165101.18951-4-bp@alien8.de
show more ...
|
#
cc0dd445 |
| 29-Mar-2022 |
Ammar Faizi <ammarfaizi2@gnuweeb.org> |
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
commit e5f28623ceb103e13fc3d7bd45edf9818b227fd0 upstream.
In mce_threshold_create_device(), if threshold_create_bank() fails, the pre
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
commit e5f28623ceb103e13fc3d7bd45edf9818b227fd0 upstream.
In mce_threshold_create_device(), if threshold_create_bank() fails, the previously allocated threshold banks array @bp will be leaked because the call to mce_threshold_remove_device() will not free it.
This happens because mce_threshold_remove_device() fetches the pointer through the threshold_banks per-CPU variable but bp is written there only after the bank creation is successful, and not before, when threshold_create_bank() fails.
Add a helper which unwinds all the bank creation work previously done and pass into it the previously allocated threshold banks array for freeing.
[ bp: Massage. ]
Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-developed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Co-developed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220329104705.65256-3-ammarfaizi2@gnuweeb.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
cc0dd445 |
| 29-Mar-2022 |
Ammar Faizi <ammarfaizi2@gnuweeb.org> |
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
commit e5f28623ceb103e13fc3d7bd45edf9818b227fd0 upstream.
In mce_threshold_create_device(), if threshold_create_bank() fails, the pre
x86/MCE/AMD: Fix memory leak when threshold_create_bank() fails
commit e5f28623ceb103e13fc3d7bd45edf9818b227fd0 upstream.
In mce_threshold_create_device(), if threshold_create_bank() fails, the previously allocated threshold banks array @bp will be leaked because the call to mce_threshold_remove_device() will not free it.
This happens because mce_threshold_remove_device() fetches the pointer through the threshold_banks per-CPU variable but bp is written there only after the bank creation is successful, and not before, when threshold_create_bank() fails.
Add a helper which unwinds all the bank creation work previously done and pass into it the previously allocated threshold banks array for freeing.
[ bp: Massage. ]
Fixes: 6458de97fc15 ("x86/mce/amd: Straighten CPU hotplug path") Co-developed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Co-developed-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220329104705.65256-3-ammarfaizi2@gnuweeb.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
98ccfec9 |
| 17-Jan-2022 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD: Allow thresholding interface updates after init
commit 1f52b0aba6fd37653416375cb8a1ca673acf8d5f upstream.
Changes to the AMD Thresholding sysfs code prevents sysfs writes from updating
x86/MCE/AMD: Allow thresholding interface updates after init
commit 1f52b0aba6fd37653416375cb8a1ca673acf8d5f upstream.
Changes to the AMD Thresholding sysfs code prevents sysfs writes from updating the underlying registers once CPU init is completed, i.e. "threshold_banks" is set.
Allow the registers to be updated if the thresholding interface is already initialized or if in the init path. Use the "set_lvt_off" value to indicate if running in the init path, since this value is only set during init.
Fixes: a037f3ca0ea0 ("x86/mce/amd: Make threshold bank setting hotplug robust") Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220117161328.19148-1-yazen.ghannam@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
94a311ce |
| 26-May-2021 |
Muralidhara M K <muralimk@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add the (HWID, MCATYPE) tuples and names for new SMCA bank types.
Also, add their respective error descriptions to the MCE decoding module edac_mc
x86/MCE/AMD, EDAC/mce_amd: Add new SMCA bank types
Add the (HWID, MCATYPE) tuples and names for new SMCA bank types.
Also, add their respective error descriptions to the MCE decoding module edac_mce_amd. Also while at it, optimize the string names for some SMCA banks.
[ bp: Drop repeated comments, explain why UMC_V2 is a separate entry. ]
Signed-off-by: Muralidhara M K <muralimk@amd.com> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20210526164601.66228-1-nchatrad@amd.com
show more ...
|
Revision tags: v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10 |
|
#
db970bd2 |
| 09-Nov-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/CPU/AMD: Remove amd_get_nb_id()
The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in
x86/CPU/AMD: Remove amd_get_nb_id()
The Last Level Cache ID is returned by amd_get_nb_id(). In practice, this value is the same as the AMD NodeId for callers of this function. The NodeId is saved in struct cpuinfo_x86.cpu_die_id.
Replace calls to amd_get_nb_id() with the logical CPU's cpu_die_id and remove the function.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20201109210659.754018-3-Yazen.Ghannam@amd.com
show more ...
|
Revision tags: v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53 |
|
#
368d1887 |
| 20-Jul-2020 |
Yazen Ghannam <yazen.ghannam@amd.com> |
x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid err
x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
The Extended Error Code Bitmap (xec_bitmap) for a Scalable MCA bank type was intended to be used by the kernel to filter out invalid error codes on a system. However, this is unnecessary after a few product releases because the hardware will only report valid error codes. Thus, there's no need for it with future systems.
Remove the xec_bitmap field and all references to it.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200720145353.43924-1-Yazen.Ghannam@amd.com
show more ...
|
Revision tags: v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43 |
|
#
720909a7 |
| 21-May-2020 |
Thomas Gleixner <tglx@linutronix.de> |
x86/entry: Convert various system vectors
Convert various system vectors to IDTENTRY_SYSVEC:
- Implement the C entry point with DEFINE_IDTENTRY_SYSVEC - Emit the ASM stub with DECLARE_IDTENTRY_
x86/entry: Convert various system vectors
Convert various system vectors to IDTENTRY_SYSVEC:
- Implement the C entry point with DEFINE_IDTENTRY_SYSVEC - Emit the ASM stub with DECLARE_IDTENTRY_SYSVEC - Remove the ASM idtentries in 64-bit - Remove the BUILD_INTERRUPT entries in 32-bit - Remove the old prototypes
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20200521202119.464812973@linutronix.de
show more ...
|
Revision tags: v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31 |
|
#
3e0fdec8 |
| 07-Apr-2020 |
Borislav Petkov <bp@suse.de> |
x86/mce/amd, edac: Remove report_gart_errors
... because no one should be interested in spurious MCEs anyway. Make the filtering unconditional and move it to amd_filter_mce().
Signed-off-by: Borisl
x86/mce/amd, edac: Remove report_gart_errors
... because no one should be interested in spurious MCEs anyway. Make the filtering unconditional and move it to amd_filter_mce().
Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200407163414.18058-2-bp@alien8.de
show more ...
|
Revision tags: v5.4.30, v5.4.29 |
|
#
a037f3ca |
| 31-Mar-2020 |
Thomas Gleixner <tglx@linutronix.de> |
x86/mce/amd: Make threshold bank setting hotplug robust
Handle the cases when the CPU goes offline before the bank setting/reading happens.
[ bp: Write commit message. ]
Signed-off-by: Thomas Gle
x86/mce/amd: Make threshold bank setting hotplug robust
Handle the cases when the CPU goes offline before the bank setting/reading happens.
[ bp: Write commit message. ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200403161943.1458-8-bp@alien8.de
show more ...
|