Revision tags: v6.6.26, v6.6.25, v6.6.24, v6.6.23 |
|
#
55ed6c47 |
| 25-Mar-2024 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/lbr: Use freeze based on availability
[ Upstream commit 598c2fafc06fe5c56a1a415fb7b544b31453d637 ]
Currently, the LBR code assumes that LBR Freeze is supported on all processors when X
perf/x86/amd/lbr: Use freeze based on availability
[ Upstream commit 598c2fafc06fe5c56a1a415fb7b544b31453d637 ]
Currently, the LBR code assumes that LBR Freeze is supported on all processors when X86_FEATURE_AMD_LBR_V2 is available i.e. CPUID leaf 0x80000022[EAX] bit 1 is set. This is incorrect as the availability of the feature is additionally dependent on CPUID leaf 0x80000022[EAX] bit 2 being set, which may not be set for all Zen 4 processors.
Define a new feature bit for LBR and PMC freeze and set the freeze enable bit (FLBRI) in DebugCtl (MSR 0x1d9) conditionally.
It should still be possible to use LBR without freeze for profile-guided optimization of user programs by using an user-only branch filter during profiling. When the user-only filter is enabled, branches are no longer recorded after the transition to CPL 0 upon PMI arrival. When branch entries are read in the PMI handler, the branch stack does not change.
E.g.
$ perf record -j any,u -e ex_ret_brn_tkn ./workload
Since the feature bit is visible under flags in /proc/cpuinfo, it can be used to determine the feasibility of use-cases which require LBR Freeze to be supported by the hardware such as profile-guided optimization of kernels.
Fixes: ca5b7c0d9621 ("perf/x86/amd/lbr: Add LbrExtV2 branch record support") Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/69a453c97cfd11c6f2584b19f937fe6df741510f.1711091584.git.sandipan.das@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
#
69fe5f17 |
| 25-Mar-2024 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later
[ Upstream commit c7b2edd8377be983442c1344cb940cd2ac21b601 ]
AMD processors based on Zen 2 and later microarchitectures
perf/x86/amd/core: Update and fix stalled-cycles-* events for Zen 2 and later
[ Upstream commit c7b2edd8377be983442c1344cb940cd2ac21b601 ]
AMD processors based on Zen 2 and later microarchitectures do not support PMCx087 (instruction pipe stalls) which is used as the backing event for "stalled-cycles-frontend" and "stalled-cycles-backend".
Use PMCx0A9 (cycles where micro-op queue is empty) instead to count frontend stalls and remove the entry for backend stalls since there is no direct replacement.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 3fe3331bb285 ("perf/x86/amd: Add event map for AMD Family 17h") Link: https://lore.kernel.org/r/03d7fc8fa2a28f9be732116009025bdec1b3ec97.1711352180.git.sandipan.das@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.16, v6.6.15 |
|
#
7be89bd6 |
| 29-Jan-2024 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Avoid register reset when CPU is dead
[ Upstream commit ad8c91282c95f801c37812d59d2d9eba6899b384 ]
When bringing a CPU online, some of the PMC and LBR related registers are reset
perf/x86/amd/core: Avoid register reset when CPU is dead
[ Upstream commit ad8c91282c95f801c37812d59d2d9eba6899b384 ]
When bringing a CPU online, some of the PMC and LBR related registers are reset. The same is done when a CPU is taken offline although that is unnecessary. This currently happens in the "cpu_dead" callback which is also incorrect as the callback runs on a control CPU instead of the one that is being taken offline. This also affects hibernation and suspend to RAM on some platforms as reported in the link below.
Fixes: 21d59e3e2c40 ("perf/x86/amd/core: Detect PerfMonV2 support") Reported-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/550a026764342cf7e5812680e3e2b91fe662b5ac.1706526029.git.sandipan.das@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4 |
|
#
599522d9 |
| 14-Sep-2023 |
Breno Leitao <leitao@debian.org> |
perf/x86/amd: Do not WARN() on every IRQ
Zen 4 systems running buggy microcode can hit a WARN_ON() in the PMI handler, as shown below, several times while perf runs. A simple `perf top` run is enoug
perf/x86/amd: Do not WARN() on every IRQ
Zen 4 systems running buggy microcode can hit a WARN_ON() in the PMI handler, as shown below, several times while perf runs. A simple `perf top` run is enough to render the system unusable:
WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0
This happens because the Performance Counter Global Status Register (PerfCntGlobalStatus) has one or more bits set which are considered reserved according to the "AMD64 Architecture Programmer’s Manual, Volume 2: System Programming, 24593":
https://www.amd.com/system/files/TechDocs/24593.pdf
To make this less intrusive, warn just once if any reserved bit is set and prompt the user to update the microcode. Also sanitize the value to what the code is handling, so that the overflow events continue to be handled for the number of counters that are known to be sane.
Going forward, the following microcode patch levels are recommended for Zen 4 processors in order to avoid such issues with reserved bits:
Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a10113e Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a10123e Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116 Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00212
Commit f2eb058afc57 ("linux-firmware: Update AMD cpu microcode") from the linux-firmware tree has binaries that meet the minimum required patch levels.
[ sandipan: - add message to prompt users to update microcode - rework commit message and call out required microcode levels ]
Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling") Reported-by: Jirka Hladky <jhladky@redhat.com> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/all/3540f985652f41041e54ee82aa53e7dbd55739ae.1694696888.git.sandipan.das@amd.com/
show more ...
|
#
23d2626b |
| 14-Sep-2023 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Fix overflow reset on hotplug
Kernels older than v5.19 do not support PerfMonV2 and the PMI handler does not clear the overflow bits of the PerfCntrGlobalStatus register. Because
perf/x86/amd/core: Fix overflow reset on hotplug
Kernels older than v5.19 do not support PerfMonV2 and the PMI handler does not clear the overflow bits of the PerfCntrGlobalStatus register. Because of this, loading a recent kernel using kexec from an older kernel can result in inconsistent register states on Zen 4 systems.
The PMI handler of the new kernel gets confused and shows a warning when an overflow occurs because some of the overflow bits are set even if the corresponding counters are inactive. These are remnants from overflows that were handled by the older kernel.
During CPU hotplug, the PerfCntrGlobalCtl and PerfCntrGlobalStatus registers should always be cleared for PerfMonV2-capable processors. However, a condition used for NB event constaints applicable only to older processors currently prevents this from happening. Move the reset sequence to an appropriate place and also clear the LBR Freeze bit.
Fixes: 21d59e3e2c40 ("perf/x86/amd/core: Detect PerfMonV2 support") Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/882a87511af40792ba69bb0e9026f19a2e71e8a3.1694696888.git.sandipan.das@amd.com
show more ...
|
Revision tags: v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28 |
|
#
2fad201f |
| 04-May-2023 |
Ravi Bangoria <ravi.bangoria@amd.com> |
perf/ibs: Fix interface via core pmu events
Although, IBS pmus can be invoked via their own interface, indirect IBS invocation via core pmu events is also supported with fixed set of events: cpu-cyc
perf/ibs: Fix interface via core pmu events
Although, IBS pmus can be invoked via their own interface, indirect IBS invocation via core pmu events is also supported with fixed set of events: cpu-cycles:p, r076:p (same as cpu-cycles:p) and r0C1:p (micro-ops) for user convenience.
This indirect IBS invocation is broken since commit 66d258c5b048 ("perf/core: Optimize perf_init_event()"), which added RAW pmu under 'pmu_idr' list and thus if event_init() fails with RAW pmu, it started returning error instead of trying other pmus.
Forward precise events from core pmu to IBS by overwriting 'type' and 'config' in the kernel copy of perf_event_attr. Overwriting will cause perf_init_event() to retry with updated 'type' and 'config', which will automatically forward event to IBS pmu.
Without patch: $ sudo ./perf record -C 0 -e r076:p -- sleep 1 Error: The r076:p event is not supported.
With patch: $ sudo ./perf record -C 0 -e r076:p -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.341 MB perf.data (37 samples) ]
Fixes: 66d258c5b048 ("perf/core: Optimize perf_init_event()") Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230504110003.2548-3-ravi.bangoria@amd.com
show more ...
|
Revision tags: v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21 |
|
#
263f5eca |
| 21-Mar-2023 |
Breno Leitao <leitao@debian.org> |
perf/x86/amd/core: Always clear status for idx
The variable 'status' (which contains the unhandled overflow bits) is not being properly masked in some cases, displaying the following warning:
WAR
perf/x86/amd/core: Always clear status for idx
The variable 'status' (which contains the unhandled overflow bits) is not being properly masked in some cases, displaying the following warning:
WARNING: CPU: 156 PID: 475601 at arch/x86/events/amd/core.c:972 amd_pmu_v2_handle_irq+0x216/0x270
This seems to be happening because the loop is being continued before the status bit being unset, in case x86_perf_event_set_period() returns 0. This is also causing an inconsistency because the "handled" counter is incremented, but the status bit is not cleaned.
Move the bit cleaning together above, together when the "handled" counter is incremented.
Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sandipan Das <sandipan.das@amd.com> Link: https://lore.kernel.org/r/20230321113338.1669660-1-leitao@debian.org
show more ...
|
Revision tags: v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7 |
|
#
eb55b455 |
| 18-Jan-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf/core: Add perf_sample_save_brstack() helper
When we saves the branch stack to the perf sample data, we needs to update the sample flags and the dynamic size. To make sure this is done consiste
perf/core: Add perf_sample_save_brstack() helper
When we saves the branch stack to the perf sample data, we needs to update the sample flags and the dynamic size. To make sure this is done consistently, add the perf_sample_save_brstack() helper and convert all call sites.
Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230118060559.615653-5-namhyung@kernel.org
show more ...
|
Revision tags: v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11 |
|
#
08245672 |
| 02-Dec-2022 |
Colin Ian King <colin.i.king@gmail.com> |
perf/x86/amd: fix potential integer overflow on shift of a int
The left shift of int 32 bit integer constant 1 is evaluated using 32 bit arithmetic and then passed as a 64 bit function argument. In
perf/x86/amd: fix potential integer overflow on shift of a int
The left shift of int 32 bit integer constant 1 is evaluated using 32 bit arithmetic and then passed as a 64 bit function argument. In the case where i is 32 or more this can lead to an overflow. Avoid this by shifting using the BIT_ULL macro instead.
Fixes: 471af006a747 ("perf/x86/amd: Constrain Large Increment per Cycle events") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Kim Phillips <kim.phillips@amd.com> Link: https://lore.kernel.org/r/20221202135149.1797974-1-colin.i.king@gmail.com
show more ...
|
Revision tags: v6.0.10, v5.15.80, v6.0.9, v5.15.79 |
|
#
baa014b9 |
| 13-Nov-2022 |
Ravi Bangoria <ravi.bangoria@amd.com> |
perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling
amd_pmu_enable_all() does:
if (!test_bit(idx, cpuc->active_mask)) continue;
amd_pm
perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling
amd_pmu_enable_all() does:
if (!test_bit(idx, cpuc->active_mask)) continue;
amd_pmu_enable_event(cpuc->events[idx]);
A perf NMI of another event can come between these two steps. Perf NMI handler internally disables and enables _all_ events, including the one which nmi-intercepted amd_pmu_enable_all() was in process of enabling. If that unintentionally enabled event has very low sampling period and causes immediate successive NMI, causing the event to be throttled, cpuc->events[idx] and cpuc->active_mask gets cleared by x86_pmu_stop(). This will result in amd_pmu_enable_event() getting called with event=NULL when amd_pmu_enable_all() resumes after handling the NMIs. This causes a kernel crash:
BUG: kernel NULL pointer dereference, address: 0000000000000198 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page [...] Call Trace: <TASK> amd_pmu_enable_all+0x68/0xb0 ctx_resched+0xd9/0x150 event_function+0xb8/0x130 ? hrtimer_start_range_ns+0x141/0x4a0 ? perf_duration_warn+0x30/0x30 remote_function+0x4d/0x60 __flush_smp_call_function_queue+0xc4/0x500 flush_smp_call_function_queue+0x11d/0x1b0 do_idle+0x18f/0x2d0 cpu_startup_entry+0x19/0x20 start_secondary+0x121/0x160 secondary_startup_64_no_verify+0xe5/0xeb </TASK>
amd_pmu_disable_all()/amd_pmu_enable_all() calls inside perf NMI handler were recently added as part of BRS enablement but I'm not sure whether we really need them. We can just disable BRS in the beginning and enable it back while returning from NMI. This will solve the issue by not enabling those events whose active_masks are set but are not yet enabled in hw pmu.
Fixes: ada543459cab ("perf/x86/amd: Add AMD Fam19h Branch Sampling support") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20221114044029.373-1-ravi.bangoria@amd.com
show more ...
|
Revision tags: v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39 |
|
#
28f0f3c4 |
| 10-May-2022 |
Peter Zijlstra <peterz@infradead.org> |
perf/x86: Change x86_pmu::limit_period signature
In preparation for making it a static_call, change the signature.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.ke
perf/x86: Change x86_pmu::limit_period signature
In preparation for making it a static_call, change the signature.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220829101321.573713839@infradead.org
show more ...
|
#
a9a931e2 |
| 01-Sep-2022 |
Kan Liang <kan.liang@linux.intel.com> |
perf: Use sample_flags for branch stack
Use the new sample_flags to indicate whether the branch stack is filled by the PMU driver.
Remove the br_stack from the perf_sample_data_init() to minimize t
perf: Use sample_flags for branch stack
Use the new sample_flags to indicate whether the branch stack is filled by the PMU driver.
Remove the br_stack from the perf_sample_data_init() to minimize the number of cache lines touched.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220901130959.1285717-4-kan.liang@linux.intel.com
show more ...
|
#
f4f925da |
| 11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/lbr: Add LbrExtV2 hardware branch filter support
If AMD Last Branch Record Extension Version 2 (LbrExtV2) is detected, convert the requested branch filter (PERF_SAMPLE_BRANCH_* flags) t
perf/x86/amd/lbr: Add LbrExtV2 hardware branch filter support
If AMD Last Branch Record Extension Version 2 (LbrExtV2) is detected, convert the requested branch filter (PERF_SAMPLE_BRANCH_* flags) to the corresponding hardware filter value and stash it in the event data when a branch stack is requested. The hardware filter value is also saved in per-CPU areas for use during event scheduling.
Hardware filtering is provided by the LBR Branch Select register. It has bits which when set, suppress recording of the following types of branches:
* CPL = 0 (Kernel only) * CPL > 0 (Userspace only) * Conditional Branches * Near Relative Calls * Near Indirect Calls * Near Returns * Near Indirect Jumps (excluding Near Indirect Calls and Near Returns) * Near Relative Jumps (excluding Near Relative Calls) * Far Branches
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/9336af5c9785b8e14c62220fc0e6cfb10ab97de3.1660211399.git.sandipan.das@amd.com
show more ...
|
#
ca5b7c0d |
| 11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/lbr: Add LbrExtV2 branch record support
If AMD Last Branch Record Extension Version 2 (LbrExtV2) is detected, enable it alongside LBR Freeze on PMI when an event requests branch stack i
perf/x86/amd/lbr: Add LbrExtV2 branch record support
If AMD Last Branch Record Extension Version 2 (LbrExtV2) is detected, enable it alongside LBR Freeze on PMI when an event requests branch stack i.e. PERF_SAMPLE_BRANCH_STACK.
Each branch record is represented by a pair of registers, LBR From and LBR To. The freeze feature prevents any updates to these registers once a PMC overflows. The contents remain unchanged until the freeze bit is cleared by the PMI handler.
The branch records are read and copied to sample data before unfreezing. However, only valid entries are copied. There is no additional register to denote which of the register pairs represent the top of the stack (TOS) since internal register renaming always ensures that the first pair (i.e. index 0) is the one representing the most recent branch and so on.
The LBR registers are per-thread resources and are cleared explicitly whenever a new task is scheduled in. There are no special implications on the contents of these registers when transitioning to deep C-states.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/d3b8500a3627a0d4d0259b005891ee248f248d91.1660211399.git.sandipan.das@amd.com
show more ...
|
#
703fb765 |
| 11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/lbr: Detect LbrExtV2 support
AMD Last Branch Record Extension Version 2 (LbrExtV2) is driven by Core PMC overflows. It records recently taken branches up to the moment when the PMC over
perf/x86/amd/lbr: Detect LbrExtV2 support
AMD Last Branch Record Extension Version 2 (LbrExtV2) is driven by Core PMC overflows. It records recently taken branches up to the moment when the PMC overflow occurs.
Detect the feature during PMU initialization and set the branch stack depth using CPUID leaf 0x80000022 EBX.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/fc6e45378ada258f1bab79b0de6e05c393a8f1dd.1660211399.git.sandipan.das@amd.com
show more ...
|
#
706460a9 |
| 11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Add generic branch record interfaces
AMD processors that are capable of recording branches support either Branch Sampling (BRS) or Last Branch Record (LBR). In preparation for add
perf/x86/amd/core: Add generic branch record interfaces
AMD processors that are capable of recording branches support either Branch Sampling (BRS) or Last Branch Record (LBR). In preparation for adding Last Branch Record Extension Version 2 (LbrExtV2) support, introduce new static calls which act as gateways to call into the feature-dependent functions based on what is available on the processor.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/b75dbc32663cb395f0d701167e952c6a6b0445a3.1660211399.git.sandipan.das@amd.com
show more ...
|
#
9603aa79 |
| 11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Refactor branch attributes
AMD processors that are capable of recording branches support either Branch Sampling (BRS) or Last Branch Record (LBR). In preparation for adding Last B
perf/x86/amd/core: Refactor branch attributes
AMD processors that are capable of recording branches support either Branch Sampling (BRS) or Last Branch Record (LBR). In preparation for adding Last Branch Record Extension Version 2 (LbrExtV2) support, reuse the "branches" capability to advertise information about both BRS and LBR but make the "branch-brs" event exclusive to Family 19h processors that support BRS.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/ba4a4cde6db79b1c65c49834027bbdb8a915546b.1660211399.git.sandipan.das@amd.com
show more ...
|
#
b40d0156 |
| 11-Aug-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/brs: Move feature-specific functions
Move some of the Branch Sampling (BRS) specific functions out of the Core events sources and into the BRS sources in preparation for adding other me
perf/x86/amd/brs: Move feature-specific functions
Move some of the Branch Sampling (BRS) specific functions out of the Core events sources and into the BRS sources in preparation for adding other mechanisms to record branches.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/b60283b57179475d18ee242d117c335c16733693.1660211399.git.sandipan.das@amd.com
show more ...
|
#
bae19fdd |
| 18-May-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Fix reloading events for SVM
Commit 1018faa6cf23 ("perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled") addresses an issue in which the Host-Only bit in the counter
perf/x86/amd/core: Fix reloading events for SVM
Commit 1018faa6cf23 ("perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled") addresses an issue in which the Host-Only bit in the counter control registers needs to be masked off when SVM is not enabled.
The events need to be reloaded whenever SVM is enabled or disabled for a CPU and this requires the PERF_CTL registers to be reprogrammed using {enable,disable}_all(). However, PerfMonV2 variants of these functions do not reprogram the PERF_CTL registers. Hence, the legacy enable_all() function should also be called.
Fixes: 9622e67e3980 ("perf/x86/amd/core: Add PerfMonV2 counter control") Reported-by: Like Xu <likexu@tencent.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220518084327.464005-1-sandipan.das@amd.com
show more ...
|
#
3c27b0c6 |
| 10-May-2022 |
Peter Zijlstra <peterz@infradead.org> |
perf/x86/amd: Fix AMD BRS period adjustment
There's two problems with the current amd_brs_adjust_period() code:
- it isn't in fact AMD specific and wil always adjust the period;
- it adjusts the
perf/x86/amd: Fix AMD BRS period adjustment
There's two problems with the current amd_brs_adjust_period() code:
- it isn't in fact AMD specific and wil always adjust the period;
- it adjusts the period, while it should only adjust the event count, resulting in repoting a short period.
Fix this by using x86_pmu.limit_period, this makes it specific to the AMD BRS case and ensures only the event count is adjusted while the reported period is unmodified.
Fixes: ba2fe7500845 ("perf/x86/amd: Add AMD branch sampling period adjustment") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
show more ...
|
Revision tags: v5.15.38, v5.15.37, v5.15.36 |
|
#
bc469ddf |
| 21-Apr-2022 |
Zucheng Zheng <zhengzucheng@huawei.com> |
perf/x86/amd: Remove unused variable 'hwc'
'hwc' is never used in amd_pmu_enable_all(), so remove it.
Signed-off-by: Zucheng Zheng <zhengzucheng@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <p
perf/x86/amd: Remove unused variable 'hwc'
'hwc' is never used in amd_pmu_enable_all(), so remove it.
Signed-off-by: Zucheng Zheng <zhengzucheng@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220421111031.174698-1-zhengzucheng@huawei.com
show more ...
|
#
7685665c |
| 21-Apr-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Add PerfMonV2 overflow handling
If AMD Performance Monitoring Version 2 (PerfMonV2) is supported, use a new scheme to process Core PMC overflows in the NMI handler using the new g
perf/x86/amd/core: Add PerfMonV2 overflow handling
If AMD Performance Monitoring Version 2 (PerfMonV2) is supported, use a new scheme to process Core PMC overflows in the NMI handler using the new global control and status registers. This will be bypassed on unsupported hardware (x86_pmu.version < 2).
In x86_pmu_handle_irq(), overflows are detected by testing the contents of the PERF_CTR register for each active PMC in a loop. The new scheme instead inspects the overflow bits of the global status register.
The Performance Counter Global Status (PerfCntrGlobalStatus) register has overflow (PerfCntrOvfl) bits for each PMC. This is, however, a read-only MSR. To acknowledge that overflows have been processed, the NMI handler must clear the bits by writing to the PerfCntrGlobalStatusClr register.
In x86_pmu_handle_irq(), PMCs counting the same event that are started and stopped at the same time record slightly different counts due to delays in between reads from the PERF_CTR registers. This is fixed by stopping and starting the PMCs at the same before and with a single write to the Performance Counter Global Control (PerfCntrGlobalCtl) upon entering and before exiting the NMI handler.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/f20b7e4da0b0a83bdbe05857f354146623bc63ab.1650515382.git.sandipan.das@amd.com
show more ...
|
#
9622e67e |
| 21-Apr-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Add PerfMonV2 counter control
If AMD Performance Monitoring Version 2 (PerfMonV2) is supported, use a new scheme to manage the Core PMCs using the new global control and status re
perf/x86/amd/core: Add PerfMonV2 counter control
If AMD Performance Monitoring Version 2 (PerfMonV2) is supported, use a new scheme to manage the Core PMCs using the new global control and status registers. This will be bypassed on unsupported hardware (x86_pmu.version < 2).
Currently, all PMCs have dedicated control (PERF_CTL) and counter (PERF_CTR) registers. For a given PMC, the enable (En) bit of its PERF_CTL register is used to start or stop counting.
The Performance Counter Global Control (PerfCntrGlobalCtl) register has enable (PerfCntrEn) bits for each PMC. For a PMC to start counting, both PERF_CTL and PerfCntrGlobalCtl enable bits must be set. If either of those are cleared, the PMC stops counting.
In x86_pmu_{en,dis}able_all(), the PERF_CTL registers of all active PMCs are written to in a loop. Ideally, PMCs counting the same event that were started and stopped at the same time should record the same counts. Due to delays in between writes to the PERF_CTL registers across loop iterations, the PMCs cannot be enabled or disabled at the same instant and hence, record slightly different counts. This is fixed by enabling or disabling all active PMCs at the same time with a single write to the PerfCntrGlobalCtl register.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/dfe8e934074aaabc6ba748dfaccd0a77c974bb82.1650515382.git.sandipan.das@amd.com
show more ...
|
#
56e026a7 |
| 21-Apr-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Detect available counters
If AMD Performance Monitoring Version 2 (PerfMonV2) is supported, use CPUID leaf 0x80000022 EBX to detect the number of Core PMCs. This offers more flexi
perf/x86/amd/core: Detect available counters
If AMD Performance Monitoring Version 2 (PerfMonV2) is supported, use CPUID leaf 0x80000022 EBX to detect the number of Core PMCs. This offers more flexibility if the counts change in later processor families.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/68a6d9688df189267db26530378870edd34f7b06.1650515382.git.sandipan.das@amd.com
show more ...
|
#
21d59e3e |
| 21-Apr-2022 |
Sandipan Das <sandipan.das@amd.com> |
perf/x86/amd/core: Detect PerfMonV2 support
AMD Performance Monitoring Version 2 (PerfMonV2) introduces some new Core PMU features such as detection of the number of available PMCs and managing PMCs
perf/x86/amd/core: Detect PerfMonV2 support
AMD Performance Monitoring Version 2 (PerfMonV2) introduces some new Core PMU features such as detection of the number of available PMCs and managing PMCs using global registers namely, PerfCntrGlobalCtl and PerfCntrGlobalStatus.
Clearing PerfCntrGlobalCtl and PerfCntrGlobalStatus ensures that all PMCs are inactive and have no pending overflows when CPUs are onlined or offlined.
The PMU version (x86_pmu.version) now indicates PerfMonV2 support and will be used to bypass the new features on unsupported processors.
Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/dc8672ecbddff394e088ca8abf94b089b8ecc2e7.1650515382.git.sandipan.das@amd.com
show more ...
|