Revision tags: v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30 |
|
#
e5f2b4e1 |
| 06-Apr-2021 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
perf vendor events amd: Use 0x%02x format for event code and umask
Use 0x%02x format for all event codes and umasks as this helps in tracking changes of automatically generated event tables.
Review
perf vendor events amd: Use 0x%02x format for event code and umask
Use 0x%02x format for all event codes and umasks as this helps in tracking changes of automatically generated event tables.
Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20210406215944.113332-4-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
86c2bc3d |
| 06-Apr-2021 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
Commit 08ed77e414ab2342 ("perf vendor events amd: Add recommended events") added the hits event "L2 Cache Hits from L2 HWPF" with
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
Commit 08ed77e414ab2342 ("perf vendor events amd: Add recommended events") added the hits event "L2 Cache Hits from L2 HWPF" with the same metric expression as the accesses event "L2 Cache Accesses from L2 HWPF":
$ perf list --details ... l2_cache_accesses_from_l2_hwpf [L2 Cache Accesses from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] l2_cache_hits_from_l2_hwpf [L2 Cache Hits from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] ...
This was wrong and led to counting hits the same as accesses. Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019", documents the hits event with EventCode 0x70 which is the same as l2_pf_hit_l2.
Fix this, and massage the description for l2_pf_hit_l2 as the hits event is now the duplicate of l2_pf_hit_l2. AMD recommends using the recommended event over other events if the duplicate exists and maintain both for consistency. Hence, l2_cache_hits_from_l2_hwpf should override l2_pf_hit_l2.
Before:
# perf stat -M l2_cache_accesses_from_l2_hwpf,l2_cache_hits_from_l2_hwpf sleep 1
Performance counter stats for 'sleep 1':
1,436 l2_pf_miss_l2_l3 # 11114.00 l2_cache_accesses_from_l2_hwpf # 11114.00 l2_cache_hits_from_l2_hwpf 4,482 l2_pf_hit_l2 5,196 l2_pf_miss_l2_hit_l3
1.001765339 seconds time elapsed
After:
# perf stat -M l2_cache_accesses_from_l2_hwpf sleep 1
Performance counter stats for 'sleep 1':
1,477 l2_pf_miss_l2_l3 # 10442.00 l2_cache_accesses_from_l2_hwpf 3,978 l2_pf_hit_l2 4,987 l2_pf_miss_l2_hit_l3
1.001491186 seconds time elapsed
# perf stat -e l2_cache_hits_from_l2_hwpf sleep 1
Performance counter stats for 'sleep 1':
3,983 l2_cache_hits_from_l2_hwpf
1.001329970 seconds time elapsed
Note the difference in performance counter values for the accesses versus the hits after the fix, and the hits event now counting the same as l2_pf_hit_l2.
Fixes: 08ed77e414ab ("perf vendor events amd: Add recommended events") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> # On a 3900X Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20210406215944.113332-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
b07520a5 |
| 06-Apr-2021 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
[ Upstream commit 86c2bc3da769124e3e856b6e9457be3667c30919 ]
Commit 08ed77e414ab2342 ("perf vendor events amd: Add recommended e
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
[ Upstream commit 86c2bc3da769124e3e856b6e9457be3667c30919 ]
Commit 08ed77e414ab2342 ("perf vendor events amd: Add recommended events") added the hits event "L2 Cache Hits from L2 HWPF" with the same metric expression as the accesses event "L2 Cache Accesses from L2 HWPF":
$ perf list --details ... l2_cache_accesses_from_l2_hwpf [L2 Cache Accesses from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] l2_cache_hits_from_l2_hwpf [L2 Cache Hits from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] ...
This was wrong and led to counting hits the same as accesses. Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019", documents the hits event with EventCode 0x70 which is the same as l2_pf_hit_l2.
Fix this, and massage the description for l2_pf_hit_l2 as the hits event is now the duplicate of l2_pf_hit_l2. AMD recommends using the recommended event over other events if the duplicate exists and maintain both for consistency. Hence, l2_cache_hits_from_l2_hwpf should override l2_pf_hit_l2.
Before:
# perf stat -M l2_cache_accesses_from_l2_hwpf,l2_cache_hits_from_l2_hwpf sleep 1
Performance counter stats for 'sleep 1':
1,436 l2_pf_miss_l2_l3 # 11114.00 l2_cache_accesses_from_l2_hwpf # 11114.00 l2_cache_hits_from_l2_hwpf 4,482 l2_pf_hit_l2 5,196 l2_pf_miss_l2_hit_l3
1.001765339 seconds time elapsed
After:
# perf stat -M l2_cache_accesses_from_l2_hwpf sleep 1
Performance counter stats for 'sleep 1':
1,477 l2_pf_miss_l2_l3 # 10442.00 l2_cache_accesses_from_l2_hwpf 3,978 l2_pf_hit_l2 4,987 l2_pf_miss_l2_hit_l3
1.001491186 seconds time elapsed
# perf stat -e l2_cache_hits_from_l2_hwpf sleep 1
Performance counter stats for 'sleep 1':
3,983 l2_cache_hits_from_l2_hwpf
1.001329970 seconds time elapsed
Note the difference in performance counter values for the accesses versus the hits after the fix, and the hits event now counting the same as l2_pf_hit_l2.
Fixes: 08ed77e414ab ("perf vendor events amd: Add recommended events") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> # On a 3900X Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20210406215944.113332-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62 |
|
#
08ed77e4 |
| 01-Sep-2020 |
Kim Phillips <kim.phillips@amd.com> |
perf vendor events amd: Add recommended events
Add support for events listed in Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019".
perf vendor events amd: Add recommended events
Add support for events listed in Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019".
perf now supports these new events (-e):
all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses
and these metrics (-M):
branch_misprediction_ratio all_l2_cache_accesses all_l2_cache_hits all_l2_cache_misses ic_fetch_miss_ratio l2_cache_accesses_from_l2_hwpf l2_cache_hits_from_l2_hwpf l2_cache_misses_from_l2_hwpf l3_read_miss_latency l1_itlb_misses all_remote_links_outbound nps1_die_to_dram
The nps1_die_to_dram event may need perf stat's --metric-no-group switch if the number of available data fabric counters is less than the number it uses (8).
Committer testing:
On a AMD Ryzen 3900x system:
Before:
# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" #
After:
# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" | grep -v "^recommended:$" all_dc_accesses [All L1 Data Cache Accesses] all_tlbs_flushed [All TLBs Flushed] l1_dtlb_misses [L1 DTLB Misses] l2_cache_accesses_from_dc_misses [L2 Cache Accesses from L1 Data Cache Misses (including prefetch)] l2_cache_accesses_from_ic_misses [L2 Cache Accesses from L1 Instruction Cache Misses (including prefetch)] l2_cache_hits_from_dc_misses [L2 Cache Hits from L1 Data Cache Misses] l2_cache_hits_from_ic_misses [L2 Cache Hits from L1 Instruction Cache Misses] l2_cache_misses_from_dc_misses [L2 Cache Misses from L1 Data Cache Misses] l2_cache_misses_from_ic_miss [L2 Cache Misses from L1 Instruction Cache Misses] l2_dtlb_misses [L2 DTLB Misses & Data page walks] l2_itlb_misses [L2 ITLB Misses & Instruction page walks] sse_avx_stalls [Mixed SSE/AVX Stalls] uops_dispatched [Micro-ops Dispatched] uops_retired [Micro-ops Retired] l3_accesses [L3 Accesses. Unit: amd_l3] l3_misses [L3 Misses (includes Chg2X). Unit: amd_l3] #
# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses,l2_cache_hits_from_dc_misses,l2_cache_hits_from_ic_misses,l2_cache_misses_from_dc_misses,l2_cache_misses_from_ic_miss,l2_dtlb_misses,l2_itlb_misses,sse_avx_stalls,uops_dispatched,uops_retired,l3_accesses,l3_misses sleep 2
Performance counter stats for 'system wide':
433,439,949 all_dc_accesses (35.66%) 443 all_tlbs_flushed (35.66%) 2,985,885 l1_dtlb_misses (35.66%) 18,318,019 l2_cache_accesses_from_dc_misses (35.68%) 50,114,810 l2_cache_accesses_from_ic_misses (35.72%) 12,423,978 l2_cache_hits_from_dc_misses (35.74%) 40,703,103 l2_cache_hits_from_ic_misses (35.74%) 6,698,673 l2_cache_misses_from_dc_misses (35.74%) 12,090,892 l2_cache_misses_from_ic_miss (35.74%) 614,267 l2_dtlb_misses (35.74%) 216,036 l2_itlb_misses (35.74%) 11,977 sse_avx_stalls (35.74%) 999,276,223 uops_dispatched (35.73%) 1,075,311,620 uops_retired (35.69%) 1,420,763 l3_accesses 540,164 l3_misses
2.002344121 seconds time elapsed
# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses sleep 2
Performance counter stats for 'system wide':
175,943,104 all_dc_accesses 310 all_tlbs_flushed 2,280,359 l1_dtlb_misses 11,700,151 l2_cache_accesses_from_dc_misses 25,414,963 l2_cache_accesses_from_ic_misses
2.001957818 seconds time elapsed
#
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Signed-off-by: Kim Phillips <kim.phillips@amd.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Jambor <mjambor@suse.cz> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: William Cohen <wcohen@redhat.com> Cc: Yunfeng Ye <yeyunfeng@huawei.com> Link: http://lore.kernel.org/lkml/20200901220944.277505-3-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
60d80452 |
| 01-Sep-2020 |
Kim Phillips <kim.phillips@amd.com> |
perf vendor events amd: Add L2 Prefetch events for zen1
Later revisions of PPRs that post-date the original Family 17h events submission patch add these events.
Specifically, they were not in this
perf vendor events amd: Add L2 Prefetch events for zen1
Later revisions of PPRs that post-date the original Family 17h events submission patch add these events.
Specifically, they were not in this 2017 revision of the F17h PPR:
Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017
But e.g., are included in this 2019 version of the PPR:
Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019
Fixes: 98c07a8f74f8 ("perf vendor events amd: perf PMU events for AMD Family 17h") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Signed-off-by: Kim Phillips <kim.phillips@amd.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Jambor <mjambor@suse.cz> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Stephane Eranian <eranian@google.com> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: William Cohen <wcohen@redhat.com> Cc: Yunfeng Ye <yeyunfeng@huawei.com> Link: http://lore.kernel.org/lkml/20200901220944.277505-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27 |
|
#
b5b8a7cf |
| 18-Mar-2020 |
Vijay Thakkar <vijaythakkar@me.com> |
perf vendor events amd: Update Zen1 events to V2
This patch updates the PMCs for AMD Zen1 core based processors (Family 17h; Models 0 through 2F) to be in accordance with PMCs as documented in the l
perf vendor events amd: Update Zen1 events to V2
This patch updates the PMCs for AMD Zen1 core based processors (Family 17h; Models 0 through 2F) to be in accordance with PMCs as documented in the latest versions of the AMD Processor Programming Reference [1], [2] and [3]. Note that some events, such as FPU pipe assignment are missing in [1], and therefore [3] is included for full coverage of events.
PMCs added:
fpu_pipe_assignment.dual{0|1|2|3} fpu_pipe_assignment.total{0|1|2|3} ls_mab_alloc.dc_prefetcher ls_mab_alloc.stores ls_mab_alloc.loads bp_dyn_ind_pred bp_de_redirect
PMC removed:
ex_ret_cond_misp
Cumulative counts, fpu_pipe_assignment.total and fpu_pipe_assignment.dual, existed in v1, but did expose port-level counters.
ex_ret_cond_misp has been removed as it has been removed from the latest versions of the PPR, and when tested, always seems to sample zero as tested on a Ryzen 3400G system.
[1]: Processor Programming Reference (PPR) for AMD Family 17h Models 01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.
[2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019.
[3]: OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018
All of the PPRs can be found at: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Vijay Thakkar <vijaythakkar@me.com> Acked-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: vijay thakkar <vijaythakkar@me.com> Link: http://lore.kernel.org/lkml/20200318190002.307290-4-vijaythakkar@me.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
c5f18e9e |
| 18-Mar-2020 |
Vijay Thakkar <vijaythakkar@me.com> |
perf vendor events amd: Restrict model detection for zen1 based processors
This patch changes the previous blanket detection of AMD Family 17h processors to be more specific to Zen1 core based produ
perf vendor events amd: Restrict model detection for zen1 based processors
This patch changes the previous blanket detection of AMD Family 17h processors to be more specific to Zen1 core based products only by replacing model detection regex pattern [[:xdigit:]]+ with ([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only.
This change is required to allow for the addition of separate PMU events for Zen2 core based models in the following patches as those belong to family 17h but have different PMCs. Current PMU events directory has also been renamed to "amdzen1" from "amdfam17h" to reflect this specificity.
Note that although this change does not break PMU counters for existing zen1 based systems, it does disable the current set of counters for zen2 based systems. Counters for zen2 have been added in the following patches in this patchset.
Signed-off-by: Vijay Thakkar <vijaythakkar@me.com> Acked-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200318190002.307290-2-vijaythakkar@me.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
b07520a5 |
| 06-Apr-2021 |
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> |
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric [ Upstream commit 86c2bc3da769124e3e856b6e9457be3667c30919 ] Commit 08ed77e414ab2342 ("perf vendor events amd: A
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric [ Upstream commit 86c2bc3da769124e3e856b6e9457be3667c30919 ] Commit 08ed77e414ab2342 ("perf vendor events amd: Add recommended events") added the hits event "L2 Cache Hits from L2 HWPF" with the same metric expression as the accesses event "L2 Cache Accesses from L2 HWPF": $ perf list --details ... l2_cache_accesses_from_l2_hwpf [L2 Cache Accesses from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] l2_cache_hits_from_l2_hwpf [L2 Cache Hits from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] ... This was wrong and led to counting hits the same as accesses. Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019", documents the hits event with EventCode 0x70 which is the same as l2_pf_hit_l2. Fix this, and massage the description for l2_pf_hit_l2 as the hits event is now the duplicate of l2_pf_hit_l2. AMD recommends using the recommended event over other events if the duplicate exists and maintain both for consistency. Hence, l2_cache_hits_from_l2_hwpf should override l2_pf_hit_l2. Before: # perf stat -M l2_cache_accesses_from_l2_hwpf,l2_cache_hits_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 1,436 l2_pf_miss_l2_l3 # 11114.00 l2_cache_accesses_from_l2_hwpf # 11114.00 l2_cache_hits_from_l2_hwpf 4,482 l2_pf_hit_l2 5,196 l2_pf_miss_l2_hit_l3 1.001765339 seconds time elapsed After: # perf stat -M l2_cache_accesses_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 1,477 l2_pf_miss_l2_l3 # 10442.00 l2_cache_accesses_from_l2_hwpf 3,978 l2_pf_hit_l2 4,987 l2_pf_miss_l2_hit_l3 1.001491186 seconds time elapsed # perf stat -e l2_cache_hits_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 3,983 l2_cache_hits_from_l2_hwpf 1.001329970 seconds time elapsed Note the difference in performance counter values for the accesses versus the hits after the fix, and the hits event now counting the same as l2_pf_hit_l2. Fixes: 08ed77e414ab ("perf vendor events amd: Add recommended events") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> # On a 3900X Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20210406215944.113332-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
Revision tags: v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62 |
|
#
08ed77e4 |
| 01-Sep-2020 |
Kim Phillips <kim.phillips@amd.com> |
perf vendor events amd: Add recommended events Add support for events listed in Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 -
perf vendor events amd: Add recommended events Add support for events listed in Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019". perf now supports these new events (-e): all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses and these metrics (-M): branch_misprediction_ratio all_l2_cache_accesses all_l2_cache_hits all_l2_cache_misses ic_fetch_miss_ratio l2_cache_accesses_from_l2_hwpf l2_cache_hits_from_l2_hwpf l2_cache_misses_from_l2_hwpf l3_read_miss_latency l1_itlb_misses all_remote_links_outbound nps1_die_to_dram The nps1_die_to_dram event may need perf stat's --metric-no-group switch if the number of available data fabric counters is less than the number it uses (8). Committer testing: On a AMD Ryzen 3900x system: Before: # perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" # After: # perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" | grep -v "^recommended:$" all_dc_accesses [All L1 Data Cache Accesses] all_tlbs_flushed [All TLBs Flushed] l1_dtlb_misses [L1 DTLB Misses] l2_cache_accesses_from_dc_misses [L2 Cache Accesses from L1 Data Cache Misses (including prefetch)] l2_cache_accesses_from_ic_misses [L2 Cache Accesses from L1 Instruction Cache Misses (including prefetch)] l2_cache_hits_from_dc_misses [L2 Cache Hits from L1 Data Cache Misses] l2_cache_hits_from_ic_misses [L2 Cache Hits from L1 Instruction Cache Misses] l2_cache_misses_from_dc_misses [L2 Cache Misses from L1 Data Cache Misses] l2_cache_misses_from_ic_miss [L2 Cache Misses from L1 Instruction Cache Misses] l2_dtlb_misses [L2 DTLB Misses & Data page walks] l2_itlb_misses [L2 ITLB Misses & Instruction page walks] sse_avx_stalls [Mixed SSE/AVX Stalls] uops_dispatched [Micro-ops Dispatched] uops_retired [Micro-ops Retired] l3_accesses [L3 Accesses. Unit: amd_l3] l3_misses [L3 Misses (includes Chg2X). Unit: amd_l3] # # perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses,l2_cache_hits_from_dc_misses,l2_cache_hits_from_ic_misses,l2_cache_misses_from_dc_misses,l2_cache_misses_from_ic_miss,l2_dtlb_misses,l2_itlb_misses,sse_avx_stalls,uops_dispatched,uops_retired,l3_accesses,l3_misses sleep 2 Performance counter stats for 'system wide': 433,439,949 all_dc_accesses (35.66%) 443 all_tlbs_flushed (35.66%) 2,985,885 l1_dtlb_misses (35.66%) 18,318,019 l2_cache_accesses_from_dc_misses (35.68%) 50,114,810 l2_cache_accesses_from_ic_misses (35.72%) 12,423,978 l2_cache_hits_from_dc_misses (35.74%) 40,703,103 l2_cache_hits_from_ic_misses (35.74%) 6,698,673 l2_cache_misses_from_dc_misses (35.74%) 12,090,892 l2_cache_misses_from_ic_miss (35.74%) 614,267 l2_dtlb_misses (35.74%) 216,036 l2_itlb_misses (35.74%) 11,977 sse_avx_stalls (35.74%) 999,276,223 uops_dispatched (35.73%) 1,075,311,620 uops_retired (35.69%) 1,420,763 l3_accesses 540,164 l3_misses 2.002344121 seconds time elapsed # perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses sleep 2 Performance counter stats for 'system wide': 175,943,104 all_dc_accesses 310 all_tlbs_flushed 2,280,359 l1_dtlb_misses 11,700,151 l2_cache_accesses_from_dc_misses 25,414,963 l2_cache_accesses_from_ic_misses 2.001957818 seconds time elapsed # Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Signed-off-by: Kim Phillips <kim.phillips@amd.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Jambor <mjambor@suse.cz> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: William Cohen <wcohen@redhat.com> Cc: Yunfeng Ye <yeyunfeng@huawei.com> Link: http://lore.kernel.org/lkml/20200901220944.277505-3-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
60d80452 |
| 01-Sep-2020 |
Kim Phillips <kim.phillips@amd.com> |
perf vendor events amd: Add L2 Prefetch events for zen1 Later revisions of PPRs that post-date the original Family 17h events submission patch add these events. Specifically, th
perf vendor events amd: Add L2 Prefetch events for zen1 Later revisions of PPRs that post-date the original Family 17h events submission patch add these events. Specifically, they were not in this 2017 revision of the F17h PPR: Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017 But e.g., are included in this 2019 version of the PPR: Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019 Fixes: 98c07a8f74f8 ("perf vendor events amd: perf PMU events for AMD Family 17h") Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Signed-off-by: Kim Phillips <kim.phillips@amd.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Jambor <mjambor@suse.cz> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Stephane Eranian <eranian@google.com> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: William Cohen <wcohen@redhat.com> Cc: Yunfeng Ye <yeyunfeng@huawei.com> Link: http://lore.kernel.org/lkml/20200901220944.277505-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27 |
|
#
b5b8a7cf |
| 18-Mar-2020 |
Vijay Thakkar <vijaythakkar@me.com> |
perf vendor events amd: Update Zen1 events to V2 This patch updates the PMCs for AMD Zen1 core based processors (Family 17h; Models 0 through 2F) to be in accordance with PMCs as doc
perf vendor events amd: Update Zen1 events to V2 This patch updates the PMCs for AMD Zen1 core based processors (Family 17h; Models 0 through 2F) to be in accordance with PMCs as documented in the latest versions of the AMD Processor Programming Reference [1], [2] and [3]. Note that some events, such as FPU pipe assignment are missing in [1], and therefore [3] is included for full coverage of events. PMCs added: fpu_pipe_assignment.dual{0|1|2|3} fpu_pipe_assignment.total{0|1|2|3} ls_mab_alloc.dc_prefetcher ls_mab_alloc.stores ls_mab_alloc.loads bp_dyn_ind_pred bp_de_redirect PMC removed: ex_ret_cond_misp Cumulative counts, fpu_pipe_assignment.total and fpu_pipe_assignment.dual, existed in v1, but did expose port-level counters. ex_ret_cond_misp has been removed as it has been removed from the latest versions of the PPR, and when tested, always seems to sample zero as tested on a Ryzen 3400G system. [1]: Processor Programming Reference (PPR) for AMD Family 17h Models 01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019. [2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019. [3]: OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018 All of the PPRs can be found at: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Signed-off-by: Vijay Thakkar <vijaythakkar@me.com> Acked-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: vijay thakkar <vijaythakkar@me.com> Link: http://lore.kernel.org/lkml/20200318190002.307290-4-vijaythakkar@me.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
c5f18e9e |
| 18-Mar-2020 |
Vijay Thakkar <vijaythakkar@me.com> |
perf vendor events amd: Restrict model detection for zen1 based processors This patch changes the previous blanket detection of AMD Family 17h processors to be more specific to Zen1 core
perf vendor events amd: Restrict model detection for zen1 based processors This patch changes the previous blanket detection of AMD Family 17h processors to be more specific to Zen1 core based products only by replacing model detection regex pattern [[:xdigit:]]+ with ([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only. This change is required to allow for the addition of separate PMU events for Zen2 core based models in the following patches as those belong to family 17h but have different PMCs. Current PMU events directory has also been renamed to "amdzen1" from "amdfam17h" to reflect this specificity. Note that although this change does not break PMU counters for existing zen1 based systems, it does disable the current set of counters for zen2 based systems. Counters for zen2 have been added in the following patches in this patchset. Signed-off-by: Vijay Thakkar <vijaythakkar@me.com> Acked-by: Kim Phillips <kim.phillips@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jon Grimm <jon.grimm@amd.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200318190002.307290-2-vijaythakkar@me.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|