03a0e252 | 20-Nov-2024 |
Namhyung Kim <namhyung@kernel.org> |
perf/arm-cmn: Ensure port and device id bits are set properly
[ Upstream commit dfdf714fed559c09021df1d2a4bb64c0ad5f53bc ]
The portid_bits and deviceid_bits were set only for XP type nodes in the a
perf/arm-cmn: Ensure port and device id bits are set properly
[ Upstream commit dfdf714fed559c09021df1d2a4bb64c0ad5f53bc ]
The portid_bits and deviceid_bits were set only for XP type nodes in the arm_cmn_discover() and it confused other nodes to find XP nodes. Copy the both bits from the XP nodes directly when it sets up a new node.
Fixes: e79634b53e39 ("perf/arm-cmn: Refactor node ID handling. Again.") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20241121001334.331334-1-namhyung@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
c902e515 | 02-Sep-2024 |
Robin Murphy <robin.murphy@arm.com> |
perf/arm-cmn: Ensure dtm_idx is big enough
[ Upstream commit 359414b33e00bae91e4eabf3e4ef8e76024c7673 ]
While CMN_MAX_DIMENSION was bumped to 12 for CMN-650, that only supports up to a 10x10 mesh,
perf/arm-cmn: Ensure dtm_idx is big enough
[ Upstream commit 359414b33e00bae91e4eabf3e4ef8e76024c7673 ]
While CMN_MAX_DIMENSION was bumped to 12 for CMN-650, that only supports up to a 10x10 mesh, so bumping dtm_idx to 256 bits at the time worked out OK in practice. However CMN-700 did finally support up to 144 XPs, and thus needs a worst-case 288 bits of dtm_idx for an aggregated XP event on a maxed-out config. Oops.
Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e771b358526a0d7fc06efee2c3a2fdc0c9f51d44.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
5418a61e | 02-Sep-2024 |
Robin Murphy <robin.murphy@arm.com> |
perf/arm-cmn: Fix CCLA register offset
[ Upstream commit 88b63a82c84ed9bbcbdefb10cb6f75dd1dd04887 ]
Apparently pmu_event_sel is offset by 8 for all CCLA nodes, not just the CCLA_RNI combination typ
perf/arm-cmn: Fix CCLA register offset
[ Upstream commit 88b63a82c84ed9bbcbdefb10cb6f75dd1dd04887 ]
Apparently pmu_event_sel is offset by 8 for all CCLA nodes, not just the CCLA_RNI combination type.
Fixes: 23760a014417 ("perf/arm-cmn: Add CMN-700 support") Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/6e7bb06fef6046f83e7647aad0e5be544139763f.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
a687d9d1 | 02-Sep-2024 |
Robin Murphy <robin.murphy@arm.com> |
perf/arm-cmn: Refactor node ID handling. Again.
[ Upstream commit e79634b53e398966c49f803c49701bc74dc3ccf8 ]
The scope of the "extra device ports" configuration is not made clear by the CMN documen
perf/arm-cmn: Refactor node ID handling. Again.
[ Upstream commit e79634b53e398966c49f803c49701bc74dc3ccf8 ]
The scope of the "extra device ports" configuration is not made clear by the CMN documentation - so far we've assumed it applies globally, based on the sole example which suggests as much. However it transpires that this is incorrect, and the format does in fact vary based on each individual XP's port configuration. As a consequence, we're currenly liable to decode the port/device indices from a node ID incorrectly, thus program the wrong event source in the DTM leading to bogus event counts, and also show device topology on the wrong ports in debugfs.
To put this right, rework node IDs yet again to carry around the additional data necessary to decode them properly per-XP. At this point the notion of fully decomposing an ID becomes more impractical than it's worth, so unabstracting the XY mesh coordinates (where 2/3 users were just debug anyway) ends up leaving things a bit simpler overall.
Fixes: 60d1504070c2 ("perf/arm-cmn: Support new IP features") Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/5195f990152fc37adba5fbf5929a6b11063d9f09.1725296395.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
a1b25661 | 13-Dec-2023 |
Robin Murphy <robin.murphy@arm.com> |
perf/arm-cmn: Improve debugfs pretty-printing for large configs
[ Upstream commit a1083ee717e9bde012268782e084d343314490a4 ]
The debugfs pretty-printer was written for the CMN-600 assumptions of a
perf/arm-cmn: Improve debugfs pretty-printing for large configs
[ Upstream commit a1083ee717e9bde012268782e084d343314490a4 ]
The debugfs pretty-printer was written for the CMN-600 assumptions of a maximum 8x8 mesh, but CMN-700 now allows coordinates and ID values up to 12 and 128 respectively, which can overflow the format strings, mess up the alignment of the table and hurt overall readability. This table does prove useful for double-checking that the driver is picking up the topology of new systems correctly and for verifying user expectations, so tweak the formatting to stay nice and readable with wider values.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/1d1517eadd1bac5992fab679c9dc531b381944da.1702484646.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Stable-dep-of: e79634b53e39 ("perf/arm-cmn: Refactor node ID handling. Again.") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
f5c4ec8d | 20-Oct-2023 |
Robin Murphy <robin.murphy@arm.com> |
perf/arm-cmn: Rework DTC counters (again)
[ Upstream commit 7633ec2c262fab3e7c5bf3cd3876b5748f584a57 ]
The bitmap-based scheme for tracking DTC counter usage turns out to be a complete dead-end for
perf/arm-cmn: Rework DTC counters (again)
[ Upstream commit 7633ec2c262fab3e7c5bf3cd3876b5748f584a57 ]
The bitmap-based scheme for tracking DTC counter usage turns out to be a complete dead-end for its imagined purpose, since by the time we have to keep track of a per-DTC counter index anyway, we already have enough information to make the bitmap itself redundant. Revert the remains of it back to almost the original scheme, but now expanded to track per-DTC indices, in preparation for making use of them in anger.
Note that since cycle count events always use a dedicated counter on a single DTC, we reuse the field to encode their DTC index directly.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/5f6ade76b47f033836d7a36c03555da896dfb4a3.1697824215.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> Stable-dep-of: e79634b53e39 ("perf/arm-cmn: Refactor node ID handling. Again.") Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
666a46a9 | 29-Aug-2024 |
Yicong Yang <yangyicong@hisilicon.com> |
drivers/perf: hisi_pcie: Fix TLP headers bandwidth counting
[ Upstream commit 17bf68aeb3642221e3e770399b5a52f370747ac1 ]
We make the initial value of event ctrl register as HISI_PCIE_INIT_SET and m
drivers/perf: hisi_pcie: Fix TLP headers bandwidth counting
[ Upstream commit 17bf68aeb3642221e3e770399b5a52f370747ac1 ]
We make the initial value of event ctrl register as HISI_PCIE_INIT_SET and modify according to the user options. This will make TLP headers bandwidth only counting never take effect since HISI_PCIE_INIT_SET configures to count the TLP payloads bandwidth. Fix this by making the initial value of event ctrl register as 0.
Fixes: 17d573984d4d ("drivers/perf: hisi: Add TLP filter support") Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240829090332.28756-3-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
a7678a16 | 25-Apr-2024 |
Hao Chen <chenhao418@huawei.com> |
drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()
[ Upstream commit 582c1aeee0a9e73010cf1c4cef338709860deeb0 ]
pci_alloc_irq_vectors() allocates an irq vector. When devm_add_action(
drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()
[ Upstream commit 582c1aeee0a9e73010cf1c4cef338709860deeb0 ]
pci_alloc_irq_vectors() allocates an irq vector. When devm_add_action() fails, the irq vector is not freed, which leads to a memory leak.
Replace the devm_add_action with devm_add_action_or_reset to ensure the irq vector can be destroyed when it fails.
Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
be1fa711 | 25-Apr-2024 |
Junhao He <hejunhao3@huawei.com> |
drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group
[ Upstream commit 81bdd60a3d1d3b05e6cc6674845afb1694dd3a0e ]
The perf tool allows users to create event groups through follo
drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group
[ Upstream commit 81bdd60a3d1d3b05e6cc6674845afb1694dd3a0e ]
The perf tool allows users to create event groups through following cmd [1], but the driver does not check whether the array index is out of bounds when writing data to the event_group array. If the number of events in an event_group is greater than HNS3_PMU_MAX_HW_EVENTS, the memory write overflow of event_group array occurs.
Add array index check to fix the possible array out of bounds violation, and return directly when write new events are written to array bounds.
There are 9 different events in an event_group. [1] perf stat -e '{pmu/event1/, ... ,pmu/event9/}
Fixes: 66637ab137b4 ("drivers/perf: hisi: add driver for HNS3 PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Signed-off-by: Hao Chen <chenhao418@huawei.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jijie Shao <shaojijie@huawei.com> Link: https://lore.kernel.org/r/20240425124627.13764-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|
e0d17ee8 | 27-Feb-2024 |
Vadim Shakirov <vadim.shakirov@syntacore.com> |
drivers: perf: ctr_get_width function for legacy is not defined
[ Upstream commit 682dc133f83e0194796e6ea72eb642df1c03dfbe ]
With parameters CONFIG_RISCV_PMU_LEGACY=y and CONFIG_RISCV_PMU_SBI=n lin
drivers: perf: ctr_get_width function for legacy is not defined
[ Upstream commit 682dc133f83e0194796e6ea72eb642df1c03dfbe ]
With parameters CONFIG_RISCV_PMU_LEGACY=y and CONFIG_RISCV_PMU_SBI=n linux kernel crashes when you try perf record:
$ perf record ls [ 46.749286] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 46.750199] Oops [#1] [ 46.750342] Modules linked in: [ 46.750608] CPU: 0 PID: 107 Comm: perf-exec Not tainted 6.6.0 #2 [ 46.750906] Hardware name: riscv-virtio,qemu (DT) [ 46.751184] epc : 0x0 [ 46.751430] ra : arch_perf_update_userpage+0x54/0x13e [ 46.751680] epc : 0000000000000000 ra : ffffffff8072ee52 sp : ff2000000022b8f0 [ 46.751958] gp : ffffffff81505988 tp : ff6000000290d400 t0 : ff2000000022b9c0 [ 46.752229] t1 : 0000000000000001 t2 : 0000000000000003 s0 : ff2000000022b930 [ 46.752451] s1 : ff600000028fb000 a0 : 0000000000000000 a1 : ff600000028fb000 [ 46.752673] a2 : 0000000ae2751268 a3 : 00000000004fb708 a4 : 0000000000000004 [ 46.752895] a5 : 0000000000000000 a6 : 000000000017ffe3 a7 : 00000000000000d2 [ 46.753117] s2 : ff600000028fb000 s3 : 0000000ae2751268 s4 : 0000000000000000 [ 46.753338] s5 : ffffffff8153e290 s6 : ff600000863b9000 s7 : ff60000002961078 [ 46.753562] s8 : ff60000002961048 s9 : ff60000002961058 s10: 0000000000000001 [ 46.753783] s11: 0000000000000018 t3 : ffffffffffffffff t4 : ffffffffffffffff [ 46.754005] t5 : ff6000000292270c t6 : ff2000000022bb30 [ 46.754179] status: 0000000200000100 badaddr: 0000000000000000 cause: 000000000000000c [ 46.754653] Code: Unable to access instruction at 0xffffffffffffffec. [ 46.754939] ---[ end trace 0000000000000000 ]--- [ 46.755131] note: perf-exec[107] exited with irqs disabled [ 46.755546] note: perf-exec[107] exited with preempt_count 4
This happens because in the legacy case the ctr_get_width function was not defined, but it is used in arch_perf_update_userpage.
Also remove extra check in riscv_pmu_ctr_get_width_mask
Signed-off-by: Vadim Shakirov <vadim.shakirov@syntacore.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Fixes: cc4c07c89aad ("drivers: perf: Implement perf event mmap support in the SBI backend") Link: https://lore.kernel.org/r/20240227170002.188671-3-vadim.shakirov@syntacore.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
show more ...
|