Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3 |
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
535a265d |
| 09-Sep-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "perf tools maintainership:
Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "perf tools maintainership:
- Add git information for perf-tools and perf-tools-next trees and branches to the MAINTAINERS file. That is where development now takes place and myself and Namhyung Kim have write access, more people to come as we emulate other maintainer groups.
perf record:
- Record kernel data maps when 'perf record --data' is used, so that global variables can be resolved and used in tools that do data profiling.
perf trace:
- Remove the old, experimental support for BPF events in which a .c file was passed as an event: "perf trace -e hello.c" to then get compiled and loaded.
The only known usage for that, that shipped with the kernel as an example for such events, augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all the user space components and the BPF code connected to the syscalls.
In the end just the way to glue the BPF part and the user space type beautifiers changed, now being performed by libbpf skeletons.
The next step is to use BTF to do pretty printing of all syscall types, as discussed with Alan Maguire and others.
Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings, some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds:
# perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5 0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 ? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0 10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ... ? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 ? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0 3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ... 3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0
Performance counter stats for 'sleep 5':
2,617,347 cycles 1,855,997 instructions # 0.71 insn per cycle
5.002282128 seconds time elapsed
0.000855000 seconds user 0.000852000 seconds sys
perf annotate:
- Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for licensing reasons, and we missed a build test on tools/perf/tests makefile.
Since we now default to NDEBUG=1, we ended up segfaulting when building with BUILD_NONDISTRO=1 because a needed initialization routine was being "error checked" via an assert.
Fix it by explicitly checking the result and aborting instead if it fails.
We better back propagate the error, but at least 'perf annotate' on samples collected for a BPF program is back working when perf is built with BUILD_NONDISTRO=1.
perf report/top:
- Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'.
- Fix the number of entries for 'e' key in the TUI that was preventing navigation of lines when expanding an entry.
perf report/script:
- Support cross platform register handling, allowing a perf.data file collected on one architecture to have registers sampled correctly displayed when analysis tools such as 'perf report' and 'perf script' are used on a different architecture.
- Fix handling of event attributes in pipe mode, i.e. when one uses:
perf record -o - | perf report -i -
When no perf.data files are used.
- Handle files generated via pipe mode with a version of perf and then read also via pipe mode with a different version of perf, where the event attr record may have changed, use the record size field to properly support this version mismatch.
perf probe:
- Accessing global variables from uprobes isn't supported, make the error message state that instead of stating that some minimal kernel version is needed to have that feature. This seems just a tool limitation, the kernel probably has all that is needed.
perf tests:
- Fix a reference count related leak in the dlfilter v0 API where the result of a thread__find_symbol_fb() is not matched with an addr_location__exit() to drop the reference counts of the resolved components (machine, thread, map, symbol, etc). Add a dlfilter test to make sure that doesn't regresses.
- Lots of fixes for the 'perf test' written in shell script related to problems found with the shellcheck utility.
- Fixes for 'perf test' shell scripts testing features enabled when perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters.
- Add perf record sample filtering test, things like the following example, that gets implemented as a BPF filter attached to the event:
# perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
- Improve the way the task_analyzer test checks if libtraceevent is linked, using 'perf version --build-options' instead of the more expensinve 'perf record -e "sched:sched_switch"'.
- Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents).
libperf:
- Implement riscv mmap support (This went as well via the RiscV tree, same contents).
perf script:
- New tool that converts perf.data files to the firefox profiler format so that one can use the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's Google Summer of Code.
One can generate the output and upload it to the web interface but Anup also automated everything:
perf script gecko -F 99 -a sleep 60
- Support syscall name parsing on arm64.
- Print "cgroup" field on the same line as "comm".
perf bench:
- Add new 'uprobe' benchmark to measure the overhead of uprobes with/without BPF programs attached to it.
- breakpoints are not available on power9, skip that test.
perf stat:
- Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra 'perf test' check that exemplifies its purpose:
TEST_ASSERT_VAL("#num_cpus_online", expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0); TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0); TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
Miscellaneous:
- Improve tool startup time by lazily reading PMU, JSON, sysfs data.
- Improve error reporting in the parsing of events, passing YYLTYPE to error routines, so that the output can show were the parsing error was found.
- Add 'perf test' entries to check the parsing of events improvements.
- Fix various leak for things detected by -fsanitize=address, mostly things that would be freed at tool exit, including:
- Free evsel->filter on the destructor.
- Allow tools to register a thread->priv destructor and use it in 'perf trace'.
- Free evsel->priv in 'perf trace'.
- Free string returned by synthesize_perf_probe_point() when the caller fails to do all it needs.
- Adjust various compiler options to not consider errors some warnings when building with broken headers found in things like python, flex, bison, as we otherwise build with -Werror. Some for gcc, some for clang, some for some specific version of those, some for some specific version of flex or bison, or some specific combination of these components, bah.
- Allow customization of clang options for BPF target, this helps building on gentoo where there are other oddities where BPF targets gets passed some compiler options intended for the native build, so building with WERROR=0 helps while these oddities are fixed.
- Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock', fixing some segfaults when handling some odd failures.
- Add LTO build option.
- Fix format of unordered lists in the perf docs (tools/perf/Documentation)
- Overhaul the bison files, using constructs such as YYNOMEM.
- Remove unused tokens from the bison .y files.
- Add more comments to various structs.
- A few LoongArch enablement patches.
Vendor events (JSON):
- Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
EventName, BriefDescription visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.", visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.", op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.", op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.", op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
- Add AmpereOne metrics (aarch64).
- Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo.
- Update scale units and descriptions of common topdown metrics on aarch64. Things like: - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)", - "BriefDescription": "Frontend bound L1 topdown metric", + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))", + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
- Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints.
- Update files for the power10 platform"
* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits) perf parse-events: Fix driver config term perf parse-events: Fixes relating to no_value terms perf parse-events: Fix propagation of term's no_value when cloning perf parse-events: Name the two term enums perf list: Don't print Unit for "default_core" perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake perf dlfilter: Avoid leak in v0 API test use of resolve_address() perf metric: Add #num_cpus_online literal perf pmu: Remove str from perf_pmu_alias perf parse-events: Make common term list to strbuf helper perf parse-events: Minor help message improvements perf pmu: Avoid uninitialized use of alias->str perf jevents: Use "default_core" for events with no Unit perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test perf test shell stat_bpf_counters: Fix test on Intel perf test shell record_bpf_filter: Skip 6.2 kernel libperf: Get rid of attr.id field perf tools: Convert to perf_record_header_attr_id() libperf: Add perf_record_header_attr_id() perf tools: Handle old data in PERF_RECORD_ATTR ...
show more ...
|
Revision tags: v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48 |
|
#
9823ae6f |
| 23-Aug-2023 |
Kajol Jain <kjain@linux.ibm.com> |
perf bench breakpoint: Skip run if no breakpoints available
Based on commit 7d54a4acd8c1de3e ("perf test: Skip watchpoint tests if no watchpoints available"), hardware breakpoints are not available
perf bench breakpoint: Skip run if no breakpoints available
Based on commit 7d54a4acd8c1de3e ("perf test: Skip watchpoint tests if no watchpoints available"), hardware breakpoints are not available for power9 platform and because of that 'perf bench breakpoint' run fails on power9 platform.
Add code to check for the return value of perf_event_open() in the breakpoint run and skip the 'perf bench breakpoint' run, if hardware breakpoints are not available.
Result on power9 system before patch changes:
[command]# perf bench breakpoint thread perf_event_open: No such device
Result on power9 system after patch changes:
[command]# ./perf bench breakpoint thread Skipping perf bench breakpoint thread: No hardware support
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Naveen N Rao <naveen@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230823075103.190565-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13 |
|
#
4f2c0a4a |
| 13-Dec-2022 |
Nick Terrell <terrelln@fb.com> |
Merge branch 'main' into zstd-linus
|
Revision tags: v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4 |
|
#
14e77332 |
| 21-Oct-2022 |
Nick Terrell <terrelln@fb.com> |
Merge branch 'main' into zstd-next
|
Revision tags: v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72 |
|
#
97acb6a8 |
| 03-Oct-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Daniele needs 84d4333c1e28 ("misc/mei: Add NULL check to component match callback functions") in order to merge the DG2 HuC patches.
Signed-off-by: Tvrtko
Merge drm/drm-next into drm-intel-gt-next
Daniele needs 84d4333c1e28 ("misc/mei: Add NULL check to component match callback functions") in order to merge the DG2 HuC patches.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
show more ...
|
Revision tags: v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60 |
|
#
44627916 |
| 05-Aug-2022 |
Andreas Gruenbacher <agruenba@redhat.com> |
Merge part of branch 'for-next.instantiate' into for-next
|
#
fc30eea1 |
| 04-Aug-2022 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync up. In special to get the drm-intel-gt-next stuff.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
Revision tags: v5.15.59 |
|
#
8bb5e7f4 |
| 02-Aug-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 5.20 (or 6.0) merge window.
|
Revision tags: v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55 |
|
#
f83d9396 |
| 14-Jul-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging from drm/drm-next for the final fixes that will go into v5.20.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v5.15.54 |
|
#
a63f7778 |
| 08-Jul-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v5.19-rc5' into next
Merge with mainline to bring up the latest definition from MFD subsystem needed for Mediatek keypad driver.
|
Revision tags: v5.15.53 |
|
#
dd84cfff |
| 04-Jul-2022 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v5.19-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.19
A collection of fixes for v5.19, quite large but nothing major -
Merge tag 'asoc-fix-v5.19-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.19
A collection of fixes for v5.19, quite large but nothing major - a good chunk of it is more stuff that was identified by mixer-test regarding event generation.
show more ...
|
Revision tags: v5.15.52, v5.15.51, v5.15.50, v5.15.49 |
|
#
2b1333b8 |
| 20-Jun-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get new regmap APIs of v5.19-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v5.15.48 |
|
#
f777316e |
| 15-Jun-2022 |
Takashi Iwai <tiwai@suse.de> |
Merge branch 'topic/ctl-enhancements' into for-next
Pull ALSA control enhancement patches. One is the faster lookup of control elements, and another is to introduce the input data validation.
Signe
Merge branch 'topic/ctl-enhancements' into for-next
Pull ALSA control enhancement patches. One is the faster lookup of control elements, and another is to introduce the input data validation.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
show more ...
|
Revision tags: v5.15.47 |
|
#
66da6500 |
| 09-Jun-2022 |
Paolo Bonzini <pbonzini@redhat.com> |
Merge tag 'kvm-riscv-fixes-5.19-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 5.19, take #1
- Typo fix in arch/riscv/kvm/vmid.c
- Remove broken reference pattern from MAIN
Merge tag 'kvm-riscv-fixes-5.19-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 5.19, take #1
- Typo fix in arch/riscv/kvm/vmid.c
- Remove broken reference pattern from MAINTAINERS entry
show more ...
|
Revision tags: v5.15.46 |
|
#
6e2b347d |
| 08-Jun-2022 |
Maxime Ripard <maxime@cerno.tech> |
Merge v5.19-rc1 into drm-misc-fixes
Let's kick-off the start of the 5.19 fix cycle
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
#
073350da |
| 07-Jun-2022 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.19-rc1' into asoc-5.19
Linux 5.19-rc1
|
Revision tags: v5.15.45, v5.15.44 |
|
#
d223575e |
| 25-May-2022 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v5.19-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo: "Intel PT:
- Allow hardware tracing
Merge tag 'perf-tools-for-v5.19-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo: "Intel PT:
- Allow hardware tracing on KVM test programs. In this case, the VM is not running an OS, but only the functions loaded into it by the hypervisor test program, and conveniently, loaded at the same virtual addresses.
- Improve documentation: - Add link to perf wiki's page
- Cleanups: - Delete now unused perf-with-kcore.sh script - Remove unused machines__find_host()
ARM SPE (Statistical Profile Extensions):
- Add man page entry.
Vendor Events:
- Update various Intel event topics
- Update various microarch events
- Fix various cstate metrics
- Fix Alderlake metric groups
- Add sapphirerapids events
- Add JSON files for ARM Cortex A34, A35, A55, A510, A65, A73, A75, A77, A78, A710, X1, X2 and Neoverse E1
- Update Cortex A57/A72
perf stat:
- Introduce stats for the user and system rusage times
perf c2c:
- Prep work to support ARM systems
perf annotate:
- Add --percent-limit option
perf lock:
- Add -t/--thread option for report
- Do not discard broken lock stats
perf bench:
- Add breakpoint benchmarks
perf test:
- Limit to only run executable scripts in tests
- Add basic perf record tests
- Add stat record+report test
- Add basic stat and topdown group test
- Skip several tests when the user hasn't permission to perform them
- Fix test case 81 ("perf record tests") on s390x
perf version:
- debuginfod support improvements
perf scripting python:
- Expose symbol offset and source information
perf build:
- Error for BPF skeletons without LIBBPF
- Use Python devtools for version autodetection rather than runtime
Miscellaneous:
- Add riscv64 support to 'perf jitdump'
- Various fixes/tidy ups related to cpu_map
- Fixes for handling Intel hybrid systems"
* tag 'perf-tools-for-v5.19-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (122 commits) perf intel-pt: Add guest_code support perf kvm report: Add guest_code support perf script: Add guest_code support perf tools: Add guest_code support perf tools: Factor out thread__set_guest_comm() perf tools: Add machine to machines back pointer perf vendors events arm64: Update Cortex A57/A72 perf vendors events arm64: Arm Neoverse E1 perf vendors events arm64: Arm Cortex-X2 perf vendors events arm64: Arm Cortex-X1 perf vendors events arm64: Arm Cortex-A710 perf vendors events arm64: Arm Cortex-A78 perf vendors events arm64: Arm Cortex-A77 perf vendors events arm64: Arm Cortex-A75 perf vendors events arm64: Arm Cortex-A73 perf vendors events arm64: Arm Cortex-A65 perf vendors events arm64: Arm Cortex-A510 perf vendors events arm64: Arm Cortex-A55 perf vendors events arm64: Arm Cortex-A35 perf vendors events arm64: Arm Cortex-A34 ...
show more ...
|
Revision tags: v5.15.43, v5.15.42, v5.18, v5.15.41 |
|
#
df36d257 |
| 16-May-2022 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf bench breakpoint: Fix build on 32-bit arches
Cast pointers to unsigned long instead of to uint64_t to avoid this problem on 32-bit arches:
31 6.89 debian:experimental-x-mips : FAIL gc
perf bench breakpoint: Fix build on 32-bit arches
Cast pointers to unsigned long instead of to uint64_t to avoid this problem on 32-bit arches:
31 6.89 debian:experimental-x-mips : FAIL gcc version 11.2.0 (Debian 11.2.0-18) bench/breakpoint.c: In function 'breakpoint_setup': bench/breakpoint.c:56:24: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] 56 | attr.bp_addr = (uint64_t)addr; | ^ cc1: all warnings being treated as errors make[3]: *** [/git/perf-5.18.0-rc7/tools/build/Makefile.build:139: bench] Error 2
Fixes: 68a6772f11dbb1ed ("perf bench: Add breakpoint benchmarks") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/YoLq1nHx1doi+VWl@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v5.15.40, v5.15.39, v5.15.38 |
|
#
68a6772f |
| 05-May-2022 |
Dmitry Vyukov <dvyukov@google.com> |
perf bench: Add breakpoint benchmarks
Add 2 benchmarks:
1. Performance of thread creation/exiting in presence of breakpoints. 2. Performance of breakpoint modification in presence of threads.
The
perf bench: Add breakpoint benchmarks
Add 2 benchmarks:
1. Performance of thread creation/exiting in presence of breakpoints. 2. Performance of breakpoint modification in presence of threads.
The benchmarks capture use cases that we are interested in: using inheritable breakpoints in large highly-threaded applications.
The benchmarks show significant slowdown imposed by breakpoints (even when they don't fire).
Testing on Intel 8173M with 112 HW threads show:
perf bench --repeat=56 breakpoint thread --breakpoints=0 --parallelism=56 --threads=20 78.675000 usecs/op perf bench --repeat=56 breakpoint thread --breakpoints=4 --parallelism=56 --threads=20 12967.135714 usecs/op
That's 165x slowdown due to presence of the breakpoints.
perf bench --repeat=20000 breakpoint enable --passive=0 --active=0 1.433250 usecs/op perf bench --repeat=20000 breakpoint enable --passive=224 --active=0 585.318400 usecs/op perf bench --repeat=20000 breakpoint enable --passive=0 --active=111 635.953000 usecs/op
That's 408x and 444x slowdown due to presence of threads.
Profiles show some overhead in toggle_bp_slot, but also very high contention:
90.83% breakpoint-thre [kernel.kallsyms] [k] osq_lock 4.69% breakpoint-thre [kernel.kallsyms] [k] mutex_spin_on_owner 2.06% breakpoint-thre [kernel.kallsyms] [k] __reserve_bp_slot 2.04% breakpoint-thre [kernel.kallsyms] [k] toggle_bp_slot
79.01% breakpoint-enab [kernel.kallsyms] [k] smp_call_function_single 9.94% breakpoint-enab [kernel.kallsyms] [k] llist_add_batch 5.70% breakpoint-enab [kernel.kallsyms] [k] _raw_spin_lock_irq 1.84% breakpoint-enab [kernel.kallsyms] [k] event_function_call 1.12% breakpoint-enab [kernel.kallsyms] [k] send_call_function_single_ipi 0.37% breakpoint-enab [kernel.kallsyms] [k] generic_exec_single 0.24% breakpoint-enab [kernel.kallsyms] [k] __perf_event_disable 0.20% breakpoint-enab [kernel.kallsyms] [k] _perf_event_enable 0.18% breakpoint-enab [kernel.kallsyms] [k] toggle_bp_slot
Committer notes:
Fixup struct init for older compilers:
3 32.90 alpine:3.5 : FAIL clang version 3.8.1 (tags/RELEASE_381/final) bench/breakpoint.c:49:34: error: missing field 'size' initializer [-Werror,-Wmissing-field-initializers] struct perf_event_attr attr = {0}; ^ 1 error generated. 7 37.31 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0) bench/breakpoint.c:49:34: error: missing field 'size' initializer [-Werror,-Wmissing-field-initializers] struct perf_event_attr attr = {0}; ^ 1 error generated.
Signed-off-by: Dmitriy Vyukov <dvyukov@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220505155745.1690906-1-dvyukov@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|