#
bc88ba0c |
| 11-May-2023 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes. No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
Revision tags: v6.1.28 |
|
#
ff32fcca |
| 09-May-2023 |
Maxime Ripard <maxime@cerno.tech> |
Merge drm/drm-next into drm-misc-next
Start the 6.5 release cycle.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
#
f085df1b |
| 07-May-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo: "Third version of perf tool updates, w
Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo: "Third version of perf tool updates, with the build problems with with using a 'vmlinux.h' generated from the main build fixed, and the bpf skeleton build disabled by default.
Build:
- Require libtraceevent to build, one can disable it using NO_LIBTRACEEVENT=1.
It is required for tools like 'perf sched', 'perf kvm', 'perf trace', etc.
libtraceevent is available in most distros so installing 'libtraceevent-devel' should be a one-time event to continue building perf as usual.
Using NO_LIBTRACEEVENT=1 produces tooling that is functional and sufficient for lots of users not interested in those libtraceevent dependent features.
- Allow Python support in 'perf script' when libtraceevent isn't linked, as not all features requires it, for instance Intel PT does not use tracepoints.
- Error if the python interpreter needed for jevents to work isn't available and NO_JEVENTS=1 isn't set, preventing a build without support for JSON vendor events, which is a rare but possible condition. The two check error messages:
$(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.) $(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.)
- Make libbpf 1.0 the minimum required when building with out of tree, distro provided libbpf.
- Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++ demangler, add 'perf test' entry for it.
- Make binutils libraries opt in, as distros disable building with it due to licensing, they were used for C++ demangling, for instance.
- Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or equivalent) isn't installed, we'll just have a build warning:
Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev
- Add a feature test for scandirat(), that is not implemented so far in musl and uclibc, disabling features that need it, such as scanning for tracepoints in /sys/kernel/tracing/events.
perf BPF filters:
- New feature where BPF can be used to filter samples, for instance:
$ sudo ./perf record -e cycles --filter 'period > 1000' true $ sudo ./perf script perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms]) perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms]) perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms]) true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms]) true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
- In addition to 'period' (PERF_SAMPLE_PERIOD), the other PERF_SAMPLE_ can be used for filtering, and also some other sample accessible values, from tools/perf/Documentation/perf-record.txt:
Essentially the BPF filter expression is:
<term> <operator> <value> (("," | "||") <term> <operator> <value>)*
The <term> can be one of: ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr, code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat, p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock, mem_dtlb, mem_blk, mem_hops
The <operator> can be one of: ==, !=, >, >=, <, <=, &
The <value> can be one of: <number> (for any term) na, load, store, pfetch, exec (for mem_op) l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl) na, none, hit, miss, hitm, fwd, peer (for mem_snoop) remote (for mem_remote) na, locked (for mem_locked) na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb) na, by_data, by_addr (for mem_blk) hops0, hops1, hops2, hops3 (for mem_hops)
perf lock contention:
- Show lock type with address.
- Track and show mmap_lock, siglock and per-cpu rq_lock with address. This is done for mmap_lock by following the current->mm pointer:
$ sudo ./perf lock con -abl -- sleep 10 contended total wait max wait avg wait address symbol ... 16344 312.30 ms 2.22 ms 19.11 us ffff8cc702595640 17686 310.08 ms 1.49 ms 17.53 us ffff8cc7025952c0 3 84.14 ms 45.79 ms 28.05 ms ffff8cc78114c478 mmap_lock 3557 76.80 ms 68.75 us 21.59 us ffff8cc77ca3af58 1 68.27 ms 68.27 ms 68.27 ms ffff8cda745dfd70 9 54.53 ms 7.96 ms 6.06 ms ffff8cc7642a48b8 mmap_lock 14629 44.01 ms 60.00 us 3.01 us ffff8cc7625f9ca0 3481 42.63 ms 140.71 us 12.24 us ffffffff937906ac vmap_area_lock 16194 38.73 ms 42.15 us 2.39 us ffff8cd397cbc560 11 38.44 ms 10.39 ms 3.49 ms ffff8ccd6d12fbb8 mmap_lock 1 5.43 ms 5.43 ms 5.43 ms ffff8cd70018f0d8 1674 5.38 ms 422.93 us 3.21 us ffffffff92e06080 tasklist_lock 581 4.51 ms 130.68 us 7.75 us ffff8cc9b1259058 5 3.52 ms 1.27 ms 703.23 us ffff8cc754510070 112 3.47 ms 56.47 us 31.02 us ffff8ccee38b3120 381 3.31 ms 73.44 us 8.69 us ffffffff93790690 purge_vmap_area_lock 255 3.19 ms 36.35 us 12.49 us ffff8d053ce30c80
- Update default map size to 16384.
- Allocate single letter option -M for --map-nr-entries, as it is proving being frequently used.
- Fix struct rq lock access for older kernels with BPF's CO-RE (Compile once, run everywhere).
- Fix problems found with MSAn.
perf report/top:
- Add inline information when using --call-graph=fp or lbr, as was already done to the --call-graph=dwarf callchain mode.
- Improve the 'srcfile' sort key performance by really using an optimization introduced in 6.2 for the 'srcline' sort key that avoids calling addr2line for comparision with each sample.
perf sched:
- Make 'perf sched latency/map/replay' to use "sched:sched_waking" instead of "sched:sched_waking", consistent with 'perf record' since d566a9c2d482 ("perf sched: Prefer sched_waking event when it exists").
perf ftrace:
- Make system wide the default target for latency subcommand, run the following command then generate some network traffic and press control+C:
# perf ftrace latency -T __kfree_skb ^C DURATION | COUNT | GRAPH | 0 - 1 us | 27 | ############# | 1 - 2 us | 22 | ########### | 2 - 4 us | 8 | #### | 4 - 8 us | 5 | ## | 8 - 16 us | 24 | ############ | 16 - 32 us | 2 | # | 32 - 64 us | 1 | | 64 - 128 us | 0 | | 128 - 256 us | 0 | | 256 - 512 us | 0 | | 512 - 1024 us | 0 | | 1 - 2 ms | 0 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 0 | | 256 - 512 ms | 0 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | #
perf top:
- Add --branch-history (LBR: Last Branch Record) option, just like already available for 'perf record'.
- Fix segfault in thread__comm_len() where thread->comm was being used outside thread->comm_lock.
perf annotate:
- Allow configuring objdump and addr2line in ~/.perfconfig., so that you can use alternative binaries, such as llvm's.
perf kvm:
- Add TUI mode for 'perf kvm stat report'.
Reference counting:
- Add reference count checking infrastructure to check for use after free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs, more to come.
To build with it use -DREFCNT_CHECKING=1 in the make command line to build tools/perf. Documented at:
https://perf.wiki.kernel.org/index.php/Reference_Count_Checking
- The above caught, for instance, fix, present in this series:
- Fix maps use after put in 'perf test "Share thread maps"':
'maps' is copied from leader, but the leader is put on line 79 and then 'maps' is used to read the reference count below - so a use after put, with the put of maps happening within thread__put.
Fixed by reversing the order of puts so that the leader is put last.
- Also several fixes were made to places where reference counts were not being held.
- Make this one of the tests in 'make -C tools/perf build-test' to regularly build test it and to make sure no direct access to the reference counted structs are made, doing that via accessors to check the validity of the struct pointer.
ARM64:
- Fix 'perf report' segfault when filtering coresight traces by sparse lists of CPUs.
- Add support for 'simd' as a sort field for 'perf report', to show ARM's NEON SIMD's predicate flags: "partial" and "empty".
arm64 vendor events:
- Add N1 metrics.
Intel vendor events:
- Add graniterapids, grandridge and sierraforrest events.
- Refresh events for: alderlake, aldernaken, broadwell, broadwellde, broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex, jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids, silvermont, skylake, tigerlake and westmereep-dp
- Refresh metrics for alderlake-n, broadwell, broadwellde, broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and skylakex.
perf stat:
- Implement --topdown using JSON metrics.
- Add TopdownL1 JSON metric as a default if present, but disable it for now for some Intel hybrid architectures, a series of patches addressing this is being reviewed and will be submitted for v6.5.
- Use metrics for --smi-cost.
- Update topdown documentation.
Vendor events (JSON) infrastructure:
- Add support for computing and printing metric threshold values. For instance, here is one found in thesapphirerapids json file:
{ "BriefDescription": "Percentage of cycles spent in System Management Interrupts.", "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)", "MetricGroup": "smi", "MetricName": "smi_cycles", "MetricThreshold": "smi_cycles > 0.1", "ScaleUnit": "100%" },
- Test parsing metric thresholds with the fake PMU in 'perf test pmu-events'.
- Support for printing metric thresholds in 'perf list'.
- Add --metric-no-threshold option to 'perf stat'.
- Add rand (reverse and) and has_pmem (optane memory) support to metrics.
- Sort list of input files to avoid depending on the order from readdir() helping in obtaining reproducible builds.
S/390:
- Add common metrics: - CPI (cycles per instruction), prbstate (ratio of instructions executed in problem state compared to total number of instructions), l1mp (Level one instruction and data cache misses per 100 instructions).
- Add cache metrics for z13, z14, z15 and z16.
- Add metric for TLB and cache.
ARM:
- Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE (Memory Tagging Extension) and MOPS (Memory Operations) load/store.
Intel PT hardware tracing:
- Add event type names UINTR (User interrupt delivered) and UIRET (Exiting from user interrupt routine), documented in table 32-50 "CFE Packet Type and Vector Fields Details" in the Intel Processor Trace chapter of The Intel SDM Volume 3 version 078.
- Add support for new branch instructions ERETS and ERETU.
- Fix CYC timestamps after standalone CBR
ARM CoreSight hardware tracing:
- Allow user to override timestamp and contextid settings.
- Fix segfault in dso lookup.
- Fix timeless decode mode detection.
- Add separate decode paths for timeless and per-thread modes.
auxtrace:
- Fix address filter entire kernel size.
Miscellaneous:
- Fix use-after-free and unaligned bugs in the PLT handling routines.
- Use zfree() to reduce chances of use after free.
- Add missing 0x prefix for addresses printed in hexadecimal in 'perf probe'.
- Suppress massive unsupported target platform errors in the unwind code.
- Fix return incorrect build_id size in elf_read_build_id().
- Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 .
- Add missing new parameter in kfree_skb tracepoint to the python scripts using it.
- Add 'perf bench syscall fork' benchmark.
- Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in 'perf mem'.
- Fix wrong size expectation for perf test 'Setup struct perf_event_attr' caused by the patch adding perf_event_attr::config3.
- Fix some spelling mistakes"
* tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits) Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL" Revert "perf build: Warn for BPF skeletons if endian mismatches" perf metrics: Fix SEGV with --for-each-cgroup perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE perf stat: Separate bperf from bpf_profiler perf test record+probe_libc_inet_pton: Fix call chain match on x86_64 perf test record+probe_libc_inet_pton: Fix call chain match on s390 perf tracepoint: Fix memory leak in is_valid_tracepoint() perf cs-etm: Add fix for coresight trace for any range of CPUs perf build: Fix unescaped # in perf build-test perf unwind: Suppress massive unsupported target platform errors perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it perf script: Print raw ip instead of binary offset for callchain perf symbols: Fix return incorrect build_id size in elf_read_build_id() perf list: Modify the warning message about scandirat(3) perf list: Fix memory leaks in print_tracepoint_events() perf lock contention: Rework offset calculation with BPF CO-RE perf lock contention: Fix struct rq lock access perf stat: Disable TopdownL1 on hybrid perf stat: Avoid SEGV on counter->name ...
show more ...
|
Revision tags: v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22 |
|
#
56d9117c |
| 28-Mar-2023 |
Ian Rogers <irogers@google.com> |
perf annotate: Own objdump_path and disassembler_style strings
Make struct annotation_options own the strings objdump_path and disassembler_style, freeing them on exit. Add missing strdup for disass
perf annotate: Own objdump_path and disassembler_style strings
Make struct annotation_options own the strings objdump_path and disassembler_style, freeing them on exit. Add missing strdup for disassembler_style when read from a config file.
Committer notes:
Converted free(obj->member) to zfree(&obj->member) in annotation_options__exit()
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
217b7d41 |
| 28-Mar-2023 |
Ian Rogers <irogers@google.com> |
perf annotate: Add init/exit to annotation_options remove default
The annotation__default_options global variable was used to initialize annotation_options. Switch to the init/exit pattern as later
perf annotate: Add init/exit to annotation_options remove default
The annotation__default_options global variable was used to initialize annotation_options. Switch to the init/exit pattern as later changes will give ownership over strings and this will be necessary to avoid memory leaks.
Committer note:
Fix the GTK2=1 build, hist_entry__gtk_annotate() needs to receive a 'struct annotation_options' pointer.
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13 |
|
#
4f2c0a4a |
| 13-Dec-2022 |
Nick Terrell <terrelln@fb.com> |
Merge branch 'main' into zstd-linus
|
#
cfd1f6c1 |
| 13-Dec-2022 |
Jiri Kosina <jkosina@suse.cz> |
Merge branch 'for-6.2/apple' into for-linus
- new quirks for select Apple keyboards (Kerem Karabay, Aditya Garg)
|
#
e291c116 |
| 12-Dec-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.2 merge window.
|
Revision tags: v6.1 |
|
#
6b2b0d83 |
| 08-Dec-2022 |
Petr Mladek <pmladek@suse.com> |
Merge branch 'rework/console-list-lock' into for-linus
|
Revision tags: v6.0.12, v6.0.11, v6.0.10, v5.15.80 |
|
#
29583dfc |
| 21-Nov-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next-fixes
Backmerging to update drm-misc-next-fixes for the final phase of the release cycle.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.0.9, v5.15.79 |
|
#
002c6ca7 |
| 14-Nov-2022 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Catch up on 6.1-rc cycle in order to solve the intel_backlight conflict on linux-next.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
Revision tags: v6.0.8, v5.15.78 |
|
#
d93618da |
| 04-Nov-2022 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Needed to bring in v6.1-rc1 which contains commit f683b9d61319 ("i915: use the VMA iterator") which is needed for series https://patchwork.freedesktop.org/s
Merge drm/drm-next into drm-intel-gt-next
Needed to bring in v6.1-rc1 which contains commit f683b9d61319 ("i915: use the VMA iterator") which is needed for series https://patchwork.freedesktop.org/series/110083/ .
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
Revision tags: v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4 |
|
#
14e77332 |
| 21-Oct-2022 |
Nick Terrell <terrelln@fb.com> |
Merge branch 'main' into zstd-next
|
Revision tags: v6.0.3 |
|
#
1aca5ce0 |
| 20-Oct-2022 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Backmerging to get v6.1-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
008f05a7 |
| 19-Oct-2022 |
Mark Brown <broonie@kernel.org> |
ASoC: jz4752b: Capture fixes
Merge series from Siarhei Volkau <lis8215@gmail.com>:
The patchset fixes: - Line In path stays powered off during capturing or bypass to mixer. - incorrectly repre
ASoC: jz4752b: Capture fixes
Merge series from Siarhei Volkau <lis8215@gmail.com>:
The patchset fixes: - Line In path stays powered off during capturing or bypass to mixer. - incorrectly represented dB values in alsamixer, et al. - incorrect represented Capture input selector in alsamixer in Playback tab. - wrong control selected as Capture Master
show more ...
|
#
a140a6a2 |
| 18-Oct-2022 |
Maxime Ripard <maxime@cerno.tech> |
Merge drm/drm-next into drm-misc-next
Let's kick-off this release cycle.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
#
c29a017f |
| 17-Oct-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.1-rc1' into next
Merge with mainline to bring in the latest changes to twl4030 driver.
|
#
8048b835 |
| 16-Oct-2022 |
Andrew Morton <akpm@linux-foundation.org> |
Merge branch 'master' into mm-hotfixes-stable
|
Revision tags: v6.0.2, v5.15.74, v5.15.73, v6.0.1 |
|
#
d465bff1 |
| 11-Oct-2022 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Add support for AMD on 'perf mem'
Merge tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Add support for AMD on 'perf mem' and 'perf c2c', the kernel enablement patches went via tip.
Example:
$ sudo perf mem record -- -c 10000 ^C[ perf record: Woken up 227 times to write data ] [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
$ sudo perf mem report -F mem,sample,snoop Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762 Memory access Samples Snoop N/A 700620 N/A L1 hit 126675 N/A L2 hit 424 N/A L3 hit 664 HitM L3 hit 10 N/A Local RAM hit 2 N/A Remote RAM (1 hop) hit 8558 N/A Remote Cache (1 hop) hit 3 N/A Remote Cache (1 hop) hit 2 HitM Remote Cache (2 hops) hit 10 HitM Remote Cache (2 hops) hit 6 N/A Uncached hit 4 N/A $
- "perf lock" improvements:
- Add -E/--entries option to limit the number of entries to display, say to ask for just the top 5 contended locks.
- Add -q/--quiet option to suppress header and debug messages.
- Add a 'perf test' kernel lock contention entry to test 'perf lock'.
- "perf lock contention" improvements:
- Ask BPF's bpf_get_stackid() to skip some callchain entries.
The ones closer to the tooling are bpf related and not that interesting, the ones calling the locking function are the ones we're interested in, example of a full, unskipped callstack:
- Allow changing the callstack depth and number of entries to skip.
1 10.74 us 10.74 us 10.74 us spinlock __bpf_trace_contention_begin+0xb 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117 0xffffffffbb8b8e75 bpf_trace_run2+0x35 0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb 0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5 0xffffffffbc1c26ff _raw_spin_lock+0x1f 0xffffffffbb841015 tick_do_update_jiffies64+0x25 0xffffffffbb8409ee tick_irq_enter+0x9e
- Show full callstack in verbose mode (-v option), sometimes this is desirable instead of showing just one callstack entry.
- Allow multiple time ranges in 'perf record --delay' to help in reducing the amount of data collected from hardware tracing (Intel PT, etc) when there is a rough idea of periods of time where events of interest take time.
- Add Intel PT to record only decoder debug messages when error happens.
- Improve layout of Intel PT man page.
- Add new branch types: alignment, data and inst faults and arch specific ones, such as fiq, debug_halt, debug_exit, debug_inst and debug_data on arm64.
Kernel enablement went thru the tip tree.
- Fix 'perf probe' error log check in 'perf test' when no debuginfo is available.
- Fix 'perf stat' aggregation mode logic, it should be looking at the CPU not at the core number.
- Fix flags parsing in 'perf trace' filters.
- Introduce compact encoding of CPU range encoding on perf.data, to avoid having a bitmap with all the CPUs.
- Improvements to the 'perf stat' metrics, including adding "core_wide", and computing "smt" from the CPU topology.
- Add support to the new PERF_FORMAT_LOST perf_event_attr.read_format, that allows tooling to ask for the precise number of lost samples for a given event.
- Add 'addr' sort key to see just the address of sampled instructions:
$ perf record -o- true | perf report -i- -s addr [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] # Samples: 12 of event 'cycles:u' # Event count (approx.): 252512 # # Overhead Address # ........ .................. 42.96% 0x7f96f08443d7 29.55% 0x7f96f0859b50 14.76% 0x7f96f0852e02 8.30% 0x7f96f0855028 4.43% 0xffffffff8de01087
perf annotate: Toggle full address <-> offset display
- Add 'f' hotkey to the 'perf annotate' TUI interface when in 'disassembler output' mode ('o' hotkey) to toggle showing full virtual address or just the offset.
- Cache DSO build-ids when synthesizing PERF_RECORD_MMAP records for pre-existing threads, at the start of a 'perf record' session, speeding up that record startup phase.
- Add a command line option to specify build ids in 'perf inject'.
- Update JSON event files for the Intel alderlake, broadwell, broadwellde, broadwellx, cascadelakex, haswell, haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, sandybridge, sapphirerapids, skylake, skylakex, and tigerlake processors.
- Update vendor JSON event files for the ARM Neoverse V1 and E1 platforms.
- Add a 'perf test' entry for 'perf mem' where a struct has false sharing and this gets detected in the 'perf mem' output, tested with Intel, AMD and ARM64 systems.
- Add a 'perf test' entry to test the resolution of java symbols, where an output like this is expected:
8.18% jshell jitted-50116-29.so [.] Interpreter 0.75% Thread-1 jitted-83602-1670.so [.] jdk.internal.jimage.BasicImageReader.getString(int)
- Add tests for the ARM64 CoreSight hardware tracing feature, with specially crafted pureloop, memcpy, thread loop and unroll tread that then gets traced and the output compared with expected output.
Documentation explaining it is also included.
- Add per thread Intel PT 'perf test' entry to check that PERF_RECORD_TEXT_POKE events are recorded per CPU, resulting in a mixture of per thread and per CPU events and mmaps, verify that this gets all recorded correctly.
- Introduce pthread mutex wrappers to allow for building with clang's -Wthread-safety, i.e. using the "guarded_by" "pt_guarded_by" "lockable", "exclusive_lock_function", "exclusive_trylock_function", "exclusive_locks_required", and "no_thread_safety_analysis" compiler function attributes.
- Fix empty version number when building outside of a git repo.
- Improve feature detection display when multiple versions of a feature are present, such as for binutils libbfd, that has a mix of possible ways to detect according to the Linux distribution.
Previously in some cases we had:
Auto-detecting system features <SNIP> ... libbfd: [ on ] ... libbfd-liberty: [ on ] ... libbfd-liberty-z: [ on ] <SNIP>
Now for this case we show just the main feature:
Auto-detecting system features <SNIP> ... libbfd: [ on ] <SNIP>
- Remove some unused structs, variables, macros, function prototypes and includes from various places.
* tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits) perf script: Add missing fields in usage hint perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB perf mem/c2c: Avoid printing empty lines for unsupported events perf mem/c2c: Add load store event mappings for AMD perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO} perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernel tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel perf stat: Fix cpu check to use id.cpu.cpu in aggr_printout() perf test coresight: Add relevant documentation about ARM64 CoreSight testing perf test: Add git ignore for tmp and output files of ARM CoreSight tests perf test coresight: Add unroll thread test shell script perf test coresight: Add unroll thread test tool perf test coresight: Add thread loop test shell scripts perf test coresight: Add thread loop test tool perf test coresight: Add memcpy thread test shell script perf test coresight: Add memcpy thread test tool perf test: Add git ignore for perf data generated by the ARM CoreSight tests perf test: Add arm64 asm pureloop test shell script perf test: Add asm pureloop test tool ...
show more ...
|
Revision tags: v5.15.72, v6.0, v5.15.71 |
|
#
7d18a824 |
| 23-Sep-2022 |
Namhyung Kim <namhyung@kernel.org> |
perf annotate: Toggle full address <-> offset display
Handle 'f' key to toggle the display offset and full address. Obviously it only works when users set to see disassembler output ('o' key). It'
perf annotate: Toggle full address <-> offset display
Handle 'f' key to toggle the display offset and full address. Obviously it only works when users set to see disassembler output ('o' key). It'd be useful when users want to see the full virtual address in the TUI annotate browser.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220923173142.805896-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64 |
|
#
9b3726ef |
| 26-Aug-2022 |
Ian Rogers <irogers@google.com> |
perf annotate: Update use of pthread mutex
Switch to the use of mutex wrappers that provide better error checking.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.
perf annotate: Update use of pthread mutex
Switch to the use of mutex wrappers that provide better error checking.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: André Almeida <andrealmeid@igalia.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Colin Ian King <colin.king@intel.com> Cc: Dario Petrillo <dario.pk1@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Fangrui Song <maskray@google.com> Cc: Hewenliang <hewenliang4@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jason Wang <wangborong@cdjrlc.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Pavithra Gurushankar <gpavithrasha@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Remi Bernon <rbernon@codeweavers.com> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tom Rix <trix@redhat.com> Cc: Weiguo Li <liwg06@foxmail.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: William Cohen <wcohen@redhat.com> Cc: Zechuan Chen <chenzechuan1@huawei.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Cc: yaowenbin <yaowenbin1@huawei.com> Link: https://lore.kernel.org/r/20220826164242.43412-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45 |
|
#
03ab8e62 |
| 31-May-2022 |
Konstantin Komarov <almaz.alexandrovich@paragon-software.com> |
Merge tag 'v5.18'
Linux 5.18
|
Revision tags: v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26 |
|
#
1136fa0c |
| 01-Mar-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v5.17-rc4' into for-linus
Merge with mainline to get the Intel ASoC generic helpers header and other changes.
|
Revision tags: v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16 |
|
#
87a0b2fa |
| 17-Jan-2022 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v5.16' into next
Sync up with mainline to bring in the latest API changes.
|
Revision tags: v5.15.15 |
|
#
8a2094d6 |
| 10-Jan-2022 |
Jiri Kosina <jkosina@suse.cz> |
Merge branch 'for-5.17/core' into for-linus
- support for USI style pens (Tero Kristo, Mika Westerberg) - quirk for devices that need inverted X/Y axes (Alistair Francis) - small core code cleanups
Merge branch 'for-5.17/core' into for-linus
- support for USI style pens (Tero Kristo, Mika Westerberg) - quirk for devices that need inverted X/Y axes (Alistair Francis) - small core code cleanups and deduplication (Benjamin Tissoires)
show more ...
|