Revision tags: v6.6.67, v6.6.66, v6.6.65, v6.6.64, v6.6.63, v6.6.62, v6.6.61, v6.6.60, v6.6.59, v6.6.58, v6.6.57, v6.6.56, v6.6.55, v6.6.54, v6.6.53, v6.6.52, v6.6.51, v6.6.50, v6.6.49, v6.6.48, v6.6.47, v6.6.46, v6.6.45, v6.6.44, v6.6.43, v6.6.42, v6.6.41, v6.6.40, v6.6.39, v6.6.38, v6.6.37, v6.6.36, v6.6.35, v6.6.34, v6.6.33, v6.6.32, v6.6.31, v6.6.30, v6.6.29, v6.6.28, v6.6.27, v6.6.26, v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3 |
|
#
c900529f |
| 12-Sep-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
535a265d |
| 09-Sep-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "perf tools maintainership:
Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "perf tools maintainership:
- Add git information for perf-tools and perf-tools-next trees and branches to the MAINTAINERS file. That is where development now takes place and myself and Namhyung Kim have write access, more people to come as we emulate other maintainer groups.
perf record:
- Record kernel data maps when 'perf record --data' is used, so that global variables can be resolved and used in tools that do data profiling.
perf trace:
- Remove the old, experimental support for BPF events in which a .c file was passed as an event: "perf trace -e hello.c" to then get compiled and loaded.
The only known usage for that, that shipped with the kernel as an example for such events, augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all the user space components and the BPF code connected to the syscalls.
In the end just the way to glue the BPF part and the user space type beautifiers changed, now being performed by libbpf skeletons.
The next step is to use BTF to do pretty printing of all syscall types, as discussed with Alan Maguire and others.
Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings, some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds:
# perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5 0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 ? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0 10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ... ? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 ? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0 3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ... 3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0
Performance counter stats for 'sleep 5':
2,617,347 cycles 1,855,997 instructions # 0.71 insn per cycle
5.002282128 seconds time elapsed
0.000855000 seconds user 0.000852000 seconds sys
perf annotate:
- Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for licensing reasons, and we missed a build test on tools/perf/tests makefile.
Since we now default to NDEBUG=1, we ended up segfaulting when building with BUILD_NONDISTRO=1 because a needed initialization routine was being "error checked" via an assert.
Fix it by explicitly checking the result and aborting instead if it fails.
We better back propagate the error, but at least 'perf annotate' on samples collected for a BPF program is back working when perf is built with BUILD_NONDISTRO=1.
perf report/top:
- Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'.
- Fix the number of entries for 'e' key in the TUI that was preventing navigation of lines when expanding an entry.
perf report/script:
- Support cross platform register handling, allowing a perf.data file collected on one architecture to have registers sampled correctly displayed when analysis tools such as 'perf report' and 'perf script' are used on a different architecture.
- Fix handling of event attributes in pipe mode, i.e. when one uses:
perf record -o - | perf report -i -
When no perf.data files are used.
- Handle files generated via pipe mode with a version of perf and then read also via pipe mode with a different version of perf, where the event attr record may have changed, use the record size field to properly support this version mismatch.
perf probe:
- Accessing global variables from uprobes isn't supported, make the error message state that instead of stating that some minimal kernel version is needed to have that feature. This seems just a tool limitation, the kernel probably has all that is needed.
perf tests:
- Fix a reference count related leak in the dlfilter v0 API where the result of a thread__find_symbol_fb() is not matched with an addr_location__exit() to drop the reference counts of the resolved components (machine, thread, map, symbol, etc). Add a dlfilter test to make sure that doesn't regresses.
- Lots of fixes for the 'perf test' written in shell script related to problems found with the shellcheck utility.
- Fixes for 'perf test' shell scripts testing features enabled when perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters.
- Add perf record sample filtering test, things like the following example, that gets implemented as a BPF filter attached to the event:
# perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
- Improve the way the task_analyzer test checks if libtraceevent is linked, using 'perf version --build-options' instead of the more expensinve 'perf record -e "sched:sched_switch"'.
- Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents).
libperf:
- Implement riscv mmap support (This went as well via the RiscV tree, same contents).
perf script:
- New tool that converts perf.data files to the firefox profiler format so that one can use the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's Google Summer of Code.
One can generate the output and upload it to the web interface but Anup also automated everything:
perf script gecko -F 99 -a sleep 60
- Support syscall name parsing on arm64.
- Print "cgroup" field on the same line as "comm".
perf bench:
- Add new 'uprobe' benchmark to measure the overhead of uprobes with/without BPF programs attached to it.
- breakpoints are not available on power9, skip that test.
perf stat:
- Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra 'perf test' check that exemplifies its purpose:
TEST_ASSERT_VAL("#num_cpus_online", expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0); TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0); TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
Miscellaneous:
- Improve tool startup time by lazily reading PMU, JSON, sysfs data.
- Improve error reporting in the parsing of events, passing YYLTYPE to error routines, so that the output can show were the parsing error was found.
- Add 'perf test' entries to check the parsing of events improvements.
- Fix various leak for things detected by -fsanitize=address, mostly things that would be freed at tool exit, including:
- Free evsel->filter on the destructor.
- Allow tools to register a thread->priv destructor and use it in 'perf trace'.
- Free evsel->priv in 'perf trace'.
- Free string returned by synthesize_perf_probe_point() when the caller fails to do all it needs.
- Adjust various compiler options to not consider errors some warnings when building with broken headers found in things like python, flex, bison, as we otherwise build with -Werror. Some for gcc, some for clang, some for some specific version of those, some for some specific version of flex or bison, or some specific combination of these components, bah.
- Allow customization of clang options for BPF target, this helps building on gentoo where there are other oddities where BPF targets gets passed some compiler options intended for the native build, so building with WERROR=0 helps while these oddities are fixed.
- Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock', fixing some segfaults when handling some odd failures.
- Add LTO build option.
- Fix format of unordered lists in the perf docs (tools/perf/Documentation)
- Overhaul the bison files, using constructs such as YYNOMEM.
- Remove unused tokens from the bison .y files.
- Add more comments to various structs.
- A few LoongArch enablement patches.
Vendor events (JSON):
- Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
EventName, BriefDescription visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.", visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.", op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.", op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.", op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
- Add AmpereOne metrics (aarch64).
- Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo.
- Update scale units and descriptions of common topdown metrics on aarch64. Things like: - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)", - "BriefDescription": "Frontend bound L1 topdown metric", + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))", + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
- Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints.
- Update files for the power10 platform"
* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits) perf parse-events: Fix driver config term perf parse-events: Fixes relating to no_value terms perf parse-events: Fix propagation of term's no_value when cloning perf parse-events: Name the two term enums perf list: Don't print Unit for "default_core" perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake perf dlfilter: Avoid leak in v0 API test use of resolve_address() perf metric: Add #num_cpus_online literal perf pmu: Remove str from perf_pmu_alias perf parse-events: Make common term list to strbuf helper perf parse-events: Minor help message improvements perf pmu: Avoid uninitialized use of alias->str perf jevents: Use "default_core" for events with no Unit perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test perf test shell stat_bpf_counters: Fix test on Intel perf test shell record_bpf_filter: Skip 6.2 kernel libperf: Get rid of attr.id field perf tools: Convert to perf_record_header_attr_id() libperf: Add perf_record_header_attr_id() perf tools: Handle old data in PERF_RECORD_ATTR ...
show more ...
|
Revision tags: v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39 |
|
#
a5f3171b |
| 09-Jul-2023 |
Athira Rajeev <atrajeev@linux.vnet.ibm.com> |
perf tests lib probe_vfs_getname: Fix shellcheck warnings about missing shebang/local variables
Running shellcheck on probe_vfs_getname fails with below warning:
In ./tools/perf/tests/shell/lib/
perf tests lib probe_vfs_getname: Fix shellcheck warnings about missing shebang/local variables
Running shellcheck on probe_vfs_getname fails with below warning:
In ./tools/perf/tests/shell/lib/probe_vfs_getname.sh line 1: # Arnaldo Carvalho de Melo <acme@kernel.org>, 2017 ^-- SC2148 (error): Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
In ./tools/perf/tests/shell/lib/probe_vfs_getname.sh line 14: local verbose=$1 ^-----------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
Fix this: - by adding shebang in the beginning of the file and - rename variable verbose to "add_probe_verbose" after removing local
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230709182800.53002-19-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28 |
|
#
9a87ffc9 |
| 01-May-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.4 merge window.
|
Revision tags: v6.1.27 |
|
#
cdc780f0 |
| 26-Apr-2023 |
Jiri Kosina <jkosina@suse.cz> |
Merge branch 'for-6.4/amd-sfh' into for-linus
- assorted functional fixes for amd-sfh driver (Basavaraj Natikar)
|
Revision tags: v6.1.26, v6.3, v6.1.25, v6.1.24 |
|
#
ea68a3e9 |
| 11-Apr-2023 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Need to pull in commit from drm-next (earlier in drm-intel-next):
1eca0778f4b3 ("drm/i915: add struct i915_dsm to wrap dsm members together")
In order to
Merge drm/drm-next into drm-intel-gt-next
Need to pull in commit from drm-next (earlier in drm-intel-next):
1eca0778f4b3 ("drm/i915: add struct i915_dsm to wrap dsm members together")
In order to merge following patch to drm-intel-gt-next:
https://patchwork.freedesktop.org/patch/530942/?series=114925&rev=6
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
Revision tags: v6.1.23, v6.1.22 |
|
#
cecdd52a |
| 28-Mar-2023 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Catch up with 6.3-rc cycle...
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
Revision tags: v6.1.21 |
|
#
e752ab11 |
| 20-Mar-2023 |
Rob Clark <robdclark@chromium.org> |
Merge remote-tracking branch 'drm/drm-next' into msm-next
Merge drm-next into msm-next to pick up external clk and PM dependencies for improved a6xx GPU reset sequence.
Signed-off-by: Rob Clark <ro
Merge remote-tracking branch 'drm/drm-next' into msm-next
Merge drm-next into msm-next to pick up external clk and PM dependencies for improved a6xx GPU reset sequence.
Signed-off-by: Rob Clark <robdclark@chromium.org>
show more ...
|
#
d26a3a6c |
| 17-Mar-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.3-rc2' into next
Merge with mainline to get of_property_present() and other newer APIs.
|
Revision tags: v6.1.20, v6.1.19 |
|
#
b3c9a041 |
| 13-Mar-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-fixes into drm-misc-fixes
Backmerging to get latest upstream.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
#
a1eccc57 |
| 13-Mar-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get v6.3-rc1 and sync with the other DRM trees.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14 |
|
#
0df82189 |
| 23-Feb-2023 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.3-1-2023-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo: "Miscellaneous:
- Add Ian Rogers
Merge tag 'perf-tools-for-v6.3-1-2023-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo: "Miscellaneous:
- Add Ian Rogers to MAINTAINERS as a perf tools reviewer.
- Add support for retire latency feature (pipeline stall of a instruction compared to the previous one, in cycles) present on some Intel processors.
- Add 'perf c2c' report option to show false sharing with adjacent cachelines, to be used in machines with cacheline prefetching, where accesses to a cacheline brings the next one too.
- Skip 'perf test bpf' when the required kernel-debuginfo package isn't installed.
- Avoid d3-flame-graph package dependency in 'perf script flamegraph', making this feature more generally available.
- Add JSON metric events to present CPI stall cycles in Power10.
- Assorted improvements/refactorings on the JSON metrics parsing code.
perf lock contention:
- Add -o/--lock-owner option:
$ sudo ./perf lock contention -abo -- ./perf bench sched pipe # Running 'sched/pipe' benchmark: # Executed 1000000 pipe operations between two processes
Total time: 4.766 [sec]
4.766540 usecs/op 209795 ops/sec contended total wait max wait avg wait pid owner
403 565.32 us 26.81 us 1.40 us -1 Unknown 4 27.99 us 8.57 us 7.00 us 1583145 sched-pipe 1 8.25 us 8.25 us 8.25 us 1583144 sched-pipe 1 2.03 us 2.03 us 2.03 us 5068 chrome
The owner is unknown in most cases. Filtering only for the mutex locks, it will more likely get the owners.
- -S/--callstack-filter is to limit display entries having the given string in the callstack:
$ sudo ./perf lock contention -abv -S net sleep 1 ... contended total wait max wait avg wait type caller
5 70.20 us 16.13 us 14.04 us spinlock __dev_queue_xmit+0xb6d 0xffffffffa5dd1c60 _raw_spin_lock+0x30 0xffffffffa5b8f6ed __dev_queue_xmit+0xb6d 0xffffffffa5cd8267 ip6_finish_output2+0x2c7 0xffffffffa5cdac14 ip6_finish_output+0x1d4 0xffffffffa5cdb477 ip6_xmit+0x457 0xffffffffa5d1fd17 inet6_csk_xmit+0xd7 0xffffffffa5c5f4aa __tcp_transmit_skb+0x54a 0xffffffffa5c6467d tcp_keepalive_timer+0x2fd
Please note that to have the -b option (BPF) working above one has to build with BUILD_BPF_SKEL=1.
- Add more 'perf test' entries to test these new features.
perf script:
- Add 'cgroup' field for 'perf script' output:
$ perf record --all-cgroups -- true $ perf script -F comm,pid,cgroup true 337112 /user.slice/user-657345.slice/user@657345.service/... true 337112 /user.slice/user-657345.slice/user@657345.service/... true 337112 /user.slice/user-657345.slice/user@657345.service/... true 337112 /user.slice/user-657345.slice/user@657345.service/...
- Add support for showing branch speculation information in 'perf script' and in the 'perf report' raw dump (-D).
perf record:
- Fix 'perf record' segfault with --overwrite and --max-size.
perf test/bench:
- Switch basic BPF filtering test to use syscall tracepoint to avoid the variable number of probes inserted when using the previous probe point (do_epoll_wait) that happens on different CPU architectures.
- Fix DWARF unwind test by adding non-inline to expected function in a backtrace.
- Use 'grep -c' where the longer form 'grep | wc -l' was being used.
- Add getpid and execve benchmarks to 'perf bench syscall'.
Intel PT:
- Add support for synthesizing "cycle" events from Intel PT traces as we support "instruction" events when Intel PT CYC packets are available. This enables much more accurate profiles than when using the regular 'perf record -e cycles' (the default) when the workload lasts for very short periods (<10ms).
- .plt symbol handling improvements, better handling IBT (in the past MPX) done in the context of decoding Intel PT processor traces, IFUNC symbols on x86_64, static executables, understanding .plt.got symbols on x86_64.
- Add a 'perf test' to test symbol resolution, part of the .plt improvements series, this tests things like symbol size in contexts where only the symbol start is available (kallsyms), etc.
- Better handle auxtrace/Intel PT data when using pipe mode (perf record sleep 1|perf report).
- Fix symbol lookup with kcore with multiple segments match stext, getting the symbol resolution to just show DSOs as unknown.
ARM:
- Timestamp improvements for ARM64 systems with ETMv4 (Embedded Trace Macrocell v4).
- Ensure ARM64 CoreSight timestamps don't go backwards.
- Document that ARM64 SPE (Statistical Profiling Extension) is used with 'perf c2c/mem'.
- Add raw decoding for ARM64 SPEv1.2 previous branch address.
- Update neoverse-n2-v2 ARM vendor events (JSON tables): topdown L1, TLB, cache, branch, PE utilization and instruction mix metrics.
- Update decoder code for OpenCSD version 1.4, on ARM64 systems.
- Fix command line auto-complete of CPU events on aarch64.
Build:
- Fix 'perf probe' and 'perf test' when libtraceevent isn't linked, as several tests use tracepoints, those should be skipped.
- More fallout fixes for the removal of tools/lib/traceevent/.
- Fix build error when linking with libpfm"
* tag 'perf-tools-for-v6.3-1-2023-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (114 commits) perf tests stat_all_metrics: Change true workload to sleep workload for system wide check perf vendor events power10: Add JSON metric events to present CPI stall cycles in powerpc perf intel-pt: Synthesize cycle events perf c2c: Add report option to show false sharing in adjacent cachelines perf record: Fix segfault with --overwrite and --max-size perf stat: Avoid merging/aggregating metric counts twice perf tools: Fix perf tool build error in util/pfm.c perf tools: Fix auto-complete on aarch64 perf lock contention: Support old rw_semaphore type perf lock contention: Add -o/--lock-owner option perf lock contention: Fix to save callstack for the default modified perf test bpf: Skip test if kernel-debuginfo is not present perf probe: Update the exit error codes in function try_to_find_probe_trace_event perf script: Fix missing Retire Latency fields option documentation perf event x86: Add retire_lat when synthesizing PERF_SAMPLE_WEIGHT_STRUCT perf test x86: Support the retire_lat (Retire Latency) sample_type check perf test bpf: Check for libtraceevent support perf script: Support Retire Latency perf report: Support Retire Latency perf lock contention: Support filters for different aggregation ...
show more ...
|
Revision tags: v6.1.13 |
|
#
7ae9fb1b |
| 21-Feb-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.3 merge window.
|
Revision tags: v6.2, v6.1.12, v6.1.11, v6.1.10 |
|
#
766b0bee |
| 01-Feb-2023 |
Athira Rajeev <atrajeev@linux.vnet.ibm.com> |
perf tests shell: Fix check for libtracevent support
Test “Use vfs_getname probe to get syscall args filenames” fails in environment with missing libtraceevent support as below:
82: Use vfs_getna
perf tests shell: Fix check for libtracevent support
Test “Use vfs_getname probe to get syscall args filenames” fails in environment with missing libtraceevent support as below:
82: Use vfs_getname probe to get syscall args filenames : --- start --- test child forked, pid 304726 Recording open file: event syntax error: 'probe:vfs_getname*' \___ unsupported tracepoint
libtraceevent is necessary for tracepoint support Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events test child finished with -1 ---- end ---- Use vfs_getname probe to get syscall args filenames: FAILED!
The environment has debuginfo but is missing the libtraceevent devel.
Hence perf is compiled without libtraceevent support. The test tries to add probe “probe:vfs_getname” and then uses it with “perf record”. This fails at function “parse_events_add_tracepoint" due to missing libtraceevent.
Similarly "probe libc's inet_pton & backtrace it with ping" test slso fails with same reason.
Add a function in 'perf test shell' library to check if perf record with —dry-run reports any error on missing support for libtraceevent. Update both the tests to use this new function “skip_no_probe_record_support” before proceeding With using probe point via perf builtin record.
With the change,
82: Use vfs_getname probe to get syscall args filenames : --- start --- test child forked, pid 305014 Recording open file: libtraceevent is necessary for tracepoint support test child finished with -2 ---- end ---- Use vfs_getname probe to get syscall args filenames: Skip
81: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 305036 libtraceevent is necessary for tracepoint support test child finished with -2 ---- end ---- probe libc's inet_pton & backtrace it with ping: Skip
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kjain@linux.ibm.com, Cc: linuxppc-dev@lists.ozlabs.org Link: http://lore.kernel.org/r/20230201180421.59640-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.1.9, v6.1.8 |
|
#
6f849817 |
| 19-Jan-2023 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging into drm-misc-next to get DRM accelerator infrastructure, which is required by ipuv driver.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v6.1.7 |
|
#
d0e99511 |
| 17-Jan-2023 |
Kalle Valo <kvalo@kernel.org> |
Merge wireless into wireless-next
Due to the two cherry picked commits from wireless to wireless-next we have several conflicts in mt76. To avoid any bugs with conflicts merge wireless into wireless
Merge wireless into wireless-next
Due to the two cherry picked commits from wireless to wireless-next we have several conflicts in mt76. To avoid any bugs with conflicts merge wireless into wireless-next.
96f134dc1964 wifi: mt76: handle possible mt76_rx_token_consume failures fe13dad8992b wifi: mt76: dma: do not increment queue head if mt76_dma_add_buf fails
show more ...
|
Revision tags: v6.1.6, v6.1.5, v6.0.19 |
|
#
407da561 |
| 09-Jan-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.2-rc3' into next
Merge with mainline to bring in timer_shutdown_sync() API.
|
Revision tags: v6.0.18, v6.1.4, v6.1.3, v6.0.17 |
|
#
2c55d703 |
| 03-Jan-2023 |
Maxime Ripard <maxime@cerno.tech> |
Merge drm/drm-fixes into drm-misc-fixes
Let's start the fixes cycle.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
#
0d8eae7b |
| 02-Jan-2023 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync up with v6.2-rc1.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
Revision tags: v6.1.2, v6.0.16 |
|
#
b501d4dc |
| 30-Dec-2022 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Sync after v6.2-rc1 landed in drm-next.
We need to get some dependencies in place before we can merge the fixes series from Gwan-gyeong and Chris.
Referen
Merge drm/drm-next into drm-intel-gt-next
Sync after v6.2-rc1 landed in drm-next.
We need to get some dependencies in place before we can merge the fixes series from Gwan-gyeong and Chris.
References: https://lore.kernel.org/all/Y6x5JCDnh2rvh4lA@intel.com/ Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
#
6599e683 |
| 28-Dec-2022 |
Mauro Carvalho Chehab <mchehab@kernel.org> |
Merge tag 'v6.2-rc1' into media_tree
Linux 6.2-rc1
* tag 'v6.2-rc1': (14398 commits) Linux 6.2-rc1 treewide: Convert del_timer*() to timer_shutdown*() pstore: Properly assign mem_type propert
Merge tag 'v6.2-rc1' into media_tree
Linux 6.2-rc1
* tag 'v6.2-rc1': (14398 commits) Linux 6.2-rc1 treewide: Convert del_timer*() to timer_shutdown*() pstore: Properly assign mem_type property pstore: Make sure CONFIG_PSTORE_PMSG selects CONFIG_RT_MUTEXES cfi: Fix CFI failure with KASAN perf python: Fix splitting CC into compiler and options afs: Stop implementing ->writepage() afs: remove afs_cache_netfs and afs_zap_permits() declarations afs: remove variable nr_servers afs: Fix lost servers_outstanding count ALSA: usb-audio: Add new quirk FIXED_RATE for JBL Quantum810 Wireless ALSA: azt3328: Remove the unused function snd_azf3328_codec_outl() gcov: add support for checksum field test_maple_tree: add test for mas_spanning_rebalance() on insufficient data maple_tree: fix mas_spanning_rebalance() on insufficient data hugetlb: really allocate vma lock for all sharable vmas kmsan: export kmsan_handle_urb kmsan: include linux/vmalloc.h mm/mempolicy: fix memory leak in set_mempolicy_home_node system call mm, mremap: fix mremap() expanding vma with addr inside vma ...
show more ...
|
#
c183e6c3 |
| 21-Dec-2022 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
Revision tags: v6.1.1, v6.0.15, v6.0.14 |
|
#
563a5423 |
| 16-Dec-2022 |
Andrew Morton <akpm@linux-foundation.org> |
Merge branch 'master' into mm-nonmm-stable
|
#
bcfbff2e |
| 16-Dec-2022 |
Andrew Morton <akpm@linux-foundation.org> |
Merge branch 'master' into mm-stable
|
#
aa4800e3 |
| 16-Dec-2022 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.2-1-2022-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo: "Libraries:
- Drop the old copy o
Merge tag 'perf-tools-for-v6.2-1-2022-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo: "Libraries:
- Drop the old copy of libtraceevent in tools/lib/traceevent/ now that all major distros ship it from its external repository.
This is now just another feature detection, emitting a warning when the libtraceevent-dev[el] package isn't installed, disabling the build of perf features and tools that strictly require parsing things from tracefs while keeping the core functionality present and working with a subset of the events, the most used ones like CPU cycles, hardware cache and also vendor events, etc.
This was tested with lots of containers for Fedora, Debian, OpenSUSE, Alpine Linux, Ubuntu, with cross builds, etc.
Build:
- Update to C standard to gnu11, like was done for the kernel.
- Install the tools/lib/ libraries locally instead of having headers searched directly from the source code directories, to help the cases where we can build either from in-kernel source libraries or from the same library shipped as a distro package, as is the case with libbpf and was the case with libtraceevent.
perf stat:
- Do not delay the workload with --delay, the delay is just for starting to count the events, to skip noise at workload startup.
- When we have events for each cgroup, the metric should be printed for each cgroup separately.
$ perf stat -a --for-each-cgroup system.slice,user.slice --metric-only sleep 1
Performance counter stats for 'system wide':
GHz insn per cycle branch-misses of all branches system.slice 3.792 0.61 3.24% user.slice 3.661 2.32 0.37%
- Fix printing field separator in CSV metrics output.
- Fix --metric-only --json output.
- Fix summary output in CSV with --metric-only.
- Update event group check for support of uncore event.
perf test:
- Stop requiring a C toolchain in shell tests, instead add a workload option that has all the previously C snippets built as part of 'perf test -w' that then get used in the 'perf test' shell scripts.
- Add event group test for events in multiple PMUs
- The "kernel lock contention analysis" test should not print warnings in quiet mode.
- Add attr tests for ARM64's new VG register.
- Fix record test on KVM guests, as using precise flag with the br_inst_retired.near_call event causes the test fail on KVM guests, even when the guests have PMU forwarding enabled and the event itself is supported, so just remove the precise flag from the event.
- Add mechanism for skipping attr tests on specific kernel versions where it is known that these checks will fail.
- Skip watchpoint tests if no watchpoints available.
- Add more Intel PT 'perf test' entries: hybrid CPUs, split the packet decoder into a suite of subtests.
perf script:
- Introduce task analyzer python script, where one first records some events:
Recording can be done in two ways:
$ perf script record tasks-analyzer -- sleep 10 $ perf record -e sched:sched_switch -a -- sleep 10
The script can parse any perf.data files, as long as it has sched:sched_switch events, other events will be ignored.
The most simple report use case is to just call the script without arguments.
Runtime is the time the task was running on the CPU, Time Out-In is the time between the process being scheduled *out* and scheduled back *in*. So the last time span between two executions:
$ perf script report tasks-analyzer Switched-In Switched-Out CPU PID TID Comm Runtime Time Out-In 15576.658891407 15576.659156086 4 2412 2428 gdbus 265 1949 15576.659111320 15576.659455410 0 2412 2412 gnome-shell 344 2267 15576.659491326 15576.659506173 2 74 74 kworker/2:1 15 13145 15576.659506173 15576.659825748 2 2858 2858 gnome-terminal- 320 63263 15576.659871270 15576.659902872 6 20932 20932 kworker/u16:0 32 2314582 15576.659909951 15576.659945501 3 27264 27264 sh 36 -1 15576.659853285 15576.659971052 7 27265 27265 perf 118 5050741 [...]
perf lock:
- Allow concurrent record and report to support live monitoring of kernel lock contention without BPF:
# perf lock record -a -o- sleep 1 | perf lock contention -i- contended total wait max wait avg wait type caller
2 10.27 us 6.17 us 5.13 us spinlock load_balance+0xc03 1 5.29 us 5.29 us 5.29 us rwlock:W ep_scan_ready_list+0x54 1 4.12 us 4.12 us 4.12 us spinlock smpboot_thread_fn+0x116 1 3.28 us 3.28 us 3.28 us mutex pipe_read+0x50
- Implement -t/--threads option when using BPF:
$ sudo ./perf lock contention -abt -E 5 sleep 1 contended total wait max wait avg wait pid comm
1 740.66 ms 740.66 ms 740.66 ms 1950 nv_queue 3 305.50 ms 298.19 ms 101.83 ms 1884 nvidia-modeset/ 1 25.14 us 25.14 us 25.14 us 2725038 EventManager_De 12 23.09 us 9.30 us 1.92 us 0 swapper 1 20.18 us 20.18 us 20.18 us 2725033 EventManager_De
- Add -l/--lock-addr to aggregate per-lock-instance contention:
$ sudo ./perf lock contention -abl sleep 1 contended total wait max wait avg wait address symbol
1 36.28 us 36.28 us 36.28 us ffff92615d6448b8 9 10.91 us 1.84 us 1.21 us ffffffffbaed50c0 rcu_state 1 10.49 us 10.49 us 10.49 us ffff9262ac4f0c80 8 4.68 us 1.67 us 585 ns ffffffffbae07a40 jiffies_lock 3 3.03 us 1.45 us 1.01 us ffff9262277861e0 1 924 ns 924 ns 924 ns ffff926095ba9d20 1 436 ns 436 ns 436 ns ffff9260bfda4f60
perf record:
- Add remaining branch filters: "no_cycles", "no_flags" & "hw_index", to be used with hardware such as Intel's LBR that allows things like stitching stacks of two samples to overcome the limits of the number of LBR registers.
Symbol resolution:
- Handle .debug files created with 'objcopy --only-keep-debug', where program headers are zeroed and thus can't be used for adjustments, use the info in the runtime_ss (runtime ELF) instead.
perf trace:
- Add BPF based augmenter for the 'perf_event_open's 'struct perf_event_attr' argument.
- Add BPF based augmenter for the 'clock_gettime's 'struct timespec' argument.
- In both cases the syscall tracepoint has just the pointer value, we need to hook a BPF program to collect the pointer contents, and then, in userspace, pretty print it in 'perf trace'.
perf list:
- Introduce JSON output of events.
- Streamline how the expression specifying what events should be shown is handled, fixing several corner cases, such as the metric filter that is specified as a glob but was using strstr().
perf probe:
- Fix to avoid crashing if DW_AT_decl_file is NULL, coping with clang generating DWARF5 like that.
- Use dwarf_attr_integrate() as generic DWARF attr accessor as it supersedes dwarf_attr(), supporting abstact origin DIEs.
perf inject:
- Set PERF_RECORD_MISC_BUILD_ID_SIZE in the PERF_RECORD_HEADER_BUILD_ID so that perf.data readers can get the real build-id size and avoid trailing zeroes.
perf data:
- Add tracepoint fields when converting a perf.data file to JSON.
arm64:
- Fix mksyscalltbl, don't lose syscalls due to sort -nu.
- Add Arm Neoverse V2 PMU events.
riscv:
- Add riscv sbi firmware std event files.
- Add Sifive U74 vendor events (JSON) file.
- Add some more events and metrics for Alderlake/Alderlake-N.
Documentation:
- Add data documentation for the PMU structs in the C source code.
Miscellaneous:
- Periodic sanitization of headers, adding missing includes, removing needless ones, creating new ones, etc.
- Use sig_atomic_t for signal handlers to avoid undefined behaviour in all perf tools.
- Fixes for libbpf 1.0+ compatibility (maps, etc) on 'perf trace' BPF examples.
- Remove some old perf bpf examples, leave the best ones that demonstrate how to associate BPF functions to points in the kernel.
- Make quiet mode consistent between tools.
- Use dedicated non-atomic clear/set bit helpers.
- Use "grep -E" instead of "egrep" as recommended by warning emitted by GNU grep since at least version 3.8.
- Complete list of supported subcommands in the 'perf daemon' help message.
- Update John Garry's email address for arm64 perf tooling on the MAINTAINERS file, he moved from Huawei to Oracle"
* tag 'perf-tools-for-v6.2-1-2022-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (239 commits) libperf: Fix install_pkgconfig target perf tools: Use "grep -E" instead of "egrep" perf stat: Do not delay the workload with --delay perf evlist: Remove group option. perf build: Fix python/perf.so library's name perf test arm64: Add attr tests for new VG register perf test: Add mechanism for skipping attr tests on kernel versions perf test: Add mechanism for skipping attr tests on auxiliary vector values perf test: Add ability to test exit code for attr tests perf test: add new task-analyzer tests perf script: task-analyzer add csv support perf script: Introduce task analyzer python script perf cs-etm: Print auxtrace info even if OpenCSD isn't linked perf cs-etm: Cleanup cs_etm__process_auxtrace_info() perf cs-etm: Tidy up auxtrace info header printing perf cs-etm: Remove unused stub methods perf cs-etm: Print unknown header version as an error perf test: Update perf lock contention test perf lock contention: Add -l/--lock-addr option perf lock contention: Implement -t/--threads option for BPF ...
show more ...
|