#
747a9b0a |
| 14-Jan-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar: "Tooling fixes, the biggest patch is one that decouples the kernel's
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar: "Tooling fixes, the biggest patch is one that decouples the kernel's list.h from tooling list.h"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) perf tools: Fallback to srcdir/Documentation/tips.txt perf ui/tui: Print helpline message as is perf tools: Set and pass DOCDIR to builtin-report.c perf tools: Add file_only config option to strlist perf tools: Add more usage tips perf record: Add --buildid-all option tools subcmd: Add missing NORETURN define for parse-options.h tools: Fix formatting of the "make -C tools" help message tools: Make list.h self-sufficient perf tools: Fix mmap2 event allocation in synthesize code perf stat: Fix recort_usage typo perf test: Reset err after using it hold errcode in hist testcases perf test: Fix false TEST_OK result for 'perf test hist' tools build: Add BPF feature check to test-all perf bpf: Fix build breakage due to libbpf tools: Move Makefile.arch from perf/config to tools/scripts perf tools: Fix PowerPC native building perf tools: Fix phony build target for build-test perf tools: Add -lutil in python lib list for broken python-config perf tools: Add missing sources to perf's MANIFEST ...
show more ...
|
#
ddb5388f |
| 12-Jan-2016 |
David S. Miller <davem@davemloft.net> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
|
#
c0c57019 |
| 12-Jan-2016 |
Ingo Molnar <mingo@kernel.org> |
Merge commit 'linus' into x86/urgent, to pick up recent x86 changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
0bd106d2 |
| 12-Jan-2016 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
" - Fix a few clean targets in to
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
" - Fix a few clean targets in tools/ (Jiri Olsa)
- Add missing sources to perf's MANIFEST, fixing the out of tree build with 'make perf-tar*-src-pkg' tarballs (Jiri Olsa)
- Fix bpf related build problems in PowerPC (Naveen N. Rao, Wang Nan)
- 'make -C tools/perf build-test' fixes (Wang Nan)
- Fix 'perf test hist' entry (Wang Nan)
- Add BPF feature check to test-all, as in an environment with all other features enabled, BPF would be considered enabled without doing real feature check. (Wang Nan)"
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
5cb52b5e |
| 11-Jan-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Kernel side changes:
- Intel Knights Landing support. (Harish C
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Kernel side changes:
- Intel Knights Landing support. (Harish Chegondi)
- Intel Broadwell-EP uncore PMU support. (Kan Liang)
- Core code improvements. (Peter Zijlstra.)
- Event filter, LBR and PEBS fixes. (Stephane Eranian)
- Enable cycles:pp on Intel Atom. (Stephane Eranian)
- Add cycles:ppp support for Skylake. (Andi Kleen)
- Various x86 NMI overhead optimizations. (Andi Kleen)
- Intel PT enhancements. (Takao Indoh)
- AMD cache events fix. (Vince Weaver)
Tons of tooling changes:
- Show random perf tool tips in the 'perf report' bottom line (Namhyung Kim)
- perf report now defaults to --group if the perf.data file has grouped events, try it with:
# perf record -e '{cycles,instructions}' -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ] # perf report # Samples: 1K of event 'anon group { cycles, instructions }' # Event count (approx.): 1955219195 # # Overhead Command Shared Object Symbol
2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle 1.05% 0.33% firefox libxul.so [.] js::SetObjectElement 1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno 0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab 0.65% 0.86% firefox libxul.so [.] js::ValueToId<(js::AllowGC)1> 0.64% 0.23% JS Helper libxul.so [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay 0.62% 1.27% firefox libxul.so [.] js::GetIterator 0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty 0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining
- Introduce the 'perf stat record/report' workflow:
Generate perf.data files from 'perf stat', to tap into the scripting capabilities perf has instead of defining a 'perf stat' specific scripting support to calculate event ratios, etc.
Simple example:
$ perf stat record -e cycles usleep 1
Performance counter stats for 'usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$ perf stat report
Performance counter stats for '/home/acme/bin/perf stat record -e cycles usleep 1':
1,134,996 cycles
0.000670644 seconds time elapsed
$
It generates PERF_RECORD_ userspace records to store the details:
$ perf report -D | grep PERF_RECORD 0xf0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27637 0x118 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535 0x12a [0x40]: PERF_RECORD_STAT_CONFIG 0x16a [0x30]: PERF_RECORD_STAT -1 -1 0x19a [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text 0x1da [0x18]: PERF_RECORD_STAT_ROUND [acme@ssdandy linux]$
An effort was made to make perf.data files generated like this to not generate cryptic messages when processed by older tools.
The 'perf script' bits need rebasing, will go up later.
- Make command line options always available, even when they depend on some feature being enabled, warning the user about use of such options (Wang Nan)
- Support hw breakpoint events (mem:0xAddress) in the default output mode in 'perf script' (Wang Nan)
- Fixes and improvements for supporting annotating ARM binaries, support ARM call and jump instructions, more work needed to have arch specific stuff separated into tools/perf/arch/*/annotate/ (Russell King)
- Add initial 'perf config' command, for now just with a --list command to the contents of the configuration file in use and a basic man page describing its format, commands for doing edits and detailed documentation are being reviewed and proof-read. (Taeung Song)
- Allows BPF scriptlets specify arguments to be fetched using DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan)
- Allow attaching BPF scriptlets to module symbols (Wang Nan)
- Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan)
- BPF programs now can specify 'perf probe' tunables via its section name, separating key=val values using semicolons (Wang Nan)
Testing some of these new BPF features:
Use case: get callchains when receiving SSL packets, filter then in the kernel, at arbitrary place.
# cat ssl.bpf.c #define SEC(NAME) __attribute__((section(NAME), used))
struct pt_regs;
SEC("func=__inet_lookup_established hnum") int func(struct pt_regs *ctx, int err, unsigned short port) { return err == 0 && port == 443; }
char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; # # perf record -a -g -e ssl.bpf.c ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ] # perf script | head -30 swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux) 856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux) 2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux) 2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux) 96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux) 969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux) 2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux) 95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux) 1163ffa start_kernel ([kernel.vmlinux].init.text) 11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text) 1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)
qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux) 8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux) 430a br_handle_frame_finish ([bridge]) 48bc br_handle_frame ([bridge]) 855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) #
- Use 'perf probe' various options to list functions, see what variables can be collected at any given point, experiment first collecting without a filter, then filter, use it together with 'perf trace', 'perf top', with or without callchains, if it explodes, please tell us!
- Introduce a new callchain mode: "folded", that will list per line representations of all callchains for a give histogram entry, facilitating 'perf report' output processing by other tools, such as Brendan Gregg's flamegraph tools (Namhyung Kim)
E.g:
# perf report | grep -v ^# | head 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry | ---cpu_startup_entry | |--12.07%--start_secondary | --6.30%--rest_init start_kernel x86_64_start_reservations x86_64_start_kernel #
Becomes, in "folded" mode:
# perf report -g folded | grep -v ^# | head -5 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry 12.07% cpu_startup_entry;start_secondary 6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle 11.23% call_cpuidle;cpu_startup_entry;start_secondary 5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter 11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state #
The user can also select one of "count", "period" or "percent" as the first column.
... and lots of infrastructure enhancements, plus fixes and other changes, features I failed to list - see the shortlog and the git log for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (271 commits) perf evlist: Add --trace-fields option to show trace fields perf record: Store data mmaps for dwarf unwind perf libdw: Check for mmaps also in MAP__VARIABLE tree perf unwind: Check for mmaps also in MAP__VARIABLE tree perf unwind: Use find_map function in access_dso_mem perf evlist: Remove perf_evlist__(enable|disable)_event functions perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does) perf report: Show random usage tip on the help line perf hists: Export a couple of hist functions perf diff: Use perf_hpp__register_sort_field interface perf tools: Add overhead/overhead_children keys defaults via string perf tools: Remove list entry from struct sort_entry perf tools: Include all tools/lib directory for tags/cscope/TAGS targets perf script: Align event name properly perf tools: Add missing headers in perf's MANIFEST perf tools: Do not show trace command if it's not compiled in perf report: Change default to use event group view perf top: Decay periods in callchains tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/ tools lib: Sync tools/lib/find_bit.c with the kernel ...
show more ...
|
#
b0500c16 |
| 11-Jan-2016 |
Wang Nan <wangnan0@huawei.com> |
perf test: Reset err after using it hold errcode in hist testcases
All hists test cases forget to reset err after using it to hold an error code. If error occure in setup_fake_machine() it incorrect
perf test: Reset err after using it hold errcode in hist testcases
All hists test cases forget to reset err after using it to hold an error code. If error occure in setup_fake_machine() it incorrectly return TEST_OK.
This patch fixes it.
Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1452520124-2073-13-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v4.4 |
|
#
3eb9ede2 |
| 09-Jan-2016 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Allo
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Allow using trace events fields as sort order keys, making 'perf evlist --trace_fields' show those, and then the user can select a subset and use like:
perf top -e sched:sched_switch -s prev_comm,next_comm
That works as well in 'perf report' when handling files containing tracepoints.
The default when just tracepoint events are found in a perf.data file is to format it like ftrace, using the libtraceevent formatters, plugins, etc (Namhyung Kim)
- Add support in 'perf script' to process 'perf stat record' generated files, culminating in a python perf script that calculates CPI (Cycles per Instruction) (Jiri Olsa)
- Show random perf tool tips in the 'perf report' bottom line (Namhyung Kim)
- perf report now defaults to --group if the perf.data file has grouped events, try it with:
# perf record -e '{cycles,instructions}' -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ] # perf report # Samples: 1K of event 'anon group { cycles, instructions }' # Event count (approx.): 1955219195 # # Overhead Command Shared Object Symbol
2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle 1.05% 0.33% firefox libxul.so [.] js::SetObjectElement 1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno 0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab 0.65% 0.86% firefox libxul.so [.] js::ValueToId<(js::AllowGC)1> 0.64% 0.23% JS Helper libxul.so [.] js::SplayTree<js::jit::LiveRange*, js::jit::LiveRange>::splay 0.62% 1.27% firefox libxul.so [.] js::GetIterator 0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty 0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining
User visible fixes:
- Coect data mmaps so that the DWARF unwinder can handle usecases needing them, like softice (Jiri Olsa)
- Decay callchains in fractal mode, fixing up cases where 'perf top -g' would show entries with more than 100% (Namhyung Kim)
Infrastructure changes:
- Sync tools/lib with the lib/ in the kernel sources for find_bit.c and move bitmap.[ch] from tools/perf/util/ to tools/lib/ (Arnaldo Carvalho de Melo)
- No need to set attr.sample_freq in some 'perf test' entries that only want to deal with PERF_RECORD_ meta-events, improve a bit error output for CQM test (Arnaldo Carvalho de Melo)
- Fix python binding build, adding some missing object files now required due to cpumap using find_bit stuff (Arnaldo Carvalho de Melo)
- tools/build improvemnts (Jiri Olsa)
- Add more files to cscope/ctags databases (Jiri Olsa)
- Do not show 'trace' in 'perf help' if it is not compiled in (Jiri Olsa)
- Make perf_evlist__open() open evsels with their cpus and threads, like perf record does, making them consistent (Adrian Hunter)
- Fix pmu snapshot initialization bug (Stephane Eranian)
- Add missing headers in perf's MANIFEST (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
40184c46 |
| 22-Dec-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf tools: Pass evlist to setup_sorting()
This is a preparation to support dynamic sort keys for tracepoint events. Dynamic sort keys can be created for specific fields in trace events so it needs
perf tools: Pass evlist to setup_sorting()
This is a preparation to support dynamic sort keys for tracepoint events. Dynamic sort keys can be created for specific fields in trace events so it needs the event information.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-5-git-send-email-namhyung@kernel.org [ Moving the evlist creation earlier in top was split to a previous patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: openbmc-20151217-1, openbmc-20151210-1, openbmc-20151202-1 |
|
#
8c2accc8 |
| 23-Nov-2015 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Allows BPF scriptlets specify arguments to be fetched using DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan)
- Allow attaching BPF scriptlets to module symbols (Wang Nan)
- Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan)
- BPF programs now can specify 'perf probe' tunables via its section name, separating key=val values using semicolons (Wang Nan)
Testing some of these new BPF features:
Use case: get callchains when receiving SSL packets, filter then in the kernel, at arbitrary place.
# cat ssl.bpf.c #define SEC(NAME) __attribute__((section(NAME), used))
struct pt_regs;
SEC("func=__inet_lookup_established hnum") int func(struct pt_regs *ctx, int err, unsigned short port) { return err == 0 && port == 443; }
char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; # # perf record -a -g -e ssl.bpf.c ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ] # perf script | head -30 swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux) 856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux) 2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux) 2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux) 96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux) 969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux) 2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux) 95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux) 1163ffa start_kernel ([kernel.vmlinux].init.text) 11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text) 1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)
qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb 8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux) 896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux) 8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux) 855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) 856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux) 8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux) 430a br_handle_frame_finish ([bridge]) 48bc br_handle_frame ([bridge]) 855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux) 8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux) #
Use 'perf probe' various options to list functions, see what variables can be collected at any given point, experiment first collecting without a filter, then filter, use it together with 'perf trace', 'perf top', with or without callchains, if it explodes, please tell us!
- Introduce a new callchain mode: "folded", that will list per line representations of all callchains for a give histogram entry, facilitating 'perf report' output processing by other tools, such as Brendan Gregg's flamegraph tools (Namhyung Kim)
E.g:
# perf report | grep -v ^# | head 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry | ---cpu_startup_entry | |--12.07%--start_secondary | --6.30%--rest_init start_kernel x86_64_start_reservations x86_64_start_kernel #
Becomes, in "folded" mode:
# perf report -g folded | grep -v ^# | head -5 18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry 12.07% cpu_startup_entry;start_secondary 6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle 11.23% call_cpuidle;cpu_startup_entry;start_secondary 5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter 11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel 15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state #
The user can also select one of "count", "period" or "percent" as the first column.
Infrastructure changes:
- Fix multiple leaks found with Valgrind and a refcount debugger (Masami Hiramatsu)
- Add further 'perf test' entries for BPF and LLVM (Wang Nan)
- Improve 'perf test' to suport subtests, so that the series of tests performed in the LLVM and BPF main tests appear in the default 'perf test' output (Wang Nan)
- Move memdup() from tools/perf to tools/lib/string.c (Arnaldo Carvalho de Melo)
- Adopt strtobool() from the kernel into tools/lib/ (Wang Nan)
- Fix selftests_install tools/ Makefile rule (Kevin Hilman)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: openbmc-20151123-1 |
|
#
721a1f53 |
| 19-Nov-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf tests: Pass the subtest index to each test routine
Some tests have sub-tests we want to run, so allow passing this.
Wang tried to avoid having to touch all tests, but then, having the test.fun
perf tests: Pass the subtest index to each test routine
Some tests have sub-tests we want to run, so allow passing this.
Wang tried to avoid having to touch all tests, but then, having the test.func in an anonymous union makes the build fail on older compilers, like the one in RHEL6, where:
test a = { .func = foo, };
fails.
To fix it leave the func pointer in the main structure and pass the subtest index to all tests, end result function is the same, but we have just one function pointer, not two, with and without the subtest index as an argument.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5genj0ficwdmelpoqlds0u4y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: openbmc-20151118-1, openbmc-20151104-1, v4.3, openbmc-20151102-1, openbmc-20151028-1, v4.3-rc1 |
|
#
01b944fe |
| 03-Sep-2015 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare first round of input updates for 4.3 merge window.
|
Revision tags: v4.2 |
|
#
8d58b66e |
| 25-Aug-2015 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'v4.2-rc8' into x86/mm, before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
Revision tags: v4.2-rc8, v4.2-rc7, v4.2-rc6, v4.2-rc5 |
|
#
527c465a |
| 27-Jul-2015 |
Takashi Iwai <tiwai@suse.de> |
Merge branch 'for-linus' into for-next
... to make easier developing HDA ext code.
|
Revision tags: v4.2-rc4 |
|
#
43cbf02e |
| 24-Jul-2015 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v4.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v4.2
A lot of small fixes here, a few to the core:
- Fix for binding DA
Merge tag 'asoc-fix-v4.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v4.2
A lot of small fixes here, a few to the core:
- Fix for binding DAPM stream widgets on devices with prefixes assigned to them - Minor fixes for the newly added topology interfaces - Locking and memory leak fixes for DAPM - Driver specific fixes
show more ...
|
#
c57d5621 |
| 20-Jul-2015 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v4.2-rc3' into next
Sync up with Linux 4.2-rc3 to bring in infrastructure (OF) pieces.
|
Revision tags: v4.2-rc3 |
|
#
ca6e4405 |
| 15-Jul-2015 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
Merge tag 'drm-intel-fixes-2015-07-15' into drm-intel-next-queued
Backmerge fixes since it's getting out of hand again with the massive split due to atomic between -next and 4.2-rc. All the bugfixes
Merge tag 'drm-intel-fixes-2015-07-15' into drm-intel-next-queued
Backmerge fixes since it's getting out of hand again with the massive split due to atomic between -next and 4.2-rc. All the bugfixes in 4.2-rc are addressed already (by converting more towards atomic instead of minimal duct-tape) so just always pick the version in next for the conflicts in modeset code.
All the other conflicts are just adjacent lines changed.
Conflicts: drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_gem_gtt.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_drv.h drivers/gpu/drm/i915/intel_ringbuffer.h
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
show more ...
|
Revision tags: v4.2-rc2 |
|
#
06be5eef |
| 07-Jul-2015 |
Russell King <rmk+kernel@arm.linux.org.uk> |
Merge branches 'fixes' and 'ioremap' into for-linus
|
#
83dcf400 |
| 06-Jul-2015 |
Brian Norris <computersforpeace@gmail.com> |
Merge 4.2-rc1 into MTD -next
|
#
ae745302 |
| 06-Jul-2015 |
Tony Lindgren <tony@atomide.com> |
Merge branch 'fixes-rc1' into omap-for-v4.2/fixes
|
#
98006636 |
| 06-Jul-2015 |
Mauro Carvalho Chehab <mchehab@osg.samsung.com> |
Merge tag 'v4.2-rc1' into patchwork
Linux 4.2-rc1
* tag 'v4.2-rc1': (12415 commits) Linux 4.2-rc1 bluetooth: fix list handling 9p: cope with bogus responses from server in p9_client_{read,wri
Merge tag 'v4.2-rc1' into patchwork
Linux 4.2-rc1
* tag 'v4.2-rc1': (12415 commits) Linux 4.2-rc1 bluetooth: fix list handling 9p: cope with bogus responses from server in p9_client_{read,write} p9_client_write(): avoid double p9_free_req() 9p: forgetting to cancel request on interrupted zero-copy RPC dax: bdev_direct_access() may sleep block: Add support for DAX reads/writes to block devices dax: Use copy_from_iter_nocache dax: Add block size note to documentation NTB: Add split BAR output for debugfs stats NTB: Change WARN_ON_ONCE to pr_warn_once on unsafe NTB: Print driver name and version in module init NTB: Increase transport MTU to 64k from 16k NTB: Rename Intel code names to platform names NTB: Default to CPU memcpy for performance NTB: Improve performance with write combining NTB: Use NUMA memory in Intel driver NTB: Use NUMA memory and DMA chan in transport NTB: Rate limit ntb_qp_link_work NTB: Add tool test client ...
show more ...
|
Revision tags: v4.2-rc1 |
|
#
c58267e9 |
| 22-Jun-2015 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Kernel side changes mostly consist of work on x86 PMU drivers:
-
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Kernel side changes mostly consist of work on x86 PMU drivers:
- x86 Intel PT (hardware CPU tracer) improvements (Alexander Shishkin)
- x86 Intel CQM (cache quality monitoring) improvements (Thomas Gleixner)
- x86 Intel PEBSv3 support (Peter Zijlstra)
- x86 Intel PEBS interrupt batching support for lower overhead sampling (Zheng Yan, Kan Liang)
- x86 PMU scheduler fixes and improvements (Peter Zijlstra)
There's too many tooling improvements to list them all - here are a few select highlights:
'perf bench':
- Introduce new 'perf bench futex' benchmark: 'wake-parallel', to measure parallel waker threads generating contention for kernel locks (hb->lock). (Davidlohr Bueso)
'perf top', 'perf report':
- Allow disabling/enabling events dynamicaly in 'perf top': a 'perf top' session can instantly become a 'perf report' one, i.e. going from dynamic analysis to a static one, returning to a dynamic one is possible, to toogle the modes, just press 'f' to 'freeze/unfreeze' the sampling. (Arnaldo Carvalho de Melo)
- Make Ctrl-C stop processing on TUI, allowing interrupting the load of big perf.data files (Namhyung Kim)
'perf probe': (Masami Hiramatsu)
- Support glob wildcards for function name - Support $params special probe argument: Collect all function arguments - Make --line checks validate C-style function name. - Add --no-inlines option to avoid searching inline functions - Greatly speed up 'perf probe --list' by caching debuginfo. - Improve --filter support for 'perf probe', allowing using its arguments on other commands, as --add, --del, etc.
'perf sched':
- Add option in 'perf sched' to merge like comms to lat output (Josef Bacik)
Plus tons of infrastructure work - in particular preparation for upcoming threaded perf report support, but also lots of other work - and fixes and other improvements. See (much) more details in the shortlog and in the git log"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (305 commits) perf tools: Configurable per thread proc map processing time out perf tools: Add time out to force stop proc map processing perf report: Fix sort__sym_cmp to also compare end of symbol perf hists browser: React to unassigned hotkey pressing perf top: Tell the user how to unfreeze events after pressing 'f' perf hists browser: Honour the help line provided by builtin-{top,report}.c perf hists browser: Do not exit when 'f' is pressed in 'report' mode perf top: Replace CTRL+z with 'f' as hotkey for enable/disable events perf annotate: Rename source_line_percent to source_line_samples perf annotate: Display total number of samples with --show-total-period perf tools: Ensure thread-stack is flushed perf top: Allow disabling/enabling events dynamicly perf evlist: Add toggle_enable() method perf trace: Fix race condition at the end of started workloads perf probe: Speed up perf probe --list by caching debuginfo perf probe: Show usage even if the last event is skipped perf tools: Move libtraceevent dynamic list to separated LDFLAGS variable perf tools: Fix a problem when opening old perf.data with different byte order perf tools: Ignore .config-detected in .gitignore perf probe: Fix to return error if no probe is added ...
show more ...
|
Revision tags: v4.1, v4.1-rc8, v4.1-rc7, v4.1-rc6 |
|
#
6632c4b4 |
| 27-May-2015 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- A
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Add option in 'perf sched' to merge like comms to lat output (Josef Bacik)
- Improve 'perf probe' error messages when not finding a suitable vmlinux (Masami Hiramatsu)
Infrastructure changes:
- Use atomic.h for various pre-existing reference counts (Arnaldo Carvalho de Melo)
- Leg work for refcounting 'struct map' (Arnaldo Carvalho de Melo)
- Assign default value for some pointers (Martin Liška)
- Improve setting of gcc debug option (Martin Liška)
- Separate the tests and tools in installation (Nam T. Nguyen)
- Reduce number of arguments of hist_entry_iter__add() (Namhyung Kim)
- DSO data cache fixes (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.1-rc5 |
|
#
063bd936 |
| 19-May-2015 |
Namhyung Kim <namhyung@kernel.org> |
perf hists: Reducing arguments of hist_entry_iter__add()
The evsel and sample arguments are to set iter for later use. As it also receives an iter as another argument, just set them before calling
perf hists: Reducing arguments of hist_entry_iter__add()
The evsel and sample arguments are to set iter for later use. As it also receives an iter as another argument, just set them before calling the function.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1432022650-18205-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v4.1-rc4, v4.1-rc3 |
|
#
32b0ed3a |
| 09-May-2015 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- 'perf probe' improvements: (Masami Hiramatsu)
- Support glob wildcards for function name - Support $params special probe argument: Collect all function arguments - Make --line checks validate C-style function name. - Add --no-inlines option to avoid searching inline functions
- Introduce new 'perf bench futex' benchmark: 'wake-parallel', to measure parallel waker threads generating contention for kernel locks (hb->lock). (Davidlohr Bueso)
Bug fixes:
- Improve 'perf top' to survive much longer on high core count machines, more work needed to refcount more data structures besides 'struct thread' and fix more races. (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Move barrier.h mb/rmb/wmb API from tools/perf/ to kernel like tools/arch/ hierarchy. (Arnaldo Carvalho de Melo)
- Borrow atomic.h from the kernel, initially the x86 implementations with a fallback to gcc intrinsics for the other arches, all the kernel like framework in place for doing arch specific implementations, preferrably cloning what is in the kernel to the greater extent possible. (Arnaldo Carvalho de Melo)
- Protect the 'struct thread' lifetime with a reference counter, and protect data structures that contains its instances with a mutex. (Arnaldo Carvalho de Melo
- Disable libdw DWARF unwind when built with NO_DWARF (Naveen N. Rao)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.1-rc2, v4.1-rc1, v4.0 |
|
#
b91fc39f |
| 06-Apr-2015 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf machine: Protect the machine->threads with a rwlock
In addition to using refcounts for the struct thread lifetime management, we need to protect access to machine->threads from concurrent acces
perf machine: Protect the machine->threads with a rwlock
In addition to using refcounts for the struct thread lifetime management, we need to protect access to machine->threads from concurrent access.
That happens in 'perf top', where a thread processes events, inserting and deleting entries from that rb_tree while another thread decays hist_entries, that end up dropping references and ultimately deleting threads from the rb_tree and releasing its resources when no further hist_entry (or other data structures, like in 'perf sched') references it.
So the rule is the same for refcounts + protected trees in the kernel, get the tree lock, find object, bump the refcount, drop the tree lock, return, use object, drop the refcount if no more use of it is needed, keep it if storing it in some other data structure, drop when releasing that data structure.
I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and "perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".
The addr_location__put() one is because as we return references to several data structures, we may end up adding more reference counting for the other data structures and then we'll drop it at addr_location__put() time.
Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|