Revision tags: v4.10.2 |
|
#
7ffe939d |
| 08-Mar-2017 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge drm-next to get at all the good stuff in drm-misc. We need that because:
- drm_connector_list_iter conversion fo
Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge drm-next to get at all the good stuff in drm-misc. We need that because:
- drm_connector_list_iter conversion for i915 needs the core patches. - Maarten's patches to use the new atomic state iterators also need the core patches. - We need the new link status property to complete the DP retraining work, merging through 2 branches wasn't a good idea and we had to partially backtrack. - Chris needs reservation_object_trylock and we want to roll out kref_read everywhere.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
show more ...
|
#
9fe64e15 |
| 07-Mar-2017 |
Jonathan Corbet <corbet@lwn.net> |
Merge tag 'v4.11-rc1' into docs-next
Linux 4.11-rc1
|
#
84e5b549 |
| 07-Mar-2017 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo-4.11-20170306' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New fea
Merge tag 'perf-core-for-mingo-4.11-20170306' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles Baylis)
E.g.:
# perf report -s symbol_size,symbol
Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623 Overhead Symbol size Symbol 14.55% 326 [k] flush_tlb_mm_range 7.20% 1045 [k] filemap_map_pages 5.82% 124 [k] vma_interval_tree_insert 5.18% 2430 [k] unmap_page_range 2.57% 571 [k] vma_interval_tree_remove 1.94% 494 [k] page_add_file_rmap 1.82% 740 [k] page_remove_rmap 1.66% 1017 [k] release_pages 1.57% 1636 [k] update_blocked_averages 1.57% 76 [k] unlock_page
- Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' (Namhyung Kim)
Change in behaviour:
- Make system wide (-a) the default option if no target was specified and one of following conditions is met:
- No workload specified (current behaviour)
- A workload is specified but all requested events are system wide ones, like uncore ones. (Jiri Olsa)
Fixes:
- Add missing initialization to the instruction decoder used in the intel PT/BTS code, which was causing lots of failures in 'perf test', looking for a value when there was none (Adrian Hunter)
Infrastructure changes:
- Add arch code needed to adopt the kernel's refcount_t to aid in catching bugs when using atomic_t as a reference counter, basically cmpxchg related functions (Arnaldo Carvalho de Melo)
- Convert the code using atomic_t as reference counts to refcount_t (Elena Rashetova)
- Add feature test for sched_getcpu() to more easily check for its presence in the many libc implementations and accross different versions of such C libraries (Arnaldo Carvalho de Melo)
- Issue a HW watchdog disable hint in 'perf stat' for when some of the requested events can't get counted because a PMU counter is taken by that watchdog (Borislav Petkov).
- Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
Documentation changes:
- Clarify the term 'convergence' in:
perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
Kernel code changes:
- Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
- Allow return probes with offsets and absolute addresses (Naveen N. Rao)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
e24bce8f |
| 06-Mar-2017 |
Tony Lindgren <tony@atomide.com> |
Merge tag 'v4.11-rc1' into omap-for-v4.11/fixes
Linux 4.11-rc1
|
#
700ea5e0 |
| 06-Mar-2017 |
Mauro Carvalho Chehab <mchehab@s-opensource.com> |
Merge tag 'v4.11-rc1' into patchwork
Linux 4.11-rc1
* tag 'v4.11-rc1': (10730 commits) Linux 4.11-rc1 strparser: destroy workqueue on module exit Documentation/sphinx: fix primary_domain conf
Merge tag 'v4.11-rc1' into patchwork
Linux 4.11-rc1
* tag 'v4.11-rc1': (10730 commits) Linux 4.11-rc1 strparser: destroy workqueue on module exit Documentation/sphinx: fix primary_domain configuration docs: Fix htmldocs build failure doc/ko_KR/memory-barriers: Update control-dependencies section pcieaer doc: update the link Documentation: Update path to sysrq.txt sfc: fix IPID endianness in TSOv2 sfc: avoid max() in array size rds: remove unnecessary returned value check rxrpc: Fix potential NULL-pointer exception nfp: correct DMA direction in XDP DMA sync nfp: don't tell FW about the reserved buffer space net: ethernet: bgmac: mac address change bug net: ethernet: bgmac: init sequence bug xen-netback: don't vfree() queues under spinlock xen-netback: keep a local pointer for vif in backend_disconnect() netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() fails netfilter: nft_set_rbtree: incorrect assumption on lower interval lookups netfilter: nf_conntrack_sip: fix wrong memory initialisation ...
show more ...
|
#
a0f213e1 |
| 02-Mar-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
perf bench futex: Fix build on musl + clang
When building with clang on a musl libc system, Alpine Linux, we end up hitting a problem where memset() is used but its prototype is not present, add it
perf bench futex: Fix build on musl + clang
When building with clang on a musl libc system, Alpine Linux, we end up hitting a problem where memset() is used but its prototype is not present, add it to avoid this:
bench/futex-wake.c:99:3: error: implicitly declaring library function 'memset' with type 'void *(void *, int, unsigned long)' [-Werror,-Wimplicit-function-declaration] CPU_ZERO(&cpu); ^ /usr/include/sched.h:127:23: note: expanded from macro 'CPU_ZERO' #define CPU_ZERO(set) CPU_ZERO_S(sizeof(cpu_set_t),set) ^ /usr/include/sched.h:110:30: note: expanded from macro 'CPU_ZERO_S' #define CPU_ZERO_S(size,set) memset(set,0,size) ^ bench/futex-wake.c:99:3: note: include the header <string.h> or explicitly provide a declaration for 'memset'
Found while updating my test build containers to build perf with clang in more systems.
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jh10vaz2r98zl6gm5iau8prr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
0871d5a6 |
| 01-Mar-2017 |
Ingo Molnar <mingo@kernel.org> |
Merge branch 'linus' into WIP.x86/boot, to fix up conflicts and to pick up updates
Conflicts: arch/x86/xen/setup.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
Revision tags: v4.10.1 |
|
#
e98bdb30 |
| 25-Feb-2017 |
Mike Marshall <hubcap@omnibond.com> |
Merge tag 'v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into for-next
Linux 4.10
|
#
7f4eb0a6 |
| 20-Feb-2017 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "On the kernel side the main changes in this cycle were:
- Add In
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "On the kernel side the main changes in this cycle were:
- Add Intel Kaby Lake CPU support (Srinivas Pandruvada)
- AMD uncore driver updates for fam17 (Janakarajan Natarajan)
- Intel/PT updates and core events optimizations and cleanups (Alexander Shishkin)
- cgroups events fixes (David Carrillo-Cisneros)
- kprobes improvements (Masami Hiramatsu)
- ... plus misc fixes and updates.
On the tooling side the main changes were:
- Support clang build in tools/{perf,lib/{bpf,traceevent,api}} with CC=clang, to, for instance, take advantage of better warnings (Arnaldo Carvalho de Melo):
- Introduce the 'delta-abs' 'perf diff' compute method, that orders the histogram entries by the absolute value of the percentage delta for a function in two perf.data files, i.e. the functions that changed the most (increase or decrease in samples) comes first (Namhyung Kim)
- Add support for parsing Intel uncore vendor event files and add uncore vendor events for the Intel server processors (Haswell, Broadwell, IvyBridge), Xeon Phi (Knights Landing) and Broadwell DE (Andi Kleen)
- Introduce 'perf ftrace' a perf front end to the kernel's ftrace function and function_graph tracer, defaulting to the "function_graph" tracer, more work will be done in reviving this effort, forward porting it from its initial patch submission (Namhyung Kim)
- Add 'e' and 'c' hotkeys to expand/collapse call chains for a single hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa)
- Account thread wait time (off CPU time) separately: sleep, iowait and preempt, based on the prev_state of the last event, show the breakdown when using "perf sched timehist --state" (Namhyumg Kim)
- Add more triggers to switch the output file (perf.data.TIMESTAMP).
Now, in addition to switching to a different output file when receiving a SIGUSR2, one can also specify file size and time based triggers:
perf record -a --switch-output=signal
is equivalent to what we had before:
perf record -a --switch-output
While we can also ask for the file to be "sliced" by size, taking into account that that will happen only when we get woken up by the kernel, i.e. one has to take into account the --mmap-pages (the size of the perf mmap ring buffer):
perf record -a --switch-output=2G
will break the perf.data output into multiple files limited to 2GB of samples, right when generating the output.
For time based samples, alert() will be used, so to have 1 minute limited perf.data output files:
perf record -a --switch-output=1m
(Jiri Olsa)
- Improve 'perf trace' (Arnaldo Carvalho de Melo)
- 'perf kallsyms' toy tool to look for extended symbol information on the running kernel and demonstrate the machine/thread/symbol APIs for use in other tools, such as 'perf probe' (Arnaldo Carvalho de Melo)
- ... plus tons of other changes, see the shortlog and Git log for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (131 commits) perf tools: Add missing parse_events_error() prototype perf pmu: Fix check for unset alias->unit array perf tools: Be consistent on the type of map->symbols[] interator perf intel pt decoder: clang has no -Wno-override-init perf evsel: Do not put a variable sized type not at the end of a struct perf probe: Avoid accessing uninitialized 'map' variable perf tools: Do not put a variable sized type not at the end of a struct perf record: Do not put a variable sized type not at the end of a struct perf tests: Synthesize struct instead of using field after variable sized type perf bench numa: Make sure dprintf() is not defined Revert "perf bench futex: Sanitize numeric parameters" tools lib subcmd: Make it an error to pass a signed value to OPTION_UINTEGER tools: Set the maximum optimization level according to the compiler being used tools: Suppress request for warning options not existent in clang samples/bpf: Reset global variables samples/bpf: Ignore already processed ELF sections samples/bpf: Add missing header perf symbols: dso->name is an array, no need to check it against NULL perf tests record: No need to test an array against NULL perf symbols: No need to check if sym->name is NULL ...
show more ...
|
Revision tags: v4.10 |
|
#
e2a3b0df |
| 19-Feb-2017 |
Mark Brown <broonie@kernel.org> |
Merge remote-tracking branches 'spi/topic/rockchip', 'spi/topic/rspi', 'spi/topic/s3c64xx', 'spi/topic/sh-msiof' and 'spi/topic/slave' into spi-next
|
#
389dcb9d |
| 19-Feb-2017 |
Mark Brown <broonie@kernel.org> |
Merge tag 'asoc-fix-v4.10-rc3' into asoc-linus
ASoC: Fixes for v4.10
As well as the usual smattering of driver specific fixes collected since the merge window this has one particularly important fi
Merge tag 'asoc-fix-v4.10-rc3' into asoc-linus
ASoC: Fixes for v4.10
As well as the usual smattering of driver specific fixes collected since the merge window this has one particularly important fix to the core for handling of aux_devs which was broken during the merge window by some of the componentization refactoring.
# gpg: Signature made Wed 11 Jan 2017 17:26:37 GMT # gpg: using RSA key ADE668AA675718B59FE29FEA24D68B725D5487D0 # gpg: issuer "broonie@kernel.org" # gpg: key 0D9EACE2CD7BEEBC: no public key for trusted key - skipped # gpg: key 0D9EACE2CD7BEEBC marked as ultimately trusted # gpg: key CCB0A420AF88CD16: no public key for trusted key - skipped # gpg: key CCB0A420AF88CD16 marked as ultimately trusted # gpg: key 162614E316005C11: no public key for trusted key - skipped # gpg: key 162614E316005C11 marked as ultimately trusted # gpg: key A730C53A5621E907: no public key for trusted key - skipped # gpg: key A730C53A5621E907 marked as ultimately trusted # gpg: key 276568D75C6153AD: no public key for trusted key - skipped # gpg: key 276568D75C6153AD marked as ultimately trusted # gpg: Good signature from "Mark Brown <broonie@sirena.org.uk>" [ultimate] # gpg: aka "Mark Brown <broonie@debian.org>" [ultimate] # gpg: aka "Mark Brown <broonie@kernel.org>" [ultimate] # gpg: aka "Mark Brown <broonie@tardis.ed.ac.uk>" [ultimate] # gpg: aka "Mark Brown <broonie@linaro.org>" [ultimate] # gpg: aka "Mark Brown <Mark.Brown@linaro.org>" [ultimate]
show more ...
|
#
0c8967c9 |
| 16-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'perf-core-for-mingo-4.11-20170215' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core clang fixes from Arnaldo Carvalho de Melo:
Changes to make to
Merge tag 'perf-core-for-mingo-4.11-20170215' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core clang fixes from Arnaldo Carvalho de Melo:
Changes to make tools/{perf,lib/{bpf,traceevent,api}} build with CC=clang, to, for instance, take advantage of warnings (Arnaldo Carvalho de Melo):
- Conditionally request some warning options not available on clang
- Set the maximum optimization level to -O3 when using CC=clang, leave the previous setting of -O6 otherwise.
- Make it an error to pass a signed value to OPTION_UINTEGER, so that we can remove abs(unsigned int) calls in 'perf bench futex'.
- Make sure dprintf() is not defined before using that name in 'perf bench numa'
- Avoid using field after variable sized type, its a GNU extension, use equivalent code.
- Fix some bugs where some variables could be used unitialized, something not caught by gcc.
- Fix some spots where we were testing struct->array[] members against NULL, it will always evaluate to 'true'.
- Add missing parse_events_error() prototype in the bison file.
There are still one problem when trying to build the python support, but this are the 'size' outputs for 'make -C tools/perf NO_LIBPYTHON' for gcc and clang builds:
DW_AT_producer: clang version 4.0.0 (http://llvm.org/git/clang.git f5be8ba13adc4ba1011a7ccd60c844bd60427c1c) (ht
$ size ~/bin/perf text data bss dec hex filename 3447514 831320 23901696 28180530 1ae0032 /home/acme/bin/perf
DW_AT_producer: GNU C99 6.3.1 20161221 (Red Hat 6.3.1-1) -mtune=generic -march=x86-64 -ggdb3 -O6 -std=gnu99 +-fno-omit-frame-pointer -funwind-tables -fstack-protector-all
$ size ~/bin/perf text data bss dec hex filename 3671662 836480 23902752 28410894 1b1840e /home/acme/bin/perf
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
16cab322 |
| 14-Feb-2017 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
Revert "perf bench futex: Sanitize numeric parameters"
This reverts commit 60758d6668b3e2fa8e5fd143d24d0425203d007e.
Now that libsubcmd makes sure that OPT_UINTEGER options will not return negative
Revert "perf bench futex: Sanitize numeric parameters"
This reverts commit 60758d6668b3e2fa8e5fd143d24d0425203d007e.
Now that libsubcmd makes sure that OPT_UINTEGER options will not return negative values, we can revert this patch while addressing the problem it solved:
# perf bench futex hash -t -4 # Running 'futex/hash' benchmark: Error: switch `t' expects an unsigned numerical value Usage: perf bench futex hash <options>
-t, --threads <n> Specify amount of threads # perf bench futex hash -t-4 # Running 'futex/hash' benchmark: Error: switch `t' expects an unsigned numerical value Usage: perf bench futex hash <options>
-t, --threads <n> Specify amount of threads #
IMO it is more reasonable to flat out refuse to process a negative number than to silently turn it into an absolute value.
This also helps in silencing clang's complaint about asking for an absolute value of an unsigned integer:
bench/futex-hash.c:133:10: error: taking the absolute value of unsigned type 'unsigned int' has no effect [-Werror,-Wabsolute-value] nsecs = futexbench_sanitize_numeric(nsecs); ^ bench/futex.h:104:42: note: expanded from macro 'futexbench_sanitize_numeric' #define futexbench_sanitize_numeric(__n) abs((__n)) ^ bench/futex-hash.c:133:10: note: remove the call to 'abs' since unsigned values cannot be negative
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2kl68v22or31vw643m2exz8x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
858a0d7e |
| 30-Jan-2017 |
Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
Merge back earlier suspend/hibernation changes for v4.11.
|
#
1b62d134 |
| 30-Jan-2017 |
Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
Merge back earlier ACPICA changes for v4.11.
|
#
0cce2845 |
| 24-Jan-2017 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v4.10-rc5' into next
Sync up with mainline to bring up improvements in various subsystems.
|
#
62ed8ced |
| 24-Jan-2017 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v4.10-rc5' into for-linus
Sync up with mainline to apply fixup to a commit that came through power supply tree.
|
#
dbbc21bb |
| 24-Jan-2017 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v4.10-rc1' into asoc-intel
Linux 4.10-rc1
|
#
9c1852b4 |
| 10-Jan-2017 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v4.10-rc1' into asoc-samsung
Linux 4.10-rc1
|
#
5c47e3cf |
| 09-Jan-2017 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v4.10-rc1' into spi-s3c64xx
Linux 4.10-rc1
|
#
a402eae6 |
| 04-Jan-2017 |
Daniel Vetter <daniel.vetter@ffwll.ch> |
Merge tag 'v4.10-rc2' into drm-intel-next-queued
Backmerge Linux 4.10-rc2 to resync with our -fixes cherry-picks. I've done the backmerge directly because Dave is on vacation.
Signed-off-by: Daniel
Merge tag 'v4.10-rc2' into drm-intel-next-queued
Backmerge Linux 4.10-rc2 to resync with our -fixes cherry-picks. I've done the backmerge directly because Dave is on vacation.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
show more ...
|
#
54ab6db0 |
| 27-Dec-2016 |
Jonathan Corbet <corbet@lwn.net> |
Merge tag 'v4.10-rc1' into docs-next
Linux 4.10-rc1
|
#
bd361f5d |
| 26-Dec-2016 |
Mauro Carvalho Chehab <mchehab@s-opensource.com> |
Merge tag 'v4.10-rc1' into patchwork
Linux 4.10-rc1
* tag 'v4.10-rc1': (11427 commits) Linux 4.10-rc1 powerpc: Fix build warning on 32-bit PPC avoid spurious "may be used uninitialized" warni
Merge tag 'v4.10-rc1' into patchwork
Linux 4.10-rc1
* tag 'v4.10-rc1': (11427 commits) Linux 4.10-rc1 powerpc: Fix build warning on 32-bit PPC avoid spurious "may be used uninitialized" warning mm: add PageWaiters indicating tasks are waiting for a page bit mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked ktime: Get rid of ktime_equal() ktime: Cleanup ktime_set() usage ktime: Get rid of the union clocksource: Use a plain u64 instead of cycle_t irqchip/armada-xp: Consolidate hotplug state space irqchip/gic: Consolidate hotplug state space coresight/etm3/4x: Consolidate hotplug state space cpu/hotplug: Cleanup state names cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions staging/lustre/libcfs: Convert to hotplug state machine scsi/bnx2i: Convert to hotplug state machine scsi/bnx2fc: Convert to hotplug state machine cpu/hotplug: Prevent overwriting of callbacks x86/msr: Remove bogus cleanup from the error path bus: arm-ccn: Prevent hotplug callback leak ...
show more ...
|
#
f26e8817 |
| 16-Dec-2016 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 4.10 merge window.
|
#
bca13ce4 |
| 12-Dec-2016 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "This update is pretty big and almost exclusively includes tooling
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "This update is pretty big and almost exclusively includes tooling changes, because v4.9's LTS status forced to completion most of the pending kernel side hardware enablement work and because we tried to freeze core perf work a bit to give a time window for the fuzzing efforts.
The diff is large mostly due to the JSON hardware event tables added for Intel and Power8 CPUs. This was a popular feature request from people working close to hardware and from the HPC community.
Tree size is big because this added the CPU event tables for over a decade of Intel CPUs. Future changes for a CPU vendor alrady support should be much smaller, as events for new models are added. The new events are listed in 'perf list', for the CPU model the tool is running on. If you find an interesting event it can be used as-is:
$ perf stat -a -e l2_lines_out.pf_clean sleep 1
Performance counter stats for 'system wide':
7,860,403 l2_lines_out.pf_clean
1.000624918 seconds time elapsed
The event lists can be searched the usual 'perf list' fashion for (case insensitive) substrings as well:
$ perf list l2_lines_out
List of pre-defined events (to be used in -e):
cache: l2_lines_out.demand_clean [Clean L2 cache lines evicted by demand] l2_lines_out.demand_dirty [Dirty L2 cache lines evicted by demand] l2_lines_out.dirty_all [Dirty L2 cache lines filling the L2] l2_lines_out.pf_clean [Clean L2 cache lines evicted by L2 prefetch] l2_lines_out.pf_dirty [Dirty L2 cache lines evicted by L2 prefetch]
etc.
There's a few high level categories as well that can be listed: 'cache', 'floating point', 'frontend', 'memory', 'pipeline', 'virtual memory'.
Existing generic events and workflows should work as-is.
The only kernel side change is a late breaking fix for an older regression, related to Intel BTS, LBR and PT feature interaction.
On the tooling side there are three new tools / major features:
- The new 'perf c2c' tool provides means for Shared Data C2C/HITM analysis.
This allows you to track down cacheline contention. The tool is based on x86's load latency and precise store facility events provided by Intel CPUs.
It was tested by Joe Mario and has proven to be useful, finding some cacheline contentions. Joe also wrote a blog about c2c tool with examples:
https://joemario.github.io/blog/2016/09/01/c2c-blog/
excerpt of the content on this site:
At a high level, “perf c2c” will show you:
* The cachelines where false sharing was detected. * The readers and writers to those cachelines, and the offsets where those accesses occurred. * The pid, tid, instruction addr, function name, binary object name for those readers and writers. * The source file and line number for each reader and writer. * The average load latency for the loads to those cachelines. * Which numa nodes the samples a cacheline came from and which CPUs were involved.
Using perf c2c is similar to using the Linux perf tool today. First collect data with “perf c2c record”, then generate a report output with “perf c2c report”
There one finds extensive details on using the tool, with tips on reducing the volume of samples while still capturing enough to do its job. (Dick Fowles, Joe Mario, Don Zickus, Jiri Olsa)
- The new 'perf sched timehist' tool provides tailored analysis of scheduling events.
Example usage:
perf sched record -- sleep 1 perf sched timehist
By default it shows the individual schedule events, including the wait time (time between sched-out and next sched-in events for the task), the task scheduling delay (time between wakeup and actually running) and run time for the task:
time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) -------- ------ ---------------- --------- --------- -------- 1.874569 [0011] gcc[31949] 0.014 0.000 1.148 1.874591 [0010] gcc[31951] 0.000 0.000 0.024 1.874603 [0010] migration/10[59] 3.350 0.004 0.011 1.874604 [0011] <idle> 1.148 0.000 0.035 1.874723 [0005] <idle> 0.016 0.000 1.383 1.874746 [0005] gcc[31949] 0.153 0.078 0.022 ...
Times are in msec.usec. (David Ahern, Namhyung Kim)
- Add CPU vendor hardware event tables:
Add JSON files with vendor event naming for Intel and Power8 processors, allowing users of tools like oprofile to keep using the event names they are used to, as well as people reading vendor documentation, where such naming is used. (Andi Kleen, Sukadev Bhattiprolu)
You should see all the new events with 'perf list' and you should be able to search them, for example 'perf list miss' will list all the myriads of miss events.
Other tooling features added were:
- Cross-arch annotation support:
o Improve ARM support in the annotation code, affecting 'perf annotate', 'perf report' and live annotation in 'perf top' (Kim Phillips)
o Initial support for PowerPC in the annotation code (Ravi Bangoria)
o Support AArch64 in the 'annotate' code, native/local and cross-arch/remote (Kim Phillips)
- Allow considering just events in a given time interval, via the '--time start.s.ms,end.s.ms' command line, added to 'perf kmem', 'perf report', 'perf sched timehist' and 'perf script' (David Ahern)
- Add option to stop printing a callchain at one of a given group of symbol names (David Ahern)
- Track memory freed in 'perf kmem stat' (David Ahern)
- Allow querying and setting .perfconfig variables (Taeung Song)
- Show branch information in callchains (predicted, TSX aborts, loop iteractions, etc) (Jin Yao)
- Dynamicly change verbosity level by pressing 'V' in the 'perf top/report' hists TUI browser (Alexis Berlemont)
- Implement 'perf trace --delay' in the same fashion as in 'perf record --delay', to skip sampling workload initialization events (Alexis Berlemont)
- Make vendor named events case insensitive in 'perf list', i.e. 'perf list LONGEST_LAT' works just the same as 'perf list longest_lat' (Andi Kleen)
- Add unwinding support for jitdump (Stefano Sanfilippo)
Tooling infrastructure changes:
- Support linking perf with clang and LLVM libraries, initially statically, but this limitation will be lifted and shared libraries, when available, will be preferred to the static build, that should, as with other features, be enabled explicitly (Wang Nan)
- Add initial support (and perf test entry) for tooling hooks, starting with 'record_start' and 'record_end', that will have as its initial user the eBPF infrastructure, where perf_ prefixed functions will be JITed and run when such hooks are called (Wang Nan)
- Implement assorted libbpf improvements (Wang Nan)"
... and lots of other changes, features, cleanups and refactorings I did not list, see the shortlog and the git log for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (220 commits) perf/x86: Fix exclusion of BTS and LBR for Goldmont perf tools: Explicitly document that --children is enabled by default perf sched timehist: Cleanup idle_max_cpu handling perf sched timehist: Handle zero sample->tid properly perf callchain: Introduce callchain_cursor__copy() perf sched: Cleanup option processing perf sched timehist: Improve error message when analyzing wrong file perf tools: Move perf build related variables under non fixdep leg perf tools: Force fixdep compilation at the start of the build perf tools: Move PERF-VERSION-FILE target into rules area perf build: Check LLVM version in feature check perf annotate: Show raw form for jump instruction with indirect target perf tools: Add non config targets perf tools: Cleanup build directory before each test perf tools: Move python/perf.so target into rules area perf tools: Move install-gtk target into rules area tools build: Move tabs to spaces where suitable tools build: Make the .cmd file more readable perf clang: Compile BPF script using builtin clang support perf clang: Support compile IR to BPF object and add testcase ...
show more ...
|