Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4 |
|
#
6aca56c0 |
| 23-Jun-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: remove check on CPU being online
During sampling event initialization, a check is done if that particular CPU the event is to be installed on is actually online. This check is not nece
s390/cpum_sf: remove check on CPU being online
During sampling event initialization, a check is done if that particular CPU the event is to be installed on is actually online. This check is not necessary, as it is also performed in the system call entry point. Therefore remove this check. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
show more ...
|
#
b2534c28 |
| 23-Jun-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: handle casts consistently
The casts are written in two different notations: (cast) expression and (cast)expression Convert statements with the first notation to the second notati
s390/cpum_sf: handle casts consistently
The casts are written in two different notations: (cast) expression and (cast)expression Convert statements with the first notation to the second notation. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
show more ...
|
Revision tags: v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22 |
|
#
c13166bd |
| 27-Mar-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: remove unnecessary debug statement
Remove debug_sprint_event() statement right after an pr_err() statement. No additional debug information is generated. No functional change.
Signed-
s390/cpum_sf: remove unnecessary debug statement
Remove debug_sprint_event() statement right after an pr_err() statement. No additional debug information is generated. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
show more ...
|
#
b2ae4969 |
| 23-Mar-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: remove parameter in call to pr_err
The op argument is hardcoded in the parameter list of function pr_err. Make the op code part of the text printed by pr_err. No functional change.
Si
s390/cpum_sf: remove parameter in call to pr_err
The op argument is hardcoded in the parameter list of function pr_err. Make the op code part of the text printed by pr_err. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
show more ...
|
#
eeeff534 |
| 23-Mar-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: simplify function setup_pmu_cpu
Print the error message when the FAILURE flag is set. This saves on pr_err statement as the text of the error message is identical in both failures. Als
s390/cpum_sf: simplify function setup_pmu_cpu
Print the error message when the FAILURE flag is set. This saves on pr_err statement as the text of the error message is identical in both failures. Also observe reverse Xmas tree variable declarations in this function. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
show more ...
|
#
cada938a |
| 28-Jun-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390: fix various typos
Fix various typos found with codespell.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
#
b378a982 |
| 22-Jun-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390: include linux/io.h instead of asm/io.h
Include linux/io.h instead of asm/io.h everywhere. linux/io.h includes asm/io.h, so this shouldn't cause any problems. Instead this might help for some r
s390: include linux/io.h instead of asm/io.h
Include linux/io.h instead of asm/io.h everywhere. linux/io.h includes asm/io.h, so this shouldn't cause any problems. Instead this might help for some randconfig build errors which were reported due to some undefined io related functions.
Also move the changed include so it stays grouped together with other includes from the same directory.
For ctcm_mpc.c also remove not needed comments (actually questions).
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
show more ...
|
#
51f513fd |
| 11-Jun-2023 |
Baoquan He <bhe@redhat.com> |
s390/mm: do not include <asm-generic/io.h> directly
We should always include <asm/io.h> in ARCH, but not <asm-generic/io.h> directly. Otherwise, macro defined by ARCH won't be seen and could cause b
s390/mm: do not include <asm-generic/io.h> directly
We should always include <asm/io.h> in ARCH, but not <asm-generic/io.h> directly. Otherwise, macro defined by ARCH won't be seen and could cause building error.
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306100105.8GHnoMCP-lkp@intel.com/ Link: https://lore.kernel.org/all/ZIWrtFMUnRfVP5h0@MiWiFi-R3L-srv/ Signed-off-by: Baoquan He <bhe@redhat.com> [agordeev@linux.ibm.com changed patch description] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
show more ...
|
#
497cc42b |
| 31-May-2023 |
Peter Zijlstra <peterz@infradead.org> |
s390/cpum_sf: Convert to cmpxchg128()
Now that there is a cross arch u128 and cmpxchg128(), use those instead of the custom CDSG helper.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
s390/cpum_sf: Convert to cmpxchg128()
Now that there is a cross arch u128 and cmpxchg128(), use those instead of the custom CDSG helper.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132324.058821078@infradead.org
show more ...
|
Revision tags: v6.1.21, v6.1.20 |
|
#
af90d7b6 |
| 16-Mar-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: remove flag PERF_CPUM_SF_FULL_BLOCKS
This flag is used to process only fully populated sampling buffers when an sampling event is stopped on a CPU. By default the last sampling buffer
s390/cpum_sf: remove flag PERF_CPUM_SF_FULL_BLOCKS
This flag is used to process only fully populated sampling buffers when an sampling event is stopped on a CPU. By default the last sampling buffer is also scanned for samples even if the sampling block full indicator is not set in the trailer entry of a sampling buffer page.
This flag can be set via perf_event_attr::config1 field. It was never used and never documented. It is useless now.
With PERF_CPUM_SF_FULL_BLOCKS: When a process is scheduled off the CPU, the sampling is stopped and the samples are copied to the perf ring buffer and marked invalid. When stopped at the last full sample buffer page (which is achieved with the PERF_CPUM_SF_FULL_BLOCKS options), the hardware sampling will resume at the first free sample entry in the current, partially filled sample buffer.
Without PERF_CPUM_SF_FULL_BLOCKS (default behavior): The partially filled last sample buffer is scanned and valid samples are saved to the perf ring buffer. The valid samples are marked invalid. The sampling is resumed when the process is scheduled on this CPU. Again the hardware sampling will resume at the first free sample entry in the current, partially filled sample buffer.
Now the next interrupt handler invocation scans the full sample block and saves the valid samples to the ring buffer. It omits the invalid samples at the top of the buffer. The default behavior is fully sufficient, therefore remove this feature.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
show more ...
|
Revision tags: v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14 |
|
#
5e02c749 |
| 24-Feb-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/cpum_sf: use READ_ONCE_ALIGNED_128() instead of 128-bit cmpxchg
Use READ_ONCE_ALIGNED_128() to read the previous value in front of a 128-bit cmpxchg loop, instead of (mis-)using a 128-bit cmpxc
s390/cpum_sf: use READ_ONCE_ALIGNED_128() instead of 128-bit cmpxchg
Use READ_ONCE_ALIGNED_128() to read the previous value in front of a 128-bit cmpxchg loop, instead of (mis-)using a 128-bit cmpxchg operation to do the same.
This makes the code more readable and is faster.
Link: https://lore.kernel.org/r/20230224100237.3247871-3-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
Revision tags: v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6 |
|
#
d924ecdb |
| 13-Jan-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: diagnostic sampling buffer setup to handle virtual addresses
The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers to save samples collected by hardware. These buffers
s390/cpum_sf: diagnostic sampling buffer setup to handle virtual addresses
The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers to save samples collected by hardware. These buffers are organized as Sample Data Buffer Tables (SDBT) and Sample Data Buffers (SDB). SDBs contain the samples which are extracted and saved in the perf ring buffer. The SDBTs are chained using real addresses and refer to SDBs using real addresses.
The diagnostic sampling setup uses buffers provided by the process which invokes perf_event_open system call. The buffers are memory mapped. The buffers have been allocated by the kernel event subsystem.
Add proper virtual to phyiscal address translation to the buffer chaining. The current constraint which requires virtual equals real address layout is removed.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
#
78157b47 |
| 13-Jan-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: rework macro AUX_SDB_NUM_xxx
Macro AUX_SDB_NUM() has three parameters. The first one is not used. Remove the first parameter. Also convert the macros to inline functions. No functional
s390/cpum_sf: rework macro AUX_SDB_NUM_xxx
Macro AUX_SDB_NUM() has three parameters. The first one is not used. Remove the first parameter. Also convert the macros to inline functions. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
#
1f8e5072 |
| 16-Jan-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: sampling buffer setup to handle virtual addresses
The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers to save samples collected by hardware. These buffers are organi
s390/cpum_sf: sampling buffer setup to handle virtual addresses
The CPU Measurement Sampling Facility (CPUM_SF) installs large buffers to save samples collected by hardware. These buffers are organized as Sample Data Buffer Tables (SDBT) and Sample Data Buffers (SDB). SDBs contain the samples which are extracted and saved in the perf ring buffer. The SDBTs are chained using real addresses and refer to SDBs using real addresses.
Adds proper virtual to phyiscal address translation to the buffer chaining. The current constraint which requires virtual equals real address layout is removed.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
Revision tags: v6.1.5, v6.0.19, v6.0.18, v6.1.4 |
|
#
4012fc20 |
| 05-Jan-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: remove debug statements from function setup_pmc_cpu
Remove debug statements from function setup_pmc_cpu(). The debug statement displays a pointer value to a per cpu variable. This poin
s390/cpum_sf: remove debug statements from function setup_pmc_cpu
Remove debug statements from function setup_pmc_cpu(). The debug statement displays a pointer value to a per cpu variable. This pointer value is printed nowhere else, so it has no use for cross reference. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
#
a64e45c2 |
| 12-Jan-2023 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: move functions from header file to source file
Some inline helper functions are defined in a header file but used in only one source file. Move these functions to the source file. No f
s390/cpum_sf: move functions from header file to source file
Some inline helper functions are defined in a header file but used in only one source file. Move these functions to the source file. No functional change.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
#
f6e70715 |
| 18-Jan-2023 |
Namhyung Kim <namhyung@kernel.org> |
perf/core: Introduce perf_prepare_header()
Factor out perf_prepare_header() so that it can call perf_prepare_sample() without a header if not needed.
Also it checks the filtered_sample_type to avoi
perf/core: Introduce perf_prepare_header()
Factor out perf_prepare_header() so that it can call perf_prepare_sample() without a header if not needed.
Also it checks the filtered_sample_type to avoid duplicate work when perf_prepare_sample() is called twice (or more).
Suggested-by: Peter Zijlstr <peterz@infradead.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230118060559.615653-8-namhyung@kernel.org
show more ...
|
#
82d3edb5 |
| 05-Jan-2023 |
Heiko Carstens <hca@linux.ibm.com> |
s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops
The current cmpxchg_double() loops within the perf hw sampling code do not have READ_ONCE() semantics to read the old value from mem
s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops
The current cmpxchg_double() loops within the perf hw sampling code do not have READ_ONCE() semantics to read the old value from memory. This allows the compiler to generate code which reads the "old" value several times from memory, which again allows for inconsistencies.
For example:
/* Reset trailer (using compare-double-and-swap) */ do { te_flags = te->flags & ~SDB_TE_BUFFER_FULL_MASK; te_flags |= SDB_TE_ALERT_REQ_MASK; } while (!cmpxchg_double(&te->flags, &te->overflow, te->flags, te->overflow, te_flags, 0ULL));
The compiler could generate code where te->flags used within the cmpxchg_double() call may be refetched from memory and which is not necessarily identical to the previous read version which was used to generate te_flags. Which in turn means that an incorrect update could happen.
Fix this by adding READ_ONCE() semantics to all cmpxchg_double() loops. Given that READ_ONCE() cannot generate code on s390 which atomically reads 16 bytes, use a private compare-and-swap-double implementation to achieve that.
Also replace cmpxchg_double() with the private implementation to be able to re-use the old value within the loops.
As a side effect this converts the whole code to only use bit fields to read and modify bits within the hws trailer header.
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.ibm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/linux-s390/Y71QJBhNTIatvxUT@osiris/T/#ma14e2a5f7aa8ed4b94b6f9576799b3ad9c60f333 Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
Revision tags: v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15 |
|
#
745f5d20 |
| 13-Jan-2022 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpumf: Support for CPU Measurement Sampling Facility LS bit
Adds support for the CPU Measurement Sampling Facility limit sampling bit in the sampling device driver. Limited samples have no valu
s390/cpumf: Support for CPU Measurement Sampling Facility LS bit
Adds support for the CPU Measurement Sampling Facility limit sampling bit in the sampling device driver. Limited samples have no valueable information are not collected.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
Revision tags: v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15 |
|
#
f8d8977a |
| 08-Feb-2021 |
Heiko Carstens <hca@linux.ibm.com> |
s390/time: convert tod_clock_base to union
Convert tod_clock_base to union tod_clock. This simplifies quite a bit of code and also fixes a bug in read_persistent_clock64();
void read_persistent_clo
s390/time: convert tod_clock_base to union
Convert tod_clock_base to union tod_clock. This simplifies quite a bit of code and also fixes a bug in read_persistent_clock64();
void read_persistent_clock64(struct timespec64 *ts) { __u64 delta;
delta = initial_leap_seconds + TOD_UNIX_EPOCH; get_tod_clock_ext(clk); *(__u64 *) &clk[1] -= delta; if (*(__u64 *) &clk[1] > delta) clk[0]--; ext_to_timespec64(clk, ts); }
Assume &clk[1] == 3 and delta == 2; then after the substraction the if condition becomes true and the epoch part of the clock is decremented by one because of an assumed overflow, even though there is none.
Fix this by using 128 bit arithmetics and let the compiler do the right thing:
void read_persistent_clock64(struct timespec64 *ts) { union tod_clock clk; u64 delta;
delta = initial_leap_seconds + TOD_UNIX_EPOCH; store_tod_clock_ext(&clk); clk.eitod -= delta; ext_to_timespec64(&clk, ts); }
Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
show more ...
|
Revision tags: v5.10.14, v5.10 |
|
#
78d732e1 |
| 11-Nov-2020 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf.c: fix file permission for cpum_sfb_size
This file is installed by the s390 CPU Measurement sampling facility device driver to export supported minimum and maximum sample buffer sizes.
s390/cpum_sf.c: fix file permission for cpum_sfb_size
This file is installed by the s390 CPU Measurement sampling facility device driver to export supported minimum and maximum sample buffer sizes. This file is read by lscpumf tool to display the details of the device driver capabilities. The lscpumf tool might be invoked by a non-root user. In this case it does not print anything because the file contents can not be read.
Fix this by allowing read access for all users. Reading the file contents is ok, changing the file contents is left to the root user only.
For further reference and details see: [1] https://github.com/ibm-s390-tools/s390-tools/issues/97
Fixes: 69f239ed335a ("s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur") Cc: <stable@vger.kernel.org> # 3.14 Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
show more ...
|
#
267fb273 |
| 30-Oct-2020 |
Peter Zijlstra <peterz@infradead.org> |
perf: Reduce stack usage of perf_output_begin()
__perf_output_begin() has an on-stack struct perf_sample_data in the unlikely case it needs to generate a LOST record. However, every call to perf_out
perf: Reduce stack usage of perf_output_begin()
__perf_output_begin() has an on-stack struct perf_sample_data in the unlikely case it needs to generate a LOST record. However, every call to perf_output_begin() must already have a perf_sample_data on-stack.
Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20201030151954.985416146@infradead.org
show more ...
|
Revision tags: v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7 |
|
#
5aa98879 |
| 26-Jun-2020 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: prohibit callchain data collection
CPU Measurement sampling facility on s390 does not support perf tool collection of callchain data using --call-graph option. The sampling facility co
s390/cpum_sf: prohibit callchain data collection
CPU Measurement sampling facility on s390 does not support perf tool collection of callchain data using --call-graph option. The sampling facility collects samples in a ring buffer which includes only the instruction address the samples were taken. When the ring buffer hits a watermark, a measurement alert interrupt is triggered and handled by the performance measurement unit (PMU) device driver. It collects the samples and feeds each sample to the perf ring buffer in the common code via functions perf_prepare_sample()/perf_output_sample(). When function perf_prepare_sample() is called to collect sample data's callchain, user register values or stack area, invalid data is picked, because the context of the collected information does not match the context when the sample was taken.
There is currently no way to provide the callchain and other information, because the hardware sampler does not collect this information.
Therefore prohibit sampling when the user requests a callchain graph from the hardware sampler. Return -EOPNOTSUPP to the user in this case. If call chains are really wanted, users need to specify software event cpu-clock to get the callchain information from a software event.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
show more ...
|
Revision tags: v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28 |
|
#
4141b6a5 |
| 23-Mar-2020 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: Fix wrong page count in error message
When perf record -e SF_CYCLES_BASIC_DIAG runs with very high frequency, the samples arrive faster than the perf process can save them to file. Eve
s390/cpum_sf: Fix wrong page count in error message
When perf record -e SF_CYCLES_BASIC_DIAG runs with very high frequency, the samples arrive faster than the perf process can save them to file. Eventually, for longer running processes, this leads to the siutation where the trace buffers allocated by perf slowly fills up. At one point the auxiliary trace buffer is full and the CPU Measurement sampling facility is turned off. Furthermore a warning is printed to the kernel log buffer:
cpum_sf: The AUX buffer with 0 pages for the diagnostic-sampling mode is full
The number of allocated pages for the auxiliary trace buffer is shown as zero pages. That is wrong.
Fix this by saving the number of allocated pages before entering the work loop in the interrupt handler. When the interrupt handler processes the samples, it may detect the buffer full condition and stop sampling, reducing the buffer size to zero. Print the correct value in the error message:
cpum_sf: The AUX buffer with 256 pages for the diagnostic-sampling mode is full
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
show more ...
|
Revision tags: v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2 |
|
#
0d6f1693 |
| 04-Dec-2019 |
Thomas Richter <tmricht@linux.ibm.com> |
s390/cpum_sf: Rework sampling buffer allocation
Adjust sampling buffer allocation depending on frequency and correct comments. Investigation on the interrupt handler revealed that almost always one
s390/cpum_sf: Rework sampling buffer allocation
Adjust sampling buffer allocation depending on frequency and correct comments. Investigation on the interrupt handler revealed that almost always one interupt services one SDB, even when running with the maximum frequency of 100000. Very rarely there have been 2 SBD serviced per interrupt.
Therefore reduce the number of SBD per CPU. Each SDB is one page in size. The new formula results in freq:4000 n_sdb:32 new:16 freq:10000 n_sdb:80 new:16 freq:20000 n_sdb:159 new:17 freq:40000 n_sdb:318 new:19 freq:50000 n_sdb:397 new:20 freq:62500 n_sdb:497 new:22 freq:83333 n_sdb:662 new:24 freq:100000 n_sdb:794 new:25
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
show more ...
|