#
12600999 |
| 22-Apr-2015 |
Ingo Molnar <mingo@kernel.org> |
x86/fpu: Improve the comment for the fpu::counter field
This was pretty hard to read, improve it.
Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hans
x86/fpu: Improve the comment for the fpu::counter field
This was pretty hard to read, improve it.
Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
c0c2803d |
| 22-Apr-2015 |
Ingo Molnar <mingo@kernel.org> |
x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter
This field is kept separate from the main FPU state structure for no good reason.
Reviewed-by: Borislav Petkov <bp@alien8.de> Cc
x86/fpu: Move thread_info::fpu_counter into thread_info::fpu.counter
This field is kept separate from the main FPU state structure for no good reason.
Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.0, v4.0-rc7 |
|
#
1a7dc0db |
| 03-Apr-2015 |
Ingo Molnar <mingo@kernel.org> |
x86/fpu: Rename fpu_detect() to fpu__detect()
Use the fpu__*() namespace to organize FPU ops better.
Also document fpu__detect() a bit.
Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomir
x86/fpu: Rename fpu_detect() to fpu__detect()
Use the fpu__*() namespace to organize FPU ops better.
Also document fpu__detect() a bit.
Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
cf9328cc |
| 02-Apr-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry/32: Stop caching MSR_IA32_SYSENTER_ESP in tss.sp1
We write a stack pointer to MSR_IA32_SYSENTER_ESP exactly once, and we unnecessarily cache the value in tss.sp1. We never read the ca
x86/asm/entry/32: Stop caching MSR_IA32_SYSENTER_ESP in tss.sp1
We write a stack pointer to MSR_IA32_SYSENTER_ESP exactly once, and we unnecessarily cache the value in tss.sp1. We never read the cached value.
Remove all of the caching. It serves no purpose.
Suggested-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/05a0163eb33ef5208363f0015496855da7cebadd.1428002830.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.0-rc6, v4.0-rc5 |
|
#
ac9af498 |
| 17-Mar-2015 |
Ingo Molnar <mingo@kernel.org> |
x86/asm/entry/64: Remove thread_struct::usersp
Nothing uses thread_struct::usersp anymore, so remove it.
Originally-from: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Borislav Petkov <bp@alien8.
x86/asm/entry/64: Remove thread_struct::usersp
Nothing uses thread_struct::usersp anymore, so remove it.
Originally-from: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
9854dd74 |
| 17-Mar-2015 |
Ingo Molnar <mingo@kernel.org> |
x86/asm/entry/64: Simplify 'old_rsp' usage
Remove all manipulations of PER_CPU(old_rsp) in C code:
- it is not used on SYSRET return anymore, and system entries are atomic, so updating it from
x86/asm/entry/64: Simplify 'old_rsp' usage
Remove all manipulations of PER_CPU(old_rsp) in C code:
- it is not used on SYSRET return anymore, and system entries are atomic, so updating it from the fork and context switch paths is pointless.
- Tweak a few related comments as well: we no longer have a "partial stack frame" on entry, ever.
Based on (split out of) patch from Denys Vlasenko.
Originally-from: Denys Vlasenko <dvlasenk@redhat.com> Tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426599779-8010-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.0-rc4 |
|
#
d828c71f |
| 09-Mar-2015 |
Denys Vlasenko <dvlasenk@redhat.com> |
x86/asm/entry/32: Document the 32-bit SYSENTER "emergency stack" better
Before the patch, the 'tss_struct::stack' field was not referenced anywhere.
It was used only to set SYSENTER's stack to poin
x86/asm/entry/32: Document the 32-bit SYSENTER "emergency stack" better
Before the patch, the 'tss_struct::stack' field was not referenced anywhere.
It was used only to set SYSENTER's stack to point after the last byte of tss_struct, thus the trailing field, stack[64], was used.
But grep would not know it. You can comment it out, compile, and kernel will even run until an unlucky NMI corrupts io_bitmap[] (which is also not easily detectable).
This patch changes code so that the purpose and usage of this field is not mysterious anymore, and can be easily grepped for.
This does change generated code, for a subtle reason: since tss_struct is ____cacheline_aligned, there happens to be 5 longs of padding at the end. Old code was using the padding too; new code will strictly use it only for SYSENTER_stack[].
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1425912738-559-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
5c39403e |
| 13-Mar-2015 |
Denys Vlasenko <dvlasenk@redhat.com> |
x86/asm/entry: Simplify task_pt_regs() macro definition
Before this change, task_pt_regs() was using KSTK_TOP(), and it was the only use of that macro. In turn, KSTK_TOP used THREAD_SIZE_LONGS, and
x86/asm/entry: Simplify task_pt_regs() macro definition
Before this change, task_pt_regs() was using KSTK_TOP(), and it was the only use of that macro. In turn, KSTK_TOP used THREAD_SIZE_LONGS, and it was the only use of that macro too.
Fold these macros into task_pt_regs(). Tweak comment about "- 8" - we now use a symbolic constant, not literal 8.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426255743-5394-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
76e4c490 |
| 10-Mar-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry/32: Document our abuse of x86_hw_tss::ss1 and x86_hw_tss::sp1
This has confused me for a while. Now that I figured it out, document it.
Signed-off-by: Andy Lutomirski <luto@amacapita
x86/asm/entry/32: Document our abuse of x86_hw_tss::ss1 and x86_hw_tss::sp1
This has confused me for a while. Now that I figured it out, document it.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/b7efc1b7364039824776f68e9ddee9ec1500e894.1426009661.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
d9e05cc5 |
| 10-Mar-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry: Unify and fix initial thread_struct::sp0 values
x86_32 and x86_64 need slightly different thread_struct::sp0 values, and x86_32's was incorrect for init.
This never mattered -- the i
x86/asm/entry: Unify and fix initial thread_struct::sp0 values
x86_32 and x86_64 need slightly different thread_struct::sp0 values, and x86_32's was incorrect for init.
This never mattered -- the init thread never runs user code, so we never used thread_struct::sp0 for anything.
Fix it and mostly unify them.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1b810c1d2e797e27bb4a7708c426101161edd1f6.1426009661.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
3ee4298f |
| 10-Mar-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry: Create and use a 'TOP_OF_KERNEL_STACK_PADDING' macro
x86_32, unlike x86_64, pads the top of the kernel stack, because the hardware stack frame formats are variable in size.
Document
x86/asm/entry: Create and use a 'TOP_OF_KERNEL_STACK_PADDING' macro
x86_32, unlike x86_64, pads the top of the kernel stack, because the hardware stack frame formats are variable in size.
Document this padding and give it a name.
This should make no change whatsoever to the compiled kernel image. It also doesn't fix any of the current bugs in this area.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net [ Fixed small details, such as a missed magic constant in entry_32.S pointed out by Denys Vlasenko. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.0-rc3 |
|
#
a7fcf28d |
| 06-Mar-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
I broke 32-bit kernels. The implementation of sp0 was correct as far as I can tell, but sp0 was much weirder o
x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
I broke 32-bit kernels. The implementation of sp0 was correct as far as I can tell, but sp0 was much weirder on x86_32 than I realized. It has the following issues:
- Init's sp0 is inconsistent with everything else's: non-init tasks are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
- vm86 does crazy things to sp0.
Fix it up by replacing this_cpu_sp0() with current_top_of_stack() and using a new percpu variable to track the top of the stack on x86_32.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 75182b1632a8 ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()") Link: http://lkml.kernel.org/r/d09dbe270883433776e0cbee3c7079433349e96d.1425692936.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
d0a0de21 |
| 05-Mar-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry: Remove INIT_TSS and fold the definitions into 'cpu_tss'
The INIT_TSS is unnecessary. Just define the initial TSS where 'cpu_tss' is defined.
While we're at it, merge the 32-bit and
x86/asm/entry: Remove INIT_TSS and fold the definitions into 'cpu_tss'
The INIT_TSS is unnecessary. Just define the initial TSS where 'cpu_tss' is defined.
While we're at it, merge the 32-bit and 64-bit definitions. The only syntactic change is that 32-bit kernels were computing sp0 as long, but now they compute it as unsigned long.
Verified by objdump: the contents and relocations of .data..percpu..shared_aligned are unchanged on 32-bit and 64-bit kernels.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8fc39fa3f6c5d635e93afbdd1a0fe0678a6d7913.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
24933b82 |
| 05-Mar-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry: Rename 'init_tss' to 'cpu_tss'
It has nothing to do with init -- there's only one TSS per cpu.
Other names considered include:
- current_tss: Confusing because we never switch the
x86/asm/entry: Rename 'init_tss' to 'cpu_tss'
It has nothing to do with init -- there's only one TSS per cpu.
Other names considered include:
- current_tss: Confusing because we never switch the tss. - singleton_tss: Too long.
This patch was generated with 's/init_tss/cpu_tss/g'. Followup patches will fix INIT_TSS and INIT_TSS_IST by hand.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/da29fb2a793e4f649d93ce2d1ed320ebe8516262.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
8ef46a67 |
| 05-Mar-2015 |
Andy Lutomirski <luto@amacapital.net> |
x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu
We currently store references to the top of the kernel stack in multiple places: kernel_stack (with an offset) and init_tss.x86_tss.
x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu
We currently store references to the top of the kernel stack in multiple places: kernel_stack (with an offset) and init_tss.x86_tss.sp0 (no offset). The latter is defined by hardware and is a clean canonical way to find the top of the stack. Add an accessor so we can start using it.
This needs minor paravirt tweaks. On native, sp0 defines the top of the kernel stack and is therefore always correct. On Xen and lguest, the hypervisor tracks the top of the stack, but we want to start reading sp0 in the kernel. Fixing this is simple: just update our local copy of sp0 as well as the hypervisor's copy on task switches.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8d675581859712bee09a055ed8f785d80dac1eca.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.0-rc2, v4.0-rc1, v3.19, v3.19-rc7, v3.19-rc6 |
|
#
cbc82b17 |
| 23-Jan-2015 |
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> |
x86: Add support for Intel Cache QoS Monitoring (CQM) detection
This patch adds support for the new Cache QoS Monitoring (CQM) feature found in future Intel Xeon processors. It includes the new val
x86: Add support for Intel Cache QoS Monitoring (CQM) detection
This patch adds support for the new Cache QoS Monitoring (CQM) feature found in future Intel Xeon processors. It includes the new values to track CQM resources to the cpuinfo_x86 structure, plus the CPUID detection routines for CQM.
CQM allows a process, or set of processes, to be tracked by the CPU to determine the cache usage of that task group. Using this data from the CPU, software can be written to extract this data and report cache usage and occupancy for a particular process, or group of processes.
More information about Cache QoS Monitoring can be found in the Intel (R) x86 Architecture Software Developer Manual, section 17.14.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Chris Webb <chris@arachsys.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kanaka Juvva <kanaka.d.juvva@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Honeyman <stevenhoneyman@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Link: http://lkml.kernel.org/r/1422038748-21397-5-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
a930dc45 |
| 18-Jan-2015 |
Borislav Petkov <bp@suse.de> |
x86/asm: Cleanup prefetch primitives
This is based on a patch originally by hpa.
With the current improvements to the alternatives, we can simply use %P1 as a mem8 operand constraint and rely on th
x86/asm: Cleanup prefetch primitives
This is based on a patch originally by hpa.
With the current improvements to the alternatives, we can simply use %P1 as a mem8 operand constraint and rely on the toolchain to generate the proper instruction sizes. For example, on 32-bit, where we use an empty old instruction we get:
apply_alternatives: feat: 6*32+8, old: (c104648b, len: 4), repl: (c195566c, len: 4) c104648b: alt_insn: 90 90 90 90 c195566c: rpl_insn: 0f 0d 4b 5c
...
apply_alternatives: feat: 6*32+8, old: (c18e09b4, len: 3), repl: (c1955948, len: 3) c18e09b4: alt_insn: 90 90 90 c1955948: rpl_insn: 0f 0d 08
...
apply_alternatives: feat: 6*32+8, old: (c1190cf9, len: 7), repl: (c1955a79, len: 7) c1190cf9: alt_insn: 90 90 90 90 90 90 90 c1955a79: rpl_insn: 0f 0d 0d a0 d4 85 c1
all with the proper padding done depending on the size of the replacement instruction the compiler generates.
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@linux.intel.com>
show more ...
|
Revision tags: v3.19-rc5, v3.19-rc4, v3.19-rc3, v3.19-rc2, v3.19-rc1, v3.18, v3.18-rc7, v3.18-rc6, v3.18-rc5, v3.18-rc4, v3.18-rc3, v3.18-rc2 |
|
#
375074cc |
| 24-Oct-2014 |
Andy Lutomirski <luto@amacapital.net> |
x86: Clean up cr4 manipulation
CR4 manipulation was split, seemingly at random, between direct (write_cr4) and using a helper (set/clear_in_cr4). Unfortunately, the set_in_cr4 and clear_in_cr4 help
x86: Clean up cr4 manipulation
CR4 manipulation was split, seemingly at random, between direct (write_cr4) and using a helper (set/clear_in_cr4). Unfortunately, the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code, which only a small subset of users actually wanted.
This patch replaces all cr4 access in functions that don't leave cr4 exactly the way they found it with new helpers cr4_set_bits, cr4_clear_bits, and cr4_set_bits_and_update_boot.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
fe3d197f |
| 14-Nov-2014 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86, mpx: On-demand kernel allocation of bounds tables
This is really the meat of the MPX patch set. If there is one patch to review in the entire series, this is the one. There is a new ABI here
x86, mpx: On-demand kernel allocation of bounds tables
This is really the meat of the MPX patch set. If there is one patch to review in the entire series, this is the one. There is a new ABI here and this kernel code also interacts with userspace memory in a relatively unusual manner. (small FAQ below).
Long Description:
This patch adds two prctl() commands to provide enable or disable the management of bounds tables in kernel, including on-demand kernel allocation (See the patch "on-demand kernel allocation of bounds tables") and cleanup (See the patch "cleanup unused bound tables"). Applications do not strictly need the kernel to manage bounds tables and we expect some applications to use MPX without taking advantage of this kernel support. This means the kernel can not simply infer whether an application needs bounds table management from the MPX registers. The prctl() is an explicit signal from userspace.
PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to require kernel's help in managing bounds tables.
PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, the kernel won't allocate and free bounds tables even if the CPU supports MPX.
PR_MPX_ENABLE_MANAGEMENT will fetch the base address of the bounds directory out of a userspace register (bndcfgu) and then cache it into a new field (->bd_addr) in the 'mm_struct'. PR_MPX_DISABLE_MANAGEMENT will set "bd_addr" to an invalid address. Using this scheme, we can use "bd_addr" to determine whether the management of bounds tables in kernel is enabled.
Also, the only way to access that bndcfgu register is via an xsaves, which can be expensive. Caching "bd_addr" like this also helps reduce the cost of those xsaves when doing table cleanup at munmap() time. Unfortunately, we can not apply this optimization to #BR fault time because we need an xsave to get the value of BNDSTATUS.
==== Why does the hardware even have these Bounds Tables? ====
MPX only has 4 hardware registers for storing bounds information. If MPX-enabled code needs more than these 4 registers, it needs to spill them somewhere. It has two special instructions for this which allow the bounds to be moved between the bounds registers and some new "bounds tables".
They are similar conceptually to a page fault and will be raised by the MPX hardware during both bounds violations or when the tables are not present. This patch handles those #BR exceptions for not-present tables by carving the space out of the normal processes address space (essentially calling the new mmap() interface indroduced earlier in this patch set.) and then pointing the bounds-directory over to it.
The tables *need* to be accessed and controlled by userspace because the instructions for moving bounds in and out of them are extremely frequent. They potentially happen every time a register pointing to memory is dereferenced. Any direct kernel involvement (like a syscall) to access the tables would obviously destroy performance.
==== Why not do this in userspace? ====
This patch is obviously doing this allocation in the kernel. However, MPX does not strictly *require* anything in the kernel. It can theoretically be done completely from userspace. Here are a few ways this *could* be done. I don't think any of them are practical in the real-world, but here they are.
Q: Can virtual space simply be reserved for the bounds tables so that we never have to allocate them? A: As noted earlier, these tables are *HUGE*. An X-GB virtual area needs 4*X GB of virtual space, plus 2GB for the bounds directory. If we were to preallocate them for the 128TB of user virtual address space, we would need to reserve 512TB+2GB, which is larger than the entire virtual address space today. This means they can not be reserved ahead of time. Also, a single process's pre-popualated bounds directory consumes 2GB of virtual *AND* physical memory. IOW, it's completely infeasible to prepopulate bounds directories.
Q: Can we preallocate bounds table space at the same time memory is allocated which might contain pointers that might eventually need bounds tables? A: This would work if we could hook the site of each and every memory allocation syscall. This can be done for small, constrained applications. But, it isn't practical at a larger scale since a given app has no way of controlling how all the parts of the app might allocate memory (think libraries). The kernel is really the only place to intercept these calls.
Q: Could a bounds fault be handed to userspace and the tables allocated there in a signal handler instead of in the kernel? A: (thanks to tglx) mmap() is not on the list of safe async handler functions and even if mmap() would work it still requires locking or nasty tricks to keep track of the allocation state there.
Having ruled out all of the userspace-only approaches for managing bounds tables that we could think of, we create them on demand in the kernel.
Based-on-patch-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-mm@kvack.org Cc: linux-mips@linux-mips.org Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20141114151829.AD4310DE@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
62e7759b |
| 14-Nov-2014 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86, mpx: Rename cfg_reg_u and status_reg
According to Intel SDM extension, MPX configuration and status registers should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and status_reg to bnd
x86, mpx: Rename cfg_reg_u and status_reg
According to Intel SDM extension, MPX configuration and status registers should be BNDCFGU and BNDSTATUS. This patch renames cfg_reg_u and status_reg to bndcfgu and bndstatus.
[ tglx: Renamed 'struct bndscr_struct' to 'struct bndscr' ]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-mm@kvack.org Cc: linux-mips@linux-mips.org Cc: Dave Hansen <dave@sr71.net> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Link: http://lkml.kernel.org/r/20141114151817.031762AC@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
c04e051c |
| 31-Oct-2014 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86: mpx: Give bndX registers actual names
Consider the bndX MPX registers. There 4 registers each containing a 64-bit lower and a 64-bit upper bound. That's 8*64 bits and we declare it thusly:
x86: mpx: Give bndX registers actual names
Consider the bndX MPX registers. There 4 registers each containing a 64-bit lower and a 64-bit upper bound. That's 8*64 bits and we declare it thusly:
struct bndregs_struct { u64 bndregs[8]; } Let's say you want to read the upper bound from the MPX register bnd2 out of the xsave buf. You do:
bndregno = 2; upper_bound = xsave_buf->bndregs.bndregs[2*bndregno+1];
That kinda sucks. Every time you access it, you need to know: 1. Each bndX register is two entries wide in "bndregs" 2. The lower comes first followed by upper. We do the +1 to get upper vs. lower.
This replaces the old definition. You can now access them indexed by the register number directly, and with a meaningful name for the lower and upper bound:
bndregno = 2; xsave_buf->bndreg[bndregno].upper_bound;
It's now *VERY* clear that there are 4 registers. The programmer now doesn't have to care what order the lower and upper bounds are in, and it's harder to get it wrong.
[ tglx: Changed ub/lb to upper_bound/lower_bound and renamed struct bndreg_struct to struct bndreg ]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@linux.intel.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: "Yu, Fenghua" <fenghua.yu@intel.com> Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20141031215820.5EA5E0EC@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
07114f0f |
| 04-Nov-2014 |
Andy Lutomirski <luto@amacapital.net> |
x86_64: Add a comment explaining the TASK_SIZE_MAX guard page
That guard page is absolutely necessary; explain why for posterity.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://l
x86_64: Add a comment explaining the TASK_SIZE_MAX guard page
That guard page is absolutely necessary; explain why for posterity.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/23320cb5017c2da8475ec20fcde8089d82aa2699.1415144745.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
2c773dd3 |
| 04-Nov-2014 |
Jan Beulich <JBeulich@suse.com> |
x86: Convert a few more per-CPU items to read-mostly ones
Both this_cpu_off and cpu_info aren't getting modified post boot, yet are being accessed on enough code paths that grouping them with other
x86: Convert a few more per-CPU items to read-mostly ones
Both this_cpu_off and cpu_info aren't getting modified post boot, yet are being accessed on enough code paths that grouping them with other frequently read items seems desirable. For cpu_info this at the same time implies removing the cache line alignment (which afaict became pointless when it got converted to per-CPU data years ago).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/54589BD20200007800044A84@mail.emea.novell.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
Revision tags: v3.18-rc1, v3.17, v3.17-rc7, v3.17-rc6, v3.17-rc5, v3.17-rc4, v3.17-rc3, v3.17-rc2, v3.17-rc1, v3.16 |
|
#
e9f4e0a9 |
| 31-Jul-2014 |
Dave Hansen <dave.hansen@linux.intel.com> |
x86/mm: Rip out complicated, out-of-date, buggy TLB flushing
I think the flush_tlb_mm_range() code that tries to tune the flush sizes based on the CPU needs to get ripped out for several reasons:
1
x86/mm: Rip out complicated, out-of-date, buggy TLB flushing
I think the flush_tlb_mm_range() code that tries to tune the flush sizes based on the CPU needs to get ripped out for several reasons:
1. It is obviously buggy. It uses mm->total_vm to judge the task's footprint in the TLB. It should certainly be using some measure of RSS, *NOT* ->total_vm since only resident memory can populate the TLB. 2. Haswell, and several other CPUs are missing from the intel_tlb_flushall_shift_set() function. Thus, it has been demonstrated to bitrot quickly in practice. 3. It is plain wrong in my vm: [ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.037444] tlb_flushall_shift: 6 Which leads to it to never use invlpg. 4. The assumptions about TLB refill costs are wrong: http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com (more on this in later patches) 5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59 I believe the sample times were too short. Running the benchmark in a loop yields times that vary quite a bit.
Note that this leaves us with a static ceiling of 1 page. This is a conservative, dumb setting, and will be revised in a later patch.
This also removes the code which attempts to predict whether we are flushing data or instructions. We expect instruction flushes to be relatively rare and not worth tuning for explicitly.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.com Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
show more ...
|
Revision tags: v3.16-rc7, v3.16-rc6, v3.16-rc5, v3.16-rc4 |
|
#
3a6bfbc9 |
| 29-Jun-2014 |
Davidlohr Bueso <davidlohr@hp.com> |
arch, locking: Ciao arch_mutex_cpu_relax()
The arch_mutex_cpu_relax() function, introduced by 34b133f, is hacky and ugly. It was added a few years ago to address the fact that common cpu_relax() cal
arch, locking: Ciao arch_mutex_cpu_relax()
The arch_mutex_cpu_relax() function, introduced by 34b133f, is hacky and ugly. It was added a few years ago to address the fact that common cpu_relax() calls include yielding on s390, and thus impact the optimistic spinning functionality of mutexes. Nowadays we use this function well beyond mutexes: rwsem, qrwlock, mcs and lockref. Since the macro that defines the call is in the mutex header, any users must include mutex.h and the naming is misleading as well.
This patch (i) renames the call to cpu_relax_lowlatency ("relax, but only if you can do it with very low latency") and (ii) defines it in each arch's asm/processor.h local header, just like for regular cpu_relax functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax, and thus we can take it out of mutex.h. While this can seem redundant, I believe it is a good choice as it allows us to move out arch specific logic from generic locking primitives and enables future(?) archs to transparently define it, similarly to System Z.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Blanchard <anton@samba.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharat Bhushan <r65777@freescale.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Joe Perches <joe@perches.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Joseph Myers <joseph@codesourcery.com> Cc: Kees Cook <keescook@chromium.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Stratos Karafotis <stratosk@semaphore.gr> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Kulikov <segoon@openwall.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Waiman Long <Waiman.Long@hp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: adi-buildroot-devel@lists.sourceforge.net Cc: linux390@de.ibm.com Cc: linux-alpha@vger.kernel.org Cc: linux-am33-list@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-cris-kernel@axis.com Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux@lists.openrisc.net Cc: linux-m32r-ja@ml.linux-m32r.org Cc: linux-m32r@ml.linux-m32r.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-metag@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|