History log of /openbmc/linux/arch/x86/include/asm/spinlock.h (Results 76 – 95 of 95)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v3.1-rc5, v3.1-rc4, v3.1-rc3, v3.1-rc2, v3.1-rc1, v3.0, v3.0-rc7, v3.0-rc6, v3.0-rc5, v3.0-rc4, v3.0-rc3, v3.0-rc2, v3.0-rc1, v2.6.39, v2.6.39-rc7, v2.6.39-rc6, v2.6.39-rc5, v2.6.39-rc4, v2.6.39-rc3, v2.6.39-rc2, v2.6.39-rc1, v2.6.38, v2.6.38-rc8, v2.6.38-rc7, v2.6.38-rc6, v2.6.38-rc5, v2.6.38-rc4, v2.6.38-rc3, v2.6.38-rc2, v2.6.38-rc1, v2.6.37, v2.6.37-rc8, v2.6.37-rc7, v2.6.37-rc6, v2.6.37-rc5, v2.6.37-rc4, v2.6.37-rc3, v2.6.37-rc2, v2.6.37-rc1, v2.6.36, v2.6.36-rc8, v2.6.36-rc7, v2.6.36-rc6, v2.6.36-rc5, v2.6.36-rc4, v2.6.36-rc3, v2.6.36-rc2, v2.6.36-rc1, v2.6.35, v2.6.35-rc6
# 229855d6 13-Jul-2010 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

x86, ticketlock: Make __ticket_spin_trylock common

Make trylock code common regardless of ticket size.

(Also, rename arch_spinlock.slock to head_tail.)

Signed-off-by: Jerem

x86, ticketlock: Make __ticket_spin_trylock common

Make trylock code common regardless of ticket size.

(Also, rename arch_spinlock.slock to head_tail.)

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

show more ...


# 2994488f 13-Jul-2010 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

x86, ticketlock: Convert __ticket_spin_lock to use xadd()

Convert the two variants of __ticket_spin_lock() to use xadd(), which
has the effect of making them identical, so remove the dup

x86, ticketlock: Convert __ticket_spin_lock to use xadd()

Convert the two variants of __ticket_spin_lock() to use xadd(), which
has the effect of making them identical, so remove the duplicate function.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

show more ...


Revision tags: v2.6.35-rc5, v2.6.35-rc4
# c576a3ea 02-Jul-2010 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

x86, ticketlock: Convert spin loop to C

The inner loop of __ticket_spin_lock isn't doing anything very special,
so reimplement it in C.

For the 8 bit ticket lock variant, we use

x86, ticketlock: Convert spin loop to C

The inner loop of __ticket_spin_lock isn't doing anything very special,
so reimplement it in C.

For the 8 bit ticket lock variant, we use a register union to get direct
access to the lower and upper bytes in the tickets, but unfortunately gcc
won't generate a direct comparison between the two halves of the register,
so the generated asm isn't quite as pretty as the hand-coded version.
However benchmarking shows that this is actually a small improvement in
runtime performance on some benchmarks, and never a slowdown.

We also need to make sure there's a barrier at the end of the lock loop
to make sure that the compiler doesn't move any instructions from within
the locked region into the region where we don't yet own the lock.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

show more ...


# 84eb950d 02-Jul-2010 Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

x86, ticketlock: Clean up types and accessors

A few cleanups to the way spinlocks are defined and accessed:
- define __ticket_t which is the size of a spinlock ticket (ie, enough

x86, ticketlock: Clean up types and accessors

A few cleanups to the way spinlocks are defined and accessed:
- define __ticket_t which is the size of a spinlock ticket (ie, enough
bits to hold all the cpus)
- Define struct arch_spinlock as a union containing plain slock and
the head and tail tickets
- Use head and tail to implement some of the spinlock predicates.
- Make all ticket variables unsigned.
- Use TICKET_SHIFT to form constants

Most of this will be used in later patches.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Link: http://lkml.kernel.org/r/4E5BCC40.3030501@goop.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

show more ...


# 60063497 26-Jul-2011 Arun Sharma <asharma@fb.com>

atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.

atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# a750036f 19-Jul-2011 Jan Beulich <JBeulich@novell.com>

x86: Fix write lock scalability 64-bit issue

With the write lock path simply subtracting RW_LOCK_BIAS there
is, on large systems, the theoretical possibility of overflowing
the 32-bi

x86: Fix write lock scalability 64-bit issue

With the write lock path simply subtracting RW_LOCK_BIAS there
is, on large systems, the theoretical possibility of overflowing
the 32-bit value that was used so far (namely if 128 or more
CPUs manage to do the subtraction, but don't get to do the
inverse addition in the failure path quickly enough).

A first measure is to modify RW_LOCK_BIAS itself - with the new
value chosen, it is good for up to 2048 CPUs each allowed to
nest over 2048 times on the read path without causing an issue.
Quite possibly it would even be sufficient to adjust the bias a
little further, assuming that allowing for significantly less
nesting would suffice.

However, as the original value chosen allowed for even more
nesting levels, to support more than 2048 CPUs (possible
currently only for 64-bit kernels) the lock itself gets widened
to 64 bits.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4E258E0D020000780004E3F0@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>

show more ...


Revision tags: v2.6.35-rc3, v2.6.35-rc2, v2.6.35-rc1, v2.6.34, v2.6.34-rc7, v2.6.34-rc6, v2.6.34-rc5, v2.6.34-rc4, v2.6.34-rc3, v2.6.34-rc2, v2.6.34-rc1, v2.6.33, v2.6.33-rc8, v2.6.33-rc7, v2.6.33-rc6, v2.6.33-rc5, v2.6.33-rc4, v2.6.33-rc3, v2.6.33-rc2, v2.6.33-rc1
# e5931943 03-Dec-2009 Thomas Gleixner <tglx@linutronix.de>

locking: Convert raw_rwlock functions to arch_rwlock

Name space cleanup for rwlock functions. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by:

locking: Convert raw_rwlock functions to arch_rwlock

Name space cleanup for rwlock functions. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org

show more ...


# fb3a6bbc 03-Dec-2009 Thomas Gleixner <tglx@linutronix.de>

locking: Convert raw_rwlock to arch_rwlock

Not strictly necessary for -rt as -rt does not have non sleeping
rwlocks, but it's odd to not have a consistent naming convention.

No

locking: Convert raw_rwlock to arch_rwlock

Not strictly necessary for -rt as -rt does not have non sleeping
rwlocks, but it's odd to not have a consistent naming convention.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org

show more ...


Revision tags: v2.6.32
# 0199c4e6 02-Dec-2009 Thomas Gleixner <tglx@linutronix.de>

locking: Convert __raw_spin* functions to arch_spin*

Name space cleanup. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <pete

locking: Convert __raw_spin* functions to arch_spin*

Name space cleanup. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org

show more ...


# 445c8951 02-Dec-2009 Thomas Gleixner <tglx@linutronix.de>

locking: Convert raw_spinlock to arch_spinlock

The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for

locking: Convert raw_spinlock to arch_spinlock

The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.

Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org

show more ...


Revision tags: v2.6.32-rc8, v2.6.32-rc7, v2.6.32-rc6, v2.6.32-rc5, v2.6.32-rc4, v2.6.32-rc3, v2.6.32-rc1, v2.6.32-rc2, v2.6.31, v2.6.31-rc9, v2.6.31-rc8, v2.6.31-rc7, v2.6.31-rc6, v2.6.31-rc5, v2.6.31-rc4, v2.6.31-rc3
# ad462769 08-Jul-2009 Jiri Olsa <jolsa@redhat.com>

memory barrier: adding smp_mb__after_lock

Adding smp_mb__after_lock define to be used as a smp_mb call after
a lock.

Making it nop for x86, since {read|write|spin}_lock() on x86

memory barrier: adding smp_mb__after_lock

Adding smp_mb__after_lock define to be used as a smp_mb call after
a lock.

Making it nop for x86, since {read|write|spin}_lock() on x86 are
full memory barriers.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v2.6.31-rc2, v2.6.31-rc1, v2.6.30, v2.6.30-rc8, v2.6.30-rc7, v2.6.30-rc6
# b4ecc126 13-May-2009 Jeremy Fitzhardinge <jeremy@goop.org>

x86: Fix performance regression caused by paravirt_ops on native kernels

Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_op

x86: Fix performance regression caused by paravirt_ops on native kernels

Xiaohui Xin and some other folks at Intel have been looking into what's
behind the performance hit of paravirt_ops when running native.

It appears that the hit is entirely due to the paravirtualized
spinlocks introduced by:

| commit 8efcbab674de2bee45a2e4cdf97de16b8e609ac8
| Date: Mon Jul 7 12:07:51 2008 -0700
|
| paravirt: introduce a "lock-byte" spinlock implementation

The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test). The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons). This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions. But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).

If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:

nopv Pv-nospin Pv-spin
CPU cycles 100.00% 99.89% 102.18%
instructions 100.00% 100.10% 100.15%
CPI 100.00% 99.79% 102.03%
cache ref 100.00% 100.84% 100.28%
cache miss 100.00% 90.47% 88.56%
cache miss rate 100.00% 89.72% 88.31%
branches 100.00% 99.93% 100.04%
branch miss 100.00% 103.66% 107.72%
branch miss rt 100.00% 103.73% 107.67%
wallclock 100.00% 99.90% 102.20%

The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.

(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates. Not too surprising, but it suggests that
the non-pvops kernel is over-inlined. On the flipside,
the branch misses go up correspondingly...)

So, what's the fix?

Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls. For example, the compiler
generated code for paravirtualized _spin_lock is:

<_spin_lock+0>: mov %gs:0xb4c8,%rax
<_spin_lock+9>: incl 0xffffffffffffe044(%rax)
<_spin_lock+15>: callq *0xffffffff805a5b30
<_spin_lock+22>: retq

The indirect call will get patched to:
<_spin_lock+0>: mov %gs:0xb4c8,%rax
<_spin_lock+9>: incl 0xffffffffffffe044(%rax)
<_spin_lock+15>: callq <__ticket_spin_lock>
<_spin_lock+20>: nop; nop /* or whatever 2-byte nop */
<_spin_lock+22>: retq

One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled). That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case. The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.

The other obvious answer is to disable pv-spinlocks. Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin). Still it is a
reasonable short-term workaround.

[ Impact: fix pvops performance regression when running native ]

Analysed-by: "Xin Xiaohui" <xiaohui.xin@intel.com>
Analysed-by: "Li Xin" <xin.li@intel.com>
Analysed-by: "Nakajima Jun" <jun.nakajima@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>

show more ...


Revision tags: v2.6.30-rc5, v2.6.30-rc4, v2.6.30-rc3, v2.6.30-rc2, v2.6.30-rc1
# f5f7eac4 02-Apr-2009 Robin Holt <holt@sgi.com>

Allow rwlocks to re-enable interrupts

Pass the original flags to rwlock arch-code, so that it can re-enable
interrupts if implemented for that architecture.

Initially, make __ra

Allow rwlocks to re-enable interrupts

Pass the original flags to rwlock arch-code, so that it can re-enable
interrupts if implemented for that architecture.

Initially, make __raw_read_lock_flags and __raw_write_lock_flags stubs
which just do the same thing as non-flags variants.

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# f8a6b2b9 13-Feb-2009 Ingo Molnar <mingo@elte.hu>

Merge branch 'linus' into x86/apic

Conflicts:
arch/x86/kernel/acpi/boot.c
arch/x86/mm/fault.c


Revision tags: v2.6.29, v2.6.29-rc8, v2.6.29-rc7, v2.6.29-rc6, v2.6.29-rc5
# a5ef7ca0 08-Feb-2009 Kyle McMartin <kyle@redhat.com>

x86: spinlocks: define dummy __raw_spin_is_contended

Architectures other than mips and x86 are not using ticket spinlocks.
Therefore, the contention on the lock is meaningless, since the

x86: spinlocks: define dummy __raw_spin_is_contended

Architectures other than mips and x86 are not using ticket spinlocks.
Therefore, the contention on the lock is meaningless, since there is
nobody known to be waiting on it (arguably /fairly/ unfair locks).

Dummy it out to return 0 on other architectures.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 74b6eb6b 28-Jan-2009 Ingo Molnar <mingo@elte.hu>

Merge branches 'x86/asm', 'x86/cleanups', 'x86/cpudetect', 'x86/debug', 'x86/doc', 'x86/header-fixes', 'x86/mm', 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgen

Merge branches 'x86/asm', 'x86/cleanups', 'x86/cpudetect', 'x86/debug', 'x86/doc', 'x86/header-fixes', 'x86/mm', 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core

show more ...


Revision tags: v2.6.29-rc4, v2.6.29-rc3
# 2d4d57db 25-Jan-2009 Frederic Weisbecker <fweisbec@gmail.com>

x86: micro-optimize __raw_read_trylock()

The current version of __raw_read_trylock starts with decrementing the lock
and read its new value as a separate operation after that.

T

x86: micro-optimize __raw_read_trylock()

The current version of __raw_read_trylock starts with decrementing the lock
and read its new value as a separate operation after that.

That makes 3 dereferences (read, write (after sub), read) whereas
a single atomic_dec_return does only two pointers dereferences (read, write).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

show more ...


Revision tags: v2.6.29-rc2
# afb33f8c 12-Jan-2009 Jiri Kosina <jkosina@suse.cz>

x86: remove byte locks

Impact: cleanup

Remove byte locks implementation, which was introduced by Jeremy in
8efcbab6 ("paravirt: introduce a "lock-byte" spinlock implementation")

x86: remove byte locks

Impact: cleanup

Remove byte locks implementation, which was introduced by Jeremy in
8efcbab6 ("paravirt: introduce a "lock-byte" spinlock implementation"),
but turned out to be dead code that is not used by any in-kernel
virtualization guest (Xen uses its own variant of spinlocks implementation
and KVM is not planning to move to byte locks).

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

show more ...


Revision tags: v2.6.29-rc1, v2.6.28, v2.6.28-rc9, v2.6.28-rc8, v2.6.28-rc7, v2.6.28-rc6, v2.6.28-rc5, v2.6.28-rc4, v2.6.28-rc3, v2.6.28-rc2, v2.6.28-rc1
# 1965aae3 23-Oct-2008 H. Peter Anvin <hpa@zytor.com>

x86: Fix ASM_X86__ header guards

Change header guards named "ASM_X86__*" to "_ASM_X86_*" since:

a. the double underscore is ugly and pointless.
b. no leading underscore violates

x86: Fix ASM_X86__ header guards

Change header guards named "ASM_X86__*" to "_ASM_X86_*" since:

a. the double underscore is ugly and pointless.
b. no leading underscore violates namespace constraints.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>

show more ...


Revision tags: v2.6.27, v2.6.27-rc9, v2.6.27-rc8, v2.6.27-rc7, v2.6.27-rc6, v2.6.27-rc5, v2.6.27-rc4
# bb898558 17-Aug-2008 Al Viro <viro@zeniv.linux.org.uk>

x86, um: ... and asm-x86 move

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>


1234