History log of /openbmc/linux/drivers/md/dm-crypt.c (Results 151 – 175 of 641)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v4.10.3, v4.10.2, v4.10.1, v4.10
# ef43aa38 04-Jan-2017 Milan Broz <gmazyland@gmail.com>

dm crypt: add cryptographic data integrity protection (authenticated encryption)

Allow the use of per-sector metadata, provided by the dm-integrity
module, for integrity protection and persistently

dm crypt: add cryptographic data integrity protection (authenticated encryption)

Allow the use of per-sector metadata, provided by the dm-integrity
module, for integrity protection and persistently stored per-sector
Initialization Vector (IV). The underlying device must support the
"DM-DIF-EXT-TAG" dm-integrity profile.

The per-bio integrity metadata is allocated by dm-crypt for every bio.

Example of low-level mapping table for various types of use:
DEV=/dev/sdb
SIZE=417792

# Additional HMAC with CBC-ESSIV, key is concatenated encryption key + HMAC key
SIZE_INT=389952
dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 32 J 0"
dmsetup create y --table "0 $SIZE_INT crypt aes-cbc-essiv:sha256 \
11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \
00112233445566778899aabbccddeeff00112233445566778899aabbccddeeff \
0 /dev/mapper/x 0 1 integrity:32:hmac(sha256)"

# AEAD (Authenticated Encryption with Additional Data) - GCM with random IVs
# GCM in kernel uses 96bits IV and we store 128bits auth tag (so 28 bytes metadata space)
SIZE_INT=393024
dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 28 J 0"
dmsetup create y --table "0 $SIZE_INT crypt aes-gcm-random \
11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \
0 /dev/mapper/x 0 1 integrity:28:aead"

# Random IV only for XTS mode (no integrity protection but provides atomic random sector change)
SIZE_INT=401272
dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 16 J 0"
dmsetup create y --table "0 $SIZE_INT crypt aes-xts-random \
11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \
0 /dev/mapper/x 0 1 integrity:16:none"

# Random IV with XTS + HMAC integrity protection
SIZE_INT=377656
dmsetup create x --table "0 $SIZE_INT integrity $DEV 0 48 J 0"
dmsetup create y --table "0 $SIZE_INT crypt aes-xts-random \
11ff33c6fb942655efb3e30cf4c0fd95f5ef483afca72166c530ae26151dd83b \
00112233445566778899aabbccddeeff00112233445566778899aabbccddeeff \
0 /dev/mapper/x 0 1 integrity:48:hmac(sha256)"

Both AEAD and HMAC protection authenticates not only data but also
sector metadata.

HMAC protection is implemented through autenc wrapper (so it is
processed the same way as an authenticated mode).

In HMAC mode there are two keys (concatenated in dm-crypt mapping
table). First is the encryption key and the second is the key for
authentication (HMAC). (It is userspace decision if these keys are
independent or somehow derived.)

The sector request for AEAD/HMAC authenticated encryption looks like this:
|----- AAD -------|------ DATA -------|-- AUTH TAG --|
| (authenticated) | (auth+encryption) | |
| sector_LE | IV | sector in/out | tag in/out |

For writes, the integrity fields are calculated during AEAD encryption
of every sector and stored in bio integrity fields and sent to
underlying dm-integrity target for storage.

For reads, the integrity metadata is verified during AEAD decryption of
every sector (they are filled in by dm-integrity, but the integrity
fields are pre-allocated in dm-crypt).

There is also an experimental support in cryptsetup utility for more
friendly configuration (part of LUKS2 format).

Because the integrity fields are not valid on initial creation, the
device must be "formatted". This can be done by direct-io writes to the
device (e.g. dd in direct-io mode). For now, there is available trivial
tool to do this, see: https://github.com/mbroz/dm_int_tools

Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Ondrej Mosnacek <omosnacek@gmail.com>
Signed-off-by: Vashek Matyas <matyas@fi.muni.cz>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 0837e49a 01-Mar-2017 David Howells <dhowells@redhat.com>

KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()

rcu_dereference_key() and user_key_payload() are currently being used in
two different, incompatible ways:

(1) As a wrapper

KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload()

rcu_dereference_key() and user_key_payload() are currently being used in
two different, incompatible ways:

(1) As a wrapper to rcu_dereference() - when only the RCU read lock used
to protect the key.

(2) As a wrapper to rcu_dereference_protected() - when the key semaphor is
used to protect the key and the may be being modified.

Fix this by splitting both of the key wrappers to produce:

(1) RCU accessors for keys when caller has the key semaphore locked:

dereference_key_locked()
user_key_payload_locked()

(2) RCU accessors for keys when caller holds the RCU read lock:

dereference_key_rcu()
user_key_payload_rcu()

This should fix following warning in the NFS idmapper

===============================
[ INFO: suspicious RCU usage. ]
4.10.0 #1 Tainted: G W
-------------------------------
./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 0
1 lock held by mount.nfs/5987:
#0: (rcu_read_lock){......}, at: [<d000000002527abc>] nfs_idmap_get_key+0x15c/0x420 [nfsv4]
stack backtrace:
CPU: 1 PID: 5987 Comm: mount.nfs Tainted: G W 4.10.0 #1
Call Trace:
dump_stack+0xe8/0x154 (unreliable)
lockdep_rcu_suspicious+0x140/0x190
nfs_idmap_get_key+0x380/0x420 [nfsv4]
nfs_map_name_to_uid+0x2a0/0x3b0 [nfsv4]
decode_getfattr_attrs+0xfac/0x16b0 [nfsv4]
decode_getfattr_generic.constprop.106+0xbc/0x150 [nfsv4]
nfs4_xdr_dec_lookup_root+0xac/0xb0 [nfsv4]
rpcauth_unwrap_resp+0xe8/0x140 [sunrpc]
call_decode+0x29c/0x910 [sunrpc]
__rpc_execute+0x140/0x8f0 [sunrpc]
rpc_run_task+0x170/0x200 [sunrpc]
nfs4_call_sync_sequence+0x68/0xa0 [nfsv4]
_nfs4_lookup_root.isra.44+0xd0/0xf0 [nfsv4]
nfs4_lookup_root+0xe0/0x350 [nfsv4]
nfs4_lookup_root_sec+0x70/0xa0 [nfsv4]
nfs4_find_root_sec+0xc4/0x100 [nfsv4]
nfs4_proc_get_rootfh+0x5c/0xf0 [nfsv4]
nfs4_get_rootfh+0x6c/0x190 [nfsv4]
nfs4_server_common_setup+0xc4/0x260 [nfsv4]
nfs4_create_server+0x278/0x3c0 [nfsv4]
nfs4_remote_mount+0x50/0xb0 [nfsv4]
mount_fs+0x74/0x210
vfs_kern_mount+0x78/0x220
nfs_do_root_mount+0xb0/0x140 [nfsv4]
nfs4_try_mount+0x60/0x100 [nfsv4]
nfs_fs_mount+0x5ec/0xda0 [nfs]
mount_fs+0x74/0x210
vfs_kern_mount+0x78/0x220
do_mount+0x254/0xf70
SyS_mount+0x94/0x100
system_call+0x38/0xe0

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>

show more ...


# f5b0cba8 31-Jan-2017 Ondrej Kozina <okozina@redhat.com>

dm crypt: replace RCU read-side section with rwsem

The lockdep splat below hints at a bug in RCU usage in dm-crypt that
was introduced with commit c538f6ec9f56 ("dm crypt: add ability to use
keys fr

dm crypt: replace RCU read-side section with rwsem

The lockdep splat below hints at a bug in RCU usage in dm-crypt that
was introduced with commit c538f6ec9f56 ("dm crypt: add ability to use
keys from the kernel key retention service"). The kernel keyring
function user_key_payload() is in fact a wrapper for
rcu_dereference_protected() which must not be called with only
rcu_read_lock() section mark.

Unfortunately the kernel keyring subsystem doesn't currently provide
an interface that allows the use of an RCU read-side section. So for
now we must drop RCU in favour of rwsem until a proper function is
made available in the kernel keyring subsystem.

===============================
[ INFO: suspicious RCU usage. ]
4.10.0-rc5 #2 Not tainted
-------------------------------
./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by cryptsetup/6464:
#0: (&md->type_lock){+.+.+.}, at: [<ffffffffa02472a2>] dm_lock_md_type+0x12/0x20 [dm_mod]
#1: (rcu_read_lock){......}, at: [<ffffffffa02822f8>] crypt_set_key+0x1d8/0x4b0 [dm_crypt]
stack backtrace:
CPU: 1 PID: 6464 Comm: cryptsetup Not tainted 4.10.0-rc5 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
Call Trace:
dump_stack+0x67/0x92
lockdep_rcu_suspicious+0xc5/0x100
crypt_set_key+0x351/0x4b0 [dm_crypt]
? crypt_set_key+0x1d8/0x4b0 [dm_crypt]
crypt_ctr+0x341/0xa53 [dm_crypt]
dm_table_add_target+0x147/0x330 [dm_mod]
table_load+0x111/0x350 [dm_mod]
? retrieve_status+0x1c0/0x1c0 [dm_mod]
ctl_ioctl+0x1f5/0x510 [dm_mod]
dm_ctl_ioctl+0xe/0x20 [dm_mod]
do_vfs_ioctl+0x8e/0x690
? ____fput+0x9/0x10
? task_work_run+0x7e/0xa0
? trace_hardirqs_on_caller+0x122/0x1b0
SyS_ioctl+0x3c/0x70
entry_SYSCALL_64_fastpath+0x18/0xad
RIP: 0033:0x7f392c9a4ec7
RSP: 002b:00007ffef6383378 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffef63830a0 RCX: 00007f392c9a4ec7
RDX: 000000000124fcc0 RSI: 00000000c138fd09 RDI: 0000000000000005
RBP: 00007ffef6383090 R08: 00000000ffffffff R09: 00000000012482b0
R10: 2a28205d34383336 R11: 0000000000000246 R12: 00007f392d803a08
R13: 00007ffef63831e0 R14: 0000000000000000 R15: 00007f392d803a0b

Fixes: c538f6ec9f56 ("dm crypt: add ability to use keys from the kernel key retention service")
Reported-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 642fa448 03-Jan-2017 Davidlohr Bueso <dave@stgolabs.net>

sched/core: Remove set_task_state()

This is a nasty interface and setting the state of a foreign task must
not be done. As of the following commit:

be628be0956 ("bcache: Make gc wakeup sane, remo

sched/core: Remove set_task_state()

This is a nasty interface and setting the state of a foreign task must
not be done. As of the following commit:

be628be0956 ("bcache: Make gc wakeup sane, remove set_task_state()")

... everyone in the kernel calls set_task_state() with current, allowing
the helper to be removed.

However, as the comment indicates, it is still around for those archs
where computing current is more expensive than using a pointer, at least
in theory. An important arch that is affected is arm64, however this has
been addressed now [1] and performance is up to par making no difference
with either calls.

Of all the callers, if any, it's the locking bits that would care most
about this -- ie: we end up passing a tsk pointer to a lot of the lock
slowpath, and setting ->state on that. The following numbers are based
on two tests: a custom ad-hoc microbenchmark that just measures
latencies (for ~65 million calls) between get_task_state() vs
get_current_state().

Secondly for a higher overview, an unlink microbenchmark was used,
which pounds on a single file with open, close,unlink combos with
increasing thread counts (up to 4x ncpus). While the workload is quite
unrealistic, it does contend a lot on the inode mutex or now rwsem.

[1] https://lkml.kernel.org/r/1483468021-8237-1-git-send-email-mark.rutland@arm.com

== 1. x86-64 ==

Avg runtime set_task_state(): 601 msecs
Avg runtime set_current_state(): 552 msecs

vanilla dirty
Hmean unlink1-processes-2 36089.26 ( 0.00%) 38977.33 ( 8.00%)
Hmean unlink1-processes-5 28555.01 ( 0.00%) 29832.55 ( 4.28%)
Hmean unlink1-processes-8 37323.75 ( 0.00%) 44974.57 ( 20.50%)
Hmean unlink1-processes-12 43571.88 ( 0.00%) 44283.01 ( 1.63%)
Hmean unlink1-processes-21 34431.52 ( 0.00%) 38284.45 ( 11.19%)
Hmean unlink1-processes-30 34813.26 ( 0.00%) 37975.17 ( 9.08%)
Hmean unlink1-processes-48 37048.90 ( 0.00%) 39862.78 ( 7.59%)
Hmean unlink1-processes-79 35630.01 ( 0.00%) 36855.30 ( 3.44%)
Hmean unlink1-processes-110 36115.85 ( 0.00%) 39843.91 ( 10.32%)
Hmean unlink1-processes-141 32546.96 ( 0.00%) 35418.52 ( 8.82%)
Hmean unlink1-processes-172 34674.79 ( 0.00%) 36899.21 ( 6.42%)
Hmean unlink1-processes-203 37303.11 ( 0.00%) 36393.04 ( -2.44%)
Hmean unlink1-processes-224 35712.13 ( 0.00%) 36685.96 ( 2.73%)

== 2. ppc64le ==

Avg runtime set_task_state(): 938 msecs
Avg runtime set_current_state: 940 msecs

vanilla dirty
Hmean unlink1-processes-2 19269.19 ( 0.00%) 30704.50 ( 59.35%)
Hmean unlink1-processes-5 20106.15 ( 0.00%) 21804.15 ( 8.45%)
Hmean unlink1-processes-8 17496.97 ( 0.00%) 17243.28 ( -1.45%)
Hmean unlink1-processes-12 14224.15 ( 0.00%) 17240.21 ( 21.20%)
Hmean unlink1-processes-21 14155.66 ( 0.00%) 15681.23 ( 10.78%)
Hmean unlink1-processes-30 14450.70 ( 0.00%) 15995.83 ( 10.69%)
Hmean unlink1-processes-48 16945.57 ( 0.00%) 16370.42 ( -3.39%)
Hmean unlink1-processes-79 15788.39 ( 0.00%) 14639.27 ( -7.28%)
Hmean unlink1-processes-110 14268.48 ( 0.00%) 14377.40 ( 0.76%)
Hmean unlink1-processes-141 14023.65 ( 0.00%) 16271.69 ( 16.03%)
Hmean unlink1-processes-172 13417.62 ( 0.00%) 16067.55 ( 19.75%)
Hmean unlink1-processes-203 15293.08 ( 0.00%) 15440.40 ( 0.96%)
Hmean unlink1-processes-234 13719.32 ( 0.00%) 16190.74 ( 18.01%)
Hmean unlink1-processes-265 16400.97 ( 0.00%) 16115.22 ( -1.74%)
Hmean unlink1-processes-296 14388.60 ( 0.00%) 16216.13 ( 12.70%)
Hmean unlink1-processes-320 15771.85 ( 0.00%) 15905.96 ( 0.85%)

x86-64 (known to be fast for get_current()/this_cpu_read_stable() caching)
and ppc64 (with paca) show similar improvements in the unlink microbenches.
The small delta for ppc64 (2ms), does not represent the gains on the unlink
runs. In the case of x86, there was a decent amount of variation in the
latency runs, but always within a 20 to 50ms increase), ppc was more constant.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: mark.rutland@arm.com
Link: http://lkml.kernel.org/r/1483479794-14013-5-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


Revision tags: v4.9
# 027c431c 01-Dec-2016 Ondrej Kozina <okozina@redhat.com>

dm crypt: reject key strings containing whitespace chars

Unfortunately key_string may theoretically contain whitespace even after
it's processed by dm_split_args(). The reason for this is DM core
s

dm crypt: reject key strings containing whitespace chars

Unfortunately key_string may theoretically contain whitespace even after
it's processed by dm_split_args(). The reason for this is DM core
supports escaping of almost all chars including any whitespace.

If userspace passes a key to the kernel in format ":32:logon:my_prefix:my\ key"
dm-crypt will look up key "my_prefix:my key" in kernel keyring service.
So far everything's fine.

Unfortunately if userspace later calls DM_TABLE_STATUS ioctl, it will not
receive back expected ":32:logon:my_prefix:my\ key" but the unescaped version
instead. Also userpace (most notably cryptsetup) is not ready to parse
single target argument containing (even escaped) whitespace chars and any
whitespace is simply taken as delimiter of another argument.

This effect is mitigated by the fact libdevmapper curently performs
double escaping of '\' char. Any user input in format "x\ x" is
transformed into "x\\ x" before being passed to the kernel. Nonetheless
dm-crypt may be used without libdevmapper. Therefore the near-term
solution to this is to reject any key string containing whitespace.

Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# c538f6ec 21-Nov-2016 Ondrej Kozina <okozina@redhat.com>

dm crypt: add ability to use keys from the kernel key retention service

The kernel key service is a generic way to store keys for the use of
other subsystems. Currently there is no way to use kernel

dm crypt: add ability to use keys from the kernel key retention service

The kernel key service is a generic way to store keys for the use of
other subsystems. Currently there is no way to use kernel keys in dm-crypt.
This patch aims to fix that. Instead of key userspace may pass a key
description with preceding ':'. So message that constructs encryption
mapping now looks like this:

<cipher> [<key>|:<key_string>] <iv_offset> <dev_path> <start> [<#opt_params> <opt_params>]

where <key_string> is in format: <key_size>:<key_type>:<key_description>

Currently we only support two elementary key types: 'user' and 'logon'.
Keys may be loaded in dm-crypt either via <key_string> or using
classical method and pass the key in hex representation directly.

dm-crypt device initialised with a key passed in hex representation may be
replaced with key passed in key_string format and vice versa.

(Based on original work by Andrey Ryabinin)

Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


Revision tags: openbmc-4.4-20161121-1, v4.4.33, v4.4.32, v4.4.31, v4.4.30, v4.4.29, v4.4.28, v4.4.27, v4.7.10, openbmc-4.4-20161021-1, v4.7.9, v4.4.26, v4.7.8, v4.4.25, v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22, v4.4.21, v4.7.4, v4.7.3, v4.4.20, v4.7.2, v4.4.19, openbmc-4.4-20160819-1, v4.7.1, v4.4.18, v4.4.17, openbmc-4.4-20160804-1, v4.4.16, v4.7, openbmc-4.4-20160722-1, openbmc-20160722-1, openbmc-20160713-1, v4.4.15, v4.6.4, v4.6.3, v4.4.14, v4.6.2, v4.4.13, openbmc-20160606-1, v4.6.1, v4.4.12, openbmc-20160521-1, v4.4.11, openbmc-20160518-1, v4.6, v4.4.10, openbmc-20160511-1, openbmc-20160505-1, v4.4.9, v4.4.8, v4.4.7, openbmc-20160329-2, openbmc-20160329-1, openbmc-20160321-1, v4.4.6, v4.5, v4.4.5, v4.4.4, v4.4.3, openbmc-20160222-1, v4.4.2, openbmc-20160212-1, openbmc-20160210-1, openbmc-20160202-2, openbmc-20160202-1, v4.4.1, openbmc-20160127-1, openbmc-20160120-1, v4.4, openbmc-20151217-1, openbmc-20151210-1, openbmc-20151202-1
# 1b1b58f5 29-Nov-2015 Julia Lawall <Julia.Lawall@lip6.fr>

dm crypt: constify crypt_iv_operations structures

The crypt_iv_operations are never modified, so declare them
as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@

dm crypt: constify crypt_iv_operations structures

The crypt_iv_operations are never modified, so declare them
as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 671ea6b4 25-Aug-2016 Mikulas Patocka <mpatocka@redhat.com>

dm crypt: rename crypt_setkey_allcpus to crypt_setkey

In the past, dm-crypt used per-cpu crypto context. This has been removed
in the kernel 3.15 and the crypto context is shared between all cpus. T

dm crypt: rename crypt_setkey_allcpus to crypt_setkey

In the past, dm-crypt used per-cpu crypto context. This has been removed
in the kernel 3.15 and the crypto context is shared between all cpus. This
patch renames the function crypt_setkey_allcpus to crypt_setkey, because
there is really no activity that is done for all cpus.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 265e9098 02-Nov-2016 Ondrej Kozina <okozina@redhat.com>

dm crypt: mark key as invalid until properly loaded

In crypt_set_key(), if a failure occurs while replacing the old key
(e.g. tfm->setkey() fails) the key must not have DM_CRYPT_KEY_VALID flag
set.

dm crypt: mark key as invalid until properly loaded

In crypt_set_key(), if a failure occurs while replacing the old key
(e.g. tfm->setkey() fails) the key must not have DM_CRYPT_KEY_VALID flag
set. Otherwise, the crypto layer would have an invalid key that still
has DM_CRYPT_KEY_VALID flag set.

Cc: stable@vger.kernel.org
Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 0dae7fe5 29-Oct-2016 Ming Lei <tom.leiming@gmail.com>

dm crypt: use bio_add_page()

Use bio_add_page(), the standard interface for adding a page to a bio,
rather than open-coding the same.

It should be noted that the 'clone' bio that is allocated using

dm crypt: use bio_add_page()

Use bio_add_page(), the standard interface for adding a page to a bio,
rather than open-coding the same.

It should be noted that the 'clone' bio that is allocated using
bio_alloc_bioset(), in crypt_alloc_buffer(), does _not_ set the
bio's BIO_CLONED flag. As such, bio_add_page()'s early return for true
bio clones (those with BIO_CLONED set) isn't applicable.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# ef295ecf 28-Oct-2016 Christoph Hellwig <hch@lst.de>

block: better op and flags encoding

Now that we don't need the common flags to overflow outside the range
of a 32-bit type we can encode them the same way for both the bio and
request fields. This

block: better op and flags encoding

Now that we don't need the common flags to overflow outside the range
of a 32-bit type we can encode them the same way for both the bio and
request fields. This in addition allows us to place the operation
first (and make some room for more ops while we're at it) and to
stop having to shift around the operation values.

In addition this allows passing around only one value in the block layer
instead of two (and eventuall also in the file systems, but we can do
that later) and thus clean up a lot of code.

Last but not least this allows decreasing the size of the cmd_flags
field in struct request to 32-bits. Various functions passing this
value could also be updated, but I'd like to avoid the churn for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

show more ...


# f659b100 21-Sep-2016 Rabin Vincent <rabinv@axis.com>

dm crypt: fix crash on exit

As the documentation for kthread_stop() says, "if threadfn() may call
do_exit() itself, the caller must ensure task_struct can't go away".
dm-crypt does not ensure this a

dm crypt: fix crash on exit

As the documentation for kthread_stop() says, "if threadfn() may call
do_exit() itself, the caller must ensure task_struct can't go away".
dm-crypt does not ensure this and therefore crashes when crypt_dtr()
calls kthread_stop(). The crash is trivially reproducible by adding a
delay before the call to kthread_stop() and just opening and closing a
dm-crypt device.

general protection fault: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 533 Comm: cryptsetup Not tainted 4.8.0-rc7+ #7
task: ffff88003bd0df40 task.stack: ffff8800375b4000
RIP: 0010: kthread_stop+0x52/0x300
Call Trace:
crypt_dtr+0x77/0x120
dm_table_destroy+0x6f/0x120
__dm_destroy+0x130/0x250
dm_destroy+0x13/0x20
dev_remove+0xe6/0x120
? dev_suspend+0x250/0x250
ctl_ioctl+0x1fc/0x530
? __lock_acquire+0x24f/0x1b10
dm_ctl_ioctl+0x13/0x20
do_vfs_ioctl+0x91/0x6a0
? ____fput+0xe/0x10
? entry_SYSCALL_64_fastpath+0x5/0xbd
? trace_hardirqs_on_caller+0x151/0x1e0
SyS_ioctl+0x41/0x70
entry_SYSCALL_64_fastpath+0x1f/0xbd

This problem was introduced by bcbd94ff481e ("dm crypt: fix a possible
hang due to race condition on exit").

Looking at the description of that patch (excerpted below), it seems
like the problem it addresses can be solved by just using
set_current_state instead of __set_current_state, since we obviously
need the memory barrier.

| dm crypt: fix a possible hang due to race condition on exit
|
| A kernel thread executes __set_current_state(TASK_INTERRUPTIBLE),
| __add_wait_queue, spin_unlock_irq and then tests kthread_should_stop().
| It is possible that the processor reorders memory accesses so that
| kthread_should_stop() is executed before __set_current_state(). If
| such reordering happens, there is a possible race on thread
| termination: [...]

So this patch just reverts the aforementioned patch and changes the
__set_current_state(TASK_INTERRUPTIBLE) to set_current_state(...). This
fixes the crash and should also fix the potential hang.

Fixes: bcbd94ff481e ("dm crypt: fix a possible hang due to race condition on exit")
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 4382e33a 14-Sep-2016 Bart Van Assche <bart.vanassche@sandisk.com>

block, dm-crypt, btrfs: Introduce bio_flags()

Introduce the bio_flags() macro. Ensure that the second argument of
bio_set_op_attrs() only contains flags and no operation. This patch
does not change

block, dm-crypt, btrfs: Introduce bio_flags()

Introduce the bio_flags() macro. Ensure that the second argument of
bio_set_op_attrs() only contains flags and no operation. This patch
does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Chris Mason <clm@fb.com> (maintainer:BTRFS FILE SYSTEM)
Cc: Josef Bacik <jbacik@fb.com> (maintainer:BTRFS FILE SYSTEM)
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

show more ...


# 5d0be84e 30-Aug-2016 Eric Biggers <ebiggers@google.com>

dm crypt: fix free of bad values after tfm allocation failure

If crypt_alloc_tfms() had to allocate multiple tfms and it failed before
the last allocation, then it would call crypt_free_tfms() and c

dm crypt: fix free of bad values after tfm allocation failure

If crypt_alloc_tfms() had to allocate multiple tfms and it failed before
the last allocation, then it would call crypt_free_tfms() and could free
pointers from uninitialized memory -- due to the crypt_free_tfms() check
for non-zero cc->tfms[i]. Fix by allocating zeroed memory.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

show more ...


# 4e870e94 30-Aug-2016 Mikulas Patocka <mpatocka@redhat.com>

dm crypt: fix error with too large bios

When dm-crypt processes writes, it allocates a new bio in
crypt_alloc_buffer(). The bio is allocated from a bio set and it can
have at most BIO_MAX_PAGES vec

dm crypt: fix error with too large bios

When dm-crypt processes writes, it allocates a new bio in
crypt_alloc_buffer(). The bio is allocated from a bio set and it can
have at most BIO_MAX_PAGES vector entries, however the incoming bio can be
larger (e.g. if it was allocated by bcache). If the incoming bio is
larger, bio_alloc_bioset() fails and an error is returned.

To avoid the error, we test for a too large bio in the function
crypt_map() and use dm_accept_partial_bio() to split the bio.
dm_accept_partial_bio() trims the current bio to the desired size and
asks DM core to send another bio with the rest of the data.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # v3.16+

show more ...


# 0a83df6c 15-Jul-2016 Mikulas Patocka <mpatocka@redhat.com>

dm crypt: increase mempool reserve to better support swapping

Increase mempool size from 16 to 64 entries. This increase improves
swap on dm-crypt performance.

When swapping to dm-crypt, all avail

dm crypt: increase mempool reserve to better support swapping

Increase mempool size from 16 to 64 entries. This increase improves
swap on dm-crypt performance.

When swapping to dm-crypt, all available memory is temporarily exhausted
and dm-crypt can only use the mempool reserve.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 1eff9d32 05-Aug-2016 Jens Axboe <axboe@fb.com>

block: rename bio bi_rw to bi_opf

Since commit 63a4cc24867d, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually s

block: rename bio bi_rw to bi_opf

Since commit 63a4cc24867d, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.

No intended functional changes in this commit.

Signed-off-by: Jens Axboe <axboe@fb.com>

show more ...


# 350b5393 28-Jun-2016 Bart Van Assche <bart.vanassche@sandisk.com>

dm crypt: Fix sparse complaints

Avoid that sparse complains about assigning a __le64 value to a u64
variable. Remove the (u64) casts since these are superfluous. This
patch does not change the beh

dm crypt: Fix sparse complaints

Avoid that sparse complains about assigning a __le64 value to a u64
variable. Remove the (u64) casts since these are superfluous. This
patch does not change the behavior of the source code.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


# 28a8f0d3 05-Jun-2016 Mike Christie <mchristi@redhat.com>

block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH

To avoid confusion between REQ_OP_FLUSH, which is handled by
request_fn drivers, and upper layers requesting the block layer
perform a flush sequ

block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH

To avoid confusion between REQ_OP_FLUSH, which is handled by
request_fn drivers, and upper layers requesting the block layer
perform a flush sequence along with possibly a WRITE, this patch
renames REQ_FLUSH to REQ_PREFLUSH.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

show more ...


# e6047149 05-Jun-2016 Mike Christie <mchristi@redhat.com>

dm: use bio op accessors

Separate the op from the rq_flag_bits and have dm
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Rein

dm: use bio op accessors

Separate the op from the rq_flag_bits and have dm
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

show more ...


# 30187e1d 31-Jan-2016 Mike Snitzer <snitzer@redhat.com>

dm: rename target's per_bio_data_size to per_io_data_size

Request-based DM will also make use of per_bio_data_size.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# bbdb23b5 24-Jan-2016 Herbert Xu <herbert@gondor.apana.org.au>

dm crypt: Use skcipher and ahash

This patch replaces uses of ablkcipher with skcipher, and the long
obsolete hash interface with ahash.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


Revision tags: openbmc-20151123-1
# bcbd94ff 19-Nov-2015 Mikulas Patocka <mpatocka@redhat.com>

dm crypt: fix a possible hang due to race condition on exit

A kernel thread executes __set_current_state(TASK_INTERRUPTIBLE),
__add_wait_queue, spin_unlock_irq and then tests kthread_should_stop().

dm crypt: fix a possible hang due to race condition on exit

A kernel thread executes __set_current_state(TASK_INTERRUPTIBLE),
__add_wait_queue, spin_unlock_irq and then tests kthread_should_stop().
It is possible that the processor reorders memory accesses so that
kthread_should_stop() is executed before __set_current_state(). If such
reordering happens, there is a possible race on thread termination:

CPU 0:
calls kthread_should_stop()
it tests KTHREAD_SHOULD_STOP bit, returns false
CPU 1:
calls kthread_stop(cc->write_thread)
sets the KTHREAD_SHOULD_STOP bit
calls wake_up_process on the kernel thread, that sets the thread
state to TASK_RUNNING
CPU 0:
sets __set_current_state(TASK_INTERRUPTIBLE)
spin_unlock_irq(&cc->write_thread_wait.lock)
schedule() - and the process is stuck and never terminates, because the
state is TASK_INTERRUPTIBLE and wake_up_process on CPU 1 already
terminated

Fix this race condition by using a new flag DM_CRYPT_EXIT_THREAD to
signal that the kernel thread should exit. The flag is set and tested
while holding cc->write_thread_wait.lock, so there is no possibility of
racy access to the flag.

Also, remove the unnecessary set_task_state(current, TASK_RUNNING)
following the schedule() call. When the process was woken up, its state
was already set to TASK_RUNNING. Other kernel code also doesn't set the
state to TASK_RUNNING following schedule() (for example,
do_wait_for_common in completion.c doesn't do it).

Fixes: dc2676210c42 ("dm crypt: offload writes to thread")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


Revision tags: openbmc-20151118-1
# d0164adc 06-Nov-2015 Mel Gorman <mgorman@techsingularity.net>

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


Revision tags: openbmc-20151104-1, v4.3, openbmc-20151102-1, openbmc-20151028-1
# 6f65985e 13-Sep-2015 Julia Lawall <Julia.Lawall@lip6.fr>

dm: drop NULL test before kmem_cache_destroy() and mempool_destroy()

Remove DM's unneeded NULL tests before calling these destroy functions,
now that they check for NULL, thanks to these v4.3 commit

dm: drop NULL test before kmem_cache_destroy() and mempool_destroy()

Remove DM's unneeded NULL tests before calling these destroy functions,
now that they check for NULL, thanks to these v4.3 commits:
3942d2991 ("mm/slab_common: allow NULL cache pointer in kmem_cache_destroy()")
4e3ca3e03 ("mm/mempool: allow NULL `pool' pointer in mempool_destroy()")

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL)
\(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

show more ...


12345678910>>...26