History log of /openbmc/linux/net/ceph/osd_client.c (Results 51 – 75 of 767)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 98c4bfe9 17-Oct-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: check reply num_data_items in setup_request_data()

setup_request_data() adds message data items to both request and reply
messages, but only checks request num_data_items before proceeding

libceph: check reply num_data_items in setup_request_data()

setup_request_data() adds message data items to both request and reply
messages, but only checks request num_data_items before proceeding with
the loop. This is wrong because if an op doesn't have any request data
items but has a reply data item (e.g. read), a duplicate data item gets
added to the message on every resend attempt.

This went unnoticed for years but now that message data items are
preallocated, it promptly crashes in ceph_msg_data_add(). Amend the
signature to make it clear that setup_request_data() operates on both
request and reply messages. Also, remove data_len assert -- we have
another one in prepare_write_message().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# 0d9c1ab3 15-Oct-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: preallocate message data items

Currently message data items are allocated with ceph_msg_data_create()
in setup_request_data() inside send_request(). send_request() has never
been allowed t

libceph: preallocate message data items

Currently message data items are allocated with ceph_msg_data_create()
in setup_request_data() inside send_request(). send_request() has never
been allowed to fail, so each allocation is followed by a BUG_ON:

data = ceph_msg_data_create(...);
BUG_ON(!data);

It's been this way since support for multiple message data items was
added in commit 6644ed7b7e04 ("libceph: make message data be a pointer")
in 3.10.

There is no reason to delay the allocation of message data items until
the last possible moment and we certainly don't need a linked list of
them as they are only ever appended to the end and never erased. Make
ceph_msg_new2() take max_data_items and adapt the rest of the code.

Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# 26f887e0 15-Oct-2018 Ilya Dryomov <idryomov@gmail.com>

libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls

The current requirement is that ceph_osdc_alloc_messages() should be
called after oid and oloc are known. In preparation for preallocating

libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls

The current requirement is that ceph_osdc_alloc_messages() should be
called after oid and oloc are known. In preparation for preallocating
message data items, move ceph_osdc_alloc_messages() further down, so
that it is called when OSD op codes are known.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# 39e58c34 15-Oct-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: introduce alloc_watch_request()

ceph_osdc_alloc_messages() call will be moved out of
alloc_linger_request() in the next commit, which means that
ceph_osdc_watch() will need to call ceph_osd

libceph: introduce alloc_watch_request()

ceph_osdc_alloc_messages() call will be moved out of
alloc_linger_request() in the next commit, which means that
ceph_osdc_watch() will need to call ceph_osdc_alloc_messages()
twice. Add a helper for that.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


Revision tags: v4.18.14
# 81c65213 11-Oct-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: assign cookies in linger_submit()

Register lingers directly in linger_submit(). This avoids allocating
memory for notify pagelist while holding osdc->lock and simplifies both
callers of li

libceph: assign cookies in linger_submit()

Register lingers directly in linger_submit(). This avoids allocating
memory for notify pagelist while holding osdc->lock and simplifies both
callers of linger_submit().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# 3b83f60d 11-Oct-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get()

ceph_msgpool_get() can fall back to ceph_msg_new() when it is asked for
a message whose front portion is larger than pool->front_len.

libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get()

ceph_msgpool_get() can fall back to ceph_msg_new() when it is asked for
a message whose front portion is larger than pool->front_len. However
the caller always passes 0, effectively disabling that code path. The
allocation goes to the message pool and returns a message with a front
that is smaller than requested, setting us up for a crash.

One example of this is a directory with a large number of snapshots.
If its snap context doesn't fit, we oops in encode_request_partial().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# 41a264e1 13-Oct-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: no need to call osd_req_opcode_valid() in osd_req_encode_op()

Any uninitialized or unknown ops will be caught by the default clause
anyway.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>


Revision tags: v4.18.13, v4.18.12, v4.18.11
# 89486833 28-Sep-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist()

Because send_mds_reconnect() wants to send a message with a pagelist
and pass the ownership to the messenger, ceph_msg_data_a

libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist()

Because send_mds_reconnect() wants to send a message with a pagelist
and pass the ownership to the messenger, ceph_msg_data_add_pagelist()
consumes a ref which is then put in ceph_msg_data_destroy(). This
makes managing pagelists in the OSD client (where they are wrapped in
ceph_osd_data) unnecessarily hard because the handoff only happens in
ceph_osdc_start_request() instead of when the pagelist is passed to
ceph_osd_data_pagelist_init(). I counted several memory leaks on
various error paths.

Fix up ceph_msg_data_add_pagelist() and carry a pagelist ref in
ceph_osd_data.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# 33165d47 28-Sep-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: introduce ceph_pagelist_alloc()

struct ceph_pagelist cannot be embedded into anything else because it
has its own refcount. Merge allocation and initialization together.

Signed-off-by: Il

libceph: introduce ceph_pagelist_alloc()

struct ceph_pagelist cannot be embedded into anything else because it
has its own refcount. Merge allocation and initialization together.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# 24639ce5 26-Sep-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: osd_req_op_cls_init() doesn't need to take opcode

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>


Revision tags: v4.18.10, v4.18.9, v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11
# 6daca13d 27-Jul-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: add authorizer challenge

When a client authenticates with a service, an authorizer is sent with
a nonce to the service (ceph_x_authorize_[ab]) and the service responds
with a mutation of th

libceph: add authorizer challenge

When a client authenticates with a service, an authorizer is sent with
a nonce to the service (ceph_x_authorize_[ab]) and the service responds
with a mutation of that nonce (ceph_x_authorize_reply). This lets the
client verify the service is who it says it is but it doesn't protect
against a replay: someone can trivially capture the exchange and reuse
the same authorizer to authenticate themselves.

Allow the service to reject an initial authorizer with a random
challenge (ceph_x_authorize_challenge). The client then has to respond
with an updated authorizer proving they are able to decrypt the
service's challenge and that the new authorizer was produced for this
specific connection instance.

The accepting side requires this challenge and response unconditionally
if the client side advertises they have CEPHX_V2 feature bit.

This addresses CVE-2018-1128.

Link: http://tracker.ceph.com/issues/24836
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>

show more ...


Revision tags: v4.17.10, v4.17.9, v4.17.8, v4.17.7
# fac02ddf 13-Jul-2018 Arnd Bergmann <arnd@arndb.de>

libceph: use timespec64 for r_mtime

The request mtime field is used all over ceph, and is currently
represented as a 'timespec' structure in Linux. This changes it to
timespec64 to allow times beyon

libceph: use timespec64 for r_mtime

The request mtime field is used all over ceph, and is currently
represented as a 'timespec' structure in Linux. This changes it to
timespec64 to allow times beyond 2038, modifying all users at the
same time.

[ Remove now redundant ts variable in writepage_nounlock(). ]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


Revision tags: v4.17.6, v4.17.5, v4.17.4, v4.17.3
# 6d54228f 25-Jun-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: make ceph_osdc_notify{,_ack}() payload_len u32

The wire format dictates that payload_len fits into 4 bytes. Take u32
instead of size_t to reflect that.

All callers pass a small integer, s

libceph: make ceph_osdc_notify{,_ack}() payload_len u32

The wire format dictates that payload_len fits into 4 bytes. Take u32
instead of size_t to reflect that.

All callers pass a small integer, so no changes required.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


Revision tags: v4.17.2, v4.17.1, v4.17
# acafe7e3 08-May-2018 Kees Cook <keescook@chromium.org>

treewide: Use struct_size() for kmalloc()-family

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with me

treewide: Use struct_size() for kmalloc()-family

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
int stuff;
void *entry[];
};

instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

This patch makes the changes for kmalloc()-family (and kvmalloc()-family)
uses. It was done via automatic conversion with manual review for the
"CHECKME" non-standard cases noted below, using the following Coccinelle
script:

// pkey_cache = kmalloc(sizeof *pkey_cache + tprops->pkey_tbl_len *
// sizeof *pkey_cache->table, GFP_KERNEL);
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@

- alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT), GFP)
+ alloc(struct_size(VAR, ELEMENT, COUNT), GFP)

// mr = kzalloc(sizeof(*mr) + m * sizeof(mr->map[0]), GFP_KERNEL);
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
identifier VAR, ELEMENT;
expression COUNT;
@@

- alloc(sizeof(*VAR) + COUNT * sizeof(VAR->ELEMENT[0]), GFP)
+ alloc(struct_size(VAR, ELEMENT, COUNT), GFP)

// Same pattern, but can't trivially locate the trailing element name,
// or variable name.
@@
identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
expression GFP;
expression SOMETHING, COUNT, ELEMENT;
@@

- alloc(sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP)
+ alloc(CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP)

Signed-off-by: Kees Cook <keescook@chromium.org>

show more ...


# a86f009f 23-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: allocate the locator string with GFP_NOFAIL

calc_target() isn't supposed to fail with anything but POOL_DNE, in
which case we report that the pool doesn't exist and fail the request
with -E

libceph: allocate the locator string with GFP_NOFAIL

calc_target() isn't supposed to fail with anything but POOL_DNE, in
which case we report that the pool doesn't exist and fail the request
with -ENOENT. Doing this for -ENOMEM is at the very least confusing
and also harmful -- as the preceding requests complete, a short-lived
locator string allocation is likely to succeed after a wait.

(We used to call ceph_object_locator_to_pg() for a pi lookup. In
theory that could fail with -ENOENT, hence the "ret != -ENOENT" warning
being removed.)

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


# c843d13c 30-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: make abort_on_full a per-osdc setting

The intent behind making it a per-request setting was that it would be
set for writes, but not for reads. As it is, the flag is set for all
fs/ceph re

libceph: make abort_on_full a per-osdc setting

The intent behind making it a per-request setting was that it would be
set for writes, but not for reads. As it is, the flag is set for all
fs/ceph requests except for pool perm check stat request (technically
a read).

ceph_osdc_abort_on_full() skips reads since the previous commit and
I don't see a use case for marking individual requests.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>

show more ...


# 690f951d 30-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: don't abort reads in ceph_osdc_abort_on_full()

Don't consider reads for aborting and use ->base_oloc instead of
->target_oloc, as done in __submit_request().

Strictly speaking, we shouldn'

libceph: don't abort reads in ceph_osdc_abort_on_full()

Don't consider reads for aborting and use ->base_oloc instead of
->target_oloc, as done in __submit_request().

Strictly speaking, we shouldn't be aborting FULL_TRY/FULL_FORCE writes
either. But, there is an inconsistency in FULL_TRY/FULL_FORCE handling
on the OSD side [1], so given that neither of these is used in the
kernel client, leave it for when the OSD behaviour is sorted out.

[1] http://tracker.ceph.com/issues/24339

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>

show more ...


# 6001567c 22-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: avoid a use-after-free during map check

Sending map check after complete_request() was called is not only
useless, but can lead to a use-after-free as req->r_kref decrement in
__complete_re

libceph: avoid a use-after-free during map check

Sending map check after complete_request() was called is not only
useless, but can lead to a use-after-free as req->r_kref decrement in
__complete_request() races with map check code.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>

show more ...


# 29e87820 17-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: don't warn if req->r_abort_on_full is set

The "FULL or reached pool quota" warning is there to explain paused
requests. No need to emit it if pausing isn't going to occur.

Signed-off-by:

libceph: don't warn if req->r_abort_on_full is set

The "FULL or reached pool quota" warning is there to explain paused
requests. No need to emit it if pausing isn't going to occur.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>

show more ...


# 4eea0fef 16-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: use for_each_request() in ceph_osdc_abort_on_full()

Scanning the trees just to see if there is anything to abort is
unnecessary -- all that is needed here is to update the epoch barrier
fir

libceph: use for_each_request() in ceph_osdc_abort_on_full()

Scanning the trees just to see if there is anything to abort is
unnecessary -- all that is needed here is to update the epoch barrier
first, before we start aborting. Simplify and do the update inside the
loop before calling abort_request() for the first time.

The switch to for_each_request() also fixes a bug: homeless requests
weren't even considered for aborting.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>

show more ...


# 88bc1922 21-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: defer __complete_request() to a workqueue

In the common case, req->r_callback is called by handle_reply() on the
ceph-msgr worker thread without any locks. If handle_reply() fails, it
is c

libceph: defer __complete_request() to a workqueue

In the common case, req->r_callback is called by handle_reply() on the
ceph-msgr worker thread without any locks. If handle_reply() fails, it
is called with both osd->lock and osdc->lock. In the map check case,
it is called with just osdc->lock but held for write. Finally, if the
request is aborted because of -ENOSPC or by ceph_osdc_abort_requests(),
it is called directly on the submitter's thread, again with both locks.

req->r_callback on the submitter's thread is relatively new (introduced
in 4.12) and ripe for deadlocks -- e.g. writeback worker thread waiting
on itself:

inode_wait_for_writeback+0x26/0x40
evict+0xb5/0x1a0
iput+0x1d2/0x220
ceph_put_wrbuffer_cap_refs+0xe0/0x2c0 [ceph]
writepages_finish+0x2d3/0x410 [ceph]
__complete_request+0x26/0x60 [libceph]
complete_request+0x2e/0x70 [libceph]
__submit_request+0x256/0x330 [libceph]
submit_request+0x2b/0x30 [libceph]
ceph_osdc_start_request+0x25/0x40 [libceph]
ceph_writepages_start+0xdfe/0x1320 [ceph]
do_writepages+0x1f/0x70
__writeback_single_inode+0x45/0x330
writeback_sb_inodes+0x26a/0x600
__writeback_inodes_wb+0x92/0xc0
wb_writeback+0x274/0x330
wb_workfn+0x2d5/0x3b0

Defer __complete_request() to a workqueue in all failure cases so it's
never on the same thread as ceph_osdc_start_request() and always called
with no locks held.

Link: http://tracker.ceph.com/issues/23978
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>

show more ...


# 26df726b 21-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: move more code into __complete_request()

Move req->r_completion wake up and req->r_kref decrement into
__complete_request().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff

libceph: move more code into __complete_request()

Move req->r_completion wake up and req->r_kref decrement into
__complete_request().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>

show more ...


# 0d09c57d 18-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: no need to call flush_workqueue() before destruction

destroy_workqueue() drains the workqueue before proceeding with
destruction.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>


# 66850df5 15-May-2018 Ilya Dryomov <idryomov@gmail.com>

libceph: introduce ceph_osdc_abort_requests()

This will be used by the filesystem for "umount -f".

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>


# fe943d50 11-Apr-2018 Chengguang Xu <cgxu519@gmx.com>

libceph, rbd: add error handling for osd_req_op_cls_init()

Add proper error handling for osd_req_op_cls_init() to replace
BUG_ON statement when failing from memory allocation.

Signed-off-by: Chengg

libceph, rbd: add error handling for osd_req_op_cls_init()

Add proper error handling for osd_req_op_cls_init() to replace
BUG_ON statement when failing from memory allocation.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

show more ...


12345678910>>...31