History log of /openbmc/linux/net/ceph/osd_client.c (Results 201 – 225 of 767)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v3.15, v3.15-rc8, v3.15-rc7, v3.15-rc6, v3.15-rc5, v3.15-rc4, v3.15-rc3, v3.15-rc2, v3.15-rc1, v3.14, v3.14-rc8
# 8008ab10 24-Mar-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: return primary from ceph_calc_pg_acting()

In preparation for adding support for primary_temp, stop assuming
primaryness: add a primary out parameter to ceph_calc_pg_acting() and
change call

libceph: return primary from ceph_calc_pg_acting()

In preparation for adding support for primary_temp, stop assuming
primaryness: add a primary out parameter to ceph_calc_pg_acting() and
change call sites accordingly. Primary is now specified separately
from the order of osds in the set.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>

show more ...


Revision tags: v3.14-rc7
# a2505d63 13-Mar-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: split osdmap allocation and decode steps

Split osdmap allocation and initialization into a separate function,
ceph_osdmap_decode().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
R

libceph: split osdmap allocation and decode steps

Split osdmap allocation and initialization into a separate function,
ceph_osdmap_decode().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>

show more ...


Revision tags: v3.14-rc6, v3.14-rc5
# c647b8a8 25-Feb-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: add support for CEPH_OSD_OP_SETALLOCHINT osd op

This is primarily for rbd's benefit and is supposed to combat
fragmentation:

"... knowing that rbd images have a 4m size, librbd can pass a

libceph: add support for CEPH_OSD_OP_SETALLOCHINT osd op

This is primarily for rbd's benefit and is supposed to combat
fragmentation:

"... knowing that rbd images have a 4m size, librbd can pass a hint
that will let the osd do the xfs allocation size ioctl on new files so
that they are allocated in 1m or 4m chunks. We've seen cases where
users with rbd workloads have very high levels of fragmentation in xfs
and this would mitigate that and probably have a pretty nice
performance benefit."

SETALLOCHINT is considered advisory, so our backwards compatibility
mechanism here is to set FAILOK flag for all SETALLOCHINT ops.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>

show more ...


# 7b25bf5f 25-Feb-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: encode CEPH_OSD_OP_FLAG_* op flags

Encode ceph_osd_op::flags field so that it gets sent over the wire.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@i

libceph: encode CEPH_OSD_OP_FLAG_* op flags

Encode ceph_osd_op::flags field so that it gets sent over the wire.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>

show more ...


Revision tags: v3.14-rc4, v3.14-rc3
# 2045ceae 12-Feb-2014 stephen hemminger <stephen@networkplumber.org>

net: remove unnecessary return's

One of my pet coding style peeves is the practice of
adding extra return; at the end of function.
Kill several instances of this in network code.

I suppose some coc

net: remove unnecessary return's

One of my pet coding style peeves is the practice of
adding extra return; at the end of function.
Kill several instances of this in network code.

I suppose some coccinelle wizardy could do this automatically.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v3.14-rc2
# ff513ace 03-Feb-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: take map_sem for read in handle_reply()

Handling redirect replies requires both map_sem and request_mutex.
Taking map_sem unconditionally near the top of handle_reply() avoids
possible race

libceph: take map_sem for read in handle_reply()

Handling redirect replies requires both map_sem and request_mutex.
Taking map_sem unconditionally near the top of handle_reply() avoids
possible race conditions that arise from releasing request_mutex to be
able to acquire map_sem in redirect reply case. (Lock ordering is:
map_sem, request_mutex, crush_mutex.)

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


Revision tags: v3.14-rc1
# 0bbfdfe8 31-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: factor out logic from ceph_osdc_start_request()

Factor out logic from ceph_osdc_start_request() into a new helper,
__ceph_osdc_start_request(). ceph_osdc_start_request() now amounts to
tak

libceph: factor out logic from ceph_osdc_start_request()

Factor out logic from ceph_osdc_start_request() into a new helper,
__ceph_osdc_start_request(). ceph_osdc_start_request() now amounts to
taking locks and calling __ceph_osdc_start_request().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# c172ec5c 31-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: fix error handling in ceph_osdc_init()

msgpool_op_reply message pool isn't destroyed if workqueue construction
fails. Fix it.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Review

libceph: fix error handling in ceph_osdc_init()

msgpool_op_reply message pool isn't destroyed if workqueue construction
fails. Fix it.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 205ee118 27-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: follow redirect replies from osds

Follow redirect replies from osds, for details see ceph.git commit
fbbe3ad1220799b7bb00ea30fce581c5eadaf034.

v1 (current) version of redirect reply consis

libceph: follow redirect replies from osds

Follow redirect replies from osds, for details see ceph.git commit
fbbe3ad1220799b7bb00ea30fce581c5eadaf034.

v1 (current) version of redirect reply consists of oloc and oid, which
expands to pool, key, nspace, hash and oid. However, server-side code
that would populate anything other than pool doesn't exist yet, and
hence this commit adds support for pool redirects only. To make sure
that future server-side updates don't break us, we decode all fields
and, if any of key, nspace, hash or oid have a non-default value, error
out with "corrupt osd_op_reply ..." message.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 3c972c95 27-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid}

Rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid} before
introducing r_target_{oloc,oid} needed for redirects.

Signed-of

libceph: rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid}

Rename ceph_osd_request::r_{oloc,oid} to r_base_{oloc,oid} before
introducing r_target_{oloc,oid} needed for redirects.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 17a13e40 27-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: follow {read,write}_tier fields on osd request submission

Overwrite ceph_osd_request::r_oloc.pool with read_tier for read ops and
write_tier for write and read+write ops (aka basic tiering

libceph: follow {read,write}_tier fields on osd request submission

Overwrite ceph_osd_request::r_oloc.pool with read_tier for read ops and
write_tier for write and read+write ops (aka basic tiering support).
{read,write}_tier are part of pg_pool_t since v9. This commit bumps
our pg_pool_t decode compat version from v7 to v9, all new fields
except for {read,write}_tier are ignored.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 7c13cb64 27-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: replace ceph_calc_ceph_pg() with ceph_oloc_oid_to_pg()

Switch ceph_calc_ceph_pg() to new oloc and oid abstractions and rename
it to ceph_oloc_oid_to_pg() to make its purpose more clear.

Si

libceph: replace ceph_calc_ceph_pg() with ceph_oloc_oid_to_pg()

Switch ceph_calc_ceph_pg() to new oloc and oid abstractions and rename
it to ceph_oloc_oid_to_pg() to make its purpose more clear.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 4295f221 27-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: introduce and start using oid abstraction

In preparation for tiering support, which would require having two
(base and target) object names for each osd request and also copying
those names

libceph: introduce and start using oid abstraction

In preparation for tiering support, which would require having two
(base and target) object names for each osd request and also copying
those names around, introduce struct ceph_object_id (oid) and a couple
helpers to facilitate those copies and encapsulate the fact that object
name is not necessarily a NUL-terminated string.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 2d0ebc5d 27-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: rename MAX_OBJ_NAME_SIZE to CEPH_MAX_OID_NAME_LEN

In preparation for adding oid abstraction, rename MAX_OBJ_NAME_SIZE to
CEPH_MAX_OID_NAME_LEN.

Signed-off-by: Ilya Dryomov <ilya.dryomov@in

libceph: rename MAX_OBJ_NAME_SIZE to CEPH_MAX_OID_NAME_LEN

In preparation for adding oid abstraction, rename MAX_OBJ_NAME_SIZE to
CEPH_MAX_OID_NAME_LEN.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 22116525 27-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: start using oloc abstraction

Instead of relying on pool fields in ceph_file_layout (for mapping) and
ceph_pg (for enconding), start using ceph_object_locator (oloc)
abstraction. Note that

libceph: start using oloc abstraction

Instead of relying on pool fields in ceph_file_layout (for mapping) and
ceph_pg (for enconding), start using ceph_object_locator (oloc)
abstraction. Note that userspace oloc currently consists of pool, key,
nspace and hash fields, while this one contains only a pool. This is
OK, because at this point we only send (i.e. encode) olocs and never
have to receive (i.e. decode) them.

This makes keeping a copy of ceph_file_layout in every osd request
unnecessary, so ceph_osd_request::r_file_layout field is nuked.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


Revision tags: v3.13
# 0b4af2e8 16-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: dout() is missing a newline

Add a missing newline to a dout() in __reset_osd().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>


Revision tags: v3.13-rc8
# f2be82b0 09-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: fix preallocation check in get_reply()

The check that makes sure that we have enough memory allocated to read
in the entire header of the message in question is currently busted.
It compare

libceph: fix preallocation check in get_reply()

The check that makes sure that we have enough memory allocated to read
in the entire header of the message in question is currently busted.
It compares front_len of the incoming message with iov_len field of
ceph_msg::front structure, which is used primarily to indicate the
amount of data already read in, and not the size of the allocated
buffer. Under certain conditions (e.g. a short read from a socket
followed by that socket's shutdown and owning ceph_connection reset)
this results in a warning similar to

[85688.975866] libceph: get_reply front 198 > preallocated 122 (4#0)

and, through another bug, leads to forever hung tasks and forced
reboots. Fix this by comparing front_len with front_alloc_len field of
struct ceph_msg, which stores the actual size of the buffer.

Fixes: http://tracker.ceph.com/issues/5425

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 3f0a4ac5 09-Jan-2014 Ilya Dryomov <ilya.dryomov@inktank.com>

libceph: rename front to front_len in get_reply()

Rename front local variable to front_len in get_reply() to make its
purpose more clear.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Revi

libceph: rename front to front_len in get_reply()

Rename front local variable to front_len in get_reply() to make its
purpose more clear.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


Revision tags: v3.13-rc7, v3.13-rc6, v3.13-rc5, v3.13-rc4
# 9a1ea2db 10-Dec-2013 Josh Durgin <josh.durgin@inktank.com>

libceph: resend all writes after the osdmap loses the full flag

With the current full handling, there is a race between osds and
clients getting the first map marked full. If the osd wins, it will
r

libceph: resend all writes after the osdmap loses the full flag

With the current full handling, there is a race between osds and
clients getting the first map marked full. If the osd wins, it will
return -ENOSPC to any writes, but the client may already have writes
in flight. This results in the client getting the error and
propagating it up the stack. For rbd, the block layer turns this into
EIO, which can cause corruption in filesystems above it.

To avoid this race, osds are being changed to drop writes that came
from clients with an osdmap older than the last osdmap marked full.
In order for this to work, clients must resend all writes after they
encounter a full -> not full transition in the osdmap. osds will wait
for an updated map instead of processing a request from a client with
a newer map, so resent writes will not be dropped by the osd unless
there is another not full -> full transition.

This approach requires both osds and clients to be fixed to avoid the
race. Old clients talking to osds with this fix may hang instead of
returning EIO and potentially corrupting an fs. New clients talking to
old osds have the same behavior as before if they encounter this race.

Fixes: http://tracker.ceph.com/issues/6938

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>

show more ...


Revision tags: v3.13-rc3
# d29adb34 02-Dec-2013 Josh Durgin <josh.durgin@inktank.com>

libceph: block I/O when PAUSE or FULL osd map flags are set

The PAUSEWR and PAUSERD flags are meant to stop the cluster from
processing writes and reads, respectively. The FULL flag is set when
the

libceph: block I/O when PAUSE or FULL osd map flags are set

The PAUSEWR and PAUSERD flags are meant to stop the cluster from
processing writes and reads, respectively. The FULL flag is set when
the cluster determines that it is out of space, and will no longer
process writes. PAUSEWR and PAUSERD are purely client-side settings
already implemented in userspace clients. The osd does nothing special
with these flags.

When the FULL flag is set, however, the osd responds to all writes
with -ENOSPC. For cephfs, this makes sense, but for rbd the block
layer translates this into EIO. If a cluster goes from full to
non-full quickly, a filesystem on top of rbd will not behave well,
since some writes succeed while others get EIO.

Fix this by blocking any writes when the FULL flag is set in the osd
client. This is the same strategy used by userspace, so apply it by
default. A follow-on patch makes this configurable.

__map_request() is called to re-target osd requests in case the
available osds changed. Add a paused field to a ceph_osd_request, and
set it whenever an appropriate osd map flag is set. Avoid queueing
paused requests in __map_request(), but force them to be resent if
they become unpaused.

Also subscribe to the next osd map from the monitor if any of these
flags are set, so paused requests can be unblocked as soon as
possible.

Fixes: http://tracker.ceph.com/issues/6079

Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>

show more ...


Revision tags: v3.13-rc2
# 37c89bde 27-Nov-2013 Li Wang <liwang@ubuntukylin.com>

ceph: Add necessary clean up if invalid reply received in handle_reply()

Wake up possible waiters, invoke the call back if any, unregister the request

Signed-off-by: Li Wang <liwang@ubuntukylin.com

ceph: Add necessary clean up if invalid reply received in handle_reply()

Wake up possible waiters, invoke the call back if any, unregister the request

Signed-off-by: Li Wang <liwang@ubuntukylin.com>
Signed-off-by: Yunchuan Wen <yunchuanwen@ubuntukylin.com>
Signed-off-by: Sage Weil <sage@inktank.com>

show more ...


Revision tags: v3.13-rc1, v3.12, v3.12-rc7, v3.12-rc6, v3.12-rc5, v3.12-rc4, v3.12-rc3, v3.12-rc2, v3.12-rc1, v3.11
# dd935f44 28-Aug-2013 Josh Durgin <josh.durgin@inktank.com>

libceph: add function to ensure notifies are complete

Without a way to flush the osd client's notify workqueue, a watch
event that is unregistered could continue receiving callbacks
indefinitely.

U

libceph: add function to ensure notifies are complete

Without a way to flush the osd client's notify workqueue, a watch
event that is unregistered could continue receiving callbacks
indefinitely.

Unregistering the event simply means no new notifies are added to the
queue, but there may still be events in the queue that will call the
watch callback for the event. If the queue is flushed after the event
is unregistered, the caller can be sure no more watch callbacks will
occur for the canceled watch.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@linaro.org>

show more ...


Revision tags: v3.11-rc7, v3.11-rc6
# dbcae088 15-Aug-2013 Dan Carpenter <dan.carpenter@oracle.com>

libceph: create_singlethread_workqueue() doesn't return ERR_PTRs

create_singlethread_workqueue() returns NULL on error, and it doesn't
return ERR_PTRs.

I tweaked the error handling a little to be c

libceph: create_singlethread_workqueue() doesn't return ERR_PTRs

create_singlethread_workqueue() returns NULL on error, and it doesn't
return ERR_PTRs.

I tweaked the error handling a little to be consistent with earlier in
the function.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# b72e19b9 15-Aug-2013 Dan Carpenter <dan.carpenter@oracle.com>

libceph: potential NULL dereference in ceph_osdc_handle_map()

There are two places where we read "nr_maps" if both of them are set to
zero then we would hit a NULL dereference here.

Signed-off-by:

libceph: potential NULL dereference in ceph_osdc_handle_map()

There are two places where we read "nr_maps" if both of them are set to
zero then we would hit a NULL dereference here.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


# 18741196 15-Aug-2013 Dan Carpenter <dan.carpenter@oracle.com>

libceph: fix error handling in handle_reply()

We've tried to fix the error paths in this function before, but there
is still a hidden goto in the ceph_decode_need() macro which goes to the
wrong pla

libceph: fix error handling in handle_reply()

We've tried to fix the error paths in this function before, but there
is still a hidden goto in the ceph_decode_need() macro which goes to the
wrong place. We need to release the "req" and unlock a mutex before
returning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sage Weil <sage@inktank.com>

show more ...


12345678910>>...31