#
178bd438 |
| 18-Apr-2017 |
Fam Zheng <famz@redhat.com> |
block: Walk bs->children carefully in bdrv_drain_recurse
The recursive bdrv_drain_recurse may run a block job completion BH that drops nodes. The coming changes will make that more likely and use-af
block: Walk bs->children carefully in bdrv_drain_recurse
The recursive bdrv_drain_recurse may run a block job completion BH that drops nodes. The coming changes will make that more likely and use-after-free would happen without this patch
Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to QLIST_FOREACH_SAFE to prevent such a case from happening.
Since bdrv_unref accesses global state that is not protected by the AioContext lock, we cannot use bdrv_ref/bdrv_unref unconditionally. Fortunately the protection is not needed in IOThread because only main loop can modify a graph with the AioContext lock held.
Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170418143044.12187-2-famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Tested-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
show more ...
|
#
e3e0003a |
| 11-Apr-2017 |
Max Reitz <mreitz@redhat.com> |
block/io: Comment out permission assertions
In case of block migration, there may be writes to BlockBackends that do not have the write permission taken. Before this issue is fixed (which is not goi
block/io: Comment out permission assertions
In case of block migration, there may be writes to BlockBackends that do not have the write permission taken. Before this issue is fixed (which is not going to happen in 2.9), we therefore cannot assert that this is the case.
Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170411145050.31290-1-mreitz@redhat.com Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
aa388ddc |
| 11-Apr-2017 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging
# gpg: Signature made Tue 11 Apr 2017 13:10:55 BST # gpg: using RSA key 0xCA35624C6A9171C6 # gpg: Good
Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging
# gpg: Signature made Tue 11 Apr 2017 13:10:55 BST # gpg: using RSA key 0xCA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/block-pull-request: sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE block: Fix bdrv_co_flush early return block: Use bdrv_coroutine_enter to start I/O coroutines qemu-io-cmds: Use bdrv_coroutine_enter blockjob: Use bdrv_coroutine_enter to start coroutine block: Introduce bdrv_coroutine_enter async: Introduce aio_co_enter coroutine: Extract qemu_aio_coroutine_enter tests/block-job-txn: Don't start block job before adding to txn block: Quiesce old aio context during bdrv_set_aio_context block: Make bdrv_parent_drained_begin/end public
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
49ca6259 |
| 10-Apr-2017 |
Fam Zheng <famz@redhat.com> |
block: Fix bdrv_co_flush early return
bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for BDRV_POLL_WHILE to work, even for the shortcut case where flush is unnecessary. Move the if block to
block: Fix bdrv_co_flush early return
bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for BDRV_POLL_WHILE to work, even for the shortcut case where flush is unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix the variable declaration position.
Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
e92f0e19 |
| 10-Apr-2017 |
Fam Zheng <famz@redhat.com> |
block: Use bdrv_coroutine_enter to start I/O coroutines
BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling the main context, which relies on the yielded coroutine continuin
block: Use bdrv_coroutine_enter to start I/O coroutines
BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling the main context, which relies on the yielded coroutine continuing on bs->ctx before notifying qemu_aio_context with bdrv_wakeup().
Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine is entered from main loop, co->ctx will be qemu_aio_context, as a result of the "release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when both main thread and the iothread access the same BDS:
main loop iothread ----------------------------------------------------------------------- blockdev_snapshot aio_context_acquire(bs->ctx) virtio_scsi_data_plane_handle_cmd bdrv_drained_begin(bs->ctx) bdrv_flush(bs) bdrv_co_flush(bs) aio_context_acquire(bs->ctx).enter ... qemu_coroutine_yield(co) BDRV_POLL_WHILE() aio_context_release(bs->ctx) aio_context_acquire(bs->ctx).return ... aio_co_wake(co) aio_poll(qemu_aio_context) ... co_schedule_bh_cb() ... qemu_coroutine_enter(co) ...
/* (A) bdrv_co_flush(bs) /* (B) I/O on bs */ continues... */ aio_context_release(bs->ctx) aio_context_acquire(bs->ctx)
Note that in above case, bdrv_drained_begin() doesn't do the "release, poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.
Fix this by using bdrv_coroutine_enter and enter coroutine in the right context.
iotests 109 output is updated because the coroutine reenter flow during mirror job complete is different (now through co_queue_wakeup, instead of the unconditional qemu_coroutine_switch before), making the end job len different.
Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
show more ...
|
#
14e9559f |
| 07-Apr-2017 |
Fam Zheng <famz@redhat.com> |
block: Make bdrv_parent_drained_begin/end public
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewe
block: Make bdrv_parent_drained_begin/end public
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
show more ...
|
#
5daf9b30 |
| 07-Apr-2017 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc4
# gpg: Signature made Fri 07 Apr 2017 13:44:17 BST # gpg: using RSA key 0x
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc4
# gpg: Signature made Fri 07 Apr 2017 13:44:17 BST # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: mirror: Fix aio context of mirror_top_bs block: Assert attached child node has right aio context block: Fix unpaired aio_disable_external in external snapshot block: Don't check permissions for copy on read qemu-img: img_create does not support image-opts, fix docs iotests: Add mirror tests for orphaned source block/mirror: Fix use-after-free commit: Set commit_top_bs->total_sectors commit: Set commit_top_bs->aio_context block: Ignore guest dev permissions during incoming migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
1bf03e66 |
| 07-Apr-2017 |
Kevin Wolf <kwolf@redhat.com> |
block: Don't check permissions for copy on read
The assertion is currently failing. We can't require callers to have write permissions when all they are doing is a read, so comment it out. Add a FIX
block: Don't check permissions for copy on read
The assertion is currently failing. We can't require callers to have write permissions when all they are doing is a read, so comment it out. Add a FIXME comment in the code so that the check is re-enabled when copy on read is refactored into its own filter driver.
Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
show more ...
|
#
5bac3c39 |
| 13-Mar-2017 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc1
# gpg: Signature made Mon 13 Mar 2017 11:53:16 GMT # gpg: using RSA key 0x
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc1
# gpg: Signature made Mon 13 Mar 2017 11:53:16 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: commit: Implement .bdrv_refresh_filename mirror: Implement .bdrv_refresh_filename block: Refresh filename after changing backing file commit: Implement bdrv_commit_top.bdrv_co_get_block_status block: Request block status from *file for BDRV_BLOCK_RAW block: Remove check_new_perm from bdrv_replace_child() migration: Document handling of bdrv_is_allocated() errors vvfat: React to bdrv_is_allocated() errors backup: React to bdrv_is_allocated() errors block: Drop unmaintained 'archipelago' driver file-posix: Consider max_segments for BlockLimits.max_transfer backup: allow target without .bdrv_get_info
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
b64aa441 |
| 08-Mar-2017 |
Kevin Wolf <kwolf@redhat.com> |
block: Request block status from *file for BDRV_BLOCK_RAW
This fixes bdrv_co_get_block_status() for the bdrv_mirror_top block driver, which must fall through to bs->backing instead of bs->file.
Sig
block: Request block status from *file for BDRV_BLOCK_RAW
This fixes bdrv_co_get_block_status() for the bdrv_mirror_top block driver, which must fall through to bs->backing instead of bs->file.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
show more ...
|
#
b9fe3139 |
| 01-Mar-2017 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT # gpg: using RSA key 0x7F09B272C88F
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 28 Feb 2017 20:35:32 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (46 commits) block: Add Error parameter to bdrv_append() block: Add Error parameter to bdrv_set_backing_hd() block: Assertions for resize permission block: Assertions for write permissions block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read tests: Remove FIXME comments nbd/server: Use real permissions for NBD exports migration/block: Use real permissions hmp: Request permissions in qemu-io commit: Add filter-node-name to block-commit mirror: Add filter-node-name to blockdev-mirror stream: Use real permissions in streaming block job mirror: Use real permissions in mirror/active commit block job blockjob: Factor out block_job_remove_all_bdrv() block: Allow backing file links in change_parent_backing_link() block: BdrvChildRole.attach/detach() callbacks block: Fix pending requests check in bdrv_append() backup: Use real permissions in backup block job commit: Use real permissions for HMP 'commit' commit: Use real permissions in commit block job ...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
c8f6d58e |
| 17-Feb-2017 |
Kevin Wolf <kwolf@redhat.com> |
block: Assertions for resize permission
This adds an assertion that ensures that the necessary resize permission has been granted before bdrv_truncate() is called.
Signed-off-by: Kevin Wolf <kwolf@
block: Assertions for resize permission
This adds an assertion that ensures that the necessary resize permission has been granted before bdrv_truncate() is called.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Fam Zheng <famz@redhat.com>
show more ...
|
#
afa4b293 |
| 09-Feb-2017 |
Kevin Wolf <kwolf@redhat.com> |
block: Assertions for write permissions
This adds assertions that ensure that the necessary write permissions have been granted before someone attempts to write to a node.
Signed-off-by: Kevin Wolf
block: Assertions for write permissions
This adds assertions that ensure that the necessary write permissions have been granted before someone attempts to write to a node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
show more ...
|
#
85c97ca7 |
| 09-Feb-2017 |
Kevin Wolf <kwolf@redhat.com> |
block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
This is where we want to check the permissions, so we need to have the BdrvChild around where they are stored.
Signed-off-by: K
block: Pass BdrvChild to bdrv_aligned_preadv/pwritev and copy-on-read
This is where we want to check the permissions, so we need to have the BdrvChild around where they are stored.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>
show more ...
|
#
a0775e28 |
| 21-Feb-2017 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
v2: * Rebased to resolve scsi conflicts
# gpg: Signature made Tue 21 Feb 2017 11:56:24 GMT # gpg:
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
Pull request
v2: * Rebased to resolve scsi conflicts
# gpg: Signature made Tue 21 Feb 2017 11:56:24 GMT # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request: (24 commits) coroutine-lock: make CoRwlock thread-safe and fair coroutine-lock: add mutex argument to CoQueue APIs coroutine-lock: place CoMutex before CoQueue in header test-aio-multithread: add performance comparison with thread-based mutexes coroutine-lock: add limited spinning to CoMutex coroutine-lock: make CoMutex thread-safe block: document fields protected by AioContext lock async: remove unnecessary inc/dec pairs aio-posix: partially inline aio_dispatch into aio_poll block: explicitly acquire aiocontext in aio callbacks that need it block: explicitly acquire aiocontext in bottom halves that need it block: explicitly acquire aiocontext in callbacks that need it block: explicitly acquire aiocontext in timers that need it aio: push aio_context_acquire/release down to dispatching qed: introduce qed_aio_start_io and qed_aio_next_io_cb blkdebug: reschedule coroutine on the AioContext it is running on coroutine-lock: reschedule coroutine on the AioContext it was running on nbd: convert to use qio_channel_yield io: make qio_channel_yield aware of AioContexts io: add methods to set I/O handlers on AioContext ...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
1ace7cea |
| 13-Feb-2017 |
Paolo Bonzini <pbonzini@redhat.com> |
coroutine-lock: add mutex argument to CoQueue APIs
All that CoQueue needs in order to become thread-safe is help from an external mutex. Add this to the API.
Signed-off-by: Paolo Bonzini <pbonzini
coroutine-lock: add mutex argument to CoQueue APIs
All that CoQueue needs in order to become thread-safe is help from an external mutex. Add this to the API.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-6-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
#
b9e413dd |
| 13-Feb-2017 |
Paolo Bonzini <pbonzini@redhat.com> |
block: explicitly acquire aiocontext in aio callbacks that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@
block: explicitly acquire aiocontext in aio callbacks that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-16-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
#
1919631e |
| 13-Feb-2017 |
Paolo Bonzini <pbonzini@redhat.com> |
block: explicitly acquire aiocontext in bottom halves that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@
block: explicitly acquire aiocontext in bottom halves that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-15-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
#
2f47da5f |
| 13-Feb-2017 |
Paolo Bonzini <pbonzini@redhat.com> |
block: explicitly acquire aiocontext in timers that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.
block: explicitly acquire aiocontext in timers that need it
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-13-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
#
c2b38b27 |
| 13-Feb-2017 |
Paolo Bonzini <pbonzini@redhat.com> |
block: move AioContext, QEMUTimer, main-loop to libqemuutil
AioContext is fairly self contained, the only dependency is QEMUTimer but that in turn doesn't need anything else. So move them out of bl
block: move AioContext, QEMUTimer, main-loop to libqemuutil
AioContext is fairly self contained, the only dependency is QEMUTimer but that in turn doesn't need anything else. So move them out of block-obj-y to avoid introducing a dependency from io/ to block-obj-y.
main-loop and its dependency iohandler also need to be moved, because later in this series io/ will call iohandler_get_aio_context.
[Changed copyright "the QEMU team" to "other QEMU contributors" as suggested by Daniel Berrange and agreed by Paolo. --Stefan]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213135235.12274-2-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
#
02b351d8 |
| 17-Jan-2017 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Mon 16 Jan 2017 13:38:52 GMT # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg:
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Mon 16 Jan 2017 13:38:52 GMT # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request: async: optimize aio_bh_poll aio: document locking aio-win32: remove walking_handlers, protecting AioHandler list with list_lock aio-posix: remove walking_handlers, protecting AioHandler list with list_lock aio: tweak walking in dispatch phase aio-posix: split aio_dispatch_handlers out of aio_dispatch qemu-thread: optimize QemuLockCnt with futexes on Linux aio: make ctx->list_lock a QemuLockCnt, subsuming ctx->walking_bh qemu-thread: introduce QemuLockCnt aio: rename bh_lock to list_lock block: get rid of bdrv_io_unplugged_begin/end
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
8f90b5e9 |
| 29-Nov-2016 |
Paolo Bonzini <pbonzini@redhat.com> |
block: get rid of bdrv_io_unplugged_begin/end
bdrv_io_plug and bdrv_io_unplug are only called (via their BlockBackend equivalents) after starting asynchronous I/O. bdrv_drain is not going to be call
block: get rid of bdrv_io_unplugged_begin/end
bdrv_io_plug and bdrv_io_unplug are only called (via their BlockBackend equivalents) after starting asynchronous I/O. bdrv_drain is not going to be called while they are running, because---even if a coroutine runs for some reason---it will only drain in the next iteration of the event loop through bdrv_co_yield_to_drain.
So this mechanism is unnecessary, get rid of it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20161129113334.605-1-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
#
76989f4f |
| 22-Nov-2016 |
Stefan Hajnoczi <stefanha@redhat.com> |
Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches for 2.8.0-rc1
# gpg: Signature made Tue 22 Nov 2016 03:55:38 PM GMT # gpg: using RSA key 0x7F0
Merge remote-tracking branch 'kwolf/tags/for-upstream' into staging
Block layer patches for 2.8.0-rc1
# gpg: Signature made Tue 22 Nov 2016 03:55:38 PM GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* kwolf/tags/for-upstream: block: Pass unaligned discard requests to drivers block: Return -ENOTSUP rather than assert on unaligned discards block: Let write zeroes fallback work even with small max_transfer qcow2: Inform block layer about discard boundaries
Message-id: 1479830693-26676-1-git-send-email-kwolf@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
#
3482b9bc |
| 17-Nov-2016 |
Eric Blake <eblake@redhat.com> |
block: Pass unaligned discard requests to drivers
Discard is advisory, so rounding the requests to alignment boundaries is never semantically wrong from the data that the guest sees. But at least t
block: Pass unaligned discard requests to drivers
Discard is advisory, so rounding the requests to alignment boundaries is never semantically wrong from the data that the guest sees. But at least the Dell Equallogic iSCSI SANs has an interesting property that its advertised discard alignment is 15M, yet documents that discarding a sequence of 1M slices will eventually result in the 15M page being marked as discarded, and it is possible to observe which pages have been discarded.
Between commits 9f1963b and b8d0a980, we converted the block layer to a byte-based interface that ultimately ignores any unaligned head or tail based on the driver's advertised discard granularity, which means that qemu 2.7 refuses to pass any discard request smaller than 15M down to the Dell Equallogic hardware. This is a slight regression in behavior compared to earlier qemu, where a guest executing discards in power-of-2 chunks used to be able to get every page discarded, but is now left with various pages still allocated because the guest requests did not align with the hardware's 15M pages.
Since the SCSI specification says nothing about a minimum discard granularity, and only documents the preferred alignment, it is best if the block layer gives the driver every bit of information about discard requests, rather than rounding it to alignment boundaries early.
Rework the block layer discard algorithm to mirror the write zero algorithm: always peel off any unaligned head or tail and manage that in isolation, then do the bulk of the request on an aligned boundary. The fallback when the driver returns -ENOTSUP for an unaligned request is to silently ignore that portion of the discard request; but for devices that can pass the partial request all the way down to hardware, this can result in the hardware coalescing requests and discarding aligned pages after all.
Reported by: Peter Lieven <pl@kamp.de> CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
show more ...
|
#
b2f95fee |
| 17-Nov-2016 |
Eric Blake <eblake@redhat.com> |
block: Let write zeroes fallback work even with small max_transfer
Commit 443668ca rewrote the write_zeroes logic to guarantee that an unaligned request never crosses a cluster boundary. But in the
block: Let write zeroes fallback work even with small max_transfer
Commit 443668ca rewrote the write_zeroes logic to guarantee that an unaligned request never crosses a cluster boundary. But in the rewrite, the new code assumed that at most one iteration would be needed to get to an alignment boundary.
However, it is easy to trigger an assertion failure: the Linux kernel limits loopback devices to advertise a max_transfer of only 64k. Any operation that requires falling back to writes rather than more efficient zeroing must obey max_transfer during that fallback, which means an unaligned head may require multiple iterations of the write fallbacks before reaching the aligned boundaries, when layering a format with clusters larger than 64k atop the protocol of file access to a loopback device.
Test case:
$ qemu-img create -f qcow2 -o cluster_size=1M file 10M $ losetup /dev/loop2 /path/to/file $ qemu-io -f qcow2 /dev/loop2 qemu-io> w 7m 1k qemu-io> w -z 8003584 2093056
In fairness to Denis (as the original listed author of the culprit commit), the faulty logic for at most one iteration is probably all my fault in reworking his idea. But the solution is to restore what was in place prior to that commit: when dealing with an unaligned head or tail, iterate as many times as necessary while fragmenting the operation at max_transfer boundaries.
Reported-by: Ed Swierk <eswierk@skyportsystems.com> CC: qemu-stable@nongnu.org CC: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
show more ...
|