#
60f782b6 |
| 16-May-2023 |
Stefan Hajnoczi <stefanha@redhat.com> |
aio: remove aio_disable_external() API
All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handle
aio: remove aio_disable_external() API
All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code.
Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier().
The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external().
Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch:
@@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque)
@@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready)
Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
show more ...
|
Revision tags: v8.0.0, v7.2.0, v7.0.0 |
|
#
fc879646 |
| 23-Feb-2022 |
Stefan Hajnoczi <stefanha@redhat.com> |
aio-posix: fix spurious ->poll_ready() callbacks in main loop
When ->poll() succeeds the AioHandler is placed on the ready list with revents set to the magic value 0. This magic value causes aio_dis
aio-posix: fix spurious ->poll_ready() callbacks in main loop
When ->poll() succeeds the AioHandler is placed on the ready list with revents set to the magic value 0. This magic value causes aio_dispatch_handler() to invoke ->poll_ready() instead of ->io_read() for G_IO_IN or ->io_write() for G_IO_OUT.
This magic value 0 hack works for the IOThread where AioHandlers are placed on ->ready_list and processed by aio_dispatch_ready_handlers(). It does not work for the main loop where all AioHandlers are processed by aio_dispatch_handlers(), even those that are not ready and have a revents value of 0.
As a result the main loop invokes ->poll_ready() on AioHandlers that are not ready. These spurious ->poll_ready() calls waste CPU cycles and could lead to crashes if the code assumes ->poll() must have succeeded before ->poll_ready() is called (a reasonable asumption but I haven't seen it in practice).
Stop using revents to track whether ->poll_ready() will be called on an AioHandler. Introduce a separate AioHandler->poll_ready field instead. This eliminates spurious ->poll_ready() calls in the main loop.
Fixes: 826cc32423db2a99d184dbf4f507c737d7e7a4ae ("aio-posix: split poll check from ready handler") Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reported-by: Jason Wang <jasowang@redhat.com> Tested-by: Jason Wang <jasowang@redhat.com> Message-id: 20220223155703.136833-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
Revision tags: v6.2.0 |
|
#
826cc324 |
| 07-Dec-2021 |
Stefan Hajnoczi <stefanha@redhat.com> |
aio-posix: split poll check from ready handler
Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significa
aio-posix: split poll check from ready handler
Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time.
For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling.
By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring.
The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before:
168k IOPS, IOThread syscalls:
9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32
174k IOPS (+3.6%), IOThread syscalls:
9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32
Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring.
As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com
[Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
show more ...
|
Revision tags: v6.1.0, v5.2.0, v5.0.0 |
|
#
52a96afa |
| 20-Mar-2020 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging update syscall numbers to linux 5.5 (with scripts) add clock_gettime64/clock_settime64
Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging update syscall numbers to linux 5.5 (with scripts) add clock_gettime64/clock_settime64 add AT_EXECFN v4: restore syscall.tbl series but remove vsyscall series v3: remove syscall.tbl series v2: guard copy_to_user_timezone() with TARGET_NR_gettimeofday remove "Support futex_time64" patch guard sys_futex with TARGET_NR_exit # gpg: Signature made Fri 20 Mar 2020 15:23:29 GMT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/linux-user-for-5.0-pull-request: (32 commits) linux-user, openrisc: sync syscall numbers with kernel v5.5 linux-user, nios2: sync syscall numbers with kernel v5.5 linux-user, aarch64: sync syscall numbers with kernel v5.5 scripts: add a script to generate syscall_nr.h linux-user,mips: update syscall-args-o32.c.inc linux-user,mips: move content of mips_syscall_args linux-user: update syscall.tbl from linux 0bf999f9c5e7 linux-user, scripts: add a script to update syscall.tbl linux-user, mips64: add syscall table generation support linux-user, mips: add syscall table generation support linux-user, x86_64: add syscall table generation support linux-user, i386: add syscall table generation support linux-user, x86_64, i386: cleanup TARGET_NR_arch_prctl linux-user, sparc, sparc64: add syscall table generation support linux-user, s390x: add syscall table generation support linux-user, s390x: remove syscall definitions for !TARGET_S390X linux-user, ppc: add syscall table generation support linux-user, arm: add syscall table generation support linux-user, microblaze: add syscall table generation support linux-user, sh4: add syscall table generation support ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
ce73691e |
| 19-Mar-2020 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging Pull request # gpg: Signature made Tue 17 Mar 2020 23:22:33 GMT # gpg: using RSA
Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging Pull request # gpg: Signature made Tue 17 Mar 2020 23:22:33 GMT # gpg: using RSA key F9B7ABDBBCACDF95BE76CBD07DEF8106AAFC390E # gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" [full] # Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB # Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E * remotes/jnsnow/tags/ide-pull-request: hw/ide: Remove unneeded inclusion of hw/ide.h hw/ide: Move MAX_IDE_DEVS define to hw/ide/internal.h hw/ide: Do ide_drive_get() within pci_ide_create_devs() hw/ide/pci.c: Coding style update to fix checkpatch errors hw/ide: Remove now unneded #include "hw/pci/pci.h" from hw/ide.h hw/ide: Get rid of piix4_init function hw/isa/piix4.c: Introduce variable to store devfn hw/ide: Get rid of piix3_init functions hd-geo-test: Clean up use of buf[] in create_qcow2_with_mbr() via-ide: always use legacy IRQ 14/15 routing via-ide: allow guests to write to PCI_CLASS_PROG via-ide: initialise IDE controller in legacy mode via-ide: ensure that PCI_INTERRUPT_LINE is hard-wired to its default value pci: Honour wmask when resetting PCI_INTERRUPT_LINE ide/via: Get rid of via_ide_init() via-ide: move registration of VMStateDescription to DeviceClass cmd646: remove unused pci_cmd646_ide_init() function dp264: use pci_create_simple() to initialise the cmd646 device cmd646: register vmstate_ide_pci VMStateDescription in DeviceClass cmd646: register cmd646_reset() function in DeviceClass Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
49780a58 |
| 12-Mar-2020 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - Relax restrictions for blockdev-snapshot (allows libvirt to do live storage
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches: - Relax restrictions for blockdev-snapshot (allows libvirt to do live storage migration with blockdev-mirror) - luks: Delete created files when block_crypto_co_create_opts_luks fails - Fix memleaks in qmp_object_add # gpg: Signature made Wed 11 Mar 2020 15:38:59 GMT # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: qemu-iotests: adding LUKS cleanup for non-UTF8 secret error crypto.c: cleanup created file when block_crypto_co_create_opts_luks fails block.c: adding bdrv_co_delete_file block: introducing 'bdrv_co_delete_file' interface tests/qemu-iotests: Fix socket_scm_helper build path qapi: Add '@allow-write-only-overlay' feature for 'blockdev-snapshot' iotests: Add iothread cases to 155 block: Fix cross-AioContext blockdev-snapshot iotests: Test mirror with temporarily disabled target backing file iotests: Fix run_job() with use_log=False block: Relax restrictions for blockdev-snapshot block: Make bdrv_get_cumulative_perm() public qom-qmp-cmds: fix two memleaks in qmp_object_add Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
10b11400 |
| 12-Mar-2020 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/vivier/tags/q800-for-5.0-pull-request' into staging Fix Coverity CID 1412799 (Error handling issues) # gpg: Signature made Thu 12 Mar 2020 15:1
Merge remote-tracking branch 'remotes/vivier/tags/q800-for-5.0-pull-request' into staging Fix Coverity CID 1412799 (Error handling issues) # gpg: Signature made Thu 12 Mar 2020 15:15:01 GMT # gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C # gpg: issuer "laurent@vivier.eu" # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full] # gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full] # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full] # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier/tags/q800-for-5.0-pull-request: q800: fix coverity warning CID 1412799 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
474acbe0 |
| 12-Mar-2020 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-03-10' into staging Fix the proxy fsdev so that it honours "readonly" and "writeout". # gpg: Signature made Tue 10 Mar
Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-03-10' into staging Fix the proxy fsdev so that it honours "readonly" and "writeout". # gpg: Signature made Tue 10 Mar 2020 15:16:39 GMT # gpg: using RSA key B4828BAF943140CEF2A3491071D4D5E5822F73D6 # gpg: Good signature from "Greg Kurz <groug@kaod.org>" [full] # gpg: aka "Gregory Kurz <gregory.kurz@free.fr>" [full] # gpg: aka "[jpeg image of size 3330]" [full] # Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6 * remotes/gkurz/tags/9p-next-2020-03-10: 9p/proxy: Fix export_flags Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
5931ed56 |
| 11-Mar-2020 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-03-11' into staging Block patches for the 5.0 softfreeze: - qemu-img measure for LUKS - Improve block-copy's perfo
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-03-11' into staging Block patches for the 5.0 softfreeze: - qemu-img measure for LUKS - Improve block-copy's performance by reducing inter-request dependencies - Make curl's detection of accept-ranges more robust - Memleak fixes - iotest fix # gpg: Signature made Wed 11 Mar 2020 13:19:01 GMT # gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40 # gpg: issuer "mreitz@redhat.com" # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full] # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/maxreitz/tags/pull-block-2020-03-11: block/block-copy: hide structure definitions block/block-copy: reduce intersecting request lock block/block-copy: rename start to offset in interfaces block/block-copy: refactor interfaces to use bytes instead of end block/block-copy: factor out find_conflicting_inflight_req block/block-copy: use block_status block/block-copy: specialcase first copy_range request block/block-copy: fix progress calculation job: refactor progress to separate object block/qcow2-threads: fix qcow2_decompress qemu-img: free memory before re-assign block/qcow2: do free crypto_opts in qcow2_close() iotests: Fix nonportable use of od --endian block/curl: HTTP header field names are case insensitive block/curl: HTTP header fields allow whitespace around values iotests: add 288 luks qemu-img measure test qemu-img: allow qemu-img measure --object without a filename luks: implement .bdrv_measure() luks: extract qcrypto_block_calculate_payload_offset() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
6e8a73e9 |
| 11-Mar-2020 |
Peter Maydell <peter.maydell@linaro.org> |
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging Pull request # gpg: Signature made Wed 11 Mar 2020 12:40:36 GMT # gpg: using
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging Pull request # gpg: Signature made Wed 11 Mar 2020 12:40:36 GMT # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full] # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: aio-posix: remove idle poll handlers to improve scalability aio-posix: support userspace polling of fd monitoring aio-posix: add io_uring fd monitoring implementation aio-posix: simplify FDMonOps->update() prototype aio-posix: extract ppoll(2) and epoll(7) fd monitoring aio-posix: move RCU_READ_LOCK() into run_poll_handlers() aio-posix: completely stop polling when disabled aio-posix: remove confusing QLIST_SAFE_REMOVE() qemu/queue.h: clear linked list pointers on remove Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
show more ...
|
#
d37d0e36 |
| 05-Mar-2020 |
Stefan Hajnoczi <stefanha@redhat.com> |
aio-posix: remove idle poll handlers to improve scalability When there are many poll handlers it's likely that some of them are idle most of the time. Remove handlers that haven't had a
aio-posix: remove idle poll handlers to improve scalability When there are many poll handlers it's likely that some of them are idle most of the time. Remove handlers that haven't had activity recently so that the polling loop scales better for guests with a large number of devices. This feature only takes effect for the Linux io_uring fd monitoring implementation because it is capable of combining fd monitoring with userspace polling. The other implementations can't do that and risk starving fds in favor of poll handlers, so don't try this optimization when they are in use. IOPS improves from 10k to 105k when the guest has 100 virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1 device for rw=randread,iodepth=1,bs=4k,ioengine=libaio on NVMe. [Clarified aio_poll_handlers locking discipline explanation in comment after discussion with Paolo Bonzini <pbonzini@redhat.com>. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20200305170806.1313245-8-stefanha@redhat.com Message-Id: <20200305170806.1313245-8-stefanha@redhat.com>
show more ...
|
#
73fd282e |
| 05-Mar-2020 |
Stefan Hajnoczi <stefanha@redhat.com> |
aio-posix: add io_uring fd monitoring implementation The recent Linux io_uring API has several advantages over ppoll(2) and epoll(2). Details are given in the source code. Add
aio-posix: add io_uring fd monitoring implementation The recent Linux io_uring API has several advantages over ppoll(2) and epoll(2). Details are given in the source code. Add an io_uring implementation and make it the default on Linux. Performance is the same as with epoll(7) but later patches add optimizations that take advantage of io_uring. It is necessary to change how aio_set_fd_handler() deals with deleting AioHandlers since removing monitored file descriptors is asynchronous in io_uring. fdmon_io_uring_remove() marks the AioHandler deleted and aio_set_fd_handler() will let it handle deletion in that case. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20200305170806.1313245-6-stefanha@redhat.com Message-Id: <20200305170806.1313245-6-stefanha@redhat.com>
show more ...
|
#
1f050a46 |
| 05-Mar-2020 |
Stefan Hajnoczi <stefanha@redhat.com> |
aio-posix: extract ppoll(2) and epoll(7) fd monitoring The ppoll(2) and epoll(7) file descriptor monitoring implementations are mixed with the core util/aio-posix.c code. Before adding
aio-posix: extract ppoll(2) and epoll(7) fd monitoring The ppoll(2) and epoll(7) file descriptor monitoring implementations are mixed with the core util/aio-posix.c code. Before adding another implementation for Linux io_uring, extract out the existing ones so there is a clear interface and the core code is simpler. The new interface is AioContext->fdmon_ops, a pointer to a FDMonOps struct. See the patch for details. Semantic changes: 1. ppoll(2) now reflects events from pollfds[] back into AioHandlers while we're still on the clock for adaptive polling. This was already happening for epoll(7), so if it's really an issue then we'll need to fix both in the future. 2. epoll(7)'s fallback to ppoll(2) while external events are disabled was broken when the number of fds exceeded the epoll(7) upgrade threshold. I guess this code path simply wasn't tested and no one noticed the bug. I didn't go out of my way to fix it but the correct code is simpler than preserving the bug. I also took some liberties in removing the unnecessary AioContext->epoll_available (just check AioContext->epollfd != -1 instead) and AioContext->epoll_enabled (it's implicit if our AioContext->fdmon_ops callbacks are being invoked) fields. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20200305170806.1313245-4-stefanha@redhat.com Message-Id: <20200305170806.1313245-4-stefanha@redhat.com>
show more ...
|