History log of /openbmc/qemu/util/aio-posix.h (Results 1 – 13 of 13)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 60f782b6 16-May-2023 Stefan Hajnoczi <stefanha@redhat.com>

aio: remove aio_disable_external() API

All callers now pass is_external=false to aio_set_fd_handler() and
aio_set_event_notifier(). The aio_disable_external() API that
temporarily disables fd handle

aio: remove aio_disable_external() API

All callers now pass is_external=false to aio_set_fd_handler() and
aio_set_event_notifier(). The aio_disable_external() API that
temporarily disables fd handlers that were registered is_external=true
is therefore dead code.

Remove aio_disable_external(), aio_enable_external(), and the
is_external arguments to aio_set_fd_handler() and
aio_set_event_notifier().

The entire test-fdmon-epoll test is removed because its sole purpose was
testing aio_disable_external().

Parts of this patch were generated using the following coccinelle
(https://coccinelle.lip6.fr/) semantic patch:

@@
expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque;
@@
- aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque)
+ aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque)

@@
expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready;
@@
- aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready)
+ aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready)

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-21-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

show more ...


Revision tags: v8.0.0, v7.2.0, v7.0.0
# fc879646 23-Feb-2022 Stefan Hajnoczi <stefanha@redhat.com>

aio-posix: fix spurious ->poll_ready() callbacks in main loop

When ->poll() succeeds the AioHandler is placed on the ready list with
revents set to the magic value 0. This magic value causes
aio_dis

aio-posix: fix spurious ->poll_ready() callbacks in main loop

When ->poll() succeeds the AioHandler is placed on the ready list with
revents set to the magic value 0. This magic value causes
aio_dispatch_handler() to invoke ->poll_ready() instead of ->io_read()
for G_IO_IN or ->io_write() for G_IO_OUT.

This magic value 0 hack works for the IOThread where AioHandlers are
placed on ->ready_list and processed by aio_dispatch_ready_handlers().
It does not work for the main loop where all AioHandlers are processed
by aio_dispatch_handlers(), even those that are not ready and have a
revents value of 0.

As a result the main loop invokes ->poll_ready() on AioHandlers that are
not ready. These spurious ->poll_ready() calls waste CPU cycles and
could lead to crashes if the code assumes ->poll() must have succeeded
before ->poll_ready() is called (a reasonable asumption but I haven't
seen it in practice).

Stop using revents to track whether ->poll_ready() will be called on an
AioHandler. Introduce a separate AioHandler->poll_ready field instead.
This eliminates spurious ->poll_ready() calls in the main loop.

Fixes: 826cc32423db2a99d184dbf4f507c737d7e7a4ae ("aio-posix: split poll check from ready handler")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reported-by: Jason Wang <jasowang@redhat.com>
Tested-by: Jason Wang <jasowang@redhat.com>
Message-id: 20220223155703.136833-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


Revision tags: v6.2.0
# 826cc324 07-Dec-2021 Stefan Hajnoczi <stefanha@redhat.com>

aio-posix: split poll check from ready handler

Adaptive polling measures the execution time of the polling check plus
handlers called when a polled event becomes ready. Handlers can take a
significa

aio-posix: split poll check from ready handler

Adaptive polling measures the execution time of the polling check plus
handlers called when a polled event becomes ready. Handlers can take a
significant amount of time, making it look like polling was running for
a long time when in fact the event handler was running for a long time.

For example, on Linux the io_submit(2) syscall invoked when a virtio-blk
device's virtqueue becomes ready can take 10s of microseconds. This
can exceed the default polling interval (32 microseconds) and cause
adaptive polling to stop polling.

By excluding the handler's execution time from the polling check we make
the adaptive polling calculation more accurate. As a result, the event
loop now stays in polling mode where previously it would have fallen
back to file descriptor monitoring.

The following data was collected with virtio-blk num-queues=2
event_idx=off using an IOThread. Before:

168k IOPS, IOThread syscalls:

9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16
9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8
9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8
9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3
9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8
9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8
9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8
9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32

174k IOPS (+3.6%), IOThread syscalls:

9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32
9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8
9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8
9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32

Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because
the IOThread stays in polling mode instead of falling back to file
descriptor monitoring.

As usual, polling is not implemented on Windows so this patch ignores
the new io_poll_read() callback in aio-win32.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-2-stefanha@redhat.com

[Fixed up aio_set_event_notifier() calls in
tests/unit/test-fdmon-epoll.c added after this series was queued.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

show more ...


Revision tags: v6.1.0, v5.2.0, v5.0.0
# 52a96afa 20-Mar-2020 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging

update syscall numbers to linux 5.5 (with scripts)
add clock_gettime64/clock_settime64

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging

update syscall numbers to linux 5.5 (with scripts)
add clock_gettime64/clock_settime64
add AT_EXECFN

v4: restore syscall.tbl series but remove vsyscall series
v3: remove syscall.tbl series
v2: guard copy_to_user_timezone() with TARGET_NR_gettimeofday
remove "Support futex_time64" patch
guard sys_futex with TARGET_NR_exit

# gpg: Signature made Fri 20 Mar 2020 15:23:29 GMT
# gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg: issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-5.0-pull-request: (32 commits)
linux-user, openrisc: sync syscall numbers with kernel v5.5
linux-user, nios2: sync syscall numbers with kernel v5.5
linux-user, aarch64: sync syscall numbers with kernel v5.5
scripts: add a script to generate syscall_nr.h
linux-user,mips: update syscall-args-o32.c.inc
linux-user,mips: move content of mips_syscall_args
linux-user: update syscall.tbl from linux 0bf999f9c5e7
linux-user, scripts: add a script to update syscall.tbl
linux-user, mips64: add syscall table generation support
linux-user, mips: add syscall table generation support
linux-user, x86_64: add syscall table generation support
linux-user, i386: add syscall table generation support
linux-user, x86_64, i386: cleanup TARGET_NR_arch_prctl
linux-user, sparc, sparc64: add syscall table generation support
linux-user, s390x: add syscall table generation support
linux-user, s390x: remove syscall definitions for !TARGET_S390X
linux-user, ppc: add syscall table generation support
linux-user, arm: add syscall table generation support
linux-user, microblaze: add syscall table generation support
linux-user, sh4: add syscall table generation support
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# ce73691e 19-Mar-2020 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging

Pull request

# gpg: Signature made Tue 17 Mar 2020 23:22:33 GMT
# gpg: using RSA

Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging

Pull request

# gpg: Signature made Tue 17 Mar 2020 23:22:33 GMT
# gpg: using RSA key F9B7ABDBBCACDF95BE76CBD07DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" [full]
# Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB
# Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
hw/ide: Remove unneeded inclusion of hw/ide.h
hw/ide: Move MAX_IDE_DEVS define to hw/ide/internal.h
hw/ide: Do ide_drive_get() within pci_ide_create_devs()
hw/ide/pci.c: Coding style update to fix checkpatch errors
hw/ide: Remove now unneded #include "hw/pci/pci.h" from hw/ide.h
hw/ide: Get rid of piix4_init function
hw/isa/piix4.c: Introduce variable to store devfn
hw/ide: Get rid of piix3_init functions
hd-geo-test: Clean up use of buf[] in create_qcow2_with_mbr()
via-ide: always use legacy IRQ 14/15 routing
via-ide: allow guests to write to PCI_CLASS_PROG
via-ide: initialise IDE controller in legacy mode
via-ide: ensure that PCI_INTERRUPT_LINE is hard-wired to its default value
pci: Honour wmask when resetting PCI_INTERRUPT_LINE
ide/via: Get rid of via_ide_init()
via-ide: move registration of VMStateDescription to DeviceClass
cmd646: remove unused pci_cmd646_ide_init() function
dp264: use pci_create_simple() to initialise the cmd646 device
cmd646: register vmstate_ide_pci VMStateDescription in DeviceClass
cmd646: register cmd646_reset() function in DeviceClass

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 49780a58 12-Mar-2020 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Relax restrictions for blockdev-snapshot (allows libvirt to do live
storage

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Relax restrictions for blockdev-snapshot (allows libvirt to do live
storage migration with blockdev-mirror)
- luks: Delete created files when block_crypto_co_create_opts_luks fails
- Fix memleaks in qmp_object_add

# gpg: Signature made Wed 11 Mar 2020 15:38:59 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
qemu-iotests: adding LUKS cleanup for non-UTF8 secret error
crypto.c: cleanup created file when block_crypto_co_create_opts_luks fails
block.c: adding bdrv_co_delete_file
block: introducing 'bdrv_co_delete_file' interface
tests/qemu-iotests: Fix socket_scm_helper build path
qapi: Add '@allow-write-only-overlay' feature for 'blockdev-snapshot'
iotests: Add iothread cases to 155
block: Fix cross-AioContext blockdev-snapshot
iotests: Test mirror with temporarily disabled target backing file
iotests: Fix run_job() with use_log=False
block: Relax restrictions for blockdev-snapshot
block: Make bdrv_get_cumulative_perm() public
qom-qmp-cmds: fix two memleaks in qmp_object_add

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 10b11400 12-Mar-2020 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/vivier/tags/q800-for-5.0-pull-request' into staging

Fix Coverity CID 1412799 (Error handling issues)

# gpg: Signature made Thu 12 Mar 2020 15:1

Merge remote-tracking branch 'remotes/vivier/tags/q800-for-5.0-pull-request' into staging

Fix Coverity CID 1412799 (Error handling issues)

# gpg: Signature made Thu 12 Mar 2020 15:15:01 GMT
# gpg: using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg: issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg: aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/q800-for-5.0-pull-request:
q800: fix coverity warning CID 1412799

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 474acbe0 12-Mar-2020 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-03-10' into staging

Fix the proxy fsdev so that it honours "readonly" and "writeout".

# gpg: Signature made Tue 10 Mar

Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2020-03-10' into staging

Fix the proxy fsdev so that it honours "readonly" and "writeout".

# gpg: Signature made Tue 10 Mar 2020 15:16:39 GMT
# gpg: using RSA key B4828BAF943140CEF2A3491071D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>" [full]
# gpg: aka "Gregory Kurz <gregory.kurz@free.fr>" [full]
# gpg: aka "[jpeg image of size 3330]" [full]
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6

* remotes/gkurz/tags/9p-next-2020-03-10:
9p/proxy: Fix export_flags

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 5931ed56 11-Mar-2020 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-03-11' into staging

Block patches for the 5.0 softfreeze:
- qemu-img measure for LUKS
- Improve block-copy's perfo

Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-03-11' into staging

Block patches for the 5.0 softfreeze:
- qemu-img measure for LUKS
- Improve block-copy's performance by reducing inter-request
dependencies
- Make curl's detection of accept-ranges more robust
- Memleak fixes
- iotest fix

# gpg: Signature made Wed 11 Mar 2020 13:19:01 GMT
# gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg: issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2020-03-11:
block/block-copy: hide structure definitions
block/block-copy: reduce intersecting request lock
block/block-copy: rename start to offset in interfaces
block/block-copy: refactor interfaces to use bytes instead of end
block/block-copy: factor out find_conflicting_inflight_req
block/block-copy: use block_status
block/block-copy: specialcase first copy_range request
block/block-copy: fix progress calculation
job: refactor progress to separate object
block/qcow2-threads: fix qcow2_decompress
qemu-img: free memory before re-assign
block/qcow2: do free crypto_opts in qcow2_close()
iotests: Fix nonportable use of od --endian
block/curl: HTTP header field names are case insensitive
block/curl: HTTP header fields allow whitespace around values
iotests: add 288 luks qemu-img measure test
qemu-img: allow qemu-img measure --object without a filename
luks: implement .bdrv_measure()
luks: extract qcrypto_block_calculate_payload_offset()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 6e8a73e9 11-Mar-2020 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Wed 11 Mar 2020 12:40:36 GMT
# gpg: using

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Wed 11 Mar 2020 12:40:36 GMT
# gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
aio-posix: remove idle poll handlers to improve scalability
aio-posix: support userspace polling of fd monitoring
aio-posix: add io_uring fd monitoring implementation
aio-posix: simplify FDMonOps->update() prototype
aio-posix: extract ppoll(2) and epoll(7) fd monitoring
aio-posix: move RCU_READ_LOCK() into run_poll_handlers()
aio-posix: completely stop polling when disabled
aio-posix: remove confusing QLIST_SAFE_REMOVE()
qemu/queue.h: clear linked list pointers on remove

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# d37d0e36 05-Mar-2020 Stefan Hajnoczi <stefanha@redhat.com>

aio-posix: remove idle poll handlers to improve scalability

When there are many poll handlers it's likely that some of them are idle
most of the time. Remove handlers that haven't had a

aio-posix: remove idle poll handlers to improve scalability

When there are many poll handlers it's likely that some of them are idle
most of the time. Remove handlers that haven't had activity recently so
that the polling loop scales better for guests with a large number of
devices.

This feature only takes effect for the Linux io_uring fd monitoring
implementation because it is capable of combining fd monitoring with
userspace polling. The other implementations can't do that and risk
starving fds in favor of poll handlers, so don't try this optimization
when they are in use.

IOPS improves from 10k to 105k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device for rw=randread,iodepth=1,bs=4k,ioengine=libaio on NVMe.

[Clarified aio_poll_handlers locking discipline explanation in comment
after discussion with Paolo Bonzini <pbonzini@redhat.com>.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-8-stefanha@redhat.com
Message-Id: <20200305170806.1313245-8-stefanha@redhat.com>

show more ...


# 73fd282e 05-Mar-2020 Stefan Hajnoczi <stefanha@redhat.com>

aio-posix: add io_uring fd monitoring implementation

The recent Linux io_uring API has several advantages over ppoll(2) and
epoll(2). Details are given in the source code.

Add

aio-posix: add io_uring fd monitoring implementation

The recent Linux io_uring API has several advantages over ppoll(2) and
epoll(2). Details are given in the source code.

Add an io_uring implementation and make it the default on Linux.
Performance is the same as with epoll(7) but later patches add
optimizations that take advantage of io_uring.

It is necessary to change how aio_set_fd_handler() deals with deleting
AioHandlers since removing monitored file descriptors is asynchronous in
io_uring. fdmon_io_uring_remove() marks the AioHandler deleted and
aio_set_fd_handler() will let it handle deletion in that case.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-6-stefanha@redhat.com
Message-Id: <20200305170806.1313245-6-stefanha@redhat.com>

show more ...


# 1f050a46 05-Mar-2020 Stefan Hajnoczi <stefanha@redhat.com>

aio-posix: extract ppoll(2) and epoll(7) fd monitoring

The ppoll(2) and epoll(7) file descriptor monitoring implementations are
mixed with the core util/aio-posix.c code. Before adding

aio-posix: extract ppoll(2) and epoll(7) fd monitoring

The ppoll(2) and epoll(7) file descriptor monitoring implementations are
mixed with the core util/aio-posix.c code. Before adding another
implementation for Linux io_uring, extract out the existing
ones so there is a clear interface and the core code is simpler.

The new interface is AioContext->fdmon_ops, a pointer to a FDMonOps
struct. See the patch for details.

Semantic changes:
1. ppoll(2) now reflects events from pollfds[] back into AioHandlers
while we're still on the clock for adaptive polling. This was
already happening for epoll(7), so if it's really an issue then we'll
need to fix both in the future.
2. epoll(7)'s fallback to ppoll(2) while external events are disabled
was broken when the number of fds exceeded the epoll(7) upgrade
threshold. I guess this code path simply wasn't tested and no one
noticed the bug. I didn't go out of my way to fix it but the correct
code is simpler than preserving the bug.

I also took some liberties in removing the unnecessary
AioContext->epoll_available (just check AioContext->epollfd != -1
instead) and AioContext->epoll_enabled (it's implicit if our
AioContext->fdmon_ops callbacks are being invoked) fields.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-4-stefanha@redhat.com
Message-Id: <20200305170806.1313245-4-stefanha@redhat.com>

show more ...