Revision tags: v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7 |
|
#
01fcb1cb |
| 29-May-2020 |
Jason Wang <jasowang@redhat.com> |
vhost: allow device that does not depend on vhost worker
vDPA device currently relays the eventfd via vhost worker. This is inefficient due the latency of wakeup and scheduling, so this patch tries
vhost: allow device that does not depend on vhost worker
vDPA device currently relays the eventfd via vhost worker. This is inefficient due the latency of wakeup and scheduling, so this patch tries to introduce a use_worker attribute for the vhost device. When use_worker is not set with vhost_dev_init(), vhost won't try to allocate a worker thread and the vhost_poll will be processed directly in the wakeup function.
This help for vDPA since it reduces the latency caused by vhost worker.
In my testing, it saves 0.2 ms in pings between VMs on a mutual host.
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200529080303.15449-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
Revision tags: v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37 |
|
#
0b841030 |
| 30-Apr-2020 |
Jia He <justin.he@arm.com> |
vhost: vsock: kick send_pkt worker once device is started
Ning Bo reported an abnormal 2-second gap when booting Kata container [1]. The unconditional timeout was caused by VSOCK_DEFAULT_CONNECT_TIM
vhost: vsock: kick send_pkt worker once device is started
Ning Bo reported an abnormal 2-second gap when booting Kata container [1]. The unconditional timeout was caused by VSOCK_DEFAULT_CONNECT_TIMEOUT of connecting from the client side. The vhost vsock client tries to connect an initializing virtio vsock server.
The abnormal flow looks like: host-userspace vhost vsock guest vsock ============== =========== ============ connect() --------> vhost_transport_send_pkt_work() initializing | vq->private_data==NULL | will not be queued V schedule_timeout(2s) vhost_vsock_start() <--------- device ready set vq->private_data
wait for 2s and failed connect() again vq->private_data!=NULL recv connecting pkt
Details: 1. Host userspace sends a connect pkt, at that time, guest vsock is under initializing, hence the vhost_vsock_start has not been called. So vq->private_data==NULL, and the pkt is not been queued to send to guest 2. Then it sleeps for 2s 3. After guest vsock finishes initializing, vq->private_data is set 4. When host userspace wakes up after 2s, send connecting pkt again, everything is fine.
As suggested by Stefano Garzarella, this fixes it by additional kicking the send_pkt worker in vhost_vsock_start once the virtio device is started. This makes the pending pkt sent again.
After this patch, kata-runtime (with vsock enabled) boot time is reduced from 3s to 1s on a ThunderX2 arm64 server.
[1] https://github.com/kata-containers/runtime/issues/1917
Reported-by: Ning Bo <n.b@live.com> Suggested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jia He <justin.he@arm.com> Link: https://lore.kernel.org/r/20200501043840.186557-1-justin.he@arm.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
show more ...
|
Revision tags: v5.4.36 |
|
#
a78d1639 |
| 24-Apr-2020 |
Stefano Garzarella <sgarzare@redhat.com> |
vsock/virtio: fix multiple packet delivery to monitoring devices
In virtio_transport.c, if the virtqueue is full, the transmitting packet is queued up and it will be sent in the next iteration. This
vsock/virtio: fix multiple packet delivery to monitoring devices
In virtio_transport.c, if the virtqueue is full, the transmitting packet is queued up and it will be sent in the next iteration. This causes the same packet to be delivered multiple times to monitoring devices.
We want to continue to deliver packets to monitoring devices before it is put in the virtqueue, to avoid that replies can appear in the packet capture before the transmitted packet.
This patch fixes the issue, adding a new flag (tap_delivered) in struct virtio_vsock_pkt, to check if the packet is already delivered to monitoring devices.
In vhost/vsock.c, we are splitting packets, so we must set 'tap_delivered' to false when we queue up the same virtio_vsock_pkt to handle the remaining bytes.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
107bc076 |
| 24-Apr-2020 |
Stefano Garzarella <sgarzare@redhat.com> |
vhost/vsock: fix packet delivery order to monitoring devices
We want to deliver packets to monitoring devices before it is put in the virtqueue, to avoid that replies can appear in the packet captur
vhost/vsock: fix packet delivery order to monitoring devices
We want to deliver packets to monitoring devices before it is put in the virtqueue, to avoid that replies can appear in the packet capture before the transmitted packet.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29 |
|
#
247643f8 |
| 31-Mar-2020 |
Eugenio Pérez <eperezma@redhat.com> |
vhost: Create accessors for virtqueues private_data
Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200331192804.6019-2-eperezma@redhat.com Signed-off-by: Michae
vhost: Create accessors for virtqueues private_data
Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200331192804.6019-2-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
Revision tags: v5.6 |
|
#
792a4f2e |
| 26-Mar-2020 |
Jason Wang <jasowang@redhat.com> |
vhost: allow per device message handler
This patch allow device to register its own message handler during vhost_dev_init(). vDPA device will use it to implement its own DMA mapping logic.
Signed-o
vhost: allow per device message handler
This patch allow device to register its own message handler during vhost_dev_init(). vDPA device will use it to implement its own DMA mapping logic.
Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200326140125.19794-3-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
Revision tags: v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3 |
|
#
8a3cc29c |
| 06-Dec-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vhost/vsock: accept only packets with the right dst_cid
When we receive a new packet from the guest, we check if the src_cid is correct, but we forgot to check the dst_cid.
The host should accept o
vhost/vsock: accept only packets with the right dst_cid
When we receive a new packet from the guest, we check if the src_cid is correct, but we forgot to check the dst_cid.
The host should accept only packets where dst_cid is equal to the host CID.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12 |
|
#
ed8640a9 |
| 14-Nov-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vhost/vsock: refuse CID assigned to the guest->host transport
In a nested VM environment, we have to refuse to assign to a nested guest the same CID assigned to our guest->host transport. In this wa
vhost/vsock: refuse CID assigned to the guest->host transport
In a nested VM environment, we have to refuse to assign to a nested guest the same CID assigned to our guest->host transport. In this way, the user can use the local CID for loopback.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
6a2c0962 |
| 14-Nov-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vsock: prevent transport modules unloading
This patch adds 'module' member in the 'struct vsock_transport' in order to get/put the transport module. This prevents the module unloading while sockets
vsock: prevent transport modules unloading
This patch adds 'module' member in the 'struct vsock_transport' in order to get/put the transport module. This prevents the module unloading while sockets are assigned to it.
We increase the module refcnt when a socket is assigned to a transport, and we decrease the module refcnt when the socket is destructed.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
c0cfa2d8 |
| 14-Nov-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vsock: add multi-transports support
This patch adds the support of multiple transports in the VSOCK core.
With the multi-transports support, we can use vsock with nested VMs (using also different h
vsock: add multi-transports support
This patch adds the support of multiple transports in the VSOCK core.
With the multi-transports support, we can use vsock with nested VMs (using also different hypervisors) loading both guest->host and host->guest transports at the same time.
Major changes: - vsock core module can be loaded regardless of the transports - vsock_core_init() and vsock_core_exit() are renamed to vsock_core_register() and vsock_core_unregister() - vsock_core_register() has a feature parameter (H2G, G2H, DGRAM) to identify which directions the transport can handle and if it's support DGRAM (only vmci) - each stream socket is assigned to a transport when the remote CID is set (during the connect() or when we receive a connection request on a listener socket). The remote CID is used to decide which transport to use: - remote CID <= VMADDR_CID_HOST will use guest->host transport; - remote CID == local_cid (guest->host transport) will use guest->host transport for loopback (host->guest transports don't support loopback); - remote CID > VMADDR_CID_HOST will use host->guest transport; - listener sockets are not bound to any transports since no transport operations are done on it. In this way we can create a listener socket, also if the transports are not loaded or with VMADDR_CID_ANY to listen on all transports. - DGRAM sockets are handled as before, since only the vmci_transport provides this feature.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
b9f2b0ff |
| 14-Nov-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vsock: handle buffer_size sockopts in the core
virtio_transport and vmci_transport handle the buffer_size sockopts in a very similar way.
In order to support multiple transports, this patch moves t
vsock: handle buffer_size sockopts in the core
virtio_transport and vmci_transport handle the buffer_size sockopts in a very similar way.
In order to support multiple transports, this patch moves this handling in the core to allow the user to change the options also if the socket is not yet assigned to any transport.
This patch also adds the '.notify_buffer_size' callback in the 'struct virtio_transport' in order to inform the transport, when the buffer_size is changed by the user. It is also useful to limit the 'buffer_size' requested (e.g. virtio transports).
Acked-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
4c7246dc |
| 14-Nov-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock()
We are going to add 'struct vsock_sock *' parameter to virtio_transport_get_ops().
In some cases, like in the virtio_tr
vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock()
We are going to add 'struct vsock_sock *' parameter to virtio_transport_get_ops().
In some cases, like in the virtio_transport_reset_no_sock(), we don't have any socket assigned to the packet received, so we can't use the virtio_transport_get_ops().
In order to allow virtio_transport_reset_no_sock() to use the '.send_pkt' callback from the 'vhost_transport' or 'virtio_transport', we add the 'struct virtio_transport *' to it and to its caller: virtio_transport_recv_pkt().
We moved the 'vhost_transport' and 'virtio_transport' definition, to pass their address to the virtio_transport_recv_pkt().
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6, v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10, v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18, v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9 |
|
#
407e9ef7 |
| 11-Sep-2018 |
Arnd Bergmann <arnd@arndb.de> |
compat_ioctl: move drivers to compat_ptr_ioctl
Each of these drivers has a copy of the same trivial helper function to convert the pointer argument and then call the native ioctl handler.
We now ha
compat_ioctl: move drivers to compat_ptr_ioctl
Each of these drivers has a copy of the same trivial helper function to convert the pointer argument and then call the native ioctl handler.
We now have a generic implementation of that, so use it.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
show more ...
|
#
6dbd3e66 |
| 30-Jul-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vhost/vsock: split packets to send using multiple buffers
If the packets to sent to the guest are bigger than the buffer available, we can split them, using multiple buffers and fixing the length in
vhost/vsock: split packets to send using multiple buffers
If the packets to sent to the guest are bigger than the buffer available, we can split them, using multiple buffers and fixing the length in the packet header. This is safe since virtio-vsock supports only stream sockets.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
473c7391 |
| 30-Jul-2019 |
Stefano Garzarella <sgarzare@redhat.com> |
vsock/virtio: limit the memory used per-socket
Since virtio-vsock was introduced, the buffers filled by the host and pushed to the guest using the vring, are directly queued in a per-socket list. Th
vsock/virtio: limit the memory used per-socket
Since virtio-vsock was introduced, the buffers filled by the host and pushed to the guest using the vring, are directly queued in a per-socket list. These buffers are preallocated by the guest with a fixed size (4 KB).
The maximum amount of memory used by each socket should be controlled by the credit mechanism. The default credit available per-socket is 256 KB, but if we use only 1 byte per packet, the guest can queue up to 262144 of 4 KB buffers, using up to 1 GB of memory per-socket. In addition, the guest will continue to fill the vring with new 4 KB free buffers to avoid starvation of other sockets.
This patch mitigates this issue copying the payload of small packets (< 128 bytes) into the buffer of last packet queued, in order to avoid wasting memory.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7a338472 |
| 04-Jun-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482
Based on 1 normalized pattern(s):
this work is licensed under the terms of the gnu gpl version 2
extracted by the scancode lice
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 482
Based on 1 normalized pattern(s):
this work is licensed under the terms of the gnu gpl version 2
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 48 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081204.624030236@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
e79b431f |
| 16-May-2019 |
Jason Wang <jasowang@redhat.com> |
vhost: vsock: add weight support
This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing vsock kthread from hogging cpu which is guest triggerable.
vhost: vsock: add weight support
This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing vsock kthread from hogging cpu which is guest triggerable. The weight can help to avoid starving the request from on direction while another direction is being processed.
The value of weight is picked from vhost-net.
This addresses CVE-2019-3900.
Cc: Stefan Hajnoczi <stefanha@redhat.com> Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
e82b9b07 |
| 16-May-2019 |
Jason Wang <jasowang@redhat.com> |
vhost: introduce vhost_exceeds_weight()
We used to have vhost_exceeds_weight() for vhost-net to:
- prevent vhost kthread from hogging the cpu - balance the time spent between TX and RX
This functi
vhost: introduce vhost_exceeds_weight()
We used to have vhost_exceeds_weight() for vhost-net to:
- prevent vhost kthread from hogging the cpu - balance the time spent between TX and RX
This function could be useful for vsock and scsi as well. So move it to vhost.c. Device must specify a weight which counts the number of requests, or it can also specific a byte_weight which counts the number of bytes that has been processed.
Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
b46a0bf7 |
| 28-Jan-2019 |
Jason Wang <jasowang@redhat.com> |
vhost: fix OOB in get_rx_bufs()
After batched used ring updating was introduced in commit e2b3b35eb989 ("vhost_net: batch used ring update in rx"). We tend to batch heads in vq->heads for more than
vhost: fix OOB in get_rx_bufs()
After batched used ring updating was introduced in commit e2b3b35eb989 ("vhost_net: batch used ring update in rx"). We tend to batch heads in vq->heads for more than one packet. But the quota passed to get_rx_bufs() was not correctly limited, which can result a OOB write in vq->heads.
headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx, vhost_len, &in, vq_log, &log, likely(mergeable) ? UIO_MAXIOV : 1);
UIO_MAXIOV was still used which is wrong since we could have batched used in vq->heads, this will cause OOB if the next buffer needs more than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've batched 64 (VHOST_NET_BATCH) heads: Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
============================================================================= BUG kmalloc-8k (Tainted: G B ): Redzone overwritten -----------------------------------------------------------------------------
INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674 kmem_cache_alloc_trace+0xbb/0x140 alloc_pd+0x22/0x60 gen8_ppgtt_create+0x11d/0x5f0 i915_ppgtt_create+0x16/0x80 i915_gem_create_context+0x248/0x390 i915_gem_context_create_ioctl+0x4b/0xe0 drm_ioctl_kernel+0xa5/0xf0 drm_ioctl+0x2ed/0x3a0 do_vfs_ioctl+0x9f/0x620 ksys_ioctl+0x6b/0x80 __x64_sys_ioctl+0x11/0x20 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x (null) flags=0x200000000010201 INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b
Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for vhost-net. This is done through set the limitation through vhost_dev_init(), then set_owner can allocate the number of iov in a per device manner.
This fixes CVE-2018-16880.
Fixes: e2b3b35eb989 ("vhost_net: batch used ring update in rx") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7fbe078c |
| 08-Jan-2019 |
Zha Bin <zhabin@linux.alibaba.com> |
vhost/vsock: fix vhost vsock cid hashing inconsistent
The vsock core only supports 32bit CID, but the Virtio-vsock spec define CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as ze
vhost/vsock: fix vhost vsock cid hashing inconsistent
The vsock core only supports 32bit CID, but the Virtio-vsock spec define CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as zero. This inconsistency causes one bug in vhost vsock driver. The scenarios is:
0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock object. And hash_min() is used to compute the hash key. hash_min() is defined as: (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)). That means the hash algorithm has dependency on the size of macro argument 'val'. 0. In function vhost_vsock_set_cid(), a 64bit CID is passed to hash_min() to compute the hash key when inserting a vsock object into the hash table. 0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min() to compute the hash key when looking up a vsock for an CID.
Because the different size of the CID, hash_min() returns different hash key, thus fails to look up the vsock object for an CID.
To fix this bug, we keep CID as u64 in the IOCTLs and virtio message headers, but explicitly convert u64 to u32 when deal with the hash table and vsock core.
Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack callers") Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.tex Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Reviewed-by: Liu Jiang <gerry@linux.alibaba.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
6db3d8dc |
| 05-Nov-2018 |
Stefan Hajnoczi <stefanha@redhat.com> |
vhost/vsock: switch to a mutex for vhost_vsock_hash
Now that there are no more data path users of vhost_vsock_lock, it can be turned into a mutex. It's only used by .release() and in the .ioctl() p
vhost/vsock: switch to a mutex for vhost_vsock_hash
Now that there are no more data path users of vhost_vsock_lock, it can be turned into a mutex. It's only used by .release() and in the .ioctl() path.
Depends-on: <20181105103547.22018-1-stefanha@redhat.com> Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
show more ...
|
#
834e772c |
| 05-Nov-2018 |
Stefan Hajnoczi <stefanha@redhat.com> |
vhost/vsock: fix use-after-free in network stack callers
If the network stack calls .send_pkt()/.cancel_pkt() during .release(), a struct vhost_vsock use-after-free is possible. This occurs because
vhost/vsock: fix use-after-free in network stack callers
If the network stack calls .send_pkt()/.cancel_pkt() during .release(), a struct vhost_vsock use-after-free is possible. This occurs because .release() does not wait for other CPUs to stop using struct vhost_vsock.
Switch to an RCU-enabled hashtable (indexed by guest CID) so that .release() can wait for other CPUs by calling synchronize_rcu(). This also eliminates vhost_vsock_lock acquisition in the data path so it could have a positive effect on performance.
This is CVE-2018-14625 "kernel: use-after-free Read in vhost_transport_send_pkt".
Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+bd391451452fb0b93039@syzkaller.appspotmail.com Reported-by: syzbot+e3e074963495f92a89ed@syzkaller.appspotmail.com Reported-by: syzbot+d5a0a170c5069658b141@syzkaller.appspotmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
show more ...
|
#
c38f57da |
| 06-Dec-2018 |
Stefan Hajnoczi <stefanha@redhat.com> |
vhost/vsock: fix reset orphans race with close timeout
If a local process has closed a connected socket and hasn't received a RST packet yet, then the socket remains in the table until a timeout exp
vhost/vsock: fix reset orphans race with close timeout
If a local process has closed a connected socket and hasn't received a RST packet yet, then the socket remains in the table until a timeout expires.
When a vhost_vsock instance is released with the timeout still pending, the socket is never freed because vhost_vsock has already set the SOCK_DONE flag.
Check if the close timer is pending and let it close the socket. This prevents the race which can leak sockets.
Reported-by: Maximilian Riemensberger <riemensberger@cadami.net> Cc: Graham Whaley <graham.whaley@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
Revision tags: v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11, v4.17.10, v4.17.9, v4.17.8, v4.17.7, v4.17.6, v4.17.5, v4.17.4, v4.17.3, v4.17.2, v4.17.1, v4.17, v4.16 |
|
#
dc32bb67 |
| 14-Mar-2018 |
Sonny Rao <sonnyrao@chromium.org> |
vhost: add vsock compat ioctl
This will allow usage of vsock from 32-bit binaries on a 64-bit kernel.
Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.
vhost: add vsock compat ioctl
This will allow usage of vsock from 32-bit binaries on a 64-bit kernel.
Signed-off-by: Sonny Rao <sonnyrao@chromium.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
show more ...
|
#
ff3c1b1a |
| 08-Mar-2018 |
Vaibhav Murkute <vaibhavmurkute88@gmail.com> |
drivers: vhost: vsock: fixed a brace coding style issue
Fixed a coding style issue.
Signed-off-by: Vaibhav Murkute <vaibhavmurkute88@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Si
drivers: vhost: vsock: fixed a brace coding style issue
Fixed a coding style issue.
Signed-off-by: Vaibhav Murkute <vaibhavmurkute88@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|