1e777a5bdSMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0 2e777a5bdSMauro Carvalho Chehab 3e777a5bdSMauro Carvalho Chehab====================== 4e777a5bdSMauro Carvalho ChehabGeneric vcpu interface 5e777a5bdSMauro Carvalho Chehab====================== 6e777a5bdSMauro Carvalho Chehab 7e777a5bdSMauro Carvalho ChehabThe virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR, 8e777a5bdSMauro Carvalho ChehabKVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct 9e777a5bdSMauro Carvalho Chehabkvm_device_attr as other devices, but targets VCPU-wide settings and controls. 10e777a5bdSMauro Carvalho Chehab 11e777a5bdSMauro Carvalho ChehabThe groups and attributes per virtual cpu, if any, are architecture specific. 12e777a5bdSMauro Carvalho Chehab 13e777a5bdSMauro Carvalho Chehab1. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL 14e777a5bdSMauro Carvalho Chehab================================== 15e777a5bdSMauro Carvalho Chehab 16e777a5bdSMauro Carvalho Chehab:Architectures: ARM64 17e777a5bdSMauro Carvalho Chehab 18e777a5bdSMauro Carvalho Chehab1.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ 19e777a5bdSMauro Carvalho Chehab--------------------------------------- 20e777a5bdSMauro Carvalho Chehab 21e777a5bdSMauro Carvalho Chehab:Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a 22e777a5bdSMauro Carvalho Chehab pointer to an int 23e777a5bdSMauro Carvalho Chehab 24e777a5bdSMauro Carvalho ChehabReturns: 25e777a5bdSMauro Carvalho Chehab 26e777a5bdSMauro Carvalho Chehab ======= ======================================================== 27e777a5bdSMauro Carvalho Chehab -EBUSY The PMU overflow interrupt is already set 28af130d0aSAlexandru Elisei -EFAULT Error reading interrupt number 2951dd2eb9SAlexandru Elisei -ENXIO PMUv3 not supported or the overflow interrupt not set 3051dd2eb9SAlexandru Elisei when attempting to get it 3151dd2eb9SAlexandru Elisei -ENODEV KVM_ARM_VCPU_PMU_V3 feature missing from VCPU 32e777a5bdSMauro Carvalho Chehab -EINVAL Invalid PMU overflow interrupt number supplied or 33e777a5bdSMauro Carvalho Chehab trying to set the IRQ number without using an in-kernel 34e777a5bdSMauro Carvalho Chehab irqchip. 35e777a5bdSMauro Carvalho Chehab ======= ======================================================== 36e777a5bdSMauro Carvalho Chehab 37e777a5bdSMauro Carvalho ChehabA value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt 38e777a5bdSMauro Carvalho Chehabnumber for this vcpu. This interrupt could be a PPI or SPI, but the interrupt 39e777a5bdSMauro Carvalho Chehabtype must be same for each vcpu. As a PPI, the interrupt number is the same for 40e777a5bdSMauro Carvalho Chehaball vcpus, while as an SPI it must be a separate number per vcpu. 41e777a5bdSMauro Carvalho Chehab 42e777a5bdSMauro Carvalho Chehab1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT 43e777a5bdSMauro Carvalho Chehab--------------------------------------- 44e777a5bdSMauro Carvalho Chehab 45e777a5bdSMauro Carvalho Chehab:Parameters: no additional parameter in kvm_device_attr.addr 46e777a5bdSMauro Carvalho Chehab 47e777a5bdSMauro Carvalho ChehabReturns: 48e777a5bdSMauro Carvalho Chehab 49e777a5bdSMauro Carvalho Chehab ======= ====================================================== 50af130d0aSAlexandru Elisei -EEXIST Interrupt number already used 51e777a5bdSMauro Carvalho Chehab -ENODEV PMUv3 not supported or GIC not initialized 5251dd2eb9SAlexandru Elisei -ENXIO PMUv3 not supported, missing VCPU feature or interrupt 5351dd2eb9SAlexandru Elisei number not set 54e777a5bdSMauro Carvalho Chehab -EBUSY PMUv3 already initialized 55e777a5bdSMauro Carvalho Chehab ======= ====================================================== 56e777a5bdSMauro Carvalho Chehab 57e777a5bdSMauro Carvalho ChehabRequest the initialization of the PMUv3. If using the PMUv3 with an in-kernel 58e777a5bdSMauro Carvalho Chehabvirtual GIC implementation, this must be done after initializing the in-kernel 59e777a5bdSMauro Carvalho Chehabirqchip. 60e777a5bdSMauro Carvalho Chehab 618be86a5eSMarc Zyngier1.3 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_FILTER 628be86a5eSMarc Zyngier----------------------------------------- 638be86a5eSMarc Zyngier 648be86a5eSMarc Zyngier:Parameters: in kvm_device_attr.addr the address for a PMU event filter is a 658be86a5eSMarc Zyngier pointer to a struct kvm_pmu_event_filter 668be86a5eSMarc Zyngier 678be86a5eSMarc Zyngier:Returns: 688be86a5eSMarc Zyngier 698be86a5eSMarc Zyngier ======= ====================================================== 70030bdf36SMauro Carvalho Chehab -ENODEV PMUv3 not supported or GIC not initialized 71030bdf36SMauro Carvalho Chehab -ENXIO PMUv3 not properly configured or in-kernel irqchip not 728be86a5eSMarc Zyngier configured as required prior to calling this attribute 735177fe91SMarc Zyngier -EBUSY PMUv3 already initialized or a VCPU has already run 74030bdf36SMauro Carvalho Chehab -EINVAL Invalid filter range 758be86a5eSMarc Zyngier ======= ====================================================== 768be86a5eSMarc Zyngier 77030bdf36SMauro Carvalho ChehabRequest the installation of a PMU event filter described as follows:: 788be86a5eSMarc Zyngier 798be86a5eSMarc Zyngier struct kvm_pmu_event_filter { 808be86a5eSMarc Zyngier __u16 base_event; 818be86a5eSMarc Zyngier __u16 nevents; 828be86a5eSMarc Zyngier 838be86a5eSMarc Zyngier #define KVM_PMU_EVENT_ALLOW 0 848be86a5eSMarc Zyngier #define KVM_PMU_EVENT_DENY 1 858be86a5eSMarc Zyngier 868be86a5eSMarc Zyngier __u8 action; 878be86a5eSMarc Zyngier __u8 pad[3]; 888be86a5eSMarc Zyngier }; 898be86a5eSMarc Zyngier 908be86a5eSMarc ZyngierA filter range is defined as the range [@base_event, @base_event + @nevents), 918be86a5eSMarc Zyngiertogether with an @action (KVM_PMU_EVENT_ALLOW or KVM_PMU_EVENT_DENY). The 928be86a5eSMarc Zyngierfirst registered range defines the global policy (global ALLOW if the first 938be86a5eSMarc Zyngier@action is DENY, global DENY if the first @action is ALLOW). Multiple ranges 948be86a5eSMarc Zyngiercan be programmed, and must fit within the event space defined by the PMU 958be86a5eSMarc Zyngierarchitecture (10 bits on ARMv8.0, 16 bits from ARMv8.1 onwards). 968be86a5eSMarc Zyngier 978be86a5eSMarc ZyngierNote: "Cancelling" a filter by registering the opposite action for the same 988be86a5eSMarc Zyngierrange doesn't change the default action. For example, installing an ALLOW 998be86a5eSMarc Zyngierfilter for event range [0:10) as the first filter and then applying a DENY 1008be86a5eSMarc Zyngieraction for the same range will leave the whole range as disabled. 1018be86a5eSMarc Zyngier 1028be86a5eSMarc ZyngierRestrictions: Event 0 (SW_INCR) is never filtered, as it doesn't count a 1038be86a5eSMarc Zyngierhardware event. Filtering event 0x1E (CHAIN) has no effect either, as it 1048be86a5eSMarc Zyngierisn't strictly speaking an event. Filtering the cycle counter is possible 1058be86a5eSMarc Zyngierusing event 0x11 (CPU_CYCLES). 1068be86a5eSMarc Zyngier 107*6ee7fca2SAlexandru Elisei1.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU 108*6ee7fca2SAlexandru Elisei------------------------------------------ 109*6ee7fca2SAlexandru Elisei 110*6ee7fca2SAlexandru Elisei:Parameters: in kvm_device_attr.addr the address to an int representing the PMU 111*6ee7fca2SAlexandru Elisei identifier. 112*6ee7fca2SAlexandru Elisei 113*6ee7fca2SAlexandru Elisei:Returns: 114*6ee7fca2SAlexandru Elisei 115*6ee7fca2SAlexandru Elisei ======= ==================================================== 116*6ee7fca2SAlexandru Elisei -EBUSY PMUv3 already initialized, a VCPU has already run or 117*6ee7fca2SAlexandru Elisei an event filter has already been set 118*6ee7fca2SAlexandru Elisei -EFAULT Error accessing the PMU identifier 119*6ee7fca2SAlexandru Elisei -ENXIO PMU not found 120*6ee7fca2SAlexandru Elisei -ENODEV PMUv3 not supported or GIC not initialized 121*6ee7fca2SAlexandru Elisei -ENOMEM Could not allocate memory 122*6ee7fca2SAlexandru Elisei ======= ==================================================== 123*6ee7fca2SAlexandru Elisei 124*6ee7fca2SAlexandru EliseiRequest that the VCPU uses the specified hardware PMU when creating guest events 125*6ee7fca2SAlexandru Eliseifor the purpose of PMU emulation. The PMU identifier can be read from the "type" 126*6ee7fca2SAlexandru Eliseifile for the desired PMU instance under /sys/devices (or, equivalent, 127*6ee7fca2SAlexandru Elisei/sys/bus/even_source). This attribute is particularly useful on heterogeneous 128*6ee7fca2SAlexandru Eliseisystems where there are at least two CPU PMUs on the system. The PMU that is set 129*6ee7fca2SAlexandru Eliseifor one VCPU will be used by all the other VCPUs. It isn't possible to set a PMU 130*6ee7fca2SAlexandru Eliseiif a PMU event filter is already present. 131*6ee7fca2SAlexandru Elisei 132*6ee7fca2SAlexandru EliseiNote that KVM will not make any attempts to run the VCPU on the physical CPUs 133*6ee7fca2SAlexandru Eliseiassociated with the PMU specified by this attribute. This is entirely left to 134*6ee7fca2SAlexandru Eliseiuserspace. 135e777a5bdSMauro Carvalho Chehab 136e777a5bdSMauro Carvalho Chehab2. GROUP: KVM_ARM_VCPU_TIMER_CTRL 137e777a5bdSMauro Carvalho Chehab================================= 138e777a5bdSMauro Carvalho Chehab 139e777a5bdSMauro Carvalho Chehab:Architectures: ARM, ARM64 140e777a5bdSMauro Carvalho Chehab 141e777a5bdSMauro Carvalho Chehab2.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER 142e777a5bdSMauro Carvalho Chehab----------------------------------------------------------------------------- 143e777a5bdSMauro Carvalho Chehab 144e777a5bdSMauro Carvalho Chehab:Parameters: in kvm_device_attr.addr the address for the timer interrupt is a 145e777a5bdSMauro Carvalho Chehab pointer to an int 146e777a5bdSMauro Carvalho Chehab 147e777a5bdSMauro Carvalho ChehabReturns: 148e777a5bdSMauro Carvalho Chehab 149e777a5bdSMauro Carvalho Chehab ======= ================================= 150e777a5bdSMauro Carvalho Chehab -EINVAL Invalid timer interrupt number 151e777a5bdSMauro Carvalho Chehab -EBUSY One or more VCPUs has already run 152e777a5bdSMauro Carvalho Chehab ======= ================================= 153e777a5bdSMauro Carvalho Chehab 154e777a5bdSMauro Carvalho ChehabA value describing the architected timer interrupt number when connected to an 155e777a5bdSMauro Carvalho Chehabin-kernel virtual GIC. These must be a PPI (16 <= intid < 32). Setting the 156e777a5bdSMauro Carvalho Chehabattribute overrides the default values (see below). 157e777a5bdSMauro Carvalho Chehab 158e777a5bdSMauro Carvalho Chehab============================= ========================================== 159e777a5bdSMauro Carvalho ChehabKVM_ARM_VCPU_TIMER_IRQ_VTIMER The EL1 virtual timer intid (default: 27) 160e777a5bdSMauro Carvalho ChehabKVM_ARM_VCPU_TIMER_IRQ_PTIMER The EL1 physical timer intid (default: 30) 161e777a5bdSMauro Carvalho Chehab============================= ========================================== 162e777a5bdSMauro Carvalho Chehab 163e777a5bdSMauro Carvalho ChehabSetting the same PPI for different timers will prevent the VCPUs from running. 164e777a5bdSMauro Carvalho ChehabSetting the interrupt number on a VCPU configures all VCPUs created at that 165e777a5bdSMauro Carvalho Chehabtime to use the number provided for a given timer, overwriting any previously 166e777a5bdSMauro Carvalho Chehabconfigured values on other VCPUs. Userspace should configure the interrupt 167e777a5bdSMauro Carvalho Chehabnumbers on at least one VCPU after creating all VCPUs and before running any 168e777a5bdSMauro Carvalho ChehabVCPUs. 169e777a5bdSMauro Carvalho Chehab 170e777a5bdSMauro Carvalho Chehab3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL 171e777a5bdSMauro Carvalho Chehab================================== 172e777a5bdSMauro Carvalho Chehab 173e777a5bdSMauro Carvalho Chehab:Architectures: ARM64 174e777a5bdSMauro Carvalho Chehab 175e777a5bdSMauro Carvalho Chehab3.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA 176e777a5bdSMauro Carvalho Chehab-------------------------------------- 177e777a5bdSMauro Carvalho Chehab 178e777a5bdSMauro Carvalho Chehab:Parameters: 64-bit base address 179e777a5bdSMauro Carvalho Chehab 180e777a5bdSMauro Carvalho ChehabReturns: 181e777a5bdSMauro Carvalho Chehab 182e777a5bdSMauro Carvalho Chehab ======= ====================================== 183e777a5bdSMauro Carvalho Chehab -ENXIO Stolen time not implemented 184e777a5bdSMauro Carvalho Chehab -EEXIST Base address already set for this VCPU 185e777a5bdSMauro Carvalho Chehab -EINVAL Base address not 64 byte aligned 186e777a5bdSMauro Carvalho Chehab ======= ====================================== 187e777a5bdSMauro Carvalho Chehab 188e777a5bdSMauro Carvalho ChehabSpecifies the base address of the stolen time structure for this VCPU. The 189e777a5bdSMauro Carvalho Chehabbase address must be 64 byte aligned and exist within a valid guest memory 19072ef5e52SMauro Carvalho Chehabregion. See Documentation/virt/kvm/arm/pvtime.rst for more information 191e777a5bdSMauro Carvalho Chehabincluding the layout of the stolen time structure. 192828ca896SOliver Upton 193828ca896SOliver Upton4. GROUP: KVM_VCPU_TSC_CTRL 194828ca896SOliver Upton=========================== 195828ca896SOliver Upton 196828ca896SOliver Upton:Architectures: x86 197828ca896SOliver Upton 198828ca896SOliver Upton4.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET 199828ca896SOliver Upton 200828ca896SOliver Upton:Parameters: 64-bit unsigned TSC offset 201828ca896SOliver Upton 202828ca896SOliver UptonReturns: 203828ca896SOliver Upton 204828ca896SOliver Upton ======= ====================================== 205828ca896SOliver Upton -EFAULT Error reading/writing the provided 206828ca896SOliver Upton parameter address. 207828ca896SOliver Upton -ENXIO Attribute not supported 208828ca896SOliver Upton ======= ====================================== 209828ca896SOliver Upton 210828ca896SOliver UptonSpecifies the guest's TSC offset relative to the host's TSC. The guest's 211828ca896SOliver UptonTSC is then derived by the following equation: 212828ca896SOliver Upton 213828ca896SOliver Upton guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET 214828ca896SOliver Upton 2153f9808caSOliver UptonThis attribute is useful to adjust the guest's TSC on live migration, 2163f9808caSOliver Uptonso that the TSC counts the time during which the VM was paused. The 2173f9808caSOliver Uptonfollowing describes a possible algorithm to use for this purpose. 218828ca896SOliver Upton 219828ca896SOliver UptonFrom the source VMM process: 220828ca896SOliver Upton 2213f9808caSOliver Upton1. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_src), 2223f9808caSOliver Upton kvmclock nanoseconds (guest_src), and host CLOCK_REALTIME nanoseconds 2233f9808caSOliver Upton (host_src). 224828ca896SOliver Upton 225828ca896SOliver Upton2. Read the KVM_VCPU_TSC_OFFSET attribute for every vCPU to record the 2263f9808caSOliver Upton guest TSC offset (ofs_src[i]). 227828ca896SOliver Upton 228828ca896SOliver Upton3. Invoke the KVM_GET_TSC_KHZ ioctl to record the frequency of the 229828ca896SOliver Upton guest's TSC (freq). 230828ca896SOliver Upton 231828ca896SOliver UptonFrom the destination VMM process: 232828ca896SOliver Upton 2333f9808caSOliver Upton4. Invoke the KVM_SET_CLOCK ioctl, providing the source nanoseconds from 2343f9808caSOliver Upton kvmclock (guest_src) and CLOCK_REALTIME (host_src) in their respective 2353f9808caSOliver Upton fields. Ensure that the KVM_CLOCK_REALTIME flag is set in the provided 2363f9808caSOliver Upton structure. 237828ca896SOliver Upton 2383f9808caSOliver Upton KVM will advance the VM's kvmclock to account for elapsed time since 2393f9808caSOliver Upton recording the clock values. Note that this will cause problems in 2403f9808caSOliver Upton the guest (e.g., timeouts) unless CLOCK_REALTIME is synchronized 2413f9808caSOliver Upton between the source and destination, and a reasonably short time passes 2423f9808caSOliver Upton between the source pausing the VMs and the destination executing 2433f9808caSOliver Upton steps 4-7. 2443f9808caSOliver Upton 2453f9808caSOliver Upton5. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_dest) and 2463f9808caSOliver Upton kvmclock nanoseconds (guest_dest). 247828ca896SOliver Upton 248828ca896SOliver Upton6. Adjust the guest TSC offsets for every vCPU to account for (1) time 249828ca896SOliver Upton elapsed since recording state and (2) difference in TSCs between the 250828ca896SOliver Upton source and destination machine: 251828ca896SOliver Upton 2523f9808caSOliver Upton ofs_dst[i] = ofs_src[i] - 2533f9808caSOliver Upton (guest_src - guest_dest) * freq + 2543f9808caSOliver Upton (tsc_src - tsc_dest) 2553f9808caSOliver Upton 2563f9808caSOliver Upton ("ofs[i] + tsc - guest * freq" is the guest TSC value corresponding to 2573f9808caSOliver Upton a time of 0 in kvmclock. The above formula ensures that it is the 2583f9808caSOliver Upton same on the destination as it was on the source). 259828ca896SOliver Upton 260828ca896SOliver Upton7. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the 261828ca896SOliver Upton respective value derived in the previous step. 262