1.. SPDX-License-Identifier: GPL-2.0 2 3====================== 4Generic vcpu interface 5====================== 6 7The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR, 8KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct 9kvm_device_attr as other devices, but targets VCPU-wide settings and controls. 10 11The groups and attributes per virtual cpu, if any, are architecture specific. 12 131. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL 14================================== 15 16:Architectures: ARM64 17 181.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ 19--------------------------------------- 20 21:Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a 22 pointer to an int 23 24Returns: 25 26 ======= ======================================================== 27 -EBUSY The PMU overflow interrupt is already set 28 -EFAULT Error reading interrupt number 29 -ENXIO PMUv3 not supported or the overflow interrupt not set 30 when attempting to get it 31 -ENODEV KVM_ARM_VCPU_PMU_V3 feature missing from VCPU 32 -EINVAL Invalid PMU overflow interrupt number supplied or 33 trying to set the IRQ number without using an in-kernel 34 irqchip. 35 ======= ======================================================== 36 37A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt 38number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt 39type must be same for each vcpu. As a PPI, the interrupt number is the same for 40all vcpus, while as an SPI it must be a separate number per vcpu. 41 421.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT 43--------------------------------------- 44 45:Parameters: no additional parameter in kvm_device_attr.addr 46 47Returns: 48 49 ======= ====================================================== 50 -EEXIST Interrupt number already used 51 -ENODEV PMUv3 not supported or GIC not initialized 52 -ENXIO PMUv3 not supported, missing VCPU feature or interrupt 53 number not set 54 -EBUSY PMUv3 already initialized 55 ======= ====================================================== 56 57Request the initialization of the PMUv3. If using the PMUv3 with an in-kernel 58virtual GIC implementation, this must be done after initializing the in-kernel 59irqchip. 60 611.3 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_FILTER 62----------------------------------------- 63 64:Parameters: in kvm_device_attr.addr the address for a PMU event filter is a 65 pointer to a struct kvm_pmu_event_filter 66 67:Returns: 68 69 ======= ====================================================== 70 -ENODEV PMUv3 not supported or GIC not initialized 71 -ENXIO PMUv3 not properly configured or in-kernel irqchip not 72 configured as required prior to calling this attribute 73 -EBUSY PMUv3 already initialized 74 -EINVAL Invalid filter range 75 ======= ====================================================== 76 77Request the installation of a PMU event filter described as follows:: 78 79 struct kvm_pmu_event_filter { 80 __u16 base_event; 81 __u16 nevents; 82 83 #define KVM_PMU_EVENT_ALLOW 0 84 #define KVM_PMU_EVENT_DENY 1 85 86 __u8 action; 87 __u8 pad[3]; 88 }; 89 90A filter range is defined as the range [@base_event, @base_event + @nevents), 91together with an @action (KVM_PMU_EVENT_ALLOW or KVM_PMU_EVENT_DENY). The 92first registered range defines the global policy (global ALLOW if the first 93@action is DENY, global DENY if the first @action is ALLOW). Multiple ranges 94can be programmed, and must fit within the event space defined by the PMU 95architecture (10 bits on ARMv8.0, 16 bits from ARMv8.1 onwards). 96 97Note: "Cancelling" a filter by registering the opposite action for the same 98range doesn't change the default action. For example, installing an ALLOW 99filter for event range [0:10) as the first filter and then applying a DENY 100action for the same range will leave the whole range as disabled. 101 102Restrictions: Event 0 (SW_INCR) is never filtered, as it doesn't count a 103hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it 104isn't strictly speaking an event. Filtering the cycle counter is possible 105using event 0x11 (CPU_CYCLES). 106 107 1082. GROUP: KVM_ARM_VCPU_TIMER_CTRL 109================================= 110 111:Architectures: ARM, ARM64 112 1132.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER 114----------------------------------------------------------------------------- 115 116:Parameters: in kvm_device_attr.addr the address for the timer interrupt is a 117 pointer to an int 118 119Returns: 120 121 ======= ================================= 122 -EINVAL Invalid timer interrupt number 123 -EBUSY One or more VCPUs has already run 124 ======= ================================= 125 126A value describing the architected timer interrupt number when connected to an 127in-kernel virtual GIC. These must be a PPI (16 <= intid < 32). Setting the 128attribute overrides the default values (see below). 129 130============================= ========================================== 131KVM_ARM_VCPU_TIMER_IRQ_VTIMER The EL1 virtual timer intid (default: 27) 132KVM_ARM_VCPU_TIMER_IRQ_PTIMER The EL1 physical timer intid (default: 30) 133============================= ========================================== 134 135Setting the same PPI for different timers will prevent the VCPUs from running. 136Setting the interrupt number on a VCPU configures all VCPUs created at that 137time to use the number provided for a given timer, overwriting any previously 138configured values on other VCPUs. Userspace should configure the interrupt 139numbers on at least one VCPU after creating all VCPUs and before running any 140VCPUs. 141 1423. GROUP: KVM_ARM_VCPU_PVTIME_CTRL 143================================== 144 145:Architectures: ARM64 146 1473.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA 148-------------------------------------- 149 150:Parameters: 64-bit base address 151 152Returns: 153 154 ======= ====================================== 155 -ENXIO Stolen time not implemented 156 -EEXIST Base address already set for this VCPU 157 -EINVAL Base address not 64 byte aligned 158 ======= ====================================== 159 160Specifies the base address of the stolen time structure for this VCPU. The 161base address must be 64 byte aligned and exist within a valid guest memory 162region. See Documentation/virt/kvm/arm/pvtime.rst for more information 163including the layout of the stolen time structure. 164 1654. GROUP: KVM_VCPU_TSC_CTRL 166=========================== 167 168:Architectures: x86 169 1704.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET 171 172:Parameters: 64-bit unsigned TSC offset 173 174Returns: 175 176 ======= ====================================== 177 -EFAULT Error reading/writing the provided 178 parameter address. 179 -ENXIO Attribute not supported 180 ======= ====================================== 181 182Specifies the guest's TSC offset relative to the host's TSC. The guest's 183TSC is then derived by the following equation: 184 185 guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET 186 187This attribute is useful to adjust the guest's TSC on live migration, 188so that the TSC counts the time during which the VM was paused. The 189following describes a possible algorithm to use for this purpose. 190 191From the source VMM process: 192 1931. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_src), 194 kvmclock nanoseconds (guest_src), and host CLOCK_REALTIME nanoseconds 195 (host_src). 196 1972. Read the KVM_VCPU_TSC_OFFSET attribute for every vCPU to record the 198 guest TSC offset (ofs_src[i]). 199 2003. Invoke the KVM_GET_TSC_KHZ ioctl to record the frequency of the 201 guest's TSC (freq). 202 203From the destination VMM process: 204 2054. Invoke the KVM_SET_CLOCK ioctl, providing the source nanoseconds from 206 kvmclock (guest_src) and CLOCK_REALTIME (host_src) in their respective 207 fields. Ensure that the KVM_CLOCK_REALTIME flag is set in the provided 208 structure. 209 210 KVM will advance the VM's kvmclock to account for elapsed time since 211 recording the clock values. Note that this will cause problems in 212 the guest (e.g., timeouts) unless CLOCK_REALTIME is synchronized 213 between the source and destination, and a reasonably short time passes 214 between the source pausing the VMs and the destination executing 215 steps 4-7. 216 2175. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_dest) and 218 kvmclock nanoseconds (guest_dest). 219 2206. Adjust the guest TSC offsets for every vCPU to account for (1) time 221 elapsed since recording state and (2) difference in TSCs between the 222 source and destination machine: 223 224 ofs_dst[i] = ofs_src[i] - 225 (guest_src - guest_dest) * freq + 226 (tsc_src - tsc_dest) 227 228 ("ofs[i] + tsc - guest * freq" is the guest TSC value corresponding to 229 a time of 0 in kvmclock. The above formula ensures that it is the 230 same on the destination as it was on the source). 231 2327. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the 233 respective value derived in the previous step. 234