1*58b278f5SVaibhav Jain.. SPDX-License-Identifier: GPL-2.0 2*58b278f5SVaibhav Jain 3*58b278f5SVaibhav Jain=========================== 4*58b278f5SVaibhav JainHypercall Op-codes (hcalls) 5*58b278f5SVaibhav Jain=========================== 6*58b278f5SVaibhav Jain 7*58b278f5SVaibhav JainOverview 8*58b278f5SVaibhav Jain========= 9*58b278f5SVaibhav Jain 10*58b278f5SVaibhav JainVirtualization on 64-bit Power Book3S Platforms is based on the PAPR 11*58b278f5SVaibhav Jainspecification [1]_ which describes the run-time environment for a guest 12*58b278f5SVaibhav Jainoperating system and how it should interact with the hypervisor for 13*58b278f5SVaibhav Jainprivileged operations. Currently there are two PAPR compliant hypervisors: 14*58b278f5SVaibhav Jain 15*58b278f5SVaibhav Jain- **IBM PowerVM (PHYP)**: IBM's proprietary hypervisor that supports AIX, 16*58b278f5SVaibhav Jain IBM-i and Linux as supported guests (termed as Logical Partitions 17*58b278f5SVaibhav Jain or LPARS). It supports the full PAPR specification. 18*58b278f5SVaibhav Jain 19*58b278f5SVaibhav Jain- **Qemu/KVM**: Supports PPC64 linux guests running on a PPC64 linux host. 20*58b278f5SVaibhav Jain Though it only implements a subset of PAPR specification called LoPAPR [2]_. 21*58b278f5SVaibhav Jain 22*58b278f5SVaibhav JainOn PPC64 arch a guest kernel running on top of a PAPR hypervisor is called 23*58b278f5SVaibhav Jaina *pSeries guest*. A pseries guest runs in a supervisor mode (HV=0) and must 24*58b278f5SVaibhav Jainissue hypercalls to the hypervisor whenever it needs to perform an action 25*58b278f5SVaibhav Jainthat is hypervisor priviledged [3]_ or for other services managed by the 26*58b278f5SVaibhav Jainhypervisor. 27*58b278f5SVaibhav Jain 28*58b278f5SVaibhav JainHence a Hypercall (hcall) is essentially a request by the pseries guest 29*58b278f5SVaibhav Jainasking hypervisor to perform a privileged operation on behalf of the guest. The 30*58b278f5SVaibhav Jainguest issues a with necessary input operands. The hypervisor after performing 31*58b278f5SVaibhav Jainthe privilege operation returns a status code and output operands back to the 32*58b278f5SVaibhav Jainguest. 33*58b278f5SVaibhav Jain 34*58b278f5SVaibhav JainHCALL ABI 35*58b278f5SVaibhav Jain========= 36*58b278f5SVaibhav JainThe ABI specification for a hcall between a pseries guest and PAPR hypervisor 37*58b278f5SVaibhav Jainis covered in section 14.5.3 of ref [2]_. Switch to the Hypervisor context is 38*58b278f5SVaibhav Jaindone via the instruction **HVCS** that expects the Opcode for hcall is set in *r3* 39*58b278f5SVaibhav Jainand any in-arguments for the hcall are provided in registers *r4-r12*. If values 40*58b278f5SVaibhav Jainhave to be passed through a memory buffer, the data stored in that buffer should be 41*58b278f5SVaibhav Jainin Big-endian byte order. 42*58b278f5SVaibhav Jain 43*58b278f5SVaibhav JainOnce control is returns back to the guest after hypervisor has serviced the 44*58b278f5SVaibhav Jain'HVCS' instruction the return value of the hcall is available in *r3* and any 45*58b278f5SVaibhav Jainout values are returned in registers *r4-r12*. Again like in case of in-arguments, 46*58b278f5SVaibhav Jainany out values stored in a memory buffer will be in Big-endian byte order. 47*58b278f5SVaibhav Jain 48*58b278f5SVaibhav JainPowerpc arch code provides convenient wrappers named **plpar_hcall_xxx** defined 49*58b278f5SVaibhav Jainin a arch specific header [4]_ to issue hcalls from the linux kernel 50*58b278f5SVaibhav Jainrunning as pseries guest. 51*58b278f5SVaibhav Jain 52*58b278f5SVaibhav JainRegister Conventions 53*58b278f5SVaibhav Jain==================== 54*58b278f5SVaibhav Jain 55*58b278f5SVaibhav JainAny hcall should follow same register convention as described in section 2.2.1.1 56*58b278f5SVaibhav Jainof "64-Bit ELF V2 ABI Specification: Power Architecture"[5]_. Table below 57*58b278f5SVaibhav Jainsummarizes these conventions: 58*58b278f5SVaibhav Jain 59*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 60*58b278f5SVaibhav Jain| Register |Volatile | Purpose | 61*58b278f5SVaibhav Jain| Range |(Y/N) | | 62*58b278f5SVaibhav Jain+==========+==========+===========================================+ 63*58b278f5SVaibhav Jain| r0 | Y | Optional-usage | 64*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 65*58b278f5SVaibhav Jain| r1 | N | Stack Pointer | 66*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 67*58b278f5SVaibhav Jain| r2 | N | TOC | 68*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 69*58b278f5SVaibhav Jain| r3 | Y | hcall opcode/return value | 70*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 71*58b278f5SVaibhav Jain| r4-r10 | Y | in and out values | 72*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 73*58b278f5SVaibhav Jain| r11 | Y | Optional-usage/Environmental pointer | 74*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 75*58b278f5SVaibhav Jain| r12 | Y | Optional-usage/Function entry address at | 76*58b278f5SVaibhav Jain| | | global entry point | 77*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 78*58b278f5SVaibhav Jain| r13 | N | Thread-Pointer | 79*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 80*58b278f5SVaibhav Jain| r14-r31 | N | Local Variables | 81*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 82*58b278f5SVaibhav Jain| LR | Y | Link Register | 83*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 84*58b278f5SVaibhav Jain| CTR | Y | Loop Counter | 85*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 86*58b278f5SVaibhav Jain| XER | Y | Fixed-point exception register. | 87*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 88*58b278f5SVaibhav Jain| CR0-1 | Y | Condition register fields. | 89*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 90*58b278f5SVaibhav Jain| CR2-4 | N | Condition register fields. | 91*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 92*58b278f5SVaibhav Jain| CR5-7 | Y | Condition register fields. | 93*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 94*58b278f5SVaibhav Jain| Others | N | | 95*58b278f5SVaibhav Jain+----------+----------+-------------------------------------------+ 96*58b278f5SVaibhav Jain 97*58b278f5SVaibhav JainDRC & DRC Indexes 98*58b278f5SVaibhav Jain================= 99*58b278f5SVaibhav Jain:: 100*58b278f5SVaibhav Jain 101*58b278f5SVaibhav Jain DR1 Guest 102*58b278f5SVaibhav Jain +--+ +------------+ +---------+ 103*58b278f5SVaibhav Jain | | <----> | | | User | 104*58b278f5SVaibhav Jain +--+ DRC1 | | DRC | Space | 105*58b278f5SVaibhav Jain | PAPR | Index +---------+ 106*58b278f5SVaibhav Jain DR2 | Hypervisor | | | 107*58b278f5SVaibhav Jain +--+ | | <-----> | Kernel | 108*58b278f5SVaibhav Jain | | <----> | | Hcall | | 109*58b278f5SVaibhav Jain +--+ DRC2 +------------+ +---------+ 110*58b278f5SVaibhav Jain 111*58b278f5SVaibhav JainPAPR hypervisor terms shared hardware resources like PCI devices, NVDIMMs etc 112*58b278f5SVaibhav Jainavailable for use by LPARs as Dynamic Resource (DR). When a DR is allocated to 113*58b278f5SVaibhav Jainan LPAR, PHYP creates a data-structure called Dynamic Resource Connector (DRC) 114*58b278f5SVaibhav Jainto manage LPAR access. An LPAR refers to a DRC via an opaque 32-bit number 115*58b278f5SVaibhav Jaincalled DRC-Index. The DRC-index value is provided to the LPAR via device-tree 116*58b278f5SVaibhav Jainwhere its present as an attribute in the device tree node associated with the 117*58b278f5SVaibhav JainDR. 118*58b278f5SVaibhav Jain 119*58b278f5SVaibhav JainHCALL Return-values 120*58b278f5SVaibhav Jain=================== 121*58b278f5SVaibhav Jain 122*58b278f5SVaibhav JainAfter servicing the hcall, hypervisor sets the return-value in *r3* indicating 123*58b278f5SVaibhav Jainsuccess or failure of the hcall. In case of a failure an error code indicates 124*58b278f5SVaibhav Jainthe cause for error. These codes are defined and documented in arch specific 125*58b278f5SVaibhav Jainheader [4]_. 126*58b278f5SVaibhav Jain 127*58b278f5SVaibhav JainIn some cases a hcall can potentially take a long time and need to be issued 128*58b278f5SVaibhav Jainmultiple times in order to be completely serviced. These hcalls will usually 129*58b278f5SVaibhav Jainaccept an opaque value *continue-token* within there argument list and a 130*58b278f5SVaibhav Jainreturn value of *H_CONTINUE* indicates that hypervisor hasn't still finished 131*58b278f5SVaibhav Jainservicing the hcall yet. 132*58b278f5SVaibhav Jain 133*58b278f5SVaibhav JainTo make such hcalls the guest need to set *continue-token == 0* for the 134*58b278f5SVaibhav Jaininitial call and use the hypervisor returned value of *continue-token* 135*58b278f5SVaibhav Jainfor each subsequent hcall until hypervisor returns a non *H_CONTINUE* 136*58b278f5SVaibhav Jainreturn value. 137*58b278f5SVaibhav Jain 138*58b278f5SVaibhav JainHCALL Op-codes 139*58b278f5SVaibhav Jain============== 140*58b278f5SVaibhav Jain 141*58b278f5SVaibhav JainBelow is a partial list of HCALLs that are supported by PHYP. For the 142*58b278f5SVaibhav Jaincorresponding opcode values please look into the arch specific header [4]_: 143*58b278f5SVaibhav Jain 144*58b278f5SVaibhav Jain**H_SCM_READ_METADATA** 145*58b278f5SVaibhav Jain 146*58b278f5SVaibhav Jain| Input: *drcIndex, offset, buffer-address, numBytesToRead* 147*58b278f5SVaibhav Jain| Out: *numBytesRead* 148*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_P2, H_P3, H_Hardware* 149*58b278f5SVaibhav Jain 150*58b278f5SVaibhav JainGiven a DRC Index of an NVDIMM, read N-bytes from the the metadata area 151*58b278f5SVaibhav Jainassociated with it, at a specified offset and copy it to provided buffer. 152*58b278f5SVaibhav JainThe metadata area stores configuration information such as label information, 153*58b278f5SVaibhav Jainbad-blocks etc. The metadata area is located out-of-band of NVDIMM storage 154*58b278f5SVaibhav Jainarea hence a separate access semantics is provided. 155*58b278f5SVaibhav Jain 156*58b278f5SVaibhav Jain**H_SCM_WRITE_METADATA** 157*58b278f5SVaibhav Jain 158*58b278f5SVaibhav Jain| Input: *drcIndex, offset, data, numBytesToWrite* 159*58b278f5SVaibhav Jain| Out: *None* 160*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_P2, H_P4, H_Hardware* 161*58b278f5SVaibhav Jain 162*58b278f5SVaibhav JainGiven a DRC Index of an NVDIMM, write N-bytes to the metadata area 163*58b278f5SVaibhav Jainassociated with it, at the specified offset and from the provided buffer. 164*58b278f5SVaibhav Jain 165*58b278f5SVaibhav Jain**H_SCM_BIND_MEM** 166*58b278f5SVaibhav Jain 167*58b278f5SVaibhav Jain| Input: *drcIndex, startingScmBlockIndex, numScmBlocksToBind,* 168*58b278f5SVaibhav Jain| *targetLogicalMemoryAddress, continue-token* 169*58b278f5SVaibhav Jain| Out: *continue-token, targetLogicalMemoryAddress, numScmBlocksToBound* 170*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_P2, H_P3, H_P4, H_Overlap,* 171*58b278f5SVaibhav Jain| *H_Too_Big, H_P5, H_Busy* 172*58b278f5SVaibhav Jain 173*58b278f5SVaibhav JainGiven a DRC-Index of an NVDIMM, map a continuous SCM blocks range 174*58b278f5SVaibhav Jain*(startingScmBlockIndex, startingScmBlockIndex+numScmBlocksToBind)* to the guest 175*58b278f5SVaibhav Jainat *targetLogicalMemoryAddress* within guest physical address space. In 176*58b278f5SVaibhav Jaincase *targetLogicalMemoryAddress == 0xFFFFFFFF_FFFFFFFF* then hypervisor 177*58b278f5SVaibhav Jainassigns a target address to the guest. The HCALL can fail if the Guest has 178*58b278f5SVaibhav Jainan active PTE entry to the SCM block being bound. 179*58b278f5SVaibhav Jain 180*58b278f5SVaibhav Jain**H_SCM_UNBIND_MEM** 181*58b278f5SVaibhav Jain| Input: drcIndex, startingScmLogicalMemoryAddress, numScmBlocksToUnbind 182*58b278f5SVaibhav Jain| Out: numScmBlocksUnbound 183*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_P2, H_P3, H_In_Use, H_Overlap,* 184*58b278f5SVaibhav Jain| *H_Busy, H_LongBusyOrder1mSec, H_LongBusyOrder10mSec* 185*58b278f5SVaibhav Jain 186*58b278f5SVaibhav JainGiven a DRC-Index of an NVDimm, unmap *numScmBlocksToUnbind* SCM blocks starting 187*58b278f5SVaibhav Jainat *startingScmLogicalMemoryAddress* from guest physical address space. The 188*58b278f5SVaibhav JainHCALL can fail if the Guest has an active PTE entry to the SCM block being 189*58b278f5SVaibhav Jainunbound. 190*58b278f5SVaibhav Jain 191*58b278f5SVaibhav Jain**H_SCM_QUERY_BLOCK_MEM_BINDING** 192*58b278f5SVaibhav Jain 193*58b278f5SVaibhav Jain| Input: *drcIndex, scmBlockIndex* 194*58b278f5SVaibhav Jain| Out: *Guest-Physical-Address* 195*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_P2, H_NotFound* 196*58b278f5SVaibhav Jain 197*58b278f5SVaibhav JainGiven a DRC-Index and an SCM Block index return the guest physical address to 198*58b278f5SVaibhav Jainwhich the SCM block is mapped to. 199*58b278f5SVaibhav Jain 200*58b278f5SVaibhav Jain**H_SCM_QUERY_LOGICAL_MEM_BINDING** 201*58b278f5SVaibhav Jain 202*58b278f5SVaibhav Jain| Input: *Guest-Physical-Address* 203*58b278f5SVaibhav Jain| Out: *drcIndex, scmBlockIndex* 204*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_P2, H_NotFound* 205*58b278f5SVaibhav Jain 206*58b278f5SVaibhav JainGiven a guest physical address return which DRC Index and SCM block is mapped 207*58b278f5SVaibhav Jainto that address. 208*58b278f5SVaibhav Jain 209*58b278f5SVaibhav Jain**H_SCM_UNBIND_ALL** 210*58b278f5SVaibhav Jain 211*58b278f5SVaibhav Jain| Input: *scmTargetScope, drcIndex* 212*58b278f5SVaibhav Jain| Out: *None* 213*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_P2, H_P3, H_In_Use, H_Busy,* 214*58b278f5SVaibhav Jain| *H_LongBusyOrder1mSec, H_LongBusyOrder10mSec* 215*58b278f5SVaibhav Jain 216*58b278f5SVaibhav JainDepending on the Target scope unmap all SCM blocks belonging to all NVDIMMs 217*58b278f5SVaibhav Jainor all SCM blocks belonging to a single NVDIMM identified by its drcIndex 218*58b278f5SVaibhav Jainfrom the LPAR memory. 219*58b278f5SVaibhav Jain 220*58b278f5SVaibhav Jain**H_SCM_HEALTH** 221*58b278f5SVaibhav Jain 222*58b278f5SVaibhav Jain| Input: drcIndex 223*58b278f5SVaibhav Jain| Out: *health-bitmap, health-bit-valid-bitmap* 224*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_Hardware* 225*58b278f5SVaibhav Jain 226*58b278f5SVaibhav JainGiven a DRC Index return the info on predictive failure and overall health of 227*58b278f5SVaibhav Jainthe NVDIMM. The asserted bits in the health-bitmap indicate a single predictive 228*58b278f5SVaibhav Jainfailure and health-bit-valid-bitmap indicate which bits in health-bitmap are 229*58b278f5SVaibhav Jainvalid. 230*58b278f5SVaibhav Jain 231*58b278f5SVaibhav Jain**H_SCM_PERFORMANCE_STATS** 232*58b278f5SVaibhav Jain 233*58b278f5SVaibhav Jain| Input: drcIndex, resultBuffer Addr 234*58b278f5SVaibhav Jain| Out: None 235*58b278f5SVaibhav Jain| Return Value: *H_Success, H_Parameter, H_Unsupported, H_Hardware, H_Authority, H_Privilege* 236*58b278f5SVaibhav Jain 237*58b278f5SVaibhav JainGiven a DRC Index collect the performance statistics for NVDIMM and copy them 238*58b278f5SVaibhav Jainto the resultBuffer. 239*58b278f5SVaibhav Jain 240*58b278f5SVaibhav JainReferences 241*58b278f5SVaibhav Jain========== 242*58b278f5SVaibhav Jain.. [1] "Power Architecture Platform Reference" 243*58b278f5SVaibhav Jain https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference 244*58b278f5SVaibhav Jain.. [2] "Linux on Power Architecture Platform Reference" 245*58b278f5SVaibhav Jain https://members.openpowerfoundation.org/document/dl/469 246*58b278f5SVaibhav Jain.. [3] "Definitions and Notation" Book III-Section 14.5.3 247*58b278f5SVaibhav Jain https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0 248*58b278f5SVaibhav Jain.. [4] arch/powerpc/include/asm/hvcall.h 249*58b278f5SVaibhav Jain.. [5] "64-Bit ELF V2 ABI Specification: Power Architecture" 250*58b278f5SVaibhav Jain https://openpowerfoundation.org/?resource_lib=64-bit-elf-v2-abi-specification-power-architecture 251