Skip to content
api.rst 252 KiB
Newer Older
Scott Wood's avatar
Scott Wood committed
	__u32 num_dirty;
Scott Wood's avatar
Scott Wood committed

This must be called whenever userspace has changed an entry in the shared
TLB, prior to calling KVM_RUN on the associated vcpu.

The "bitmap" field is the userspace address of an array.  This array
consists of a number of bits, equal to the total number of TLB entries as
determined by the last successful call to KVM_CONFIG_TLB, rounded up to the
nearest multiple of 64.

Each bit corresponds to one TLB entry, ordered the same as in the shared TLB
array.

The array is little-endian: the bit 0 is the least significant bit of the
first byte, bit 8 is the least significant bit of the second byte, etc.
This avoids any complications with differing word sizes.

The "num_dirty" field is a performance hint for KVM to determine whether it
should skip processing the bitmap and just invalidate everything.  It must
be set to the number of set bits in the bitmap.

-------------------------
:Capability: KVM_CAP_SPAPR_TCE
:Architectures: powerpc
:Type: vm ioctl
:Parameters: struct kvm_create_spapr_tce (in)
:Returns: file descriptor for manipulating the created TCE table

This creates a virtual TCE (translation control entry) table, which
is an IOMMU for PAPR-style virtual I/O.  It is used to translate
logical addresses used in virtual I/O into guest physical addresses,
and provides a scatter/gather capability for PAPR virtual I/O.

::

  /* for KVM_CAP_SPAPR_TCE */
  struct kvm_create_spapr_tce {

The liobn field gives the logical IO bus number for which to create a
TCE table.  The window_size field specifies the size of the DMA window
which this TCE table will translate - the table will contain one 64
bit TCE entry for every 4kiB of the DMA window.

When the guest issues an H_PUT_TCE hcall on a liobn for which a TCE
table has been created using this ioctl(), the kernel will handle it
in real mode, updating the TCE table.  H_PUT_TCE calls for other
liobns will cause a vm exit and must be handled by userspace.

The return value is a file descriptor which can be passed to mmap(2)
to map the created TCE table into userspace.  This lets userspace read
the entries written by kernel-handled H_PUT_TCE calls, and also lets
userspace update the TCE table directly which is useful in some
circumstances.

---------------------
:Capability: KVM_CAP_PPC_RMA
:Architectures: powerpc
:Type: vm ioctl
:Parameters: struct kvm_allocate_rma (out)
:Returns: file descriptor for mapping the allocated RMA

This allocates a Real Mode Area (RMA) from the pool allocated at boot
time by the kernel.  An RMA is a physically-contiguous, aligned region
of memory used on older POWER processors to provide the memory which
will be accessed by real-mode (MMU off) accesses in a KVM guest.
POWER processors support a set of sizes for the RMA that usually
includes 64MB, 128MB, 256MB and some larger powers of two.

::

  /* for KVM_ALLOCATE_RMA */
  struct kvm_allocate_rma {

The return value is a file descriptor which can be passed to mmap(2)
to map the allocated RMA into userspace.  The mapped area can then be
passed to the KVM_SET_USER_MEMORY_REGION ioctl to establish it as the
RMA for a virtual machine.  The size of the RMA in bytes (which is
fixed at host kernel boot time) is returned in the rma_size field of
the argument structure.

The KVM_CAP_PPC_RMA capability is 1 or 2 if the KVM_ALLOCATE_RMA ioctl
is supported; 2 if the processor requires all virtual machines to have
an RMA, or 1 if the processor can use an RMA but doesn't require it,
because it supports the Virtual RMA (VRMA) facility.

Avi Kivity's avatar
Avi Kivity committed
4.64 KVM_NMI
Avi Kivity's avatar
Avi Kivity committed

:Capability: KVM_CAP_USER_NMI
:Architectures: x86
:Type: vcpu ioctl
:Parameters: none
:Returns: 0 on success, -1 on error
Avi Kivity's avatar
Avi Kivity committed

Queues an NMI on the thread's vcpu.  Note this is well defined only
when KVM_CREATE_IRQCHIP has not been called, since this is an interface
between the virtual cpu core and virtual local APIC.  After KVM_CREATE_IRQCHIP
has been called, this interface is completely emulated within the kernel.

To use this to emulate the LINT1 input with KVM_CREATE_IRQCHIP, use the
following algorithm:

  - pause the vcpu
Avi Kivity's avatar
Avi Kivity committed
  - read the local APIC's state (KVM_GET_LAPIC)
  - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1)
  - if so, issue KVM_NMI
  - resume the vcpu

Some guests configure the LINT1 NMI input to cause a panic, aiding in
debugging.

4.65 KVM_S390_UCAS_MAP
----------------------
:Capability: KVM_CAP_S390_UCONTROL
:Architectures: s390
:Type: vcpu ioctl
:Parameters: struct kvm_s390_ucas_mapping (in)
:Returns: 0 in case of success

The parameter is defined like this::

	struct kvm_s390_ucas_mapping {
		__u64 user_addr;
		__u64 vcpu_addr;
		__u64 length;
	};

This ioctl maps the memory at "user_addr" with the length "length" to
the vcpu's address space starting at "vcpu_addr". All parameters need to
be aligned by 1 megabyte.
4.66 KVM_S390_UCAS_UNMAP
------------------------
:Capability: KVM_CAP_S390_UCONTROL
:Architectures: s390
:Type: vcpu ioctl
:Parameters: struct kvm_s390_ucas_mapping (in)
:Returns: 0 in case of success

The parameter is defined like this::

	struct kvm_s390_ucas_mapping {
		__u64 user_addr;
		__u64 vcpu_addr;
		__u64 length;
	};

This ioctl unmaps the memory in the vcpu's address space starting at
"vcpu_addr" with the length "length". The field "user_addr" is ignored.
All parameters need to be aligned by 1 megabyte.
4.67 KVM_S390_VCPU_FAULT
------------------------
:Capability: KVM_CAP_S390_UCONTROL
:Architectures: s390
:Type: vcpu ioctl
:Parameters: vcpu absolute address (in)
:Returns: 0 in case of success

This call creates a page table entry on the virtual cpu's address space
(for user controlled virtual machines) or the virtual machine's address
space (for regular virtual machines). This only works for minor faults,
thus it's recommended to access subject memory page via the user page
table upfront. This is useful to handle validity intercepts for user
controlled virtual machines to fault in the virtual cpu's lowcore pages
prior to calling the KVM_RUN ioctl.

4.68 KVM_SET_ONE_REG
--------------------

:Capability: KVM_CAP_ONE_REG
:Architectures: all
:Type: vcpu ioctl
:Parameters: struct kvm_one_reg (in)
:Returns: 0 on success, negative value on failure

  ======   ============================================================
  ENOENT   no such register
  EINVAL   invalid register ID, or no such register or used with VMs in
           protected virtualization mode on s390
  EPERM    (arm64) register access not allowed before vcpu finalization
  ======   ============================================================

(These error codes are indicative only: do not rely on a specific error
code being returned in a specific situation.)

Using this ioctl, a single vcpu register can be set to a specific value
defined by user space with the passed in struct kvm_one_reg, where id
refers to the register identifier as described below and addr is a pointer
to a variable with the respective size. There can be architecture agnostic
and architecture specific registers. Each have their own range of operation
and their own constants and width. To keep track of the implemented
registers, find a list below:

  ======= =============================== ============
  Arch              Register              Width (bits)
  ======= =============================== ============
  PPC     KVM_REG_PPC_HIOR                64
  PPC     KVM_REG_PPC_IAC1                64
  PPC     KVM_REG_PPC_IAC2                64
  PPC     KVM_REG_PPC_IAC3                64
  PPC     KVM_REG_PPC_IAC4                64
  PPC     KVM_REG_PPC_DAC1                64
  PPC     KVM_REG_PPC_DAC2                64
  PPC     KVM_REG_PPC_DABR                64
  PPC     KVM_REG_PPC_DSCR                64
  PPC     KVM_REG_PPC_PURR                64
  PPC     KVM_REG_PPC_SPURR               64
  PPC     KVM_REG_PPC_DAR                 64
  PPC     KVM_REG_PPC_DSISR               32
  PPC     KVM_REG_PPC_AMR                 64
  PPC     KVM_REG_PPC_UAMOR               64
  PPC     KVM_REG_PPC_MMCR0               64
  PPC     KVM_REG_PPC_MMCR1               64
  PPC     KVM_REG_PPC_MMCRA               64
  PPC     KVM_REG_PPC_MMCR2               64
  PPC     KVM_REG_PPC_MMCRS               64
  PPC     KVM_REG_PPC_MMCR3               64
  PPC     KVM_REG_PPC_SIAR                64
  PPC     KVM_REG_PPC_SDAR                64
  PPC     KVM_REG_PPC_SIER                64
  PPC     KVM_REG_PPC_SIER2               64
  PPC     KVM_REG_PPC_SIER3               64
  PPC     KVM_REG_PPC_PMC1                32
  PPC     KVM_REG_PPC_PMC2                32
  PPC     KVM_REG_PPC_PMC3                32
  PPC     KVM_REG_PPC_PMC4                32
  PPC     KVM_REG_PPC_PMC5                32
  PPC     KVM_REG_PPC_PMC6                32
  PPC     KVM_REG_PPC_PMC7                32
  PPC     KVM_REG_PPC_PMC8                32
  PPC     KVM_REG_PPC_FPR0                64
  ...
  PPC     KVM_REG_PPC_FPR31               64
  PPC     KVM_REG_PPC_VR0                 128
  ...
  PPC     KVM_REG_PPC_VR31                128
  PPC     KVM_REG_PPC_VSR0                128
  ...
  PPC     KVM_REG_PPC_VSR31               128
  PPC     KVM_REG_PPC_FPSCR               64
  PPC     KVM_REG_PPC_VSCR                32
  PPC     KVM_REG_PPC_VPA_ADDR            64
  PPC     KVM_REG_PPC_VPA_SLB             128
  PPC     KVM_REG_PPC_VPA_DTL             128
  PPC     KVM_REG_PPC_EPCR                32
  PPC     KVM_REG_PPC_EPR                 32
  PPC     KVM_REG_PPC_TCR                 32
  PPC     KVM_REG_PPC_TSR                 32
  PPC     KVM_REG_PPC_OR_TSR              32
  PPC     KVM_REG_PPC_CLEAR_TSR           32
  PPC     KVM_REG_PPC_MAS0                32
  PPC     KVM_REG_PPC_MAS1                32
  PPC     KVM_REG_PPC_MAS2                64
  PPC     KVM_REG_PPC_MAS7_3              64
  PPC     KVM_REG_PPC_MAS4                32
  PPC     KVM_REG_PPC_MAS6                32
  PPC     KVM_REG_PPC_MMUCFG              32
  PPC     KVM_REG_PPC_TLB0CFG             32
  PPC     KVM_REG_PPC_TLB1CFG             32
  PPC     KVM_REG_PPC_TLB2CFG             32
  PPC     KVM_REG_PPC_TLB3CFG             32
  PPC     KVM_REG_PPC_TLB0PS              32
  PPC     KVM_REG_PPC_TLB1PS              32
  PPC     KVM_REG_PPC_TLB2PS              32
  PPC     KVM_REG_PPC_TLB3PS              32
  PPC     KVM_REG_PPC_EPTCFG              32
  PPC     KVM_REG_PPC_ICP_STATE           64
  PPC     KVM_REG_PPC_VP_STATE            128
  PPC     KVM_REG_PPC_TB_OFFSET           64
  PPC     KVM_REG_PPC_SPMC1               32
  PPC     KVM_REG_PPC_SPMC2               32
  PPC     KVM_REG_PPC_IAMR                64
  PPC     KVM_REG_PPC_TFHAR               64
  PPC     KVM_REG_PPC_TFIAR               64
  PPC     KVM_REG_PPC_TEXASR              64
  PPC     KVM_REG_PPC_FSCR                64
  PPC     KVM_REG_PPC_PSPB                32
  PPC     KVM_REG_PPC_EBBHR               64
  PPC     KVM_REG_PPC_EBBRR               64
  PPC     KVM_REG_PPC_BESCR               64
  PPC     KVM_REG_PPC_TAR                 64
  PPC     KVM_REG_PPC_DPDES               64
  PPC     KVM_REG_PPC_DAWR                64
  PPC     KVM_REG_PPC_DAWRX               64
  PPC     KVM_REG_PPC_CIABR               64
  PPC     KVM_REG_PPC_IC                  64
  PPC     KVM_REG_PPC_VTB                 64
  PPC     KVM_REG_PPC_CSIGR               64
  PPC     KVM_REG_PPC_TACR                64
  PPC     KVM_REG_PPC_TCSCR               64
  PPC     KVM_REG_PPC_PID                 64
  PPC     KVM_REG_PPC_ACOP                64
  PPC     KVM_REG_PPC_VRSAVE              32
  PPC     KVM_REG_PPC_LPCR                32
  PPC     KVM_REG_PPC_LPCR_64             64
  PPC     KVM_REG_PPC_PPR                 64
  PPC     KVM_REG_PPC_ARCH_COMPAT         32
  PPC     KVM_REG_PPC_DABRX               32
  PPC     KVM_REG_PPC_WORT                64
  PPC	  KVM_REG_PPC_SPRG9               64
  PPC	  KVM_REG_PPC_DBSR                32
  PPC     KVM_REG_PPC_TIDR                64
  PPC     KVM_REG_PPC_PSSCR               64
  PPC     KVM_REG_PPC_DEC_EXPIRY          64
  PPC     KVM_REG_PPC_PTCR                64
  PPC     KVM_REG_PPC_DAWR1               64
  PPC     KVM_REG_PPC_DAWRX1              64
  PPC     KVM_REG_PPC_TM_GPR0             64
  ...
  PPC     KVM_REG_PPC_TM_GPR31            64
  PPC     KVM_REG_PPC_TM_VSR0             128
  ...
  PPC     KVM_REG_PPC_TM_VSR63            128
  PPC     KVM_REG_PPC_TM_CR               64
  PPC     KVM_REG_PPC_TM_LR               64
  PPC     KVM_REG_PPC_TM_CTR              64
  PPC     KVM_REG_PPC_TM_FPSCR            64
  PPC     KVM_REG_PPC_TM_AMR              64
  PPC     KVM_REG_PPC_TM_PPR              64
  PPC     KVM_REG_PPC_TM_VRSAVE           64
  PPC     KVM_REG_PPC_TM_VSCR             32
  PPC     KVM_REG_PPC_TM_DSCR             64
  PPC     KVM_REG_PPC_TM_TAR              64
  PPC     KVM_REG_PPC_TM_XER              64

  MIPS    KVM_REG_MIPS_R0                 64
  ...
  MIPS    KVM_REG_MIPS_R31                64
  MIPS    KVM_REG_MIPS_HI                 64
  MIPS    KVM_REG_MIPS_LO                 64
  MIPS    KVM_REG_MIPS_PC                 64
  MIPS    KVM_REG_MIPS_CP0_INDEX          32
  MIPS    KVM_REG_MIPS_CP0_ENTRYLO0       64
  MIPS    KVM_REG_MIPS_CP0_ENTRYLO1       64
  MIPS    KVM_REG_MIPS_CP0_CONTEXT        64
  MIPS    KVM_REG_MIPS_CP0_CONTEXTCONFIG  32
  MIPS    KVM_REG_MIPS_CP0_USERLOCAL      64
  MIPS    KVM_REG_MIPS_CP0_XCONTEXTCONFIG 64
  MIPS    KVM_REG_MIPS_CP0_PAGEMASK       32
  MIPS    KVM_REG_MIPS_CP0_PAGEGRAIN      32
  MIPS    KVM_REG_MIPS_CP0_SEGCTL0        64
  MIPS    KVM_REG_MIPS_CP0_SEGCTL1        64
  MIPS    KVM_REG_MIPS_CP0_SEGCTL2        64
  MIPS    KVM_REG_MIPS_CP0_PWBASE         64
  MIPS    KVM_REG_MIPS_CP0_PWFIELD        64
  MIPS    KVM_REG_MIPS_CP0_PWSIZE         64
  MIPS    KVM_REG_MIPS_CP0_WIRED          32
  MIPS    KVM_REG_MIPS_CP0_PWCTL          32
  MIPS    KVM_REG_MIPS_CP0_HWRENA         32
  MIPS    KVM_REG_MIPS_CP0_BADVADDR       64
  MIPS    KVM_REG_MIPS_CP0_BADINSTR       32
  MIPS    KVM_REG_MIPS_CP0_BADINSTRP      32
  MIPS    KVM_REG_MIPS_CP0_COUNT          32
  MIPS    KVM_REG_MIPS_CP0_ENTRYHI        64
  MIPS    KVM_REG_MIPS_CP0_COMPARE        32
  MIPS    KVM_REG_MIPS_CP0_STATUS         32
  MIPS    KVM_REG_MIPS_CP0_INTCTL         32
  MIPS    KVM_REG_MIPS_CP0_CAUSE          32
  MIPS    KVM_REG_MIPS_CP0_EPC            64
  MIPS    KVM_REG_MIPS_CP0_PRID           32
  MIPS    KVM_REG_MIPS_CP0_EBASE          64
  MIPS    KVM_REG_MIPS_CP0_CONFIG         32
  MIPS    KVM_REG_MIPS_CP0_CONFIG1        32
  MIPS    KVM_REG_MIPS_CP0_CONFIG2        32
  MIPS    KVM_REG_MIPS_CP0_CONFIG3        32
  MIPS    KVM_REG_MIPS_CP0_CONFIG4        32
  MIPS    KVM_REG_MIPS_CP0_CONFIG5        32
  MIPS    KVM_REG_MIPS_CP0_CONFIG7        32
  MIPS    KVM_REG_MIPS_CP0_XCONTEXT       64
  MIPS    KVM_REG_MIPS_CP0_ERROREPC       64
  MIPS    KVM_REG_MIPS_CP0_KSCRATCH1      64
  MIPS    KVM_REG_MIPS_CP0_KSCRATCH2      64
  MIPS    KVM_REG_MIPS_CP0_KSCRATCH3      64
  MIPS    KVM_REG_MIPS_CP0_KSCRATCH4      64
  MIPS    KVM_REG_MIPS_CP0_KSCRATCH5      64
  MIPS    KVM_REG_MIPS_CP0_KSCRATCH6      64
  MIPS    KVM_REG_MIPS_CP0_MAAR(0..63)    64
  MIPS    KVM_REG_MIPS_COUNT_CTL          64
  MIPS    KVM_REG_MIPS_COUNT_RESUME       64
  MIPS    KVM_REG_MIPS_COUNT_HZ           64
  MIPS    KVM_REG_MIPS_FPR_32(0..31)      32
  MIPS    KVM_REG_MIPS_FPR_64(0..31)      64
  MIPS    KVM_REG_MIPS_VEC_128(0..31)     128
  MIPS    KVM_REG_MIPS_FCR_IR             32
  MIPS    KVM_REG_MIPS_FCR_CSR            32
  MIPS    KVM_REG_MIPS_MSA_IR             32
  MIPS    KVM_REG_MIPS_MSA_CSR            32
  ======= =============================== ============
ARM registers are mapped using the lower 32 bits.  The upper 16 of that
is the register group type, or coprocessor number:

ARM core registers have the following id bit patterns::

  0x4020 0000 0010 <index into the kvm_regs struct:16>
ARM 32-bit CP15 registers have the following id bit patterns::

  0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
ARM 64-bit CP15 registers have the following id bit patterns::

  0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
ARM CCSIDR registers are demultiplexed by CSSELR value::

  0x4020 0000 0011 00 <csselr:8>
ARM 32-bit VFP control registers have the following id bit patterns::

  0x4020 0000 0012 1 <regno:12>
ARM 64-bit FP registers have the following id bit patterns::

  0x4030 0000 0012 0 <regno:12>
ARM firmware pseudo-registers have the following bit pattern::

  0x4030 0000 0014 <regno:16>


arm64 registers are mapped using the lower 32 bits. The upper 16 of
that is the register group type, or coprocessor number:

arm64 core/FP-SIMD registers have the following id bit patterns. Note
that the size of the access is variable, as the kvm_regs structure
contains elements ranging from 32 to 128 bits. The index is a 32bit
value in the kvm_regs structure seen as a 32bit array::

  0x60x0 0000 0010 <index into the kvm_regs struct:16>


======================= ========= ===== =======================================
    Encoding            Register  Bits  kvm_regs member
======================= ========= ===== =======================================
  0x6030 0000 0010 0000 X0          64  regs.regs[0]
  0x6030 0000 0010 0002 X1          64  regs.regs[1]
  0x6030 0000 0010 003c X30         64  regs.regs[30]
  0x6030 0000 0010 003e SP          64  regs.sp
  0x6030 0000 0010 0040 PC          64  regs.pc
  0x6030 0000 0010 0042 PSTATE      64  regs.pstate
  0x6030 0000 0010 0044 SP_EL1      64  sp_el1
  0x6030 0000 0010 0046 ELR_EL1     64  elr_el1
  0x6030 0000 0010 0048 SPSR_EL1    64  spsr[KVM_SPSR_EL1] (alias SPSR_SVC)
  0x6030 0000 0010 004a SPSR_ABT    64  spsr[KVM_SPSR_ABT]
  0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
  0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
  0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    [1]_
  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    [1]_
  ...
  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   [1]_
  0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
  0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
======================= ========= ===== =======================================
.. [1] These encodings are not accepted for SVE-enabled vcpus.  See
       KVM_ARM_VCPU_INIT.
       The equivalent register content can be accessed via bits [127:0] of
       the corresponding SVE Zn registers instead for vcpus that have SVE
       enabled (see below).

arm64 CCSIDR registers are demultiplexed by CSSELR value::
  0x6020 0000 0011 00 <csselr:8>

arm64 system registers have the following id bit patterns::

  0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3>

     Two system register IDs do not follow the specified pattern.  These
     are KVM_REG_ARM_TIMER_CVAL and KVM_REG_ARM_TIMER_CNT, which map to
     system registers CNTV_CVAL_EL0 and CNTVCT_EL0 respectively.  These
     two had their values accidentally swapped, which means TIMER_CVAL is
     derived from the register encoding for CNTVCT_EL0 and TIMER_CNT is
     derived from the register encoding for CNTV_CVAL_EL0.  As this is
     API, it must remain this way.

arm64 firmware pseudo-registers have the following bit pattern::

  0x6030 0000 0014 <regno:16>

arm64 SVE registers have the following bit patterns::

  0x6080 0000 0015 00 <n:5> <slice:5>   Zn bits[2048*slice + 2047 : 2048*slice]
  0x6050 0000 0015 04 <n:4> <slice:5>   Pn bits[256*slice + 255 : 256*slice]
  0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
  0x6060 0000 0015 ffff                 KVM_REG_ARM64_SVE_VLS pseudo-register

Access to register IDs where 2048 * slice >= 128 * max_vq will fail with
ENOENT.  max_vq is the vcpu's maximum supported vector length in 128-bit
quadwords: see [2]_ below.

These registers are only accessible on vcpus for which SVE is enabled.
See KVM_ARM_VCPU_INIT for details.

In addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not
accessible until the vcpu's SVE configuration has been finalized
using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE).  See KVM_ARM_VCPU_INIT
and KVM_ARM_VCPU_FINALIZE for more information about this procedure.

KVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector
lengths supported by the vcpu to be discovered and configured by
userspace.  When transferred to or from user memory via KVM_GET_ONE_REG
or KVM_SET_ONE_REG, the value of this register is of type
__u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as
  __u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS];
  if (vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX &&
      ((vector_lengths[(vq - KVM_ARM64_SVE_VQ_MIN) / 64] >>
		((vq - KVM_ARM64_SVE_VQ_MIN) % 64)) & 1))
	/* Vector length vq * 16 bytes supported */
	/* Vector length vq * 16 bytes not supported */

.. [2] The maximum value vq for which the above condition is true is
       max_vq.  This is the maximum vector length available to the guest on
       this vcpu, and determines which register slices are visible through
       this ioctl interface.
(See Documentation/arm64/sve.rst for an explanation of the "vq"
nomenclature.)

KVM_REG_ARM64_SVE_VLS is only accessible after KVM_ARM_VCPU_INIT.
KVM_ARM_VCPU_INIT initialises it to the best set of vector lengths that
the host supports.

Userspace may subsequently modify it if desired until the vcpu's SVE
configuration is finalized using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE).

Apart from simply removing all vector lengths from the host set that
exceed some value, support for arbitrarily chosen sets of vector lengths
is hardware-dependent and may not be available.  Attempting to configure
an invalid set of vector lengths via KVM_SET_ONE_REG will fail with
EINVAL.

After the vcpu's SVE configuration is finalized, further attempts to
write this register will fail with EPERM.


MIPS registers are mapped using the lower 32 bits.  The upper 16 of that is
the register group type:

MIPS core registers (see above) have the following id bit patterns::

  0x7030 0000 0000 <reg:16>

MIPS CP0 registers (see KVM_REG_MIPS_CP0_* above) have the following id bit
patterns depending on whether they're 32-bit or 64-bit registers::

  0x7020 0000 0001 00 <reg:5> <sel:3>   (32-bit)
  0x7030 0000 0001 00 <reg:5> <sel:3>   (64-bit)

Note: KVM_REG_MIPS_CP0_ENTRYLO0 and KVM_REG_MIPS_CP0_ENTRYLO1 are the MIPS64
versions of the EntryLo registers regardless of the word size of the host
hardware, host kernel, guest, and whether XPA is present in the guest, i.e.
with the RI and XI bits (if they exist) in bits 63 and 62 respectively, and
the PFNX field starting at bit 30.

MIPS MAARs (see KVM_REG_MIPS_CP0_MAAR(*) above) have the following id bit
  0x7030 0000 0001 01 <reg:8>

MIPS KVM control registers (see above) have the following id bit patterns::

  0x7030 0000 0002 <reg:16>

MIPS FPU registers (see KVM_REG_MIPS_FPR_{32,64}() above) have the following
id bit patterns depending on the size of the register being accessed. They are
always accessed according to the current guest FPU mode (Status.FR and
Config5.FRE), i.e. as the guest would see them, and they become unpredictable
if the guest FPU mode is changed. MIPS SIMD Architecture (MSA) vector
registers (see KVM_REG_MIPS_VEC_128() above) have similar patterns as they
overlap the FPU registers::

  0x7020 0000 0003 00 <0:3> <reg:5> (32-bit FPU registers)
  0x7030 0000 0003 00 <0:3> <reg:5> (64-bit FPU registers)
  0x7040 0000 0003 00 <0:3> <reg:5> (128-bit MSA vector registers)

MIPS FPU control registers (see KVM_REG_MIPS_FCR_{IR,CSR} above) have the
following id bit patterns::

  0x7020 0000 0003 01 <0:3> <reg:5>

MIPS MSA control registers (see KVM_REG_MIPS_MSA_{IR,CSR} above) have the
following id bit patterns::

  0x7020 0000 0003 02 <0:3> <reg:5>

RISC-V registers are mapped using the lower 32 bits. The upper 8 bits of
that is the register group type.

RISC-V config registers are meant for configuring a Guest VCPU and it has
the following id bit patterns::

  0x8020 0000 01 <index into the kvm_riscv_config struct:24> (32bit Host)
  0x8030 0000 01 <index into the kvm_riscv_config struct:24> (64bit Host)

Following are the RISC-V config registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x80x0 0000 0100 0000 isa       ISA feature bitmap of Guest VCPU
======================= ========= =============================================

The isa config register can be read anytime but can only be written before
a Guest VCPU runs. It will have ISA feature bits matching underlying host
set by default.

RISC-V core registers represent the general excution state of a Guest VCPU
and it has the following id bit patterns::

  0x8020 0000 02 <index into the kvm_riscv_core struct:24> (32bit Host)
  0x8030 0000 02 <index into the kvm_riscv_core struct:24> (64bit Host)

Following are the RISC-V core registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x80x0 0000 0200 0000 regs.pc   Program counter
  0x80x0 0000 0200 0001 regs.ra   Return address
  0x80x0 0000 0200 0002 regs.sp   Stack pointer
  0x80x0 0000 0200 0003 regs.gp   Global pointer
  0x80x0 0000 0200 0004 regs.tp   Task pointer
  0x80x0 0000 0200 0005 regs.t0   Caller saved register 0
  0x80x0 0000 0200 0006 regs.t1   Caller saved register 1
  0x80x0 0000 0200 0007 regs.t2   Caller saved register 2
  0x80x0 0000 0200 0008 regs.s0   Callee saved register 0
  0x80x0 0000 0200 0009 regs.s1   Callee saved register 1
  0x80x0 0000 0200 000a regs.a0   Function argument (or return value) 0
  0x80x0 0000 0200 000b regs.a1   Function argument (or return value) 1
  0x80x0 0000 0200 000c regs.a2   Function argument 2
  0x80x0 0000 0200 000d regs.a3   Function argument 3
  0x80x0 0000 0200 000e regs.a4   Function argument 4
  0x80x0 0000 0200 000f regs.a5   Function argument 5
  0x80x0 0000 0200 0010 regs.a6   Function argument 6
  0x80x0 0000 0200 0011 regs.a7   Function argument 7
  0x80x0 0000 0200 0012 regs.s2   Callee saved register 2
  0x80x0 0000 0200 0013 regs.s3   Callee saved register 3
  0x80x0 0000 0200 0014 regs.s4   Callee saved register 4
  0x80x0 0000 0200 0015 regs.s5   Callee saved register 5
  0x80x0 0000 0200 0016 regs.s6   Callee saved register 6
  0x80x0 0000 0200 0017 regs.s7   Callee saved register 7
  0x80x0 0000 0200 0018 regs.s8   Callee saved register 8
  0x80x0 0000 0200 0019 regs.s9   Callee saved register 9
  0x80x0 0000 0200 001a regs.s10  Callee saved register 10
  0x80x0 0000 0200 001b regs.s11  Callee saved register 11
  0x80x0 0000 0200 001c regs.t3   Caller saved register 3
  0x80x0 0000 0200 001d regs.t4   Caller saved register 4
  0x80x0 0000 0200 001e regs.t5   Caller saved register 5
  0x80x0 0000 0200 001f regs.t6   Caller saved register 6
  0x80x0 0000 0200 0020 mode      Privilege mode (1 = S-mode or 0 = U-mode)
======================= ========= =============================================

RISC-V csr registers represent the supervisor mode control/status registers
of a Guest VCPU and it has the following id bit patterns::

  0x8020 0000 03 <index into the kvm_riscv_csr struct:24> (32bit Host)
  0x8030 0000 03 <index into the kvm_riscv_csr struct:24> (64bit Host)

Following are the RISC-V csr registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x80x0 0000 0300 0000 sstatus   Supervisor status
  0x80x0 0000 0300 0001 sie       Supervisor interrupt enable
  0x80x0 0000 0300 0002 stvec     Supervisor trap vector base
  0x80x0 0000 0300 0003 sscratch  Supervisor scratch register
  0x80x0 0000 0300 0004 sepc      Supervisor exception program counter
  0x80x0 0000 0300 0005 scause    Supervisor trap cause
  0x80x0 0000 0300 0006 stval     Supervisor bad address or instruction
  0x80x0 0000 0300 0007 sip       Supervisor interrupt pending
  0x80x0 0000 0300 0008 satp      Supervisor address translation and protection
======================= ========= =============================================

RISC-V timer registers represent the timer state of a Guest VCPU and it has
the following id bit patterns::

  0x8030 0000 04 <index into the kvm_riscv_timer struct:24>

Following are the RISC-V timer registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x8030 0000 0400 0000 frequency Time base frequency (read-only)
  0x8030 0000 0400 0001 time      Time value visible to Guest
  0x8030 0000 0400 0002 compare   Time compare programmed by Guest
  0x8030 0000 0400 0003 state     Time compare state (1 = ON or 0 = OFF)
======================= ========= =============================================

RISC-V F-extension registers represent the single precision floating point
state of a Guest VCPU and it has the following id bit patterns::

  0x8020 0000 05 <index into the __riscv_f_ext_state struct:24>

Following are the RISC-V F-extension registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x8020 0000 0500 0000 f[0]      Floating point register 0
  ...
  0x8020 0000 0500 001f f[31]     Floating point register 31
  0x8020 0000 0500 0020 fcsr      Floating point control and status register
======================= ========= =============================================

RISC-V D-extension registers represent the double precision floating point
state of a Guest VCPU and it has the following id bit patterns::

  0x8020 0000 06 <index into the __riscv_d_ext_state struct:24> (fcsr)
  0x8030 0000 06 <index into the __riscv_d_ext_state struct:24> (non-fcsr)

Following are the RISC-V D-extension registers:

======================= ========= =============================================
    Encoding            Register  Description
======================= ========= =============================================
  0x8030 0000 0600 0000 f[0]      Floating point register 0
  ...
  0x8030 0000 0600 001f f[31]     Floating point register 31
  0x8020 0000 0600 0020 fcsr      Floating point control and status register
======================= ========= =============================================

4.69 KVM_GET_ONE_REG
--------------------

:Capability: KVM_CAP_ONE_REG
:Architectures: all
:Type: vcpu ioctl
:Parameters: struct kvm_one_reg (in and out)
:Returns: 0 on success, negative value on failure

  ======== ============================================================
  ENOENT   no such register
  EINVAL   invalid register ID, or no such register or used with VMs in
           protected virtualization mode on s390
  EPERM    (arm64) register access not allowed before vcpu finalization
  ======== ============================================================

(These error codes are indicative only: do not rely on a specific error
code being returned in a specific situation.)

This ioctl allows to receive the value of a single register implemented
in a vcpu. The register to read is indicated by the "id" field of the
kvm_one_reg struct passed in. On success, the register value can be found
at the memory location pointed to by "addr".

The list of registers accessible using this interface is identical to the
4.70 KVM_KVMCLOCK_CTRL
----------------------
:Capability: KVM_CAP_KVMCLOCK_CTRL
:Architectures: Any that implement pvclocks (currently x86 only)
:Type: vcpu ioctl
:Parameters: None
:Returns: 0 on success, -1 on error
This ioctl sets a flag accessible to the guest indicating that the specified
vCPU has been paused by the host userspace.

The host will set a flag in the pvclock structure that is checked from the
soft lockup watchdog.  The flag is part of the pvclock structure that is
shared between guest and host, specifically the second bit of the flags
field of the pvclock_vcpu_time_info structure.  It will be set exclusively by
the host and read/cleared exclusively by the guest.  The guest operation of
checking and clearing the flag must be an atomic operation so
load-link/store-conditional, or equivalent must be used.  There are two cases
where the guest will clear the flag: when the soft lockup watchdog timer resets
itself or when a soft lockup is detected.  This ioctl can be called any time
after pausing the vcpu, but before it is resumed.

:Capability: KVM_CAP_SIGNAL_MSI
:Architectures: x86 arm arm64
:Type: vm ioctl
:Parameters: struct kvm_msi (in)
:Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error

Directly inject a MSI message. Only valid with in-kernel irqchip that handles
MSI messages.

	__u32 address_lo;
	__u32 address_hi;
	__u32 data;
	__u32 flags;
flags:
  KVM_MSI_VALID_DEVID: devid contains a valid value.  The per-VM
  KVM_CAP_MSI_DEVID capability advertises the requirement to provide
  the device ID.  If this capability is not available, userspace
  should never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail.
If KVM_MSI_VALID_DEVID is set, devid contains a unique device identifier
for the device that wrote the MSI message.  For PCI, this is usually a
BFD identifier in the lower 16 bits.
On x86, address_hi is ignored unless the KVM_X2APIC_API_USE_32BIT_IDS
feature of KVM_CAP_X2APIC_API capability is enabled.  If it is enabled,
address_hi bits 31-8 provide bits 31-8 of the destination id.  Bits 7-0 of
address_hi must be zero.
4.71 KVM_CREATE_PIT2
--------------------
:Capability: KVM_CAP_PIT2
:Architectures: x86
:Type: vm ioctl
:Parameters: struct kvm_pit_config (in)
:Returns: 0 on success, -1 on error

Creates an in-kernel device model for the i8254 PIT. This call is only valid
after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following
parameters have to be passed::
  struct kvm_pit_config {
	__u32 flags;
	__u32 pad[15];
  #define KVM_PIT_SPEAKER_DUMMY     1 /* emulate speaker port stub */
PIT timer interrupts may use a per-VM kernel thread for injection. If it
exists, this thread will have a name of the following pattern::
  kvm-pit/<owner-process-pid>

When running a guest with elevated priorities, the scheduling parameters of
this thread may have to be adjusted accordingly.

This IOCTL replaces the obsolete KVM_CREATE_PIT.


4.72 KVM_GET_PIT2
:Capability: KVM_CAP_PIT_STATE2
:Architectures: x86
:Type: vm ioctl
:Parameters: struct kvm_pit_state2 (out)
:Returns: 0 on success, -1 on error

Retrieves the state of the in-kernel PIT model. Only valid after
KVM_CREATE_PIT2. The state is returned in the following structure::
  struct kvm_pit_state2 {
	struct kvm_pit_channel_state channels[3];
	__u32 flags;
	__u32 reserved[9];
  /* disable PIT in HPET legacy mode */
  #define KVM_PIT_FLAGS_HPET_LEGACY  0x00000001

This IOCTL replaces the obsolete KVM_GET_PIT.


4.73 KVM_SET_PIT2
:Capability: KVM_CAP_PIT_STATE2
:Architectures: x86
:Type: vm ioctl
:Parameters: struct kvm_pit_state2 (in)
:Returns: 0 on success, -1 on error

Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2.
See KVM_GET_PIT2 for details on struct kvm_pit_state2.

This IOCTL replaces the obsolete KVM_SET_PIT.


--------------------------
:Capability: KVM_CAP_PPC_GET_SMMU_INFO
:Architectures: powerpc
:Type: vm ioctl
:Parameters: None
:Returns: 0 on success, -1 on error

This populates and returns a structure describing the features of
the "Server" class MMU emulation supported by KVM.
This can in turn be used by userspace to generate the appropriate
device-tree properties for the guest operating system.

The structure contains some global information, followed by an
array of supported segment page sizes::

      struct kvm_ppc_smmu_info {
	     __u64 flags;
	     __u32 slb_size;
	     __u32 pad;
	     struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
      };

The supported flags are:

    - KVM_PPC_PAGE_SIZES_REAL:
        When that flag is set, guest page sizes must "fit" the backing
        store page sizes. When not set, any page size in the list can
        be used regardless of how they are backed by userspace.

    - KVM_PPC_1T_SEGMENTS
        The emulated MMU supports 1T segments in addition to the
        standard 256M ones.

    - KVM_PPC_NO_HASH
	This flag indicates that HPT guests are not supported by KVM,
	thus all guests must use radix MMU mode.

The "slb_size" field indicates how many SLB entries are supported

The "sps" array contains 8 entries indicating the supported base
page sizes for a segment in increasing order. Each entry is defined

   struct kvm_ppc_one_seg_page_size {
	__u32 page_shift;	/* Base page shift of segment (or 0) */
	__u32 slb_enc;		/* SLB encoding for BookS */
	struct kvm_ppc_one_page_size enc[KVM_PPC_PAGE_SIZES_MAX_SZ];
   };

An entry with a "page_shift" of 0 is unused. Because the array is
organized in increasing order, a lookup can stop when encoutering
such an entry.

The "slb_enc" field provides the encoding to use in the SLB for the
page size. The bits are in positions such as the value can directly
be OR'ed into the "vsid" argument of the slbmte instruction.

The "enc" array is a list which for each of those segment base page
size provides the list of supported actual page sizes (which can be
only larger or equal to the base page size), along with the
corresponding encoding in the hash PTE. Similarly, the array is
8 entries sorted by increasing sizes and an entry with a "0" shift
is an empty entry and a terminator::