Newer
Older
MIPS MSA control registers (see KVM_REG_MIPS_MSA_{IR,CSR} above) have the
following id bit patterns:
0x7020 0000 0003 02 <0:3> <reg:5>
4.69 KVM_GET_ONE_REG
Capability: KVM_CAP_ONE_REG
Architectures: all
Type: vcpu ioctl
Parameters: struct kvm_one_reg (in and out)
Returns: 0 on success, negative value on failure
This ioctl allows to receive the value of a single register implemented
in a vcpu. The register to read is indicated by the "id" field of the
kvm_one_reg struct passed in. On success, the register value can be found
at the memory location pointed to by "addr".
The list of registers accessible using this interface is identical to the
4.70 KVM_KVMCLOCK_CTRL
Capability: KVM_CAP_KVMCLOCK_CTRL
Architectures: Any that implement pvclocks (currently x86 only)
Type: vcpu ioctl
Parameters: None
Returns: 0 on success, -1 on error
This signals to the host kernel that the specified guest is being paused by
userspace. The host will set a flag in the pvclock structure that is checked
from the soft lockup watchdog. The flag is part of the pvclock structure that
is shared between guest and host, specifically the second bit of the flags
field of the pvclock_vcpu_time_info structure. It will be set exclusively by
the host and read/cleared exclusively by the guest. The guest operation of
checking and clearing the flag must an atomic operation so
load-link/store-conditional, or equivalent must be used. There are two cases
where the guest will clear the flag: when the soft lockup watchdog timer resets
itself or when a soft lockup is detected. This ioctl can be called any time
after pausing the vcpu, but before it is resumed.
4.71 KVM_SIGNAL_MSI
Capability: KVM_CAP_SIGNAL_MSI
Type: vm ioctl
Parameters: struct kvm_msi (in)
Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
Directly inject a MSI message. Only valid with in-kernel irqchip that handles
MSI messages.
struct kvm_msi {
__u32 address_lo;
__u32 address_hi;
__u32 data;
__u32 flags;
__u32 devid;
__u8 pad[12];
};
flags: KVM_MSI_VALID_DEVID: devid contains a valid value. The per-VM
KVM_CAP_MSI_DEVID capability advertises the requirement to provide
the device ID. If this capability is not available, userspace
should never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail.
If KVM_MSI_VALID_DEVID is set, devid contains a unique device identifier
for the device that wrote the MSI message. For PCI, this is usually a
BFD identifier in the lower 16 bits.
On x86, address_hi is ignored unless the KVM_X2APIC_API_USE_32BIT_IDS
feature of KVM_CAP_X2APIC_API capability is enabled. If it is enabled,
address_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of
address_hi must be zero.
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
4.71 KVM_CREATE_PIT2
Capability: KVM_CAP_PIT2
Architectures: x86
Type: vm ioctl
Parameters: struct kvm_pit_config (in)
Returns: 0 on success, -1 on error
Creates an in-kernel device model for the i8254 PIT. This call is only valid
after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following
parameters have to be passed:
struct kvm_pit_config {
__u32 flags;
__u32 pad[15];
};
Valid flags are:
#define KVM_PIT_SPEAKER_DUMMY 1 /* emulate speaker port stub */
PIT timer interrupts may use a per-VM kernel thread for injection. If it
exists, this thread will have a name of the following pattern:
kvm-pit/<owner-process-pid>
When running a guest with elevated priorities, the scheduling parameters of
this thread may have to be adjusted accordingly.
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
This IOCTL replaces the obsolete KVM_CREATE_PIT.
4.72 KVM_GET_PIT2
Capability: KVM_CAP_PIT_STATE2
Architectures: x86
Type: vm ioctl
Parameters: struct kvm_pit_state2 (out)
Returns: 0 on success, -1 on error
Retrieves the state of the in-kernel PIT model. Only valid after
KVM_CREATE_PIT2. The state is returned in the following structure:
struct kvm_pit_state2 {
struct kvm_pit_channel_state channels[3];
__u32 flags;
__u32 reserved[9];
};
Valid flags are:
/* disable PIT in HPET legacy mode */
#define KVM_PIT_FLAGS_HPET_LEGACY 0x00000001
This IOCTL replaces the obsolete KVM_GET_PIT.
4.73 KVM_SET_PIT2
Capability: KVM_CAP_PIT_STATE2
Architectures: x86
Type: vm ioctl
Parameters: struct kvm_pit_state2 (in)
Returns: 0 on success, -1 on error
Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2.
See KVM_GET_PIT2 for details on struct kvm_pit_state2.
This IOCTL replaces the obsolete KVM_SET_PIT.
4.74 KVM_PPC_GET_SMMU_INFO
Capability: KVM_CAP_PPC_GET_SMMU_INFO
Architectures: powerpc
Type: vm ioctl
Parameters: None
Returns: 0 on success, -1 on error
This populates and returns a structure describing the features of
the "Server" class MMU emulation supported by KVM.
This can in turn be used by userspace to generate the appropriate
device-tree properties for the guest operating system.
The structure contains some global information, followed by an
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
array of supported segment page sizes:
struct kvm_ppc_smmu_info {
__u64 flags;
__u32 slb_size;
__u32 pad;
struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ];
};
The supported flags are:
- KVM_PPC_PAGE_SIZES_REAL:
When that flag is set, guest page sizes must "fit" the backing
store page sizes. When not set, any page size in the list can
be used regardless of how they are backed by userspace.
- KVM_PPC_1T_SEGMENTS
The emulated MMU supports 1T segments in addition to the
standard 256M ones.
The "slb_size" field indicates how many SLB entries are supported
The "sps" array contains 8 entries indicating the supported base
page sizes for a segment in increasing order. Each entry is defined
as follow:
struct kvm_ppc_one_seg_page_size {
__u32 page_shift; /* Base page shift of segment (or 0) */
__u32 slb_enc; /* SLB encoding for BookS */
struct kvm_ppc_one_page_size enc[KVM_PPC_PAGE_SIZES_MAX_SZ];
};
An entry with a "page_shift" of 0 is unused. Because the array is
organized in increasing order, a lookup can stop when encoutering
such an entry.
The "slb_enc" field provides the encoding to use in the SLB for the
page size. The bits are in positions such as the value can directly
be OR'ed into the "vsid" argument of the slbmte instruction.
The "enc" array is a list which for each of those segment base page
size provides the list of supported actual page sizes (which can be
only larger or equal to the base page size), along with the
corresponding encoding in the hash PTE. Similarly, the array is
8 entries sorted by increasing sizes and an entry with a "0" shift
is an empty entry and a terminator:
struct kvm_ppc_one_page_size {
__u32 page_shift; /* Page shift (or 0) */
__u32 pte_enc; /* Encoding in the HPTE (>>12) */
};
The "pte_enc" field provides a value that can OR'ed into the hash
PTE's RPN field (ie, it needs to be shifted left by 12 to OR it
into the hash PTE second double word).
4.75 KVM_IRQFD
Capability: KVM_CAP_IRQFD
Type: vm ioctl
Parameters: struct kvm_irqfd (in)
Returns: 0 on success, -1 on error
Allows setting an eventfd to directly trigger a guest interrupt.
kvm_irqfd.fd specifies the file descriptor to use as the eventfd and
kvm_irqfd.gsi specifies the irqchip pin toggled by this event. When
an event is triggered on the eventfd, an interrupt is injected into
the guest using the specified gsi pin. The irqfd is removed using
the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd
and kvm_irqfd.gsi.
With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify
mechanism allowing emulation of level-triggered, irqfd-based
interrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an
additional eventfd in the kvm_irqfd.resamplefd field. When operating
in resample mode, posting of an interrupt through kvm_irq.fd asserts
the specified gsi in the irqchip. When the irqchip is resampled, such
as from an EOI, the gsi is de-asserted and the user is notified via
kvm_irqfd.resamplefd. It is the user's responsibility to re-queue
the interrupt if the device making use of it still requires service.
Note that closing the resamplefd is not sufficient to disable the
irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
On arm/arm64, gsi routing being supported, the following can happen:
- in case no routing entry is associated to this gsi, injection fails
- in case the gsi is associated to an irqchip routing entry,
irqchip.pin + 32 corresponds to the injected SPI ID.
- in case the gsi is associated to an MSI routing entry, the MSI
message and device ID are translated into an LPI (support restricted
to GICv3 ITS in-kernel emulation).
4.76 KVM_PPC_ALLOCATE_HTAB
Capability: KVM_CAP_PPC_ALLOC_HTAB
Architectures: powerpc
Type: vm ioctl
Parameters: Pointer to u32 containing hash table order (in/out)
Returns: 0 on success, -1 on error
This requests the host kernel to allocate an MMU hash table for a
guest using the PAPR paravirtualization interface. This only does
anything if the kernel is configured to use the Book 3S HV style of
virtualization. Otherwise the capability doesn't exist and the ioctl
returns an ENOTTY error. The rest of this description assumes Book 3S
HV.
There must be no vcpus running when this ioctl is called; if there
are, it will do nothing and return an EBUSY error.
The parameter is a pointer to a 32-bit unsigned integer variable
containing the order (log base 2) of the desired size of the hash
table, which must be between 18 and 46. On successful return from the
ioctl, the value will not be changed by the kernel.
If no hash table has been allocated when any vcpu is asked to run
(with the KVM_RUN ioctl), the host kernel will allocate a
default-sized hash table (16 MB).
If this ioctl is called when a hash table has already been allocated,
with a different order from the existing hash table, the existing hash
table will be freed and a new one allocated. If this is ioctl is
called when a hash table has already been allocated of the same order
as specified, the kernel will clear out the existing hash table (zero
all HPTEs). In either case, if the guest is using the virtualized
real-mode area (VRMA) facility, the kernel will re-create the VMRA
HPTEs on the next KVM_RUN of any vcpu.
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
4.77 KVM_S390_INTERRUPT
Capability: basic
Architectures: s390
Type: vm ioctl, vcpu ioctl
Parameters: struct kvm_s390_interrupt (in)
Returns: 0 on success, -1 on error
Allows to inject an interrupt to the guest. Interrupts can be floating
(vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type.
Interrupt parameters are passed via kvm_s390_interrupt:
struct kvm_s390_interrupt {
__u32 type;
__u32 parm;
__u64 parm64;
};
type can be one of the following:
KVM_S390_SIGP_STOP (vcpu) - sigp stop; optional flags in parm
KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm
KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm
KVM_S390_RESTART (vcpu) - restart
KVM_S390_INT_CLOCK_COMP (vcpu) - clock comparator interrupt
KVM_S390_INT_CPU_TIMER (vcpu) - CPU timer interrupt
KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
parameters in parm and parm64
KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an
I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
I/O interruption parameters in parm (subchannel) and parm64 (intparm,
interruption subclass)
KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
machine check interrupt code in parm64 (note that
machine checks needing further payload are not
supported by this ioctl)
Note that the vcpu ioctl is asynchronous to vcpu execution.
Paul Mackerras
committed
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
4.78 KVM_PPC_GET_HTAB_FD
Capability: KVM_CAP_PPC_HTAB_FD
Architectures: powerpc
Type: vm ioctl
Parameters: Pointer to struct kvm_get_htab_fd (in)
Returns: file descriptor number (>= 0) on success, -1 on error
This returns a file descriptor that can be used either to read out the
entries in the guest's hashed page table (HPT), or to write entries to
initialize the HPT. The returned fd can only be written to if the
KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
can only be read if that bit is clear. The argument struct looks like
this:
/* For KVM_PPC_GET_HTAB_FD */
struct kvm_get_htab_fd {
__u64 flags;
__u64 start_index;
__u64 reserved[2];
};
/* Values for kvm_get_htab_fd.flags */
#define KVM_GET_HTAB_BOLTED_ONLY ((__u64)0x1)
#define KVM_GET_HTAB_WRITE ((__u64)0x2)
The `start_index' field gives the index in the HPT of the entry at
which to start reading. It is ignored when writing.
Reads on the fd will initially supply information about all
"interesting" HPT entries. Interesting entries are those with the
bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise
all entries. When the end of the HPT is reached, the read() will
return. If read() is called again on the fd, it will start again from
the beginning of the HPT, but will only return HPT entries that have
changed since they were last read.
Data read or written is structured as a header (8 bytes) followed by a
series of valid HPT entries (16 bytes) each. The header indicates how
many valid HPT entries there are and how many invalid entries follow
the valid entries. The invalid entries are not represented explicitly
in the stream. The header format is:
struct kvm_get_htab_header {
__u32 index;
__u16 n_valid;
__u16 n_invalid;
};
Writes to the fd create HPT entries starting at the index given in the
header; first `n_valid' valid entries with contents from the data
written, then `n_invalid' invalid entries, invalidating any previously
valid entries found.
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
4.79 KVM_CREATE_DEVICE
Capability: KVM_CAP_DEVICE_CTRL
Type: vm ioctl
Parameters: struct kvm_create_device (in/out)
Returns: 0 on success, -1 on error
Errors:
ENODEV: The device type is unknown or unsupported
EEXIST: Device already created, and this type of device may not
be instantiated multiple times
Other error conditions may be defined by individual device types or
have their standard meanings.
Creates an emulated device in the kernel. The file descriptor returned
in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
device type is supported (not necessarily whether it can be created
in the current vm).
Individual devices should not define flags. Attributes should be used
for specifying any behavior that is not implied by the device type
number.
struct kvm_create_device {
__u32 type; /* in: KVM_DEV_TYPE_xxx */
__u32 fd; /* out: device handle */
__u32 flags; /* in: KVM_CREATE_DEVICE_xxx */
};
4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
KVM_CAP_VCPU_ATTRIBUTES for vcpu device
Type: device ioctl, vm ioctl, vcpu ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
or hardware support is missing.
EPERM: The attribute cannot (currently) be accessed this way
(e.g. read-only attribute, or attribute that only makes
sense when the device is in a different state)
Other error conditions may be defined by individual device types.
Gets/sets a specified piece of device configuration and/or state. The
semantics are device-specific. See individual device documentation in
the "devices" directory. As with ONE_REG, the size of the data
transferred is defined by the particular attribute.
struct kvm_device_attr {
__u32 flags; /* no flags currently defined */
__u32 group; /* device-defined */
__u64 attr; /* group-defined */
__u64 addr; /* userspace address of attr data */
};
4.81 KVM_HAS_DEVICE_ATTR
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
KVM_CAP_VCPU_ATTRIBUTES for vcpu device
Type: device ioctl, vm ioctl, vcpu ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
or hardware support is missing.
Tests whether a device supports a particular attribute. A successful
return indicates the attribute is implemented. It does not necessarily
indicate that the attribute can be read or written in the device's
current state. "addr" is ignored.
Capability: basic
Parameters: struct kvm_vcpu_init (in)
Returns: 0 on success; -1 on error
Errors:
EINVAL: the target is unknown, or the combination of features is invalid.
ENOENT: a features bit specified is unknown.
This tells KVM what type of CPU to present to the guest, and what
optional features it should have. This will cause a reset of the cpu
registers to their initial values. If this is not called, KVM_RUN will
return ENOEXEC for that vcpu.
Note that because some registers reflect machine topology, all vcpus
should be created before this ioctl is invoked.
Userspace can call this function multiple times for a given vcpu, including
after the vcpu has been run. This will reset the vcpu to its initial
state. All calls to this function after the initial call must use the same
target and same set of feature flags, otherwise EINVAL will be returned.
Possible features:
- KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on
and execute guest code when KVM_RUN is called.
- KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).
- KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU.
Depends on KVM_CAP_ARM_PSCI_0_2.
- KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU.
Depends on KVM_CAP_ARM_PMU_V3.
4.83 KVM_ARM_PREFERRED_TARGET
Capability: basic
Architectures: arm, arm64
Type: vm ioctl
Parameters: struct struct kvm_vcpu_init (out)
Returns: 0 on success; -1 on error
Errors:
ENODEV: no preferred target available for the host
This queries KVM for preferred CPU target type which can be emulated
by KVM on underlying host.
The ioctl returns struct kvm_vcpu_init instance containing information
about preferred CPU target type and recommended features for it. The
kvm_vcpu_init->features bitmap returned will have feature bits set if
the preferred target recommends setting these features, but this is
not mandatory.
The information returned by this ioctl can be used to prepare an instance
of struct kvm_vcpu_init for KVM_ARM_VCPU_INIT ioctl which will result in
in VCPU matching underlying host.
4.84 KVM_GET_REG_LIST
Capability: basic
Architectures: arm, arm64, mips
Type: vcpu ioctl
Parameters: struct kvm_reg_list (in/out)
Returns: 0 on success; -1 on error
Errors:
E2BIG: the reg index list is too big to fit in the array specified by
the user (the number required will be written into n).
struct kvm_reg_list {
__u64 n; /* number of registers in reg[] */
__u64 reg[0];
};
This ioctl returns the guest registers that are supported for the
KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.
4.85 KVM_ARM_SET_DEVICE_ADDR (deprecated)
Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
Type: vm ioctl
Parameters: struct kvm_arm_device_address (in)
Returns: 0 on success, -1 on error
Errors:
ENODEV: The device id is unknown
ENXIO: Device not supported on current system
EEXIST: Address already set
E2BIG: Address outside guest physical address space
EBUSY: Address overlaps with other device range
struct kvm_arm_device_addr {
__u64 id;
__u64 addr;
};
Specify a device address in the guest's physical address space where guests
can access emulated or directly exposed devices, which the host kernel needs
to know about. The id field is an architecture specific identifier for a
specific device.
ARM/arm64 divides the id field into two parts, a device id and an
address type id specific to the individual device.
bits: | 63 ... 32 | 31 ... 16 | 15 ... 0 |
field: | 0x00000000 | device id | addr type id |
ARM/arm64 currently only require this when using the in-kernel GIC
support for the hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2
as the device id. When setting the base address for the guest's
mapping of the VGIC virtual CPU and distributor interface, the ioctl
must be called after calling KVM_CREATE_IRQCHIP, but before calling
KVM_RUN on any of the VCPUs. Calling this ioctl twice for any of the
base addresses will return -EEXIST.
Note, this IOCTL is deprecated and the more flexible SET/GET_DEVICE_ATTR API
should be used instead.
4.86 KVM_PPC_RTAS_DEFINE_TOKEN
Capability: KVM_CAP_PPC_RTAS
Architectures: ppc
Type: vm ioctl
Parameters: struct kvm_rtas_token_args
Returns: 0 on success, -1 on error
Defines a token value for a RTAS (Run Time Abstraction Services)
service in order to allow it to be handled in the kernel. The
argument struct gives the name of the service, which must be the name
of a service that has a kernel-side implementation. If the token
value is non-zero, it will be associated with that service, and
subsequent RTAS calls by the guest specifying that token will be
handled by the kernel. If the token value is 0, then any token
associated with the service will be forgotten, and subsequent RTAS
calls by the guest for that service will be passed to userspace to be
handled.
4.87 KVM_SET_GUEST_DEBUG
Capability: KVM_CAP_SET_GUEST_DEBUG
Architectures: x86, s390, ppc, arm64
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
Type: vcpu ioctl
Parameters: struct kvm_guest_debug (in)
Returns: 0 on success; -1 on error
struct kvm_guest_debug {
__u32 control;
__u32 pad;
struct kvm_guest_debug_arch arch;
};
Set up the processor specific debug registers and configure vcpu for
handling guest debug events. There are two parts to the structure, the
first a control bitfield indicates the type of debug events to handle
when running. Common control bits are:
- KVM_GUESTDBG_ENABLE: guest debugging is enabled
- KVM_GUESTDBG_SINGLESTEP: the next run should single-step
The top 16 bits of the control field are architecture specific control
flags which can include the following:
- KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64]
- KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390, arm64]
- KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86]
- KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86]
- KVM_GUESTDBG_EXIT_PENDING: trigger an immediate guest exit [s390]
For example KVM_GUESTDBG_USE_SW_BP indicates that software breakpoints
are enabled in memory so we need to ensure breakpoint exceptions are
correctly trapped and the KVM run loop exits at the breakpoint and not
running off into the normal guest vector. For KVM_GUESTDBG_USE_HW_BP
we need to ensure the guest vCPUs architecture specific registers are
updated to the correct (supplied) values.
The second part of the structure is architecture specific and
typically contains a set of debug registers.
For arm64 the number of debug registers is implementation defined and
can be determined by querying the KVM_CAP_GUEST_DEBUG_HW_BPS and
KVM_CAP_GUEST_DEBUG_HW_WPS capabilities which return a positive number
indicating the number of supported registers.
When debug events exit the main run loop with the reason
KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run
structure containing architecture specific debug information.
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
4.88 KVM_GET_EMULATED_CPUID
Capability: KVM_CAP_EXT_EMUL_CPUID
Architectures: x86
Type: system ioctl
Parameters: struct kvm_cpuid2 (in/out)
Returns: 0 on success, -1 on error
struct kvm_cpuid2 {
__u32 nent;
__u32 flags;
struct kvm_cpuid_entry2 entries[0];
};
The member 'flags' is used for passing flags from userspace.
#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0)
#define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1)
#define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2)
struct kvm_cpuid_entry2 {
__u32 function;
__u32 index;
__u32 flags;
__u32 eax;
__u32 ebx;
__u32 ecx;
__u32 edx;
__u32 padding[3];
};
This ioctl returns x86 cpuid features which are emulated by
kvm.Userspace can use the information returned by this ioctl to query
which features are emulated by kvm instead of being present natively.
Userspace invokes KVM_GET_EMULATED_CPUID by passing a kvm_cpuid2
structure with the 'nent' field indicating the number of entries in
the variable-size array 'entries'. If the number of entries is too low
to describe the cpu capabilities, an error (E2BIG) is returned. If the
number is too high, the 'nent' field is adjusted and an error (ENOMEM)
is returned. If the number is just right, the 'nent' field is adjusted
to the number of valid entries in the 'entries' array, which is then
filled.
The entries returned are the set CPUID bits of the respective features
which kvm emulates, as returned by the CPUID instruction, with unknown
or unsupported feature bits cleared.
Features like x2apic, for example, may not be present in the host cpu
but are exposed by kvm in KVM_GET_SUPPORTED_CPUID because they can be
emulated efficiently and thus not included here.
The fields in each entry are defined as follows:
function: the eax value used to obtain the entry
index: the ecx value used to obtain the entry (for entries that are
affected by ecx)
flags: an OR of zero or more of the following:
KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
if the index field is valid
KVM_CPUID_FLAG_STATEFUL_FUNC:
if cpuid for this function returns different values for successive
invocations; there will be several entries with the same function,
all with this flag set
KVM_CPUID_FLAG_STATE_READ_NEXT:
for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
the first entry to be read by a cpu
eax, ebx, ecx, edx: the values returned by the cpuid instruction for
this function/index combination
4.89 KVM_S390_MEM_OP
Capability: KVM_CAP_S390_MEM_OP
Architectures: s390
Type: vcpu ioctl
Parameters: struct kvm_s390_mem_op (in)
Returns: = 0 on success,
< 0 on generic error (e.g. -EFAULT or -ENOMEM),
> 0 if an exception occurred while walking the page tables
Read or write data from/to the logical (virtual) memory of a VCPU.
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
Parameters are specified via the following structure:
struct kvm_s390_mem_op {
__u64 gaddr; /* the guest address */
__u64 flags; /* flags */
__u32 size; /* amount of bytes */
__u32 op; /* type of operation */
__u64 buf; /* buffer in userspace */
__u8 ar; /* the access register number */
__u8 reserved[31]; /* should be set to 0 */
};
The type of operation is specified in the "op" field. It is either
KVM_S390_MEMOP_LOGICAL_READ for reading from logical memory space or
KVM_S390_MEMOP_LOGICAL_WRITE for writing to logical memory space. The
KVM_S390_MEMOP_F_CHECK_ONLY flag can be set in the "flags" field to check
whether the corresponding memory access would create an access exception
(without touching the data in the memory at the destination). In case an
access exception occurred while walking the MMU tables of the guest, the
ioctl returns a positive error number to indicate the type of exception.
This exception is also raised directly at the corresponding VCPU if the
flag KVM_S390_MEMOP_F_INJECT_EXCEPTION is set in the "flags" field.
The start address of the memory region has to be specified in the "gaddr"
field, and the length of the region in the "size" field. "buf" is the buffer
supplied by the userspace application where the read data should be written
to for KVM_S390_MEMOP_LOGICAL_READ, or where the data that should be written
is stored for a KVM_S390_MEMOP_LOGICAL_WRITE. "buf" is unused and can be NULL
when KVM_S390_MEMOP_F_CHECK_ONLY is specified. "ar" designates the access
register number to be used.
The "reserved" field is meant for future extensions. It is not used by
KVM with the currently defined set of flags.
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
4.90 KVM_S390_GET_SKEYS
Capability: KVM_CAP_S390_SKEYS
Architectures: s390
Type: vm ioctl
Parameters: struct kvm_s390_skeys
Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage
keys, negative value on error
This ioctl is used to get guest storage key values on the s390
architecture. The ioctl takes parameters via the kvm_s390_skeys struct.
struct kvm_s390_skeys {
__u64 start_gfn;
__u64 count;
__u64 skeydata_addr;
__u32 flags;
__u32 reserved[9];
};
The start_gfn field is the number of the first guest frame whose storage keys
you want to get.
The count field is the number of consecutive frames (starting from start_gfn)
whose storage keys to get. The count field must be at least 1 and the maximum
allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range
will cause the ioctl to return -EINVAL.
The skeydata_addr field is the address to a buffer large enough to hold count
bytes. This buffer will be filled with storage key data by the ioctl.
4.91 KVM_S390_SET_SKEYS
Capability: KVM_CAP_S390_SKEYS
Architectures: s390
Type: vm ioctl
Parameters: struct kvm_s390_skeys
Returns: 0 on success, negative value on error
This ioctl is used to set guest storage key values on the s390
architecture. The ioctl takes parameters via the kvm_s390_skeys struct.
See section on KVM_S390_GET_SKEYS for struct definition.
The start_gfn field is the number of the first guest frame whose storage keys
you want to set.
The count field is the number of consecutive frames (starting from start_gfn)
whose storage keys to get. The count field must be at least 1 and the maximum
allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range
will cause the ioctl to return -EINVAL.
The skeydata_addr field is the address to a buffer containing count bytes of
storage keys. Each byte in the buffer will be set as the storage key for a
single frame starting at start_gfn for count frames.
Note: If any architecturally invalid key value is found in the given data then
the ioctl will return -EINVAL.
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
4.92 KVM_S390_IRQ
Capability: KVM_CAP_S390_INJECT_IRQ
Architectures: s390
Type: vcpu ioctl
Parameters: struct kvm_s390_irq (in)
Returns: 0 on success, -1 on error
Errors:
EINVAL: interrupt type is invalid
type is KVM_S390_SIGP_STOP and flag parameter is invalid value
type is KVM_S390_INT_EXTERNAL_CALL and code is bigger
than the maximum of VCPUs
EBUSY: type is KVM_S390_SIGP_SET_PREFIX and vcpu is not stopped
type is KVM_S390_SIGP_STOP and a stop irq is already pending
type is KVM_S390_INT_EXTERNAL_CALL and an external call interrupt
is already pending
Allows to inject an interrupt to the guest.
Using struct kvm_s390_irq as a parameter allows
to inject additional payload which is not
possible via KVM_S390_INTERRUPT.
Interrupt parameters are passed via kvm_s390_irq:
struct kvm_s390_irq {
__u64 type;
union {
struct kvm_s390_io_info io;
struct kvm_s390_ext_info ext;
struct kvm_s390_pgm_info pgm;
struct kvm_s390_emerg_info emerg;
struct kvm_s390_extcall_info extcall;
struct kvm_s390_prefix_info prefix;
struct kvm_s390_stop_info stop;
struct kvm_s390_mchk_info mchk;
char reserved[64];
} u;
};
type can be one of the following:
KVM_S390_SIGP_STOP - sigp stop; parameter in .stop
KVM_S390_PROGRAM_INT - program check; parameters in .pgm
KVM_S390_SIGP_SET_PREFIX - sigp set prefix; parameters in .prefix
KVM_S390_RESTART - restart; no parameters
KVM_S390_INT_CLOCK_COMP - clock comparator interrupt; no parameters
KVM_S390_INT_CPU_TIMER - CPU timer interrupt; no parameters
KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg
KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall
KVM_S390_MCHK - machine check interrupt; parameters in .mchk
Note that the vcpu ioctl is asynchronous to vcpu execution.
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
4.94 KVM_S390_GET_IRQ_STATE
Capability: KVM_CAP_S390_IRQ_STATE
Architectures: s390
Type: vcpu ioctl
Parameters: struct kvm_s390_irq_state (out)
Returns: >= number of bytes copied into buffer,
-EINVAL if buffer size is 0,
-ENOBUFS if buffer size is too small to fit all pending interrupts,
-EFAULT if the buffer address was invalid
This ioctl allows userspace to retrieve the complete state of all currently
pending interrupts in a single buffer. Use cases include migration
and introspection. The parameter structure contains the address of a
userspace buffer and its length:
struct kvm_s390_irq_state {
__u64 buf;
__u32 flags;
__u32 len;
__u32 reserved[4];
};
Userspace passes in the above struct and for each pending interrupt a
struct kvm_s390_irq is copied to the provided buffer.
If -ENOBUFS is returned the buffer provided was too small and userspace
may retry with a bigger buffer.
4.95 KVM_S390_SET_IRQ_STATE
Capability: KVM_CAP_S390_IRQ_STATE
Architectures: s390
Type: vcpu ioctl
Parameters: struct kvm_s390_irq_state (in)
Returns: 0 on success,
-EFAULT if the buffer address was invalid,
-EINVAL for an invalid buffer length (see below),
-EBUSY if there were already interrupts pending,
errors occurring when actually injecting the
interrupt. See KVM_S390_IRQ.
This ioctl allows userspace to set the complete state of all cpu-local
interrupts currently pending for the vcpu. It is intended for restoring
interrupt state after a migration. The input parameter is a userspace buffer
containing a struct kvm_s390_irq_state:
struct kvm_s390_irq_state {
__u64 buf;
__u32 len;
__u32 pad;
};
The userspace memory referenced by buf contains a struct kvm_s390_irq
for each interrupt to be injected into the guest.
If one of the interrupts could not be injected for some reason the
ioctl aborts.
len must be a multiple of sizeof(struct kvm_s390_irq). It must be > 0
and it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq),
which is the maximum number of possibly pending cpu-local interrupts.
Capability: KVM_CAP_X86_SMM
Architectures: x86
Type: vcpu ioctl
Parameters: none
Returns: 0 on success, -1 on error
Queues an SMI on the thread's vcpu.
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
4.97 KVM_CAP_PPC_MULTITCE
Capability: KVM_CAP_PPC_MULTITCE
Architectures: ppc
Type: vm
This capability means the kernel is capable of handling hypercalls
H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
space. This significantly accelerates DMA operations for PPC KVM guests.
User space should expect that its handlers for these hypercalls
are not going to be called if user space previously registered LIOBN
in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
user space might have to advertise it for the guest. For example,
IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
present in the "ibm,hypertas-functions" device-tree property.
The hypercalls mentioned above may or may not be processed successfully
in the kernel based fast path. If they can not be handled by the kernel,
they will get passed on to user space. So user space still has to have
an implementation for these despite the in kernel acceleration.
This capability is always enabled.
4.98 KVM_CREATE_SPAPR_TCE_64
Capability: KVM_CAP_SPAPR_TCE_64
Architectures: powerpc
Type: vm ioctl
Parameters: struct kvm_create_spapr_tce_64 (in)
Returns: file descriptor for manipulating the created TCE table
This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit
windows, described in 4.62 KVM_CREATE_SPAPR_TCE
This capability uses extended struct in ioctl interface:
/* for KVM_CAP_SPAPR_TCE_64 */
struct kvm_create_spapr_tce_64 {
__u64 liobn;
__u32 page_shift;
__u32 flags;