- Jan 23, 2015
-
-
Dominik Dingel authored
With commit c6c956b8 ("KVM: s390/mm: support gmap page tables with less than 5 levels") we are able to define a limit for the guest memory size. As we round up the guest size in respect to the levels of page tables we get to guest limits of: 2048 MB, 4096 GB, 8192 TB and 16384 PB. We currently limit the guest size to 16 TB, which means we end up creating a page table structure supporting guest sizes up to 8192 TB. This patch introduces an interface that allows userspace to tune this limit. This may bring performance improvements for small guests. Signed-off-by:
Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- Dec 13, 2014
-
-
Christoffer Dall authored
When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus should really be turned off for the VM adhering to the suggestions in the PSCI spec, and it's the sane thing to do. Also, clarify the behavior and expectations for exits to user space with the KVM_EXIT_SYSTEM_EVENT case. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
It is not clear that this ioctl can be called multiple times for a given vcpu. Userspace already does this, so clarify the ABI. Also specify that userspace is expected to always make secondary and subsequent calls to the ioctl with the same parameters for the VCPU as the initial call (which userspace also already does). Add code to check that userspace doesn't violate that ABI in the future, and move the kvm_vcpu_set_target() function which is currently duplicated between the 32-bit and 64-bit versions in guest.c to a common static function in arm.c, shared between both architectures. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
The implementation of KVM_ARM_VCPU_INIT is currently not doing what userspace expects, namely making sure that a vcpu which may have been turned off using PSCI is returned to its initial state, which would be powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag. Implement the expected functionality and clarify the ABI. Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
- Nov 20, 2014
-
-
Tiejun Chen authored
kvm/ia64 is gone, clean up Documentation too. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Nov 07, 2014
-
-
Dominik Dingel authored
Documentation uses incorrect attribute names for some vm device attributes: fix this. Signed-off-by:
Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- Nov 03, 2014
-
-
Michael S. Tsirkin authored
No kernel ever reported KVM_CAP_DEVICE_MSIX, KVM_CAP_DEVICE_MSI, KVM_CAP_DEVICE_ASSIGNMENT, KVM_CAP_DEVICE_DEASSIGNMENT. This makes the documentation wrong, and no application ever written to use these capabilities has a chance to work correctly. The only way to detect support is to try, and test errno for ENOTTY. That's unfortunate, but we can't fix the past. Document the actual semantics, and drop the definitions from the exported header to make it easier for application developers to note and fix the bug. Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Tiejun Chen authored
When commit 6adba527 (KVM: Let host know whether the guest can handle async PF in non-userspace context.) is introduced, actually bit 2 still is reserved and should be zero. Instead, bit 1 is 1 to indicate if asynchronous page faults can be injected when vcpu is in cpl == 0, and also please see this, in the file kvm_para.h, #define KVM_ASYNC_PF_SEND_ALWAYS (1 << 1). Signed-off-by:
Tiejun Chen <tiejun.chen@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Sep 22, 2014
-
-
Bharat Bhushan authored
This was missed in respective one_reg implementation patch. Signed-off-by:
Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Sep 19, 2014
-
-
Marc Zyngier authored
In order to make the number of interrupts configurable, use the new fancy device management API to add KVM_DEV_ARM_VGIC_GRP_NR_IRQS as a VGIC configurable attribute. Userspace can now specify the exact size of the GIC (by increments of 32 interrupts). Reviewed-by:
Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com>
-
- Sep 10, 2014
-
-
Alex Bennée authored
It looks like when this was initially merged it got accidentally included in the following section. I've just moved it back in the correct section and re-numbered it as other ioctls have been added since. Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Acked-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Alex Bennée authored
In preparation for working on the ARM implementation I noticed the debug interface was missing from the API document. I've pieced together the expected behaviour from the code and commit messages written it up as best I can. Signed-off-by:
Alex Bennée <alex.bennee@linaro.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Sep 03, 2014
-
-
David Matlack authored
vcpu exits and memslot mutations can run concurrently as long as the vcpu does not aquire the slots mutex. Thus it is theoretically possible for memslots to change underneath a vcpu that is handling an exit. If we increment the memslot generation number again after synchronize_srcu_expedited(), vcpus can safely cache memslot generation without maintaining a single rcu_dereference through an entire vm exit. And much of the x86/kvm code does not maintain a single rcu_dereference of the current memslots during each exit. We can prevent the following case: vcpu (CPU 0) | thread (CPU 1) --------------------------------------------+-------------------------- 1 vm exit | 2 srcu_read_unlock(&kvm->srcu) | 3 decide to cache something based on | old memslots | 4 | change memslots | (increments generation) 5 | synchronize_srcu(&kvm->srcu); 6 retrieve generation # from new memslots | 7 tag cache with new memslot generation | 8 srcu_read_unlock(&kvm->srcu) | ... | <action based on cache occurs even | though the caching decision was based | on the old memslots> | ... | <action *continues* to occur until next | memslot generation change, which may | be never> | | By incrementing the generation after synchronizing with kvm->srcu readers, we ensure that the generation retrieved in (6) will become invalid soon after (8). Keeping the existing increment is not strictly necessary, but we do keep it and just move it for consistency from update_memslots to install_new_memslots. It invalidates old cached MMIOs immediately, instead of having to wait for the end of synchronize_srcu_expedited, which makes the code more clearly correct in case CPU 1 is preempted right after synchronize_srcu() returns. To avoid halving the generation space in SPTEs, always presume that the low bit of the generation is zero when reconstructing a generation number out of an SPTE. This effectively disables MMIO caching in SPTEs during the call to synchronize_srcu_expedited. Using the low bit this way is somewhat like a seqcount---where the protected thing is a cache, and instead of retrying we can simply punt if we observe the low bit to be 1. Cc: stable@vger.kernel.org Signed-off-by:
David Matlack <dmatlack@google.com> Reviewed-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
David Matlack <dmatlack@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Aug 25, 2014
-
-
David Hildenbrand authored
This patch clarifies that kvm_dirty_regs are just a hint to the kernel and that the kernel might just ignore some flags and sync the values (like done for acrs and gprs now). Signed-off-by:
David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- Jul 28, 2014
-
-
Alexander Graf authored
DCR handling was only needed for 440 KVM. Since we removed it, we can also remove handling of DCR accesses. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
The KVM_CHECK_EXTENSION is only available on the kvm fd today. Unfortunately on PPC some of the capabilities change depending on the way a VM was created. So instead we need a way to expose capabilities as VM ioctl, so that we can see which VM type we're using (HV or PR). To enable this, add the KVM_CHECK_EXTENSION ioctl to our vm ioctl portfolio. Signed-off-by:
Alexander Graf <agraf@suse.de> Acked-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Alexey Kardashevskiy authored
Unfortunately, the LPCR got defined as a 32-bit register in the one_reg interface. This is unfortunate because KVM allows userspace to control the DPFD (default prefetch depth) field, which is in the upper 32 bits. The result is that DPFD always get set to 0, which reduces performance in the guest. We can't just change KVM_REG_PPC_LPCR to be a 64-bit register ID, since that would break existing userspace binaries. Instead we define a new KVM_REG_PPC_LPCR_64 id which is 64-bit. Userspace can still use the old KVM_REG_PPC_LPCR id, but it now only modifies those fields in the bottom 32 bits that userspace can modify (ILE, TC and AIL). If userspace uses the new KVM_REG_PPC_LPCR_64 id, it can modify DPFD as well. Signed-off-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by:
Paul Mackerras <paulus@samba.org> Cc: stable@vger.kernel.org Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Jul 21, 2014
-
-
Cornelia Huck authored
Let's document that this is a capability that may be enabled per-vm. Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by:
David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
Cornelia Huck authored
Capabilities can be enabled on a vcpu or (since recently) on a vm. Document this and note for the existing capabilites whether they are per-vcpu or per-vm. Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by:
David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- Jul 10, 2014
-
-
David Hildenbrand authored
This patch - adds s390 specific MP states to linux headers and documents them - implements the KVM_{SET,GET}_MP_STATE ioctls - enables KVM_CAP_MP_STATE - allows user space to control the VCPU state on s390. If user space sets the VCPU state using the ioctl KVM_SET_MP_STATE, we can disable manual changing of the VCPU state and trust user space to do the right thing. Signed-off-by:
David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
David Hildenbrand authored
Highlight the aspects of the ioctls that are actually specific to x86 and ia64. As defined restrictions (irqchip) and mp states may not apply to other architectures, these parts are flagged to belong to x86 and ia64. In preparation for the use of KVM_(S|G)ET_MP_STATE by s390. Fix a spelling error (KVM_SET_MP_STATE vs. KVM_SET_MPSTATE) on the way. Signed-off-by:
David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- Jul 09, 2014
-
-
James Hogan authored
Document the MIPS specific parts of the KVM API, including: - The layout of the kvm_regs structure. - The interrupt number passed to KVM_INTERRUPT. - The registers supported by the KVM_{GET,SET}_ONE_REG interface, and the encoding of those register ids. - That KVM_INTERRUPT and KVM_GET_REG_LIST are supported on MIPS. Signed-off-by:
James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
James Hogan authored
Some of the MIPS registers that can be accessed with the KVM_{GET,SET}_ONE_REG interface have fairly long names, so widen the Register column of the table in the KVM_SET_ONE_REG documentation to allow them to fit. Tabs in the table are replaced with spaces at the same time for consistency. Signed-off-by:
James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
James Hogan authored
KVM_SET_SIGNAL_MASK is implemented in generic code and isn't x86 specific, so document it as being applicable for all architectures. Signed-off-by:
James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: kvm@vger.kernel.org Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-doc@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- May 30, 2014
-
-
Paul Mackerras authored
Commit b005255e ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs") added a definition of KVM_REG_PPC_WORT with the same register number as the existing KVM_REG_PPC_VRSAVE (though in fact the definitions are not identical because of the different register sizes.) For clarity, this moves KVM_REG_PPC_WORT to the next unused number, and also adds it to api.txt. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
Commit 3b783474 ("KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg") added definitions for several KVM_REG_PPC_* symbols but missed adding some to api.txt. This adds them. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
Old guests try to use the magic page, but map their trampoline code inside of an NX region. Since we can't fix those old kernels, try to detect whether the guest is sane or not. If not, just disable NX functionality in KVM so that old guests at least work at all. For newer guests, add a bit that we can set to keep NX functionality available. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- May 15, 2014
-
-
Cornelia Huck authored
s390 has acquired irqfd support with commit "KVM: s390: irq routing for adapter interrupts" (84223598) but failed to announce it. Let's fix that. Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- May 06, 2014
-
-
Thomas Huth authored
Add an interface to inject clock comparator and CPU timer interrupts into the guest. This is needed for handling the external interrupt interception. Signed-off-by:
Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by:
David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- May 05, 2014
-
-
Carlos Garcia authored
Fixed multiple spelling errors. Acked-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Carlos E. Garcia <carlos@cgarcia.org> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- Apr 30, 2014
-
-
Anup Patel authored
Currently, we don't have an exit reason to notify user space about a system-level event (for e.g. system reset or shutdown) triggered by the VCPU. This patch adds exit reason KVM_EXIT_SYSTEM_EVENT for this purpose. We can also inform user space about the 'type' and architecture specific 'flags' of a system-level event using the kvm_run structure. This newly added KVM_EXIT_SYSTEM_EVENT will be used by KVM ARM/ARM64 in-kernel PSCI v0.2 support to reset/shutdown VMs. Signed-off-by:
Anup Patel <anup.patel@linaro.org> Signed-off-by:
Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Reviewed-by:
Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Anup Patel authored
We have in-kernel emulation of PSCI v0.2 in KVM ARM/ARM64. To provide PSCI v0.2 interface to VCPUs, we have to enable KVM_ARM_VCPU_PSCI_0_2 feature when doing KVM_ARM_VCPU_INIT ioctl. The patch updates documentation of KVM_ARM_VCPU_INIT ioctl to provide info regarding KVM_ARM_VCPU_PSCI_0_2 feature. Signed-off-by:
Anup Patel <anup.patel@linaro.org> Signed-off-by:
Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Acked-by:
Christoffer Dall <christoffer.dall@linaro.org> Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
- Apr 22, 2014
-
-
David Hildenbrand authored
Added documentation for diag 501, stating that no subfunctions are provided and no parameters are used. Signed-off-by:
David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
Dominik Dingel authored
To enable CMMA and to reset its state we use the vm kvm_device ioctls, encapsulating attributes within the KVM_S390_VM_MEM_CTRL group. Signed-off-by:
Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
Dominik Dingel authored
We sometimes need to get/set attributes specific to a virtual machine and so need something else than ONE_REG. Let's copy the KVM_DEVICE approach, and define the respective ioctls for the vm file descriptor. Signed-off-by:
Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com>
-
- Mar 27, 2014
-
-
Christoffer Dall authored
The KVM API documentation is not clear about the semantics of the data field on the mmio struct on the kvm_run struct. This has become problematic when supporting ARM guests on big-endian host systems with guests of both endianness types, because it is unclear how the data should be exported to user space. This should not break with existing implementations as all supported existing implementations of known user space applications (QEMU and kvmtools for virtio) only support default endianness of the architectures on the host side. Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Alexander Graf <agraf@suse.de> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Mar 21, 2014
-
-
Cornelia Huck authored
Introduce a new interrupt class for s390 adapter interrupts and enable irqfds for s390. This is depending on a new s390 specific vm capability, KVM_CAP_S390_IRQCHIP, that needs to be enabled by userspace. Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com>
-
Cornelia Huck authored
Add a new interface to register/deregister sources of adapter interrupts identified by an unique id via the flic. Adapters may also be maskable and carry a list of pinned pages. These adapters will be used by irq routing later. Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com>
-