- Apr 02, 2022
-
-
Paolo Bonzini authored
Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220313140522.1307751-1-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
MSR filtering requires an exit to userspace that is hard to implement and would be very slow in the case of nested VMX vmexit and vmentry MSR accesses. Document the limitation. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
David Woodhouse authored
It isn't OK to cache the dirty status of a page in internal structures for an indefinite period of time. Any time a vCPU exits the run loop to userspace might be its last; the VMM might do its final check of the dirty log, flush the last remaining dirty pages to the destination and complete a live migration. If we have internal 'dirty' state which doesn't get flushed until the vCPU is finally destroyed on the source after migration is complete, then we have lost data because that will escape the final copy. This problem already exists with the use of kvm_vcpu_unmap() to mark pages dirty in e.g. VMX nesting. Note that the actual Linux MM already considers the page to be dirty since we have a writeable mapping of it. This is just about the KVM dirty logging. For the nesting-style use cases (KVM_GUEST_USES_PFN) we will need to track which gfn_to_pfn_caches have been used and explicitly mark the corresponding pages dirty before returning to userspace. But we would have needed external tracking of that anyway, rather than walking the full list of GPCs to find those belonging to this vCPU which are dirty. So let's rely *solely* on that external tracking, and keep it simple rather than laying a tempting trap for callers to fall into. Signed-off-by:
David Woodhouse <dwmw@amazon.co.uk> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220303154127.202856-3-dwmw2@infradead.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Don't actually set a request bit in vcpu->requests when making a request purely to force a vCPU to exit the guest. Logging a request but not actually consuming it would cause the vCPU to get stuck in an infinite loop during KVM_RUN because KVM would see the pending request and bail from VM-Enter to service the request. Note, it's currently impossible for KVM to set KVM_REQ_GPC_INVALIDATE as nothing in KVM is wired up to set guest_uses_pa=true. But, it'd be all too easy for arch code to introduce use of kvm_gfn_to_pfn_cache_init() without implementing handling of the request, especially since getting test coverage of MMU notifier interaction with specific KVM features usually requires a directed test. Opportunistically rename gfn_to_pfn_cache_invalidate_start()'s wake_vcpus to evict_vcpus. The purpose of the request is to get vCPUs out of guest mode, it's supposed to _avoid_ waking vCPUs that are blocking. Opportunistically rename KVM_REQ_GPC_INVALIDATE to be more specific as to what it wants to accomplish, and to genericize the name so that it can used for similar but unrelated scenarios, should they arise in the future. Add a comment and documentation to explain why the "no action" request exists. Add compile-time assertions to help detect improper usage. Use the inner assertless helper in the one s390 path that makes requests without a hardcoded request. Cc: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20220223165302.3205276-1-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Mar 29, 2022
-
-
Paolo Bonzini authored
Add a section to document all the different ways in which the KVM API sucks. I am sure there are way more, give people a place to vent so that userspace authors are aware. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220322110712.222449-4-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Add a file to document all the different ways in which the virtual CPU emulation is imperfect. Include an example to show how to document such errata. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Oliver Upton <oupton@google.com> Message-Id: <20220322110712.222449-3-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
ARM already has an arm/ subdirectory, but s390 and x86 do not even though they have a relatively large number of files specific to them. Create new directories in Documentation/virt/kvm for these two architectures as well. While at it, group the API documentation and the developer documentation in the table of contents. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220322110712.222449-2-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
kvm->mn_invalidate_lock and kvm->slots_arch_lock were not included in the documentation, add them. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220322110720.222499-3-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Separate the various locks clearly, and include the new names of blocked_vcpu_on_cpu_lock and blocked_vcpu_on_cpu. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220322110720.222499-2-pbonzini@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Mar 21, 2022
-
-
Oliver Upton authored
KVM_CAP_DISABLE_QUIRKS is irrevocably broken. The capability does not advertise the set of quirks which may be disabled to userspace, so it is impossible to predict the behavior of KVM. Worse yet, KVM_CAP_DISABLE_QUIRKS will tolerate any value for cap->args[0], meaning it fails to reject attempts to set invalid quirk bits. The only valid workaround for the quirky quirks API is to add a new CAP. Actually advertise the set of quirks that can be disabled to userspace so it can predict KVM's behavior. Reject values for cap->args[0] that contain invalid bits. Finally, add documentation for the new capability and describe the existing quirks. Signed-off-by:
Oliver Upton <oupton@google.com> Message-Id: <20220301060351.442881-5-oupton@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Mar 09, 2022
-
-
Oliver Upton authored
KVM support for 32-bit ARM hosts (KVM/arm) has been removed from the kernel since commit 541ad015 ("arm: Remove 32bit KVM host support"). There still exists some remnants of the old architecture in the KVM documentation. Remove all traces of 32-bit host support from the documentation. Note that AArch32 guests are still supported. Suggested-by:
Marc Zyngier <maz@kernel.org> Signed-off-by:
Oliver Upton <oupton@google.com> Signed-off-by:
Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220308172856.2997250-1-oupton@google.com
-
- Mar 01, 2022
-
-
Sean Christopherson authored
Remove the now unused KVM_REQ_MMU_RELOAD, shift KVM_REQ_VM_DEAD into the unoccupied space, and update vcpu-requests.rst, which was missing an entry for KVM_REQ_VM_DEAD. Switching KVM_REQ_VM_DEAD to entry '1' also fixes the stale comment about bits 4-7 being reserved. Reviewed-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Ben Gardon <bgardon@google.com> Message-Id: <20220225182248.3812651-7-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Feb 25, 2022
-
-
David Dunn authored
Add a new capability, KVM_CAP_PMU_CAPABILITY, that takes a bitmask of settings/features to allow userspace to configure PMU virtualization on a per-VM basis. For now, support a single flag, KVM_PMU_CAP_DISABLE, to allow disabling PMU virtualization for a VM even when KVM is configured with enable_pmu=true a module level. To keep KVM simple, disallow changing VM's PMU configuration after vCPUs have been created. Signed-off-by:
David Dunn <daviddunn@google.com> Message-Id: <20220223225743.2703915-2-daviddunn@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Feb 22, 2022
-
-
Nicholas Piggin authored
Add KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL resource mode to 3 with the H_SET_MODE hypercall. This capability differs between processor types and KVM types (PR, HV, Nested HV), and affects guest-visible behaviour. QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and use the KVM CAP if available to determine KVM support[2]. Reviewed-by:
Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Janis Schoetterl-Glausch authored
Clarify that the key argument represents the access key, not the whole storage key. Signed-off-by:
Janis Schoetterl-Glausch <scgl@linux.ibm.com> Link: https://lore.kernel.org/r/20220221143657.3712481-1-scgl@linux.ibm.com Fixes: 5e35d0eb ("KVM: s390: Update api documentation for memop ioctl") Signed-off-by:
Christian Borntraeger <borntraeger@linux.ibm.com>
-
- Feb 21, 2022
-
-
Will Deacon authored
When handling reset and power-off PSCI calls from the guest, we initialise X0 to PSCI_RET_INTERNAL_FAILURE in case the VMM tries to re-run the vCPU after issuing the call. Unfortunately, this also means that the VMM cannot see which PSCI call was issued and therefore cannot distinguish between PSCI SYSTEM_RESET and SYSTEM_RESET2 calls, which is necessary in order to determine the validity of the "reset_type" in X1. Allocate bit 0 of the previously unused 'flags' field of the system_event structure so that we can indicate the PSCI call used to initiate the reset. Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by:
Will Deacon <will@kernel.org> Signed-off-by:
Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220221153524.15397-4-will@kernel.org
-
- Feb 17, 2022
-
-
Aaron Lewis authored
Follow the precedent set by other architectures that support the VCPU ioctl, KVM_ENABLE_CAP, and advertise the VM extension, KVM_CAP_ENABLE_CAP. This way, userspace can ensure that KVM_ENABLE_CAP is available on a vcpu before using it. Fixes: 5c919412 ("kvm/x86: Hyper-V synthetic interrupt controller") Signed-off-by:
Aaron Lewis <aaronlewis@google.com> Message-Id: <20220214212950.1776943-1-aaronlewis@google.com> Cc: stable@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Feb 14, 2022
-
-
Janis Schoetterl-Glausch authored
Document all currently existing operations, flags and explain under which circumstances they are available. Document the recently introduced absolute operations and the storage key protection flag, as well as the existing SIDA operations. Signed-off-by:
Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by:
Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20220211182215.2730017-10-scgl@linux.ibm.com Signed-off-by:
Christian Borntraeger <borntraeger@linux.ibm.com>
-
- Feb 08, 2022
-
-
Alexandru Elisei authored
Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU device ioctl. If the VCPU is scheduled on a physical CPU which has a different PMU, the perf events needed to emulate a guest PMU won't be scheduled in and the guest performance counters will stop counting. Treat it as an userspace error and refuse to run the VCPU in this situation. Suggested-by:
Marc Zyngier <maz@kernel.org> Signed-off-by:
Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by:
Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-7-alexandru.elisei@arm.com
-
Alexandru Elisei authored
When KVM creates an event and there are more than one PMUs present on the system, perf_init_event() will go through the list of available PMUs and will choose the first one that can create the event. The order of the PMUs in this list depends on the probe order, which can change under various circumstances, for example if the order of the PMU nodes change in the DTB or if asynchronous driver probing is enabled on the kernel command line (with the driver_async_probe=armv8-pmu option). Another consequence of this approach is that on heteregeneous systems all virtual machines that KVM creates will use the same PMU. This might cause unexpected behaviour for userspace: when a VCPU is executing on the physical CPU that uses this default PMU, PMU events in the guest work correctly; but when the same VCPU executes on another CPU, PMU events in the guest will suddenly stop counting. Fortunately, perf core allows user to specify on which PMU to create an event by using the perf_event_attr->type field, which is used by perf_init_event() as an index in the radix tree of available PMUs. Add the KVM_ARM_VCPU_PMU_V3_CTRL(KVM_ARM_VCPU_PMU_V3_SET_PMU) VCPU attribute to allow userspace to specify the arm_pmu that KVM will use when creating events for that VCPU. KVM will make no attempt to run the VCPU on the physical CPUs that share the PMU, leaving it up to userspace to manage the VCPU threads' affinity accordingly. To ensure that KVM doesn't expose an asymmetric system to the guest, the PMU set for one VCPU will be used by all other VCPUs. Once a VCPU has run, the PMU cannot be changed in order to avoid changing the list of available events for a VCPU, or to change the semantics of existing events. Signed-off-by:
Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by:
Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-6-alexandru.elisei@arm.com
-
Marc Zyngier authored
Userspace can specify which events a guest is allowed to use with the KVM_ARM_VCPU_PMU_V3_FILTER attribute. The list of allowed events can be identified by a guest from reading the PMCEID{0,1}_EL0 registers. Changing the PMU event filter after a VCPU has run can cause reads of the registers performed before the filter is changed to return different values than reads performed with the new event filter in place. The architecture defines the two registers as read-only, and this behaviour contradicts that. Keep track when the first VCPU has run and deny changes to the PMU event filter to prevent this from happening. Signed-off-by:
Marc Zyngier <maz@kernel.org> [ Alexandru E: Added commit message, updated ioctl documentation ] Signed-off-by:
Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by:
Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-2-alexandru.elisei@arm.com
-
- Jan 28, 2022
-
-
Paolo Bonzini authored
Because KVM_GET_SUPPORTED_CPUID is meant to be passed (by simple-minded VMMs) to KVM_SET_CPUID2, it cannot include any dynamic xsave states that have not been enabled. Probing those, for example so that they can be passed to ARCH_REQ_XCOMP_GUEST_PERM, requires a new ioctl or arch_prctl. The latter is in fact worse, even though that is what the rest of the API uses, because it would require supported_xcr0 to be moved from the KVM module to the kernel just for this use. In addition, the value would be nonsensical (or an error would have to be returned) until the KVM module is loaded in. Therefore, to limit the growth of system ioctls, add a /dev/kvm variant of KVM_{GET,HAS}_DEVICE_ATTR, and implement it in x86 with just one group (0) and attribute (KVM_X86_XCOMP_GUEST_SUPP). Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jan 20, 2022
-
-
Wei Wang authored
Use the api number 134 for KVM_GET_XSAVE2, instead of 42, which has been used by KVM_GET_XSAVE. Also, fix the WARNINGs of the underlines being too short. Reported-by:
Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by:
Wei Wang <wei.w.wang@intel.com> Tested-by:
Stephen Rothwell <sfr@canb.auug.org.au> Message-Id: <20220120045003.315177-1-wei.w.wang@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jan 14, 2022
-
-
Guang Zeng authored
With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set xstate data from/to KVM. This doesn't work when dynamic xfeatures (e.g. AMX) are exposed to the guest as they require a larger buffer size. Introduce a new capability (KVM_CAP_XSAVE2). Userspace VMM gets the required xstate buffer size via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). KVM_SET_XSAVE is extended to work with both legacy and new capabilities by doing properly-sized memdup_user() based on the guest fpu container. KVM_GET_XSAVE is kept for backward-compatible reason. Instead, KVM_GET_XSAVE2 is introduced under KVM_CAP_XSAVE2 as the preferred interface for getting xstate buffer (4KB or larger size) from KVM (Link: https://lkml.org/lkml/2021/12/15/510 ) Also, update the api doc with the new KVM_GET_XSAVE2 ioctl. Signed-off-by:
Guang Zeng <guang.zeng@intel.com> Signed-off-by:
Wei Wang <wei.w.wang@intel.com> Signed-off-by:
Jing Liu <jing2.liu@intel.com> Signed-off-by:
Kevin Tian <kevin.tian@intel.com> Signed-off-by:
Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-19-yang.zhong@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jan 07, 2022
-
-
Jing Liu authored
KVM_GET_SUPPORTED_CPUID should not include any dynamic xstates in CPUID[0xD] if they have not been requested with prctl. Otherwise a process which directly passes KVM_GET_SUPPORTED_CPUID to KVM_SET_CPUID2 would now fail even if it doesn't intend to use a dynamically enabled feature. Userspace must know that prctl is required and allocate >4K xstate buffer before setting any dynamic bit. Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Jing Liu <jing2.liu@intel.com> Signed-off-by:
Yang Zhong <yang.zhong@intel.com> Message-Id: <20220105123532.12586-5-yang.zhong@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
David Woodhouse authored
This adds basic support for delivering 2 level event channels to a guest. Initially, it only supports delivery via the IRQ routing table, triggered by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast() function which will use the pre-mapped shared_info page if it already exists and is still valid, while the slow path through the irqfd_inject workqueue will remap the shared_info page if necessary. It sets the bits in the shared_info page but not the vcpu_info; that is deferred to __kvm_xen_has_interrupt() which raises the vector to the appropriate vCPU. Add a 'verbose' mode to xen_shinfo_test while adding test cases for this. Signed-off-by:
David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-5-dwmw2@infradead.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
David Woodhouse authored
Use the newly reinstated gfn_to_pfn_cache to maintain a kernel mapping of the Xen shared_info page so that it can be accessed in atomic context. Note that we do not participate in dirty tracking for the shared info page and we do not explicitly mark it dirty every single tim we deliver an event channel interrupts. We wouldn't want to do that even if we *did* have a valid vCPU context with which to do so. Signed-off-by:
David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211210163625.2886-4-dwmw2@infradead.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Dec 17, 2021
-
-
David Rientjes authored
Add new module parameter to allow users to use SEV_INIT_EX instead of SEV_INIT. This helps users who lock their SPI bus to use the PSP for SEV functionality. The 'init_ex_path' parameter defaults to NULL which means the kernel will use SEV_INIT, if a path is specified SEV_INIT_EX will be used with the data found at the path. On certain PSP commands this file is written to as the PSP updates the NV memory region. Depending on file system initialization this file open may fail during module init but the CCP driver for SEV already has sufficient retries for platform initialization. During normal operation of PSP system and SEV commands if the PSP has not been initialized it is at run time. If the file at 'init_ex_path' does not exist the PSP will not be initialized. The user must create the file prior to use with 32Kb of 0xFFs per spec. Signed-off-by:
David Rientjes <rientjes@google.com> Co-developed-by:
Peter Gonda <pgonda@google.com> Signed-off-by:
Peter Gonda <pgonda@google.com> Reviewed-by:
Marc Orr <marcorr@google.com> Reported-by:
kernel test robot <lkp@intel.com> Acked-by:
Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Marc Orr <marcorr@google.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David Rientjes <rientjes@google.com> Cc: John Allen <john.allen@amd.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Dec 08, 2021
-
-
Lai Jiangshan authored
This bit is very close to mean "role.quadrant is not in use", except that it is false also when the MMU is mapping guest physical addresses directly. In that case, role.quadrant is indeed not in use, but there are no guest PTEs at all. Changing the name and direction of the bit removes the special case, since a guest with paging disabled, or not considering guest paging structures as is the case for two-dimensional paging, does not have to deal with 4-byte guest PTEs. Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211124122055.64424-10-jiangshanlai@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Dec 07, 2021
-
-
Janis Schoetterl-Glausch authored
They are defined in include/uapi/linux/kvm.h as KVM_S390_GET_SKEYS_NONE and KVM_S390_SKEYS_MAX, but the api documetation talks of KVM_S390_GET_KEYS_NONE and KVM_S390_SKEYS_ALLOC_MAX respectively. Signed-off-by:
Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by:
Janosch Frank <frankja@linux.ibm.com> Message-Id: <20211118102522.569660-1-scgl@linux.ibm.com> Signed-off-by:
Janosch Frank <frankja@linux.ibm.com>
-
- Nov 11, 2021
-
-
Peter Gonda authored
For SEV to work with intra host migration, contents of the SEV info struct such as the ASID (used to index the encryption key in the AMD SP) and the list of memory regions need to be transferred to the target VM. This change adds a commands for a target VMM to get a source SEV VM's sev info. Signed-off-by:
Peter Gonda <pgonda@google.com> Suggested-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Marc Orr <marcorr@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20211021174303.385706-3-pgonda@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Oct 18, 2021
-
-
Oliver Upton authored
Introduce a KVM selftest to verify that userspace manipulation of the TSC (via the new vCPU attribute) results in the correct behavior within the guest. Reviewed-by:
Andrew Jones <drjones@redhat.com> Signed-off-by:
Oliver Upton <oupton@google.com> Message-Id: <20210916181555.973085-6-oupton@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Oliver Upton authored
To date, VMM-directed TSC synchronization and migration has been a bit messy. KVM has some baked-in heuristics around TSC writes to infer if the VMM is attempting to synchronize. This is problematic, as it depends on host userspace writing to the guest's TSC within 1 second of the last write. A much cleaner approach to configuring the guest's views of the TSC is to simply migrate the TSC offset for every vCPU. Offsets are idempotent, and thus not subject to change depending on when the VMM actually reads/writes values from/to KVM. The VMM can then read the TSC once with KVM_GET_CLOCK to capture a (realtime, host_tsc) pair at the instant when the guest is paused. Cc: David Matlack <dmatlack@google.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by:
Oliver Upton <oupton@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210916181538.968978-8-oupton@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Oliver Upton authored
Handling the migration of TSCs correctly is difficult, in part because Linux does not provide userspace with the ability to retrieve a (TSC, realtime) clock pair for a single instant in time. In lieu of a more convenient facility, KVM can report similar information in the kvm_clock structure. Provide userspace with a host TSC & realtime pair iff the realtime clock is based on the TSC. If userspace provides KVM_SET_CLOCK with a valid realtime value, advance the KVM clock by the amount of elapsed time. Do not step the KVM clock backwards, though, as it is a monotonic oscillator. Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Oliver Upton <oupton@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210916181538.968978-5-oupton@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Oct 12, 2021
-
-
Randy Dunlap authored
Fix various typos, command syntax, punctuation, capitalization, and whitespace. Fixes: 04301bf5 ("docs: replace the old User Mode Linux HowTo with a new one") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: linux-um@lists.infradead.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Acked-By:
anton ivanov <anton.ivanov@cambridgegreys.com> Link: https://lore.kernel.org/r/20211010064827.3405-1-rdunlap@infradead.org Signed-off-by:
Jonathan Corbet <corbet@lwn.net>
-
- Oct 04, 2021
-
-
Anup Patel authored
Document RISC-V specific parts of the KVM API, such as: - The interrupt numbers passed to the KVM_INTERRUPT ioctl. - The states supported by the KVM_{GET,SET}_MP_STATE ioctls. - The registers supported by the KVM_{GET,SET}_ONE_REG interface and the encoding of those register ids. - The exit reason KVM_EXIT_RISCV_SBI for SBI calls forwarded to userspace tool. CC: Jonathan Corbet <corbet@lwn.net> CC: linux-doc@vger.kernel.org Signed-off-by:
Anup Patel <anup.patel@wdc.com> Acked-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Sep 30, 2021
-
-
Juergen Gross authored
KVM_MAX_VCPU_ID is not specifying the highest allowed vcpu-id, but the number of allowed vcpu-ids. This has already led to confusion, so rename KVM_MAX_VCPU_ID to KVM_MAX_VCPU_IDS to make its semantics more clear Suggested-by:
Eduardo Habkost <ehabkost@redhat.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210913135745.13944-3-jgross@suse.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Sep 14, 2021
-
-
Andra Paraschiv authored
Add references for hugepages and booting steps for Arm64. Include info about the current supported architectures for the NE kernel driver. Reviewed-by:
George-Aurelian Popescu <popegeo@amazon.com> Acked-by:
Stefano Garzarella <sgarzare@redhat.com> Signed-off-by:
Andra Paraschiv <andraprs@amazon.com> Link: https://lore.kernel.org/r/20210827154930.40608-3-andraprs@amazon.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Aug 20, 2021
-
-
Maxim Levitsky authored
KVM_GUESTDBG_BLOCKIRQ will allow KVM to block all interrupts while running. This change is mostly intended for more robust single stepping of the guest and it has the following benefits when enabled: * Resuming from a breakpoint is much more reliable. When resuming execution from a breakpoint, with interrupts enabled, more often than not, KVM would inject an interrupt and make the CPU jump immediately to the interrupt handler and eventually return to the breakpoint, to trigger it again. From the user point of view it looks like the CPU never executed a single instruction and in some cases that can even prevent forward progress, for example, when the breakpoint is placed by an automated script (e.g lx-symbols), which does something in response to the breakpoint and then continues the guest automatically. If the script execution takes enough time for another interrupt to arrive, the guest will be stuck on the same breakpoint RIP forever. * Normal single stepping is much more predictable, since it won't land the debugger into an interrupt handler. * RFLAGS.TF has less chance to be leaked to the guest: We set that flag behind the guest's back to do single stepping but if single step lands us into an interrupt/exception handler it will be leaked to the guest in the form of being pushed to the stack. This doesn't completely eliminate this problem as exceptions can still happen, but at least this reduces the chances of this happening. Signed-off-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210811122927.900604-6-mlevitsk@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jing Zhang authored
Add documentations for linear and logarithmic histogram statistics. Signed-off-by:
Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-3-jingzhangos@google.com> [Small changes to the phrasing. - Paolo] Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-