- Oct 30, 2013
-
-
Borislav Petkov authored
Add a kvm ioctl which states which system functionality kvm emulates. The format used is that of CPUID and we return the corresponding CPUID bits set for which we do emulate functionality. Make sure ->padding is being passed on clean from userspace so that we can use it for something in the future, after the ioctl gets cast in stone. s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at it. Signed-off-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Oct 17, 2013
-
-
Paul Mackerras authored
This enables us to use the Processor Compatibility Register (PCR) on POWER7 to put the processor into architecture 2.05 compatibility mode when running a guest. In this mode the new instructions and registers that were introduced on POWER7 are disabled in user mode. This includes all the VSX facilities plus several other instructions such as ldbrx, stdbrx, popcntw, popcntd, etc. To select this mode, we have a new register accessible through the set/get_one_reg interface, called KVM_REG_PPC_ARCH_COMPAT. Setting this to zero gives the full set of capabilities of the processor. Setting it to one of the "logical" PVR values defined in PAPR puts the vcpu into the compatibility mode for the corresponding architecture level. The supported values are: 0x0f000002 Architecture 2.05 (POWER6) 0x0f000003 Architecture 2.06 (POWER7) 0x0f100003 Architecture 2.06+ (POWER7+) Since the PCR is per-core, the architecture compatibility level and the corresponding PCR value are stored in the struct kvmppc_vcore, and are therefore shared between all vcpus in a virtual core. Signed-off-by:
Paul Mackerras <paulus@samba.org> [agraf: squash in fix to add missing break statements and documentation] Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
POWER7 and later IBM server processors have a register called the Program Priority Register (PPR), which controls the priority of each hardware CPU SMT thread, and affects how fast it runs compared to other SMT threads. This priority can be controlled by writing to the PPR or by use of a set of instructions of the form or rN,rN,rN which are otherwise no-ops but have been defined to set the priority to particular levels. This adds code to context switch the PPR when entering and exiting guests and to make the PPR value accessible through the SET/GET_ONE_REG interface. When entering the guest, we set the PPR as late as possible, because if we are setting a low thread priority it will make the code run slowly from that point on. Similarly, the first-level interrupt handlers save the PPR value in the PACA very early on, and set the thread priority to the medium level, so that the interrupt handling code runs at a reasonable speed. Acked-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
This adds the ability to have a separate LPCR (Logical Partitioning Control Register) value relating to a guest for each virtual core, rather than only having a single value for the whole VM. This corresponds to what real POWER hardware does, where there is a LPCR per CPU thread but most of the fields are required to have the same value on all active threads in a core. The per-virtual-core LPCR can be read and written using the GET/SET_ONE_REG interface. Userspace can can only modify the following fields of the LPCR value: DPFD Default prefetch depth ILE Interrupt little-endian TC Translation control (secondary HPT hash group search disable) We still maintain a per-VM default LPCR value in kvm->arch.lpcr, which contains bits relating to memory management, i.e. the Virtualized Partition Memory (VPM) bits and the bits relating to guest real mode. When this default value is updated, the update needs to be propagated to the per-vcore values, so we add a kvmppc_update_lpcr() helper to do that. Signed-off-by:
Paul Mackerras <paulus@samba.org> [agraf: fix whitespace] Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
The VRSAVE register value for a vcpu is accessible through the GET/SET_SREGS interface for Book E processors, but not for Book 3S processors. In order to make this accessible for Book 3S processors, this adds a new register identifier for GET/SET_ONE_REG, and adds the code to implement it. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
This allows guests to have a different timebase origin from the host. This is needed for migration, where a guest can migrate from one host to another and the two hosts might have a different timebase origin. However, the timebase seen by the guest must not go backwards, and should go forwards only by a small amount corresponding to the time taken for the migration. Therefore this provides a new per-vcpu value accessed via the one_reg interface using the new KVM_REG_PPC_TB_OFFSET identifier. This value defaults to 0 and is not modified by KVM. On entering the guest, this value is added onto the timebase, and on exiting the guest, it is subtracted from the timebase. This is only supported for recent POWER hardware which has the TBU40 (timebase upper 40 bits) register. Writing to the TBU40 register only alters the upper 40 bits of the timebase, leaving the lower 24 bits unchanged. This provides a way to modify the timebase for guest migration without disturbing the synchronization of the timebase registers across CPU cores. The kernel rounds up the value given to a multiple of 2^24. Timebase values stored in KVM structures (struct kvm_vcpu, struct kvmppc_vcore, etc.) are stored as host timebase values. The timebase values in the dispatch trace log need to be guest timebase values, however, since that is read directly by the guest. This moves the setting of vcpu->arch.dec_expires on guest exit to a point after we have restored the host timebase so that vcpu->arch.dec_expires is a host timebase value. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Michael Neuling authored
This reserves space in get/set_one_reg ioctl for the extra guest state needed for POWER8. It doesn't implement these at all, it just reserves them so that the ABI is defined now. A few things to note here: - This add *a lot* state for transactional memory. TM suspend mode, this is unavoidable, you can't simply roll back all transactions and store only the checkpointed state. I've added this all to get/set_one_reg (including GPRs) rather than creating a new ioctl which returns a struct kvm_regs like KVM_GET_REGS does. This means we if we need to extract the TM state, we are going to need a bucket load of IOCTLs. Hopefully most of the time this will not be needed as we can look at the MSR to see if TM is active and only grab them when needed. If this becomes a bottle neck in future we can add another ioctl to grab all this state in one go. - The TM state is offset by 0x80000000. - For TM, I've done away with VMX and FP and created a single 64x128 bit VSX register space. - I've left a space of 1 (at 0x9c) since Paulus needs to add a value which applies to POWER7 as well. Signed-off-by:
Michael Neuling <mikey@neuling.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Oct 16, 2013
-
-
Christoffer Dall authored
Some strange character leaped into the documentation, which makes git-send-email behave quite strangely. Get rid of this before it bites anyone else. Cc: Anup Patel <anup.patel@linaro.org> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
- Oct 02, 2013
-
-
Anup Patel authored
To implement CPU=Host we have added KVM_ARM_PREFERRED_TARGET vm ioctl which provides information to user space required for creating VCPU matching underlying Host. This patch adds info related to this new KVM_ARM_PREFERRED_TARGET vm ioctl in the KVM API documentation. Signed-off-by:
Anup Patel <anup.patel@linaro.org> Signed-off-by:
Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
- Jul 25, 2013
-
-
Masanari Iida authored
Correct typo (double words) in documentations. Signed-off-by:
Masanari Iida <standby24x7@gmail.com> Acked-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- Jun 19, 2013
-
-
Alexey Kardashevskiy authored
Signed-off-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jun 12, 2013
-
-
Marc Zyngier authored
Unsurprisingly, the arm64 userspace API is extremely similar to the 32bit one, the only significant difference being the ONE_REG register mapping. Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com>
-
- Jun 05, 2013
-
-
Stefan Huber authored
Corrected the word appropariate to appropriate. Signed-off-by:
Stefan Huber <steffhip@googlemail.com> Acked-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- May 28, 2013
-
-
Anatol Pomozov authored
Signed-off-by:
Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- May 02, 2013
-
-
Paul Mackerras authored
This adds the API for userspace to instantiate an XICS device in a VM and connect VCPUs to it. The API consists of a new device type for the KVM_CREATE_DEVICE ioctl, a new capability KVM_CAP_IRQ_XICS, which functions similarly to KVM_CAP_IRQ_MPIC, and the KVM_IRQ_LINE ioctl, which is used to assert and deassert interrupt inputs of the XICS. The XICS device has one attribute group, KVM_DEV_XICS_GRP_SOURCES. Each attribute within this group corresponds to the state of one interrupt source. The attribute number is the same as the interrupt source number. This does not support irq routing or irqfd yet. Signed-off-by:
Paul Mackerras <paulus@samba.org> Acked-by:
David Gibson <david@gibson.dropbear.id.au> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Apr 29, 2013
-
-
Christoffer Dall authored
Unless I'm mistaken, the size field was encoded 4 bits off and a wrong value was used for 64-bit FP registers. Signed-off-by:
Christoffer Dall <cdall@cs.columbia.edu>
-
- Apr 26, 2013
-
-
Paul Mackerras authored
This adds the ability for userspace to save and restore the state of the XICS interrupt presentation controllers (ICPs) via the KVM_GET/SET_ONE_REG interface. Since there is one ICP per vcpu, we simply define a new 64-bit register in the ONE_REG space for the ICP state. The state includes the CPU priority setting, the pending IPI priority, and the priority and source number of any pending external interrupt. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Michael Ellerman authored
For pseries machine emulation, in order to move the interrupt controller code to the kernel, we need to intercept some RTAS calls in the kernel itself. This adds an infrastructure to allow in-kernel handlers to be registered for RTAS services by name. A new ioctl, KVM_PPC_RTAS_DEFINE_TOKEN, then allows userspace to associate token values with those service names. Then, when the guest requests an RTAS service with one of those token values, it will be handled by the relevant in-kernel handler rather than being passed up to userspace as at present. Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Paul Mackerras <paulus@samba.org> [agraf: fix warning] Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
Enabling this capability connects the vcpu to the designated in-kernel MPIC. Using explicit connections between vcpus and irqchips allows for flexibility, but the main benefit at the moment is that it simplifies the code -- KVM doesn't need vm-global state to remember which MPIC object is associated with this vm, and it doesn't need to care about ordering between irqchip creation and vcpu creation. Signed-off-by:
Scott Wood <scottwood@freescale.com> [agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu] Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Scott Wood authored
Currently, devices that are emulated inside KVM are configured in a hardcoded manner based on an assumption that any given architecture only has one way to do it. If there's any need to access device state, it is done through inflexible one-purpose-only IOCTLs (e.g. KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is cumbersome and depletes a limited numberspace. This API provides a mechanism to instantiate a device of a certain type, returning an ID that can be used to set/get attributes of the device. Attributes may include configuration parameters (e.g. register base address), device state, operational commands, etc. It is similar to the ONE_REG API, except that it acts on devices rather than vcpus. Both device types and individual attributes can be tested without having to create the device or get/set the attribute, without the need for separately managing enumerated capabilities. Signed-off-by:
Scott Wood <scottwood@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Mihai Caraman authored
EPTCFG register defined by E.PT is accessed unconditionally by Linux guests in the presence of MAV 2.0. Emulate it now. Signed-off-by:
Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Mihai Caraman authored
Add support for TLBnPS registers available in MMU Architecture Version (MAV) 2.0. Signed-off-by:
Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Mihai Caraman authored
MMU registers were exposed to user-space using sregs interface. Add them to ONE_REG interface using kvmppc_get_one_reg/kvmppc_set_one_reg delegation mechanism. Signed-off-by:
Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Mar 22, 2013
-
-
Bharat Bhushan authored
If userspace wants to change some specific bits of TSR (timer status register) then it uses GET/SET_SREGS ioctl interface. So the steps will be: i) user-space will make get ioctl, ii) change TSR in userspace iii) then make set ioctl. It can happen that TSR gets changed by kernel after step i) and before step iii). To avoid this we have added below one_reg ioctls for oring and clearing specific bits in TSR. This patch adds one registerface for: 1) setting specific bit in TSR (timer status register) 2) clearing specific bit in TSR (timer status register) 3) setting/getting the TCR register. There are cases where we want to only change TCR and not TSR. Although we can uses SREGS without KVM_SREGS_E_UPDATE_TSR flag but I think one reg is better. I am open if someone feels we should use SREGS only here. 4) getting/setting TSR register Signed-off-by:
Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Mar 05, 2013
-
-
Cornelia Huck authored
Enhance KVM_IOEVENTFD with a new flag that allows to attach to virtio-ccw devices on s390 via the KVM_VIRTIO_CCW_NOTIFY_BUS. Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
- Feb 11, 2013
-
-
Christoffer Dall authored
User space defines the model to emulate to a guest and should therefore decide which addresses are used for both the virtual CPU interface directly mapped in the guest physical address space and for the emulated distributor interface, which is mapped in software by the in-kernel VGIC support. Reviewed-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com>
-
Christoffer Dall authored
On ARM some bits are specific to the model being emulated for the guest and user space needs a way to tell the kernel about those bits. An example is mmio device base addresses, where KVM must know the base address for a given device to properly emulate mmio accesses within a certain address range or directly map a device with virtualiation extensions into the guest address space. We make this API ARM-specific as we haven't yet reached a consensus for a generic API for all KVM architectures that will allow us to do something like this. Reviewed-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com>
-
- Feb 06, 2013
-
-
Geoff Levand authored
Signed-off-by:
Geoff Levand <geoff@infradead.org> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
- Feb 05, 2013
-
-
Takuya Yoshikawa authored
As Xiao pointed out, there are a few problems with it: - kvm_arch_commit_memory_region() write protects the memory slot only for GET_DIRTY_LOG when modifying the flags. - FNAME(sync_page) uses the old spte value to set a new one without checking KVM_MEM_READONLY flag. Since we flush all shadow pages when creating a new slot, the simplest fix is to disallow such problematic flag changes: this is safe because no one is doing such things. Reviewed-by:
Gleb Natapov <gleb@redhat.com> Signed-off-by:
Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
- Jan 23, 2013
-
-
Marc Zyngier authored
Implement the PSCI specification (ARM DEN 0022A) to control virtual CPUs being "powered" on or off. PSCI/KVM is detected using the KVM_CAP_ARM_PSCI capability. A virtual CPU can now be initialized in a "powered off" state, using the KVM_ARM_VCPU_POWER_OFF feature flag. The guest can use either SMC or HVC to execute a PSCI function. Reviewed-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com>
-
Rusty Russell authored
We use space #18 for floating point regs. Reviewed-by:
Will Deacon <will.deacon@arm.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com>
-
Christoffer Dall authored
The Cache Size Selection Register (CSSELR) selects the current Cache Size ID Register (CCSIDR). You write which cache you are interested in to CSSELR, and read the information out of CCSIDR. Which cache numbers are valid is known by reading the Cache Level ID Register (CLIDR). To export this state to userspace, we add a KVM_REG_ARM_DEMUX numberspace (17), which uses 8 bits to represent which register is being demultiplexed (0 for CCSIDR), and the lower 8 bits to represent this demultiplexing (in our case, the CSSELR value, which is 4 bits). Reviewed-by:
Will Deacon <will.deacon@arm.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com>
-
Christoffer Dall authored
The following three ioctls are implemented: - KVM_GET_REG_LIST - KVM_GET_ONE_REG - KVM_SET_ONE_REG Now we have a table for all the cp15 registers, we can drive a generic API. The register IDs carry the following encoding: ARM registers are mapped using the lower 32 bits. The upper 16 of that is the register group type, or coprocessor number: ARM 32-bit CP15 registers have the following id bit patterns: 0x4002 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3> ARM 64-bit CP15 registers have the following id bit patterns: 0x4003 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3> For futureproofing, we need to tell QEMU about the CP15 registers the host lets the guest access. It will need this information to restore a current guest on a future CPU or perhaps a future KVM which allow some of these to be changed. We use a separate table for these, as they're only for the userspace API. Reviewed-by:
Will Deacon <will.deacon@arm.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com>
-
Christoffer Dall authored
All interrupt injection is now based on the VM ioctl KVM_IRQ_LINE. This works semantically well for the GIC as we in fact raise/lower a line on a machine component (the gic). The IOCTL uses the follwing struct. struct kvm_irq_level { union { __u32 irq; /* GSI */ __s32 status; /* not used for KVM_IRQ_LEVEL */ }; __u32 level; /* 0 or 1 */ }; ARM can signal an interrupt either at the CPU level, or at the in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to use PPIs designated for specific cpus. The irq field is interpreted like this: bits: | 31 ... 24 | 23 ... 16 | 15 ... 0 | field: | irq_type | vcpu_index | irq_number | The irq_type field has the following values: - irq_type[0]: out-of-kernel GIC: irq_number 0 is IRQ, irq_number 1 is FIQ - irq_type[1]: in-kernel GIC: SPI, irq_number between 32 and 1019 (incl.) (the vcpu_index field is ignored) - irq_type[2]: in-kernel GIC: PPI, irq_number between 16 and 31 (incl.) The irq_number thus corresponds to the irq ID in as in the GICv2 specs. This is documented in Documentation/kvm/api.txt. Reviewed-by:
Will Deacon <will.deacon@arm.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com>
-
Christoffer Dall authored
Targets KVM support for Cortex A-15 processors. Contains all the framework components, make files, header files, some tracing functionality, and basic user space API. Only supported core is Cortex-A15 for now. Most functionality is in arch/arm/kvm/* or arch/arm/include/asm/kvm_*.h. Reviewed-by:
Will Deacon <will.deacon@arm.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <c.dall@virtualopensystems.com>
-
- Jan 10, 2013
-
-
Alexander Graf authored
We need to be able to read and write the contents of the EPR register from user space. This patch implements that logic through the ONE_REG API and declares its (never implemented) SREGS counterpart as deprecated. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Alexander Graf authored
The External Proxy Facility in FSL BookE chips allows the interrupt controller to automatically acknowledge an interrupt as soon as a core gets its pending external interrupt delivered. Today, user space implements the interrupt controller, so we need to check on it during such a cycle. This patch implements logic for user space to enable EPR exiting, disable EPR exiting and EPR exiting itself, so that user space can acknowledge an interrupt when an external interrupt has successfully been delivered into the guest vcpu. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Mihai Caraman authored
Reflect the uapi folder change in SREGS API documentation. Signed-off-by:
Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by:
Amos Kong <kongjianjun@gmail.com> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Jan 07, 2013
-
-
Cornelia Huck authored
Add a new capability, KVM_CAP_S390_CSS_SUPPORT, which will pass intercepts for channel I/O instructions to userspace. Only I/O instructions interacting with I/O interrupts need to be handled in-kernel: - TEST PENDING INTERRUPTION (tpi) dequeues and stores pending interrupts entirely in-kernel. - TEST SUBCHANNEL (tsch) dequeues pending interrupts in-kernel and exits via KVM_EXIT_S390_TSCH to userspace for subchannel- related processing. Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
Cornelia Huck authored
Make s390 support KVM_ENABLE_CAP. Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Acked-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-