- Jun 30, 2013
-
-
Alexander Graf authored
While technically it's legal to write to PIR and have the identifier changed, we don't implement logic to do so because we simply expose vcpu_id to the guest. So instead, let's ignore writes to PIR. This ensures that we don't inject faults into the guest for something the guest is allowed to do. While at it, we cross our fingers hoping that it also doesn't mind that we broke its PIR read values. Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
At present, if the guest creates a valid SLB (segment lookaside buffer) entry with the slbmte instruction, then invalidates it with the slbie instruction, then reads the entry with the slbmfee/slbmfev instructions, the result of the slbmfee will have the valid bit set, even though the entry is not actually considered valid by the host. This is confusing, if not worse. This fixes it by zeroing out the orige and origv fields of the SLB entry structure when the entry is invalidated. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
With this, the guest can use 1TB segments as well as 256MB segments. Since we now have the situation where a single emulated guest segment could correspond to multiple shadow segments (as the shadow segments are still 256MB segments), this adds a new kvmppc_mmu_flush_segment() to scan for all shadow segments that need to be removed. This restructures the guest HPT (hashed page table) lookup code to use the correct hashing and matching functions for HPTEs within a 1TB segment. We use the standard hpt_hash() function instead of open-coding the hash calculation, and we use HPTE_V_COMPARE() with an AVPN value that has the B (segment size) field included. The calculation of avpn is done a little earlier since it doesn't change in the loop starting at the do_second label. The computation in kvmppc_mmu_book3s_64_esid_to_vsid() changes so that it returns a 256MB VSID even if the guest SLB entry is a 1TB entry. This is because the users of this function are creating 256MB SLB entries. We set a new VSID_1T flag so that entries created from 1T segments don't collide with entries from 256MB segments. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
The loop in kvmppc_mmu_book3s_64_xlate() that looks up a translation in the guest hashed page table (HPT) keeps going if it finds an HPTE that matches but doesn't allow access. This is incorrect; it is different from what the hardware does, and there should never be more than one matching HPTE anyway. This fixes it to stop when any matching HPTE is found. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
On entering a PR KVM guest, we invalidate the whole SLB before loading up the guest entries. We do this using an slbia instruction, which invalidates all entries except entry 0, followed by an slbie to invalidate entry 0. However, the slbie turns out to be ineffective in some circumstances (specifically when the host linear mapping uses 64k pages) because of errors in computing the parameter to the slbie. The result is that the guest kernel hangs very early in boot because it takes a DSI the first time it tries to access kernel data using a linear mapping address in real mode. Currently we construct bits 36 - 43 (big-endian numbering) of the slbie parameter by taking bits 56 - 63 of the SLB VSID doubleword. These bits for the tlbie are C (class, 1 bit), B (segment size, 2 bits) and 5 reserved bits. For the SLB VSID doubleword these are C (class, 1 bit), reserved (1 bit), LP (large page size, 2 bits), and 4 reserved bits. Thus we are not setting the B field correctly, and when LP = 01 as it is for 64k pages, we are setting a reserved bit. Rather than add more instructions to calculate the slbie parameter correctly, this takes a simpler approach, which is to set entry 0 to zeroes explicitly. Normally slbmte should not be used to invalidate an entry, since it doesn't invalidate the ERATs, but it is OK to use it to invalidate an entry if it is immediately followed by slbia, which does invalidate the ERATs. (This has been confirmed with the Power architects.) This approach takes fewer instructions and will work whatever the contents of entry 0. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Paul Mackerras authored
This makes sure the calculation of the proto-VSIDs used by PR KVM is done with 64-bit arithmetic. Since vcpu3s->context_id[] is int, when we do vcpu3s->context_id[0] << ESID_BITS the shift will be done with 32-bit instructions, possibly leading to significant bits getting lost, as the context id can be up to 524283 and ESID_BITS is 18. To fix this we cast the context id to u64 before shifting. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Alexander Graf <agraf@suse.de>
-
Tiejun Chen authored
Availablity of the doorbell_exception function is guarded by CONFIG_PPC_DOORBELL. Use the same define to guard our caller of it. Signed-off-by:
Tiejun Chen <tiejun.chen@windriver.com> [agraf: improve patch description] Signed-off-by:
Alexander Graf <agraf@suse.de>
-
- Jun 27, 2013
-
-
git://git.linaro.org/people/cdall/linux-kvm-arm.gitGleb Natapov authored
KVM/ARM pull request for 3.11 merge window * tag 'kvm-arm-3.11' of git://git.linaro.org/people/cdall/linux-kvm-arm.git: ARM: kvm: don't include drivers/virtio/Kconfig Update MAINTAINERS: KVM/ARM work now funded by Linaro arm/kvm: Cleanup KVM_ARM_MAX_VCPUS logic ARM: KVM: clear exclusive monitor on all exception returns ARM: KVM: add missing dsb before invalidating Stage-2 TLBs ARM: KVM: perform save/restore of PAR ARM: KVM: get rid of S2_PGD_SIZE ARM: KVM: don't special case PC when doing an MMIO ARM: KVM: use phys_addr_t instead of unsigned long long for HYP PGDs ARM: KVM: remove dead prototype for __kvm_tlb_flush_vmid ARM: KVM: Don't handle PSCI calls via SMC ARM: KVM: Allow host virt timer irq to be different from guest timer virt irq
-
Gleb Natapov authored
This reverts most of the f1ed0450. After the commit kvm_apic_set_irq() no longer returns accurate information about interrupt injection status if injection is done into disabled APIC. RTC interrupt coalescing tracking relies on the information to be accurate and cannot recover if it is not. Signed-off-by:
Gleb Natapov <gleb@redhat.com>
-
Yoshihiro YUNOMAE authored
Add a tracepoint write_tsc_offset for tracing TSC offset change. We want to merge ftrace's trace data of guest OSs and the host OS using TSC for timestamp in chronological order. We need "TSC offset" values for each guest when merge those because the TSC value on a guest is always the host TSC plus guest's TSC offset. If we get the TSC offset values, we can calculate the host TSC value for each guest events from the TSC offset and the event TSC value. The host TSC values of the guest events are used when we want to merge trace data of guests and the host in chronological order. (Note: the trace_clock of both the host and the guest must be set x86-tsc in this case) This tracepoint also records vcpu_id which can be used to merge trace data for SMP guests. A merge tool will read TSC offset for each vcpu, then the tool converts guest TSC values to host TSC values for each vcpu. TSC offset is stored in the VMCS by vmx_write_tsc_offset() or vmx_adjust_tsc_offset(). KVM executes the former function when a guest boots. The latter function is executed when kvm clock is updated. Only host can read TSC offset value from VMCS, so a host needs to output TSC offset value when TSC offset is changed. Since the TSC offset is not often changed, it could be overwritten by other frequent events while tracing. To avoid that, I recommend to use a special instance for getting this event: 1. set a instance before booting a guest # cd /sys/kernel/debug/tracing/instances # mkdir tsc_offset # cd tsc_offset # echo x86-tsc > trace_clock # echo 1 > events/kvm/kvm_write_tsc_offset/enable 2. boot a guest Signed-off-by:
Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Gleb Natapov <gleb@redhat.com>
-
Takuya Yoshikawa authored
Without this information, users will just see unexpected performance problems and there is little chance we will get good reports from them: note that mmio generation is increased even when we just start, or stop, dirty logging for some memory slot, in which case users cannot expect all shadow pages to be zapped. printk_ratelimited() is used for this taking into account the problems that we can see the information many times when we start multiple VMs and guests can trigger this by reading ROM in a loop for example. Signed-off-by:
Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Document it to Documentation/virtual/kvm/mmu.txt Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Document it to Documentation/virtual/kvm/mmu.txt Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Document fast page fault to Documentation/virtual/kvm/mmu.txt Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Document it to Documentation/virtual/kvm/mmu.txt Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Document write_flooding_count to Documentation/virtual/kvm/mmu.txt Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Document it to Documentation/virtual/kvm/mmu.txt Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Drop kvm_mmu_zap_mmio_sptes and use kvm_mmu_invalidate_zap_all_pages instead to handle mmio generation number overflow Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
Gleb Natapov <gleb@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Then it has the chance to trigger mmio generation number wrap-around Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
Gleb Natapov <gleb@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> [Change from MMIO_MAX_GEN - 13 to MMIO_MAX_GEN - 150, because 13 is very close to the number of calls to KVM_SET_USER_MEMORY_REGION before the guest is started and there is any chance to create any spte. - Paolo] Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
It is useful for debug mmio spte invalidation Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
Gleb Natapov <gleb@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
This patch tries to introduce a very simple and scale way to invalidate all mmio sptes - it need not walk any shadow pages and hold mmu-lock KVM maintains a global mmio valid generation-number which is stored in kvm->memslots.generation and every mmio spte stores the current global generation-number into his available bits when it is created When KVM need zap all mmio sptes, it just simply increase the global generation-number. When guests do mmio access, KVM intercepts a MMIO #PF then it walks the shadow page table and get the mmio spte. If the generation-number on the spte does not equal the global generation-number, it will go to the normal #PF handler to update the mmio spte Since 19 bits are used to store generation-number on mmio spte, we zap all mmio sptes when the number is round Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
Gleb Natapov <gleb@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Define some meaningful names instead of raw code Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
Gleb Natapov <gleb@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Xiao Guangrong authored
Store the generation-number into bit3 ~ bit11 and bit52 ~ bit61, totally 19 bits can be used, it should be enough for nearly all most common cases In this patch, the generation-number is always 0, it will be changed in the later patch [Gleb: masking generation bits from spte in get_mmio_spte_gfn() and get_mmio_spte_access()] Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
Gleb Natapov <gleb@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Jun 26, 2013
-
-
Arnd Bergmann authored
The virtio configuration has recently moved and is now visible everywhere. Including the file again from KVM as we used to need earlier now causes dependency problems: warning: (CAIF_VIRTIO && VIRTIO_PCI && VIRTIO_MMIO && REMOTEPROC && RPMSG) selects VIRTIO which has unmet direct dependencies (VIRTUALIZATION) Cc: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Christoffer Dall authored
Change my e-mail address to the Linaro one and change status from Maintained to Supported. Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Geoff Levand authored
Commit d21a1c83 (ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally) changed the Kconfig logic for KVM_ARM_MAX_VCPUS to work around a build error arising from the use of KVM_ARM_MAX_VCPUS when CONFIG_KVM=n. The resulting Kconfig logic is a bit awkward and leaves a KVM_ARM_MAX_VCPUS always defined in the kernel config file. This change reverts the Kconfig logic back and adds a simple preprocessor conditional in kvm_host.h to handle when CONFIG_KVM_ARM_MAX_VCPUS is undefined. Signed-off-by:
Geoff Levand <geoff@infradead.org> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Make sure we clear the exclusive monitor on all exception returns, which otherwise could lead to lock corruptions. Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
When performing a Stage-2 TLB invalidation, it is necessary to make sure the write to the page tables is observable by all CPUs. For this purpose, add a dsb instruction to __kvm_tlb_flush_vmid_ipa before doing the TLB invalidation itself. Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
Not saving PAR is an unfortunate oversight. If the guest performs an AT* operation and gets scheduled out before reading the result of the translation from PAR, it could become corrupted by another guest or the host. Saving this register is made slightly more complicated as KVM also uses it on the permission fault handling path, leading to an ugly "stash and restore" sequence. Fortunately, this is already a slow path so we don't really care. Also, Linux doesn't do any AT* operation, so Linux guests are not impacted by this bug. [ Slightly tweaked to use an even register as first operand to ldrd and strd operations in interrupts_head.S - Christoffer ] Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <christoffer.dall@linaro.org>
-
Marc Zyngier authored
S2_PGD_SIZE defines the number of pages used by a stage-2 PGD and is unused, except for a VM_BUG_ON check that missuses the define. As the check is very unlikely to ever triggered except in circumstances where KVM is the least of our worries, just kill both the define and the VM_BUG_ON check. Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <cdall@cs.columbia.edu>
-
Marc Zyngier authored
Admitedly, reading a MMIO register to load PC is very weird. Writing PC to a MMIO register is probably even worse. But the architecture doesn't forbid any of these, and injecting a Prefetch Abort is the wrong thing to do anyway. Remove this check altogether, and let the adventurous guest wander into LaLaLand if they feel compelled to do so. Reported-by:
Catalin Marinas <catalin.marinas@arm.com> Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <cdall@cs.columbia.edu>
-
Marc Zyngier authored
HYP PGDs are passed around as phys_addr_t, except just before calling into the hypervisor init code, where they are cast to a rather weird unsigned long long. Just keep them around as phys_addr_t, which is what makes the most sense. Reported-by:
Catalin Marinas <catalin.marinas@arm.com> Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <cdall@cs.columbia.edu>
-
Marc Zyngier authored
__kvm_tlb_flush_vmid has been renamed to __kvm_tlb_flush_vmid_ipa, and the old prototype should have been removed when the code was modified. Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <cdall@cs.columbia.edu>
-
Dave P Martin authored
Currently, kvmtool unconditionally declares that HVC should be used to call PSCI, so the function numbers in the DT tell the guest nothing about the function ID namespace or calling convention for SMC. We already assume that the guest will examine and honour the DT, since there is no way it could possibly guess the KVM-specific PSCI function IDs otherwise. So let's not encourage guests to violate what's specified in the DT by using SMC to make the call. [ Modified to apply to top of kvm/arm tree - Christoffer ] Signed-off-by:
Dave P Martin <Dave.Martin@arm.com> Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Christoffer Dall <cdall@cs.columbia.edu>
-
Anup Patel authored
The arch_timer irq numbers (or PPI numbers) are implementation dependent, so the host virtual timer irq number can be different from guest virtual timer irq number. This patch ensures that host virtual timer irq number is read from DTB and guest virtual timer irq is determined based on vcpu target type. Signed-off-by:
Anup Patel <anup.patel@linaro.org> Signed-off-by:
Pranavkumar Sawargaonkar <pranavkumar@linaro.org> Signed-off-by:
Christoffer Dall <cdall@cs.columbia.edu>
-
- Jun 20, 2013
-
-
Xiao Guangrong authored
Let mmio spte only use bit62 and bit63 on upper 32 bits, then bit 52 ~ bit 61 can be used for other purposes Signed-off-by:
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by:
Gleb Natapov <gleb@redhat.com> Reviewed-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
Added some missing validity checks for the operands and fixed the priority of exceptions for some function codes according to the "Principles of Operation" document. Signed-off-by:
Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
LCTL and LCTLG are also privileged instructions, thus there is no need for treating them separately from the other instructions in priv.c. So this patch moves these two instructions to priv.c, adds a check for supervisor state and simplifies the "handle_eb" instruction decoding by merging the two eb_handlers jump tables from intercept.c and priv.c into one table only. Signed-off-by:
Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
When a guest calls the TPI instruction, the second operand address could point to an invalid location. In this case the problem should be signaled to the guest by throwing an access exception. Signed-off-by:
Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
DIAGNOSE is a privileged instruction and thus we must make sure that we are in supervisor mode before taking any other actions. Signed-off-by:
Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-