Skip to content
  1. May 24, 2019
    • Paolo Bonzini's avatar
      KVM: x86/pmu: mask the result of rdpmc according to the width of the counters · 0e6f467e
      Paolo Bonzini authored
      
      
      This patch will simplify the changes in the next, by enforcing the
      masking of the counters to RDPMC and RDMSR.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0e6f467e
    • Borislav Petkov's avatar
      x86/kvm/pmu: Set AMD's virt PMU version to 1 · a80c4ec1
      Borislav Petkov authored
      
      
      After commit:
      
        672ff6cf ("KVM: x86: Raise #GP when guest vCPU do not support PMU")
      
      my AMD guests started #GPing like this:
      
        general protection fault: 0000 [#1] PREEMPT SMP
        CPU: 1 PID: 4355 Comm: bash Not tainted 5.1.0-rc6+ #3
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
        RIP: 0010:x86_perf_event_update+0x3b/0xa0
      
      with Code: pointing to RDPMC. It is RDPMC because the guest has the
      hardware watchdog CONFIG_HARDLOCKUP_DETECTOR_PERF enabled which uses
      perf. Instrumenting kvm_pmu_rdpmc() some, showed that it fails due to:
      
        if (!pmu->version)
        	return 1;
      
      which the above commit added. Since AMD's PMU leaves the version at 0,
      that causes the #GP injection into the guest.
      
      Set pmu->version arbitrarily to 1 and move it above the non-applicable
      struct kvm_pmu members.
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Cc: kvm@vger.kernel.org
      Cc: Liran Alon <liran.alon@oracle.com>
      Cc: Mihai Carabas <mihai.carabas@oracle.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: x86@kernel.org
      Cc: stable@vger.kernel.org
      Fixes: 672ff6cf ("KVM: x86: Raise #GP when guest vCPU do not support PMU")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a80c4ec1
    • Paolo Bonzini's avatar
      KVM: x86: do not spam dmesg with VMCS/VMCB dumps · 6f2f8453
      Paolo Bonzini authored
      
      
      Userspace can easily set up invalid processor state in such a way that
      dmesg will be filled with VMCS or VMCB dumps.  Disable this by default
      using a module parameter.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6f2f8453
    • Peter Xu's avatar
      kvm: Check irqchip mode before assign irqfd · 654f1f13
      Peter Xu authored
      
      
      When assigning kvm irqfd we didn't check the irqchip mode but we allow
      KVM_IRQFD to succeed with all the irqchip modes.  However it does not
      make much sense to create irqfd even without the kernel chips.  Let's
      provide a arch-dependent helper to check whether a specific irqfd is
      allowed by the arch.  At least for x86, it should make sense to check:
      
      - when irqchip mode is NONE, all irqfds should be disallowed, and,
      
      - when irqchip mode is SPLIT, irqfds that are with resamplefd should
        be disallowed.
      
      For either of the case, previously we'll silently ignore the irq or
      the irq ack event if the irqchip mode is incorrect.  However that can
      cause misterious guest behaviors and it can be hard to triage.  Let's
      fail KVM_IRQFD even earlier to detect these incorrect configurations.
      
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Radim Krčmář <rkrcmar@redhat.com>
      CC: Alex Williamson <alex.williamson@redhat.com>
      CC: Eduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      654f1f13
    • Suthikulpanit, Suravee's avatar
      kvm: svm/avic: fix off-by-one in checking host APIC ID · c9bcd3e3
      Suthikulpanit, Suravee authored
      
      
      Current logic does not allow VCPU to be loaded onto CPU with
      APIC ID 255. This should be allowed since the host physical APIC ID
      field in the AVIC Physical APIC table entry is an 8-bit value,
      and APIC ID 255 is valid in system with x2APIC enabled.
      Instead, do not allow VCPU load if the host APIC ID cannot be
      represented by an 8-bit value.
      
      Also, use the more appropriate AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK
      instead of AVIC_MAX_PHYSICAL_ID_COUNT.
      
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c9bcd3e3
    • Wanpeng Li's avatar
      KVM: LAPIC: Expose per-vCPU timer_advance_ns to userspace · 16ba3ab4
      Wanpeng Li authored
      
      
      Expose per-vCPU timer_advance_ns to userspace, so it is able to
      query the auto-adjusted value.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      16ba3ab4
    • Wanpeng Li's avatar
      KVM: LAPIC: Fix lapic_timer_advance_ns parameter overflow · 0e6edceb
      Wanpeng Li authored
      
      
      After commit c3941d9e (KVM: lapic: Allow user to disable adaptive tuning of
      timer advancement), '-1' enables adaptive tuning starting from default
      advancment of 1000ns. However, we should expose an int instead of an overflow
      uint module parameter.
      
      Before patch:
      
      /sys/module/kvm/parameters/lapic_timer_advance_ns:4294967295
      
      After patch:
      
      /sys/module/kvm/parameters/lapic_timer_advance_ns:-1
      
      Fixes: c3941d9e (KVM: lapic: Allow user to disable adaptive tuning of timer advancement)
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0e6edceb
    • Yi Wang's avatar
      kvm: vmx: Fix -Wmissing-prototypes warnings · 4d259965
      Yi Wang authored
      
      
      We get a warning when build kernel W=1:
      arch/x86/kvm/vmx/vmx.c:6365:6: warning: no previous prototype for ‘vmx_update_host_rsp’ [-Wmissing-prototypes]
       void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
      
      Add the missing declaration to fix this.
      
      Signed-off-by: default avatarYi Wang <wang.yi59@zte.com.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4d259965
    • Wanpeng Li's avatar
      KVM: nVMX: Fix using __this_cpu_read() in preemptible context · 541e886f
      Wanpeng Li authored
      
      
       BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/4590
        caller is nested_vmx_enter_non_root_mode+0xebd/0x1790 [kvm_intel]
        CPU: 4 PID: 4590 Comm: qemu-system-x86 Tainted: G           OE     5.1.0-rc4+ #1
        Call Trace:
         dump_stack+0x67/0x95
         __this_cpu_preempt_check+0xd2/0xe0
         nested_vmx_enter_non_root_mode+0xebd/0x1790 [kvm_intel]
         nested_vmx_run+0xda/0x2b0 [kvm_intel]
         handle_vmlaunch+0x13/0x20 [kvm_intel]
         vmx_handle_exit+0xbd/0x660 [kvm_intel]
         kvm_arch_vcpu_ioctl_run+0xa2c/0x1e50 [kvm]
         kvm_vcpu_ioctl+0x3ad/0x6d0 [kvm]
         do_vfs_ioctl+0xa5/0x6e0
         ksys_ioctl+0x6d/0x80
         __x64_sys_ioctl+0x1a/0x20
         do_syscall_64+0x6f/0x6c0
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Accessing per-cpu variable should disable preemption, this patch extends the
      preemption disable region for __this_cpu_read().
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Fixes: 52017608 ("KVM: nVMX: add option to perform early consistency checks via H/W")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      541e886f
    • Jim Mattson's avatar
      kvm: x86: Include CPUID leaf 0x8000001e in kvm's supported CPUID · 382409b4
      Jim Mattson authored
      
      
      Kvm now supports extended CPUID functions through 0x8000001f.  CPUID
      leaf 0x8000001e is AMD's Processor Topology Information leaf. This
      contains similar information to CPUID leaf 0xb (Intel's Extended
      Topology Enumeration leaf), and should be included in the output of
      KVM_GET_SUPPORTED_CPUID, even though userspace is likely to override
      some of this information based upon the configuration of the
      particular VM.
      
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Borislav Petkov <bp@suse.de>
      Fixes: 8765d753 ("KVM: X86: Extend CPUID range to include new leaf")
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarMarc Orr <marcorr@google.com>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      382409b4
    • Jim Mattson's avatar
      kvm: x86: Include multiple indices with CPUID leaf 0x8000001d · 32a243df
      Jim Mattson authored
      
      
      Per the APM, "CPUID Fn8000_001D_E[D,C,B,A]X reports cache topology
      information for the cache enumerated by the value passed to the
      instruction in ECX, referred to as Cache n in the following
      description. To gather information for all cache levels, software must
      repeatedly execute CPUID with 8000_001Dh in EAX and ECX set to
      increasing values beginning with 0 until a value of 00h is returned in
      the field CacheType (EAX[4:0]) indicating no more cache descriptions
      are available for this processor."
      
      The termination condition is the same as leaf 4, so we can reuse that
      code block for leaf 0x8000001d.
      
      Fixes: 8765d753 ("KVM: X86: Extend CPUID range to include new leaf")
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Borislav Petkov <bp@suse.de>
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarMarc Orr <marcorr@google.com>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      32a243df
    • Sean Christopherson's avatar
      KVM: nVMX: Clear nested_run_pending if setting nested state fails · 21be4ca1
      Sean Christopherson authored
      
      
      VMX's nested_run_pending flag is subtly consumed when stuffing state to
      enter guest mode, i.e. needs to be set according before KVM knows if
      setting guest state is successful.  If setting guest state fails, clear
      the flag as a nested run is obviously not pending.
      
      Reported-by: default avatarAaron Lewis <aaronlewis@google.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      21be4ca1
    • Paolo Bonzini's avatar
      KVM: nVMX: really fix the size checks on KVM_SET_NESTED_STATE · db80927e
      Paolo Bonzini authored
      
      
      The offset for reading the shadow VMCS is sizeof(*kvm_state)+VMCS12_SIZE,
      so the correct size must be that plus sizeof(*vmcs12).  This could lead
      to KVM reading garbage data from userspace and not reporting an error,
      but is otherwise not sensitive.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      db80927e
    • James Morse's avatar
      KVM: arm/arm64: Move cc/it checks under hyp's Makefile to avoid instrumentation · 623e1528
      James Morse authored
      
      
      KVM has helpers to handle the condition codes of trapped aarch32
      instructions. These are marked __hyp_text and used from HYP, but they
      aren't built by the 'hyp' Makefile, which has all the runes to avoid ASAN
      and KCOV instrumentation.
      
      Move this code to a new hyp/aarch32.c to avoid a hyp-panic when starting
      an aarch32 guest on a host built with the ASAN/KCOV debug options.
      
      Fixes: 021234ef ("KVM: arm64: Make kvm_condition_valid32() accessible from EL2")
      Fixes: 8cebe750 ("arm64: KVM: Make kvm_skip_instr32 available to HYP")
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      623e1528
    • James Morse's avatar
      KVM: arm64: Move pmu hyp code under hyp's Makefile to avoid instrumentation · b7c50fab
      James Morse authored
      
      
      KVM's pmu.c contains the __hyp_text needed to switch the pmu registers
      between host and guest. Because this isn't covered by the 'hyp' Makefile,
      it can be built with kasan and friends when these are enabled in Kconfig.
      
      When starting a guest, this results in:
      | Kernel panic - not syncing: HYP panic:
      | PS:a00003c9 PC:000083000028ada0 ESR:86000007
      | FAR:000083000028ada0 HPFAR:0000000029df5300 PAR:0000000000000000
      | VCPU:000000004e10b7d6
      | CPU: 0 PID: 3088 Comm: qemu-system-aar Not tainted 5.2.0-rc1 #11026
      | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Plat
      | Call trace:
      |  dump_backtrace+0x0/0x200
      |  show_stack+0x20/0x30
      |  dump_stack+0xec/0x158
      |  panic+0x1ec/0x420
      |  panic+0x0/0x420
      | SMP: stopping secondary CPUs
      | Kernel Offset: disabled
      | CPU features: 0x002,25006082
      | Memory Limit: none
      | ---[ end Kernel panic - not syncing: HYP panic:
      
      This is caused by functions in pmu.c calling the instrumented
      code, which isn't mapped to hyp. From objdump -r:
      | RELOCATION RECORDS FOR [.hyp.text]:
      | OFFSET           TYPE              VALUE
      | 0000000000000010 R_AARCH64_CALL26  __sanitizer_cov_trace_pc
      | 0000000000000018 R_AARCH64_CALL26  __asan_load4_noabort
      | 0000000000000024 R_AARCH64_CALL26  __asan_load4_noabort
      
      Move the affected code to a new file under 'hyp's Makefile.
      
      Fixes: 3d91befb ("arm64: KVM: Enable !VHE support for :G/:H perf event modifiers")
      Cc: Andrew Murray <Andrew.Murray@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      b7c50fab
  2. May 20, 2019
  3. May 18, 2019
  4. May 17, 2019
Loading