Skip to content
  1. Feb 08, 2021
  2. Jan 16, 2021
    • Atish Patra's avatar
      RISC-V: Fix maximum allowed phsyical memory for RV32 · e5577937
      Atish Patra authored
      
      
      Linux kernel can only map 1GB of address space for RV32 as the page offset
      is set to 0xC0000000. The current description in the Kconfig is confusing
      as it indicates that RV32 can support 2GB of physical memory. That is
      simply not true for current kernel. In future, a 2GB split support can be
      added to allow 2GB physical address space.
      
      Reviewed-by: default avatarAnup Patel <anup@brainfault.org>
      Signed-off-by: default avatarAtish Patra <atish.patra@wdc.com>
      Signed-off-by: default avatarPalmer Dabbelt <palmerdabbelt@google.com>
      e5577937
    • Atish Patra's avatar
      RISC-V: Set current memblock limit · abb8e86b
      Atish Patra authored
      
      
      Currently, linux kernel can not use last 4k bytes of addressable space
      because IS_ERR_VALUE macro treats those as an error. This will be an issue
      for RV32 as any memblock allocator potentially allocate chunk of memory
      from the end of DRAM (2GB) leading bad address error even though the
      address was technically valid.
      
      Fix this issue by limiting the memblock if available memory spans the
      entire address space.
      
      Reviewed-by: default avatarAnup Patel <anup@brainfault.org>
      Signed-off-by: default avatarAtish Patra <atish.patra@wdc.com>
      Signed-off-by: default avatarPalmer Dabbelt <palmerdabbelt@google.com>
      abb8e86b
    • Atish Patra's avatar
      RISC-V: Do not allocate memblock while iterating reserved memblocks · 797f0375
      Atish Patra authored
      
      
      Currently, resource tree allocates memory blocks while iterating on the
      list. It leads to following kernel warning because memblock allocation
      also invokes memory block reservation API.
      
      [    0.000000] ------------[ cut here ]------------
      [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/resource.c:795
      __insert_resource+0x8e/0xd0
      [    0.000000] Modules linked in:
      [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
      5.10.0-00022-ge20097fb37e2-dirty #549
      [    0.000000] epc: c00125c2 ra : c001262c sp : c1c01f50
      [    0.000000]  gp : c1d456e0 tp : c1c0a980 t0 : ffffcf20
      [    0.000000]  t1 : 00000000 t2 : 00000000 s0 : c1c01f60
      [    0.000000]  s1 : ffffcf00 a0 : ffffff00 a1 : c1c0c0c4
      [    0.000000]  a2 : 80c12b15 a3 : 80402000 a4 : 80402000
      [    0.000000]  a5 : c1c0c0c4 a6 : 80c12b15 a7 : f5faf600
      [    0.000000]  s2 : c1c0c0c4 s3 : c1c0e000 s4 : c1009a80
      [    0.000000]  s5 : c1c0c000 s6 : c1d48000 s7 : c1613b4c
      [    0.000000]  s8 : 00000fff s9 : 80000200 s10: c1613b40
      [    0.000000]  s11: 00000000 t3 : c1d4a000 t4 : ffffffff
      
      This is also unnecessary as we can pre-compute the total memblocks required
      for each memory region and allocate it before the loop. It save precious
      boot time not going through memblock allocation code every time.
      
      Fixes: 00ab027a3b82 ("RISC-V: Add kernel image sections to the resource tree")
      
      Reviewed-by: default avatarAnup Patel <anup@brainfault.org>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAtish Patra <atish.patra@wdc.com>
      Signed-off-by: default avatarPalmer Dabbelt <palmerdabbelt@google.com>
      797f0375
  3. Jan 15, 2021
  4. Jan 14, 2021
  5. Jan 13, 2021
  6. Jan 12, 2021
    • Catalin Marinas's avatar
      arm64: Remove arm64_dma32_phys_limit and its uses · d78050ee
      Catalin Marinas authored
      
      
      With the introduction of a dynamic ZONE_DMA range based on DT or IORT
      information, there's no need for CMA allocations from the wider
      ZONE_DMA32 since on most platforms ZONE_DMA will cover the 32-bit
      addressable range. Remove the arm64_dma32_phys_limit and set
      arm64_dma_phys_limit to cover the smallest DMA range required on the
      platform. CMA allocation and crashkernel reservation now go in the
      dynamically sized ZONE_DMA, allowing correct functionality on RPi4.
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Zhou <chenzhou10@huawei.com>
      Reviewed-by: default avatarNicolas Saenz Julienne <nsaenzjulienne@suse.de>
      Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> # On RPi4B
      d78050ee
    • Ariel Marcovitch's avatar
      powerpc: Fix alignment bug within the init sections · 2225a8dd
      Ariel Marcovitch authored
      
      
      This is a bug that causes early crashes in builds with an .exit.text
      section smaller than a page and an .init.text section that ends in the
      beginning of a physical page (this is kinda random, which might
      explain why this wasn't really encountered before).
      
      The init sections are ordered like this:
        .init.text
        .exit.text
        .init.data
      
      Currently, these sections aren't page aligned.
      
      Because the init code might become read-only at runtime and because
      the .init.text section can potentially reside on the same physical
      page as .init.data, the beginning of .init.data might be mapped
      read-only along with .init.text.
      
      Then when the kernel tries to modify a variable in .init.data (like
      kthreadd_done, used in kernel_init()) the kernel panics.
      
      To avoid this, make _einittext page aligned and also align .exit.text
      to make sure .init.data is always seperated from the text segments.
      
      Fixes: 060ef9d8 ("powerpc32: PAGE_EXEC required for inittext")
      Signed-off-by: default avatarAriel Marcovitch <ariel.marcovitch@gmail.com>
      Reviewed-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210102201156.10805-1-ariel.marcovitch@gmail.com
      2225a8dd
  7. Jan 09, 2021
  8. Jan 08, 2021
  9. Jan 07, 2021
    • Tom Lendacky's avatar
      KVM: SVM: Add support for booting APs in an SEV-ES guest · 647daca2
      Tom Lendacky authored
      
      
      Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence,
      where the guest vCPU register state is updated and then the vCPU is VMRUN
      to begin execution of the AP. For an SEV-ES guest, this won't work because
      the guest register state is encrypted.
      
      Following the GHCB specification, the hypervisor must not alter the guest
      register state, so KVM must track an AP/vCPU boot. Should the guest want
      to park the AP, it must use the AP Reset Hold exit event in place of, for
      example, a HLT loop.
      
      First AP boot (first INIT-SIPI-SIPI sequence):
        Execute the AP (vCPU) as it was initialized and measured by the SEV-ES
        support. It is up to the guest to transfer control of the AP to the
        proper location.
      
      Subsequent AP boot:
        KVM will expect to receive an AP Reset Hold exit event indicating that
        the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to
        awaken it. When the AP Reset Hold exit event is received, KVM will place
        the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI
        sequence, KVM will make the vCPU runnable. It is again up to the guest
        to then transfer control of the AP to the proper location.
      
        To differentiate between an actual HLT and an AP Reset Hold, a new MP
        state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is
        placed in upon receiving the AP Reset Hold exit event. Additionally, to
        communicate the AP Reset Hold exit event up to userspace (if needed), a
        new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD.
      
      A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order
      to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the
      original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds
      a new function that, for non SEV-ES guests, invokes the original SIPI
      delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests,
      implements the logic above.
      
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      647daca2
    • Maxim Levitsky's avatar
      KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit · f2c7ef3b
      Maxim Levitsky authored
      
      
      It is possible to exit the nested guest mode, entered by
      svm_set_nested_state prior to first vm entry to it (e.g due to pending event)
      if the nested run was not pending during the migration.
      
      In this case we must not switch to the nested msr permission bitmap.
      Also add a warning to catch similar cases in the future.
      
      Fixes: a7d5c7ce ("KVM: nSVM: delay MSR permission processing to first nested VM run")
      
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210107093854.882483-2-mlevitsk@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f2c7ef3b
    • Maxim Levitsky's avatar
      KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode · 56fe28de
      Maxim Levitsky authored
      
      
      We overwrite most of vmcb fields while doing so, so we must
      mark it as dirty.
      
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210107093854.882483-5-mlevitsk@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      56fe28de
    • Maxim Levitsky's avatar
      KVM: nSVM: correctly restore nested_run_pending on migration · 81f76ada
      Maxim Levitsky authored
      
      
      The code to store it on the migration exists, but no code was restoring it.
      
      One of the side effects of fixing this is that L1->L2 injected events
      are no longer lost when migration happens with nested run pending.
      
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210107093854.882483-3-mlevitsk@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      81f76ada
    • Ben Gardon's avatar
      KVM: x86/mmu: Clarify TDP MMU page list invariants · c0dba6e4
      Ben Gardon authored
      
      
      The tdp_mmu_roots and tdp_mmu_pages in struct kvm_arch should only contain
      pages with tdp_mmu_page set to true. tdp_mmu_pages should not contain any
      pages with a non-zero root_count and tdp_mmu_roots should only contain
      pages with a positive root_count, unless a thread holds the MMU lock and
      is in the process of modifying the list. Various functions expect these
      invariants to be maintained, but they are not explictily documented. Add
      to the comments on both fields to document the above invariants.
      
      Signed-off-by: default avatarBen Gardon <bgardon@google.com>
      Message-Id: <20210107001935.3732070-2-bgardon@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c0dba6e4
Loading