Skip to content
  1. Feb 22, 2020
  2. Feb 17, 2020
  3. Jan 23, 2020
  4. Jan 19, 2020
    • Mark Rutland's avatar
      KVM: arm/arm64: Correct AArch32 SPSR on exception entry · 1cfbb484
      Mark Rutland authored
      
      
      Confusingly, there are three SPSR layouts that a kernel may need to deal
      with:
      
      (1) An AArch64 SPSR_ELx view of an AArch64 pstate
      (2) An AArch64 SPSR_ELx view of an AArch32 pstate
      (3) An AArch32 SPSR_* view of an AArch32 pstate
      
      When the KVM AArch32 support code deals with SPSR_{EL2,HYP}, it's either
      dealing with #2 or #3 consistently. On arm64 the PSR_AA32_* definitions
      match the AArch64 SPSR_ELx view, and on arm the PSR_AA32_* definitions
      match the AArch32 SPSR_* view.
      
      However, when we inject an exception into an AArch32 guest, we have to
      synthesize the AArch32 SPSR_* that the guest will see. Thus, an AArch64
      host needs to synthesize layout #3 from layout #2.
      
      This patch adds a new host_spsr_to_spsr32() helper for this, and makes
      use of it in the KVM AArch32 support code. For arm64 we need to shuffle
      the DIT bit around, and remove the SS bit, while for arm we can use the
      value as-is.
      
      I've open-coded the bit manipulation for now to avoid having to rework
      the existing PSR_* definitions into PSR64_AA32_* and PSR32_AA32_*
      definitions. I hope to perform a more thorough refactoring in future so
      that we can handle pstate view manipulation more consistently across the
      kernel tree.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Reviewed-by: default avatarAlexandru Elisei <alexandru.elisei@arm.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20200108134324.46500-4-mark.rutland@arm.com
      1cfbb484
    • Mark Rutland's avatar
      KVM: arm/arm64: Correct CPSR on exception entry · 3c2483f1
      Mark Rutland authored
      
      
      When KVM injects an exception into a guest, it generates the CPSR value
      from scratch, configuring CPSR.{M,A,I,T,E}, and setting all other
      bits to zero.
      
      This isn't correct, as the architecture specifies that some CPSR bits
      are (conditionally) cleared or set upon an exception, and others are
      unchanged from the original context.
      
      This patch adds logic to match the architectural behaviour. To make this
      simple to follow/audit/extend, documentation references are provided,
      and bits are configured in order of their layout in SPSR_EL2. This
      layout can be seen in the diagram on ARM DDI 0487E.a page C5-426.
      
      Note that this code is used by both arm and arm64, and is intended to
      fuction with the SPSR_EL2 and SPSR_HYP layouts.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Reviewed-by: default avatarAlexandru Elisei <alexandru.elisei@arm.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20200108134324.46500-3-mark.rutland@arm.com
      3c2483f1
    • Mark Rutland's avatar
      KVM: arm64: Correct PSTATE on exception entry · a425372e
      Mark Rutland authored
      
      
      When KVM injects an exception into a guest, it generates the PSTATE
      value from scratch, configuring PSTATE.{M[4:0],DAIF}, and setting all
      other bits to zero.
      
      This isn't correct, as the architecture specifies that some PSTATE bits
      are (conditionally) cleared or set upon an exception, and others are
      unchanged from the original context.
      
      This patch adds logic to match the architectural behaviour. To make this
      simple to follow/audit/extend, documentation references are provided,
      and bits are configured in order of their layout in SPSR_EL2. This
      layout can be seen in the diagram on ARM DDI 0487E.a page C5-429.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Reviewed-by: default avatarAlexandru Elisei <alexandru.elisei@arm.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20200108134324.46500-2-mark.rutland@arm.com
      a425372e
    • Russell King's avatar
      arm64: kvm: Fix IDMAP overlap with HYP VA · f5523423
      Russell King authored
      
      
      Booting 5.4 on LX2160A reveals that KVM is non-functional:
      
      kvm: Limiting the IPA size due to kernel Virtual Address limit
      kvm [1]: IPA Size Limit: 43bits
      kvm [1]: IDMAP intersecting with HYP VA, unable to continue
      kvm [1]: error initializing Hyp mode: -22
      
      Debugging shows:
      
      kvm [1]: IDMAP page: 81a26000
      kvm [1]: HYP VA range: 0:22ffffffff
      
      as RAM is located at:
      
      80000000-fbdfffff : System RAM
      2080000000-237fffffff : System RAM
      
      Comparing this with the same kernel on Armada 8040 shows:
      
      kvm: Limiting the IPA size due to kernel Virtual Address limit
      kvm [1]: IPA Size Limit: 43bits
      kvm [1]: IDMAP page: 2a26000
      kvm [1]: HYP VA range: 4800000000:493fffffff
      ...
      kvm [1]: Hyp mode initialized successfully
      
      which indicates that hyp_va_msb is set, and is always set to the
      opposite value of the idmap page to avoid the overlap. This does not
      happen with the LX2160A.
      
      Further debugging shows vabits_actual = 39, kva_msb = 38 on LX2160A and
      kva_msb = 33 on Armada 8040. Looking at the bit layout of the HYP VA,
      there is still one bit available for hyp_va_msb. Set this bit
      appropriately. This allows KVM to be functional on the LX2160A, but
      without any HYP VA randomisation:
      
      kvm: Limiting the IPA size due to kernel Virtual Address limit
      kvm [1]: IPA Size Limit: 43bits
      kvm [1]: IDMAP page: 81a24000
      kvm [1]: HYP VA range: 4000000000:62ffffffff
      ...
      kvm [1]: Hyp mode initialized successfully
      
      Fixes: ed57cac8 ("arm64: KVM: Introduce EL2 VA randomisation")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      [maz: small additional cleanups, preserved case where the tag
       is legitimately 0 and we can just use the mask, Fixes tag]
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/E1ilAiY-0000MA-RG@rmk-PC.armlinux.org.uk
      f5523423
    • Christoffer Dall's avatar
      KVM: arm64: Only sign-extend MMIO up to register width · b6ae256a
      Christoffer Dall authored
      
      
      On AArch64 you can do a sign-extended load to either a 32-bit or 64-bit
      register, and we should only sign extend the register up to the width of
      the register as specified in the operation (by using the 32-bit Wn or
      64-bit Xn register specifier).
      
      As it turns out, the architecture provides this decoding information in
      the SF ("Sixty-Four" -- how cute...) bit.
      
      Let's take advantage of this with the usual 32-bit/64-bit header file
      dance and do the right thing on AArch64 hosts.
      
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20191212195055.5541-1-christoffer.dall@arm.com
      b6ae256a
  5. Dec 20, 2019
  6. Dec 12, 2019
  7. Dec 11, 2019
  8. Dec 06, 2019
  9. Dec 05, 2019
  10. Dec 04, 2019
    • Mark Brown's avatar
      arm64: mm: Fix column alignment for UXN in kernel_page_tables · cba779d8
      Mark Brown authored
      
      
      UXN is the only individual PTE bit other than the PTE_ATTRINDX_MASK ones
      which doesn't have both a set and a clear value provided, meaning that the
      columns in the table won't all be aligned. The PTE_ATTRINDX_MASK values
      are all both mutually exclusive and longer so are listed last to make a
      single final column for those values. Ensure everything is aligned by
      providing a clear value for UXN.
      
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      cba779d8
    • Mark Rutland's avatar
      arm64: insn: consistently handle exit text · ca2ef4ff
      Mark Rutland authored
      
      
      A kernel built with KASAN && FTRACE_WITH_REGS && !MODULES, produces a
      boot-time splat in the bowels of ftrace:
      
      | [    0.000000] ftrace: allocating 32281 entries in 127 pages
      | [    0.000000] ------------[ cut here ]------------
      | [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2019 ftrace_bug+0x27c/0x328
      | [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-rc3-00008-g7f08ae53a7e3 #13
      | [    0.000000] Hardware name: linux,dummy-virt (DT)
      | [    0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO)
      | [    0.000000] pc : ftrace_bug+0x27c/0x328
      | [    0.000000] lr : ftrace_init+0x640/0x6cc
      | [    0.000000] sp : ffffa000120e7e00
      | [    0.000000] x29: ffffa000120e7e00 x28: ffff00006ac01b10
      | [    0.000000] x27: ffff00006ac898c0 x26: dfffa00000000000
      | [    0.000000] x25: ffffa000120ef290 x24: ffffa0001216df40
      | [    0.000000] x23: 000000000000018d x22: ffffa0001244c700
      | [    0.000000] x21: ffffa00011bf393c x20: ffff00006ac898c0
      | [    0.000000] x19: 00000000ffffffff x18: 0000000000001584
      | [    0.000000] x17: 0000000000001540 x16: 0000000000000007
      | [    0.000000] x15: 0000000000000000 x14: ffffa00010432770
      | [    0.000000] x13: ffff940002483519 x12: 1ffff40002483518
      | [    0.000000] x11: 1ffff40002483518 x10: ffff940002483518
      | [    0.000000] x9 : dfffa00000000000 x8 : 0000000000000001
      | [    0.000000] x7 : ffff940002483519 x6 : ffffa0001241a8c0
      | [    0.000000] x5 : ffff940002483519 x4 : ffff940002483519
      | [    0.000000] x3 : ffffa00011780870 x2 : 0000000000000001
      | [    0.000000] x1 : 1fffe0000d591318 x0 : 0000000000000000
      | [    0.000000] Call trace:
      | [    0.000000]  ftrace_bug+0x27c/0x328
      | [    0.000000]  ftrace_init+0x640/0x6cc
      | [    0.000000]  start_kernel+0x27c/0x654
      | [    0.000000] random: get_random_bytes called from print_oops_end_marker+0x30/0x60 with crng_init=0
      | [    0.000000] ---[ end trace 0000000000000000 ]---
      | [    0.000000] ftrace faulted on writing
      | [    0.000000] [<ffffa00011bf393c>] _GLOBAL__sub_D_65535_0___tracepoint_initcall_level+0x4/0x28
      | [    0.000000] Initializing ftrace call sites
      | [    0.000000] ftrace record flags: 0
      | [    0.000000]  (0)
      | [    0.000000]  expected tramp: ffffa000100b3344
      
      This is due to an unfortunate combination of several factors.
      
      Building with KASAN results in the compiler generating anonymous
      functions to register/unregister global variables against the shadow
      memory. These functions are placed in .text.startup/.text.exit, and
      given mangled names like _GLOBAL__sub_{I,D}_65535_0_$OTHER_SYMBOL. The
      kernel linker script places these in .init.text and .exit.text
      respectively, which are both discarded at runtime as part of initmem.
      
      Building with FTRACE_WITH_REGS uses -fpatchable-function-entry=2, which
      also instruments KASAN's anonymous functions. When these are discarded
      with the rest of initmem, ftrace removes dangling references to these
      call sites.
      
      Building without MODULES implicitly disables STRICT_MODULE_RWX, and
      causes arm64's patch_map() function to treat any !core_kernel_text()
      symbol as something that can be modified in-place. As core_kernel_text()
      is only true for .text and .init.text, with the latter depending on
      system_state < SYSTEM_RUNNING, we'll treat .exit.text as something that
      can be patched in-place. However, .exit.text is mapped read-only.
      
      Hence in this configuration the ftrace init code blows up while trying
      to patch one of the functions generated by KASAN.
      
      We could try to filter out the call sites in .exit.text rather than
      initializing them, but this would be inconsistent with how we handle
      .init.text, and requires hooking into core bits of ftrace. The behaviour
      of patch_map() is also inconsistent today, so instead let's clean that
      up and have it consistently handle .exit.text.
      
      This patch teaches patch_map() to handle .exit.text at init time,
      preventing the boot-time splat above. The flow of patch_map() is
      reworked to make the logic clearer and minimize redundant
      conditionality.
      
      Fixes: 3b23e499 ("arm64: implement ftrace with regs")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      ca2ef4ff
    • Will Deacon's avatar
      arm64: mm: Fix initialisation of DMA zones on non-NUMA systems · 93b90414
      Will Deacon authored
      John reports that the recently merged commit 1a8e1cef ("arm64: use
      both ZONE_DMA and ZONE_DMA32") breaks the boot on his DB845C board:
      
        | Booting Linux on physical CPU 0x0000000000 [0x517f803c]
        | Linux version 5.4.0-mainline-10675-g957a03b9e38f
        | Machine model: Thundercomm Dragonboard 845c
        | [...]
        | Built 1 zonelists, mobility grouping on.  Total pages: -188245
        | Kernel command line: earlycon
        | firmware_class.path=/vendor/firmware/ androidboot.hardware=db845c
        | init=/init androidboot.boot_devices=soc/1d84000.ufshc
        | printk.devkmsg=on buildvariant=userdebug root=/dev/sda2
        | androidboot.bootdevice=1d84000.ufshc androidboot.serialno=c4e1189c
        | androidboot.baseband=sda
        | msm_drm.dsi_display0=dsi_lt9611_1080_video_display:
        | androidboot.slot_suffix=_a skip_initramfs rootwait ro init=/init
        |
        | <hangs indefinitely here>
      
      This is because, when CONFIG_NUMA=n, zone_sizes_init() fails to handle
      memblocks that fall entirely within the ZONE_DMA region and erroneously ends up
      trying to add a negatively-sized region into the following ZONE_DMA32, which is
      later interpreted as a large unsigned region by the core MM code.
      
      Rework the non-NUMA implementation of zone_sizes_init() so that the start
      address of the memblock being processed is adjusted according to the end of the
      previous zone, which is then range-checked before updating the hole information
      of subsequent zones.
      
      Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
      Link: https://lore.kernel.org/lkml/CALAqxLVVcsmFrDKLRGRq7GewcW405yTOxG=KR3csVzQ6bXutkA@mail.gmail.com
      
      
      Fixes: 1a8e1cef ("arm64: use both ZONE_DMA and ZONE_DMA32")
      Reported-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Tested-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      93b90414
  11. Nov 28, 2019
    • Sudeep Holla's avatar
      Revert "arm64: dts: juno: add dma-ranges property" · 54fb3fe0
      Sudeep Holla authored
      
      
      This reverts commit 193d00a2.
      
      Commit 951d4885 ("of: Make of_dma_get_range() work on bus nodes")
      reworked the logic such that of_dma_get_range() works correctly
      starting from a bus node containing "dma-ranges".
      
      Since on Juno we don't have a SoC level bus node and "dma-ranges" is
      present only in the root node, we get the following error:
      
      OF: translation of DMA address(0) to CPU address failed node(/sram@2e000000)
      OF: translation of DMA address(0) to CPU address failed node(/uart@7ff80000)
      ...
      OF: translation of DMA address(0) to CPU address failed node(/mhu@2b1f0000)
      OF: translation of DMA address(0) to CPU address failed node(/iommu@2b600000)
      OF: translation of DMA address(0) to CPU address failed node(/iommu@2b600000)
      OF: translation of DMA address(0) to CPU address failed node(/iommu@2b600000)
      
      So let's fix it by dropping the "dma-ranges" property for now. This
      should be fine since it doesn't represent any kind of device-visible
      restriction; it was only there for completeness, and we've since given
      in to the assumption that missing "dma-ranges" implies a 1:1 mapping
      anyway.
      
      We can add it later with a proper SoC bus node and moving all the
      devices that belong there along with the "dma-ranges" if required.
      
      Fixes: 193d00a2 ("arm64: dts: juno: add dma-ranges property")
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Liviu Dudau <liviu.dudau@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      54fb3fe0
  12. Nov 27, 2019
  13. Nov 26, 2019
  14. Nov 25, 2019
  15. Nov 20, 2019
    • Christoph Hellwig's avatar
      dma-mapping: drop the dev argument to arch_sync_dma_for_* · 56e35f9c
      Christoph Hellwig authored
      
      
      These are pure cache maintainance routines, so drop the unused
      struct device argument.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Suggested-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      56e35f9c
    • Pavel Tatashin's avatar
      arm64: uaccess: Remove uaccess_*_not_uao asm macros · e50be648
      Pavel Tatashin authored
      
      
      It is safer and simpler to drop the uaccess assembly macros in favour of
      inline C functions. Although this bloats the Image size slightly, it
      aligns our user copy routines with '{get,put}_user()' and generally
      makes the code a lot easier to reason about.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@soleen.com>
      [will: tweaked commit message and changed temporary variable names]
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      e50be648
    • Pavel Tatashin's avatar
      arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault · 94bb804e
      Pavel Tatashin authored
      
      
      A number of our uaccess routines ('__arch_clear_user()' and
      '__arch_copy_{in,from,to}_user()') fail to re-enable PAN if they
      encounter an unhandled fault whilst accessing userspace.
      
      For CPUs implementing both hardware PAN and UAO, this bug has no effect
      when both extensions are in use by the kernel.
      
      For CPUs implementing hardware PAN but not UAO, this means that a kernel
      using hardware PAN may execute portions of code with PAN inadvertently
      disabled, opening us up to potential security vulnerabilities that rely
      on userspace access from within the kernel which would usually be
      prevented by this mechanism. In other words, parts of the kernel run the
      same way as they would on a CPU without PAN implemented/emulated at all.
      
      For CPUs not implementing hardware PAN and instead relying on software
      emulation via 'CONFIG_ARM64_SW_TTBR0_PAN=y', the impact is unfortunately
      much worse. Calling 'schedule()' with software PAN disabled means that
      the next task will execute in the kernel using the page-table and ASID
      of the previous process even after 'switch_mm()', since the actual
      hardware switch is deferred until return to userspace. At this point, or
      if there is a intermediate call to 'uaccess_enable()', the page-table
      and ASID of the new process are installed. Sadly, due to the changes
      introduced by KPTI, this is not an atomic operation and there is a very
      small window (two instructions) where the CPU is configured with the
      page-table of the old task and the ASID of the new task; a speculative
      access in this state is disastrous because it would corrupt the TLB
      entries for the new task with mappings from the previous address space.
      
      As Pavel explains:
      
        | I was able to reproduce memory corruption problem on Broadcom's SoC
        | ARMv8-A like this:
        |
        | Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
        | stack is accessed and copied.
        |
        | The test program performed the following on every CPU and forking
        | many processes:
        |
        |	unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
        |				  MAP_SHARED | MAP_ANONYMOUS, -1, 0);
        |	map[0] = getpid();
        |	sched_yield();
        |	if (map[0] != getpid()) {
        |		fprintf(stderr, "Corruption detected!");
        |	}
        |	munmap(map, PAGE_SIZE);
        |
        | From time to time I was getting map[0] to contain pid for a
        | different process.
      
      Ensure that PAN is re-enabled when returning after an unhandled user
      fault from our uaccess routines.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: <stable@vger.kernel.org>
      Fixes: 338d4f49 ("arm64: kernel: Add support for Privileged Access Never")
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@soleen.com>
      [will: rewrote commit message]
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      94bb804e
  16. Nov 17, 2019
  17. Nov 14, 2019
Loading