- Dec 02, 2020
-
-
Mark Rutland authored
The SDEI support code is split across arch/arm64/ and drivers/firmware/, largley this is split so that the arch-specific portions are under arch/arm64, and the management logic is under drivers/firmware/. However, exception entry fixups are currently under drivers/firmware. Let's move the exception entry fixups under arch/arm64/. This de-clutters the management logic, and puts all the arch-specific portions in one place. Doing this also allows the fixups to be applied earlier, so things like PAN and UAO will be in a known good state before we run other logic. This will also make subsequent refactoring easier. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Reviewed-by:
James Morse <james.morse@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-2-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
As with SCTLR_ELx and other control registers, some PSTATE bits are UNKNOWN out-of-reset, and we may not be able to rely on hardware or firmware to initialize them to our liking prior to entry to the kernel, e.g. in the primary/secondary boot paths and return from idle/suspend. It would be more robust (and easier to reason about) if we consistently initialized PSTATE to a default value, as we do with control registers. This will ensure that the kernel is not adversely affected by bits it is not aware of, e.g. when support for a feature such as PAN/UAO is disabled. This patch ensures that PSTATE is consistently initialized at boot time via an ERET. This is not intended to relax the existing requirements (e.g. DAIF bits must still be set prior to entering the kernel). For features detected dynamically (which may require system-wide support), it is still necessary to subsequently modify PSTATE. As ERET is not always a Context Synchronization Event, an ISB is placed before each exception return to ensure updates to control registers have taken effect. This handles the kernel being entered with SCTLR_ELx.EOS clear (or any future control bits being in an UNKNOWN state). Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-6-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
Let's make SCTLR_ELx initialization a bit clearer by using meaningful names for the initialization values, following the same scheme for SCTLR_EL1 and SCTLR_EL2. These definitions will be used more widely in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-5-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
For a while now el2_setup has performed some basic initialization of EL1 even when the kernel is booted at EL1, so the name is a little misleading. Further, some comments are stale as with VHE it doesn't drop the CPU to EL1. To clarify things, rename el2_setup to init_kernel_el, and update comments to be clearer as to the function's purpose. There should be no functional change as a result of this patch. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-4-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
To make callsites easier to read, add trivial C wrappers for the SET_PSTATE_*() helpers, and convert trivial uses over to these. The new wrappers will be used further in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-3-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
For consistency, all tasks have a pt_regs reserved at the highest portion of their task stack. Among other things, this ensures that a task's SP is always pointing within its stack rather than pointing immediately past the end. While it is never legitimate to ERET from a kthread, we take pains to initialize pt_regs for kthreads as if this were legitimate. As this is never legitimate, the effects of an erroneous return are rarely tested. Let's simplify things by initializing a kthread's pt_regs such that an ERET is caught as an illegal exception return, and removing the explicit initialization of other exception context. Note that as spectre_v4_enable_task_mitigation() only manipulates the PSTATE within the unused regs this is safe to remove. As user tasks will have their exception context initialized via start_thread() or start_compat_thread(), this should only impact cases where something has gone very wrong and we'd like that to be clearly indicated. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-2-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
- Nov 09, 2020
-
-
Will Deacon authored
When building with LTO, there is an increased risk of the compiler converting an address dependency headed by a READ_ONCE() invocation into a control dependency and consequently allowing for harmful reordering by the CPU. Ensure that such transformations are harmless by overriding the generic READ_ONCE() definition with one that provides acquire semantics when building with LTO. Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Will Deacon authored
In preparation for patching the internals of READ_ONCE() itself, replace its usage on the alternatives patching patch with a volatile variable instead. Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Will Deacon authored
Armv8.3 introduced the LDAPR instruction, which provides weaker memory ordering semantics than LDARi (RCpc vs RCsc). Generally, we provide an RCsc implementation when implementing the Linux memory model, but LDAPR can be used as a useful alternative to dependency ordering, particularly when the compiler is capable of breaking the dependencies. Since LDAPR is not available on all CPUs, add a cpufeature to detect it at runtime and allow the instruction to be used with alternative code patching. Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Will Deacon authored
asm/alternative.h contains both the macros needed to use alternatives, as well the type definitions and function prototypes for applying them. Split the header in two, so that alternatives can be used from core header files such as linux/compiler.h without the risk of circular includes Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Mark Rutland authored
The uao_* alternative asm macros are only used by the uaccess assembly routines in arch/arm64/lib/, where they are included indirectly via asm-uaccess.h. Since they're specific to the uaccess assembly (and will lose the alternatives in subsequent patches), let's move them into asm-uaccess.h. There should be no functional change as a result of this patch. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> [will: update #include in mte.S to pull in uao asm macros] Signed-off-by:
Will Deacon <will@kernel.org>
-
- Nov 07, 2020
-
-
Mike Travis authored
Testing shows a problem in that UV5 hubless systems were not being recognized. Add them to the list of OEM IDs checked. Fixes: 6c779442 ("Add UV5 direct references") Signed-off-by:
Mike Travis <mike.travis@hpe.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201105222741.157029-4-mike.travis@hpe.com
-
Mike Travis authored
Testing shows that trailing spaces caused problems with the OEM_ID and the OEM_TABLE_ID. One being that the OEM_ID would not string compare correctly. Another the OEM_ID and OEM_TABLE_ID would be concatenated in the printout. Remove any trailing spaces. Fixes: 1e61f5a9 ("Add and decode Arch Type in UVsystab") Signed-off-by:
Mike Travis <mike.travis@hpe.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201105222741.157029-3-mike.travis@hpe.com
-
Mike Travis authored
Testing shows a problem in that the OEM_TABLE_ID was missing for hubless systems. This is used to determine the APIC type (legacy or extended). Add the OEM_TABLE_ID to the early hubless processing. Fixes: 1e61f5a9 ("Add and decode Arch Type in UVsystab") Signed-off-by:
Mike Travis <mike.travis@hpe.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201105222741.157029-2-mike.travis@hpe.com
-
- Nov 06, 2020
-
-
Palmer Dabbelt authored
We were relying on GNU ld's ability to re-link executable files in order to extract our VDSO symbols. This behavior was deemed a bug as of binutils-2.35 (specifically the binutils-gdb commit a87e1817a4 ("Have the linker fail if any attempt to link in an executable is made."), but as that has been backported to at least Debian's binutils-2.34 in may manifest in other places. The previous version of this was a bit of a mess: we were linking a static executable version of the VDSO, containing only a subset of the input symbols, which we then linked into the kernel. This worked, but certainly wasn't a supported path through the toolchain. Instead this new version parses the textual output of nm to produce a symbol table. Both rely on near-zero addresses being linkable, but as we rely on weak undefined symbols being linkable elsewhere I don't view this as a major issue. Fixes: e2c0cdfb ("RISC-V: User-facing API") Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Anup Patel authored
Currently, we use PGD mappings for early DTB mapping in early_pgd but this breaks Linux kernel on SiFive Unleashed because on SiFive Unleashed PMP checks don't work correctly for PGD mappings. To fix early DTB mappings on SiFive Unleashed, we use non-PGD mappings (i.e. PMD) for early DTB access. Fixes: 8f3a2b4a ("RISC-V: Move DT mapping outof fixmap") Signed-off-by:
Anup Patel <anup.patel@wdc.com> Reviewed-by:
Atish Patra <atish.patra@wdc.com> Tested-by:
Atish Patra <atish.patra@wdc.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Changbin Du authored
The copy_from_kernel_nofault() is broken on riscv because the 'dst' and 'src' are mistakenly reversed in __put_kernel_nofault() macro. copy_to_kernel_nofault: ... 0xffffffe0003159b8 <+30>: sd a4,0(a1) # a1 aka 'src' Fixes: d464118c ("riscv: implement __get_kernel_nofault and __put_user_nofault") Signed-off-by:
Changbin Du <changbin.du@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Anup Patel <anup@brainfault.org> Tested-by:
Anup Patel <anup@brainfault.org> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Liu Shaohua authored
The argument to pfn_to_virt() should be pfn not the value of CSR_SATP. Reviewed-by:
Palmer Dabbelt <palmerdabbelt@google.com> Reviewed-by:
Anup Patel <anup@brainfault.org> Signed-off-by:
liush <liush@allwinnertech.com> Reviewed-by:
Pekka Enberg <penberg@kernel.org> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Scott Cheloha authored
Add a non-NUMA definition for of_drconf_to_nid_single() to topology.h so we have one even if powerpc/mm/numa.c is not compiled. On a non-NUMA kernel the appropriate node id is always first_online_node. Fixes: 72cdd117 ("pseries/hotplug-memory: hot-add: skip redundant LMB lookup") Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Scott Cheloha <cheloha@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201105223040.3612663-1-cheloha@linux.ibm.com
-
Sean Anderson authored
M-Mode Linux is loaded at the start of RAM, not 2MB later. Perhaps this should be calculated based on PAGE_OFFSET somehow? Even better would be to deprecate text_offset and instead introduce something absolute. Signed-off-by:
Sean Anderson <seanga2@gmail.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Nov 05, 2020
-
-
Benjamin Gwin authored
It's possible that the first region picked for the new kernel will make it impossible to fit the other segments in the required 32GB window, especially if we have a very large initrd. Instead of giving up, we can keep testing other regions for the kernel until we find one that works. Suggested-by:
Ryan O'Leary <ryanoleary@google.com> Signed-off-by:
Benjamin Gwin <bgwin@google.com> Link: https://lore.kernel.org/r/20201103201106.2397844-1-bgwin@google.com Signed-off-by:
Will Deacon <will@kernel.org>
-
Anand K Mistry authored
On AMD CPUs which have the feature X86_FEATURE_AMD_STIBP_ALWAYS_ON, STIBP is set to on and spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED At the same time, IBPB can be set to conditional. However, this leads to the case where it's impossible to turn on IBPB for a process because in the PR_SPEC_DISABLE case in ib_prctl_set() the spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED condition leads to a return before the task flag is set. Similarly, ib_prctl_get() will return PR_SPEC_DISABLE even though IBPB is set to conditional. More generally, the following cases are possible: 1. STIBP = conditional && IBPB = on for spectre_v2_user=seccomp,ibpb 2. STIBP = on && IBPB = conditional for AMD CPUs with X86_FEATURE_AMD_STIBP_ALWAYS_ON The first case functions correctly today, but only because spectre_v2_user_ibpb isn't updated to reflect the IBPB mode. At a high level, this change does one thing. If either STIBP or IBPB is set to conditional, allow the prctl to change the task flag. Also, reflect that capability when querying the state. This isn't perfect since it doesn't take into account if only STIBP or IBPB is unconditionally on. But it allows the conditional feature to work as expected, without affecting the unconditional one. [ bp: Massage commit message and comment; space out statements for better readability. ] Fixes: 21998a35 ("x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.") Signed-off-by:
Anand K Mistry <amistry@google.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Tom Lendacky <thomas.lendacky@amd.com> Link: https://lkml.kernel.org/r/20201105163246.v2.1.Ifd7243cd3e2c2206a893ad0a5b9a4f19549e22c6@changeid
-
Atish Patra authored
RISC-V limits the physical memory size by -PAGE_OFFSET. Any memory beyond that size from DRAM start is unusable. Just remove any memblock pointing to those memory region without worrying about computing the maximum size. Signed-off-by:
Atish Patra <atish.patra@wdc.com> Reviewed-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Christophe Leroy authored
When _PAGE_ACCESSED is not set, a minor fault is expected. To do this, TLB miss exception ANDs _PAGE_PRESENT and _PAGE_ACCESSED into the L2 entry valid bit. To simplify the processing and reduce the number of instructions in TLB miss exceptions, manage it as an APG bit and get it next to _PAGE_GUARDED bit to allow a copy in one go. Then declare the corresponding groups as handling all accesses as user accesses. As the PP bits always define user as No Access, it will generate a fault. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/80f488db230c6b0e7b3b990d72bd94a8a069e93e.1602492856.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
The kernel expects pte_young() to work regardless of CONFIG_SWAP. Make sure a minor fault is taken to set _PAGE_ACCESSED when it is not already set, regardless of the selection of CONFIG_SWAP. This adds at least 3 instructions to the TLB miss exception handlers fast path. Following patch will reduce this overhead. Also update the rotation instruction to the correct number of bits to reflect all changes done to _PAGE_ACCESSED over time. Fixes: d069cb43 ("powerpc/8xx: Don't touch ACCESSED when no SWAP.") Fixes: 5f356497 ("powerpc/8xx: remove unused _PAGE_WRITETHRU") Fixes: e0a8e0d9 ("powerpc/8xx: Handle PAGE_USER via APG bits") Fixes: 5b2753fc ("powerpc/8xx: Implementation of PAGE_EXEC") Fixes: a891c43b ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.") Cc: stable@vger.kernel.org Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.1602492856.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
The kernel expects pte_young() to work regardless of CONFIG_SWAP. Make sure a minor fault is taken to set _PAGE_ACCESSED when it is not already set, regardless of the selection of CONFIG_SWAP. Fixes: 2c74e258 ("powerpc/40x: Rework 40x PTE access and TLB miss") Cc: stable@vger.kernel.org Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b02ca2ed2d3676a096219b48c0f69ec982a75bcf.1602342801.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
The kernel expects pte_young() to work regardless of CONFIG_SWAP. Make sure a minor fault is taken to set _PAGE_ACCESSED when it is not already set, regardless of the selection of CONFIG_SWAP. Fixes: 84de6ab0 ("powerpc/603: don't handle PAGE_ACCESSED in TLB miss handlers.") Cc: stable@vger.kernel.org Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a44367744de54e2315b2f1a8cbbd7f88488072e0.1602342806.git.christophe.leroy@csgroup.eu
-
- Nov 04, 2020
-
-
Michael Ellerman authored
Andreas reported that commit ee0a49a6 ("powerpc/uaccess: Switch __put_user_size_allowed() to __put_user_asm_goto()") broke CLONE_CHILD_SETTID. Further inspection showed that the put_user() in schedule_tail() was missing entirely, the store not emitted by the compiler. <.schedule_tail>: mflr r0 std r0,16(r1) stdu r1,-112(r1) bl <.finish_task_switch> ld r9,2496(r3) cmpdi cr7,r9,0 bne cr7,<.schedule_tail+0x60> ld r3,392(r13) ld r9,1392(r3) cmpdi cr7,r9,0 beq cr7,<.schedule_tail+0x3c> li r4,0 li r5,0 bl <.__task_pid_nr_ns> nop bl <.calculate_sigpending> nop addi r1,r1,112 ld r0,16(r1) mtlr r0 blr nop nop nop bl <.__balance_callback> b <.schedule_tail+0x1c> Notice there are no stores other than to the stack. There should be a stw in there for the store to current->set_child_tid. This is only seen with GCC 4.9 era compilers (tested with 4.9.3 and 4.9.4), and only when CONFIG_PPC_KUAP is disabled. When CONFIG_PPC_KUAP=y, the inline asm that's part of the isync() and mtspr() inlined via allow_user_access() seems to be enough to avoid the bug. We already have a macro to work around this (or a similar bug), called asm_volatile_goto which includes an empty asm block to tickle the compiler into generating the right code. So use that. With this applied the code generation looks more like it will work: <.schedule_tail>: mflr r0 std r31,-8(r1) std r0,16(r1) stdu r1,-144(r1) std r3,112(r1) bl <._mcount> nop ld r3,112(r1) bl <.finish_task_switch> ld r9,2624(r3) cmpdi cr7,r9,0 bne cr7,<.schedule_tail+0xa0> ld r3,2408(r13) ld r31,1856(r3) cmpdi cr7,r31,0 beq cr7,<.schedule_tail+0x80> li r4,0 li r5,0 bl <.__task_pid_nr_ns> nop li r9,-1 clrldi r9,r9,12 cmpld cr7,r31,r9 bgt cr7,<.schedule_tail+0x80> lis r9,16 rldicr r9,r9,32,31 subf r9,r31,r9 cmpldi cr7,r9,3 ble cr7,<.schedule_tail+0x80> li r9,0 stw r3,0(r31) <-- stw nop bl <.calculate_sigpending> nop addi r1,r1,144 ld r0,16(r1) ld r31,-8(r1) mtlr r0 blr nop bl <.__balance_callback> b <.schedule_tail+0x30> Fixes: ee0a49a6 ("powerpc/uaccess: Switch __put_user_size_allowed() to __put_user_asm_goto()") Reported-by:
Andreas Schwab <schwab@linux-m68k.org> Tested-by:
Andreas Schwab <schwab@linux-m68k.org> Suggested-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201104111742.672142-1-mpe@ellerman.id.au
-
Ryan Kosta authored
Signed-off-by:
Ryan Kosta <ryanpkosta@gmail.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Fangrui Song authored
Commit 393f203f ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") added .weak directives to arch/x86/lib/mem*_64.S instead of changing the existing ENTRY macros to WEAK. This can lead to the assembly snippet .weak memcpy ... .globl memcpy which will produce a STB_WEAK memcpy with GNU as but STB_GLOBAL memcpy with LLVM's integrated assembler before LLVM 12. LLVM 12 (since https://reviews.llvm.org/D90108 ) will error on such an overridden symbol binding. Commit ef1e0315 ("x86/asm: Make some functions local") changed ENTRY in arch/x86/lib/memcpy_64.S to SYM_FUNC_START_LOCAL, which was ineffective due to the preceding .weak directive. Use the appropriate SYM_FUNC_START_WEAK instead. Fixes: 393f203f ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions") Fixes: ef1e0315 ("x86/asm: Make some functions local") Reported-by:
Sami Tolvanen <samitolvanen@google.com> Signed-off-by:
Fangrui Song <maskray@google.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Tested-by:
Nathan Chancellor <natechancellor@gmail.com> Tested-by:
Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201103012358.168682-1-maskray@google.com
-
Ard Biesheuvel authored
free_highpages() iterates over the free memblock regions in high memory, and marks each page as available for the memory management system. Until commit cddb5ddf ("arm, xtensa: simplify initialization of high memory pages") it rounded beginning of each region upwards and end of each region downwards. However, after that commit free_highmem() rounds the beginning and end of each region downwards, and we may end up freeing a page that is memblock_reserve()d, resulting in memory corruption. Restore the original rounding of the region boundaries to avoid freeing reserved pages. Fixes: cddb5ddf ("arm, xtensa: simplify initialization of high memory pages") Link: https://lore.kernel.org/r/20201029110334.4118-1-ardb@kernel.org/ Link: https://lore.kernel.org/r/20201031094345.6984-1-rppt@kernel.org Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Co-developed-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Mike Rapoport <rppt@linux.ibm.com> Acked-by:
Max Filippov <jcmvbkbc@gmail.com>
-
- Nov 03, 2020
-
-
Niklas Schnelle authored
Under some circumstances in particular with "Reconfigure I/O Path" a zPCI function may first appear in Standby through a PCI event with PEC 0x0302 which initially makes it visible to the zPCI subsystem, Only after that is it configured with a zPCI event with PEC 0x0301. If the zbus is still missing a PCI function zero (devfn == 0) when the PCI event 0x0301 is handled zdev->zbus->bus is still NULL and gets dereferenced in common code. Check for this case and enable but don't scan the zPCI function. This matches what would happen if we immediately got the 0x0301 configuration request or the function was included in CLP List PCI. In all cases the PCI functions with devfn != 0 will be scanned once function 0 appears. Fixes: 3047766b ("s390/pci: fix enabling a reserved PCI function") Cc: <stable@vger.kernel.org> # 5.8 Signed-off-by:
Niklas Schnelle <schnelle@linux.ibm.com> Acked-by:
Pierre Morel <pmorel@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
Qian Cai authored
The call to rcu_cpu_starting() in smp_init_secondary() is not early enough in the CPU-hotplug onlining process, which results in lockdep splats as follows: WARNING: suspicious RCU usage ----------------------------- kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0. Call Trace: show_stack+0x158/0x1f0 dump_stack+0x1f2/0x238 __lock_acquire+0x2640/0x4dd0 lock_acquire+0x3a8/0xd08 _raw_spin_lock_irqsave+0xc0/0xf0 clockevents_register_device+0xa8/0x528 init_cpu_timer+0x33e/0x468 smp_init_secondary+0x11a/0x328 smp_start_secondary+0x82/0x88 This is avoided by moving the call to rcu_cpu_starting up near the beginning of the smp_init_secondary() function. Note that the raw_smp_processor_id() is required in order to avoid calling into lockdep before RCU has declared the CPU to be watched for readers. Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/ Signed-off-by:
Qian Cai <cai@redhat.com> Acked-by:
Paul E. McKenney <paulmck@kernel.org> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
Heiko Carstens authored
Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
Heiko Carstens authored
Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
Heiko Carstens authored
Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
Gerald Schaefer authored
pmd/pud_deref() assume that they will never operate on large pmd/pud entries, and therefore only use the non-large _xxx_ENTRY_ORIGIN mask. With commit 9ec8fa8d ("s390/vmemmap: extend modify_pagetable() to handle vmemmap"), that assumption is no longer true, at least for pmd_deref(). In theory, we could end up with wrong addresses because some of the non-address bits of a large entry would not be masked out. In practice, this does not (yet) show any impact, because vmemmap_free() is currently never used for s390. Fix pmd/pud_deref() to check for the entry type and use the _xxx_ENTRY_ORIGIN_LARGE mask for large entries. While at it, also move pmd/pud_pfn() around, in order to avoid code duplication, because they do the same thing. Fixes: 9ec8fa8d ("s390/vmemmap: extend modify_pagetable() to handle vmemmap") Cc: <stable@vger.kernel.org> # 5.9 Signed-off-by:
Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by:
Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by:
Heiko Carstens <hca@linux.ibm.com>
-
Jean-Philippe Brucker authored
Commit 36dadef2 ("kprobes: Init kprobes in early_initcall") enabled using kprobes from early_initcall. Unfortunately at this point the hardware debug infrastructure is not operational. The OS lock may still be locked, and the hardware watchpoints may have unknown values when kprobe enables debug monitors to single-step instructions. Rather than using hardware single-step, append a BRK instruction after the instruction to be executed out-of-line. Fixes: 36dadef2 ("kprobes: Init kprobes in early_initcall") Suggested-by:
Will Deacon <will@kernel.org> Signed-off-by:
Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by:
Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20201103134900.337243-1-jean-philippe@linaro.org Signed-off-by:
Will Deacon <will@kernel.org>
-
Vanshidhar Konda authored
The current arm64 default config limits max NUMA nodes available on system to 4 (NODES_SHIFT = 2). Today's arm64 systems can reach or exceed 16 NUMA nodes. To accomodate current hardware and to fit NODES_SHIFT within page flags on arm64, increase NODES_SHIFT to 4. Signed-off-by:
Vanshidhar Konda <vanshikonda@os.amperecomputing.com> Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20201020173409.1266576-1-vanshikonda@os.amperecomputing.com/ Link: https://lore.kernel.org/r/20201030173050.1182876-1-vanshikonda@os.amperecomputing.com Signed-off-by:
Will Deacon <will@kernel.org>
-
- Nov 02, 2020
-
-
Vineet Gupta authored
ARC HSDK platform stopped booting on released v5.10-rc1, getting stuck in startup of non master SMP cores. This was bisected to upstream commit 7fef431b "(mm/page_alloc: place pages to tail in __free_pages_core())" That commit itself is harmless, it just exposed a subtle assumption in our platform code (hence CC'ing linux-mm just as FYI in case some other arches / platforms trip on it). The upstream commit is semantically disruptive as it reverses the order of page allocations (actually it can be good test for hardware verification to exercise different memory patterns altogether). For ARC HSDK platform that meant a remapped memory region (pertaining to unused Closely Coupled Memory) started getting used early for dynamice allocations, while not effectively remapped on all the cores, triggering memory error exception on those cores. The fix is to move the CCM remapping from early platform code to to early core boot code. And while it is undesirable to riddle common boot code with platform quirks, there is no other way to do this since the faltering code involves setting up stack itself so even function calls are not allowed at that point. If anyone is interested, all the gory details can be found at Link below. Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/32 Cc: David Hildenbrand <david@redhat.com> Cc: linux-mm@kvack.org Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-