- Aug 09, 2019
-
-
Steve Capper authored
In a later patch we will need to have a slightly larger VMEMMAP region to accommodate boot time selection between 48/52-bit kernel VAs. This patch modifies the formula for computing VMEMMAP_SIZE to depend explicitly on the PAGE_OFFSET and start of kernel addressable memory. (This allows for a slightly larger direct linear map in future). Signed-off-by:
Steve Capper <steve.capper@arm.com> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
vmemmap is a preprocessor definition that depends on a variable, memstart_addr. In a later patch we will need to expand the size of the VMEMMAP region and optionally modify vmemmap depending upon whether or not hardware support is available for 52-bit virtual addresses. This patch changes vmemmap to be a variable. As the old definition depended on a variable load, this should not affect performance noticeably. Signed-off-by:
Steve Capper <steve.capper@arm.com> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
When running with a 52-bit userspace VA and a 48-bit kernel VA we offset ttbr1_el1 to allow the kernel pagetables with a 52-bit PTRS_PER_PGD to be used for both userspace and kernel. Moving on to a 52-bit kernel VA we no longer require this offset to ttbr1_el1 should we be running on a system with HW support for 52-bit VAs. This patch introduces conditional logic to offset_ttbr1 to query SYS_ID_AA64MMFR2_EL1 whenever 52-bit VAs are selected. If there is HW support for 52-bit VAs then the ttbr1 offset is skipped. We choose to read a system register rather than vabits_actual because offset_ttbr1 can be called in places where the kernel data is not actually mapped. Calls to offset_ttbr1 appear to be made from rarely called code paths so this extra logic is not expected to adversely affect performance. Signed-off-by:
Steve Capper <steve.capper@arm.com> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
In order to support 52-bit kernel addresses detectable at boot time, one needs to know the actual VA_BITS detected. A new variable vabits_actual is introduced in this commit and employed for the KVM hypervisor layout, KASAN, fault handling and phys-to/from-virt translation where there would normally be compile time constants. In order to maintain performance in phys_to_virt, another variable physvirt_offset is introduced. Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Steve Capper <steve.capper@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
In order to support 52-bit kernel addresses detectable at boot time, the kernel needs to know the most conservative VA_BITS possible should it need to fall back to this quantity due to lack of hardware support. A new compile time constant VA_BITS_MIN is introduced in this patch and it is employed in the KASAN end address, KASLR, and EFI stub. For Arm, if 52-bit VA support is unavailable the fallback is to 48-bits. In other words: VA_BITS_MIN = min (48, VA_BITS) Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Steve Capper <steve.capper@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
The kernel page table dumper assumes that the placement of VA regions is constant and determined at compile time. As we are about to introduce variable VA logic, we need to be able to determine certain regions at boot time. Specifically the VA_START and KASAN_SHADOW_START will depend on whether or not the system is booted with 52-bit kernel VAs. This patch adds logic to the kernel page table dumper s.t. these regions can be computed at boot time. Signed-off-by:
Steve Capper <steve.capper@arm.com> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
KASAN_SHADOW_OFFSET is a constant that is supplied to gcc as a command line argument and affects the codegen of the inline address sanetiser. Essentially, for an example memory access: *ptr1 = val; The compiler will insert logic similar to the below: shadowValue = *(ptr1 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET) if (somethingWrong(shadowValue)) flagAnError(); This code sequence is inserted into many places, thus KASAN_SHADOW_OFFSET is essentially baked into many places in the kernel text. If we want to run a single kernel binary with multiple address spaces, then we need to do this with KASAN_SHADOW_OFFSET fixed. Thankfully, due to the way the KASAN_SHADOW_OFFSET is used to provide shadow addresses we know that the end of the shadow region is constant w.r.t. VA space size: KASAN_SHADOW_END = ~0 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET This means that if we increase the size of the VA space, the start of the KASAN region expands into lower addresses whilst the end of the KASAN region is fixed. Currently the arm64 code computes KASAN_SHADOW_OFFSET at build time via build scripts with the VA size used as a parameter. (There are build time checks in the C code too to ensure that expected values are being derived). It is sufficient, and indeed is a simplification, to remove the build scripts (and build time checks) entirely and instead provide KASAN_SHADOW_OFFSET values. This patch removes the logic to compute the KASAN_SHADOW_OFFSET in the arm64 Makefile, and instead we adopt the approach used by x86 to supply offset values in kConfig. To help debug/develop future VA space changes, the Makefile logic has been preserved in a script file in the arm64 Documentation folder. Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Steve Capper <steve.capper@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
In order to allow for a KASAN shadow that changes size at boot time, one must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the start address. Also, it is highly desirable to maintain the same function addresses in the kernel .text between VA sizes. Both of these requirements necessitate us to flip the kernel address space halves s.t. the direct linear map occupies the lower addresses. This patch puts the direct linear map in the lower addresses of the kernel VA range and everything else in the higher ranges. We need to adjust: *) KASAN shadow region placement logic, *) KASAN_SHADOW_OFFSET computation logic, *) virt_to_phys, phys_to_virt checks, *) page table dumper. These are all small changes, that need to take place atomically, so they are bundled into this commit. As part of the re-arrangement, a guard region of 2MB (to preserve alignment for fixed map) is added after the vmemmap. Otherwise the vmemmap could intersect with IS_ERR pointers. Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Steve Capper <steve.capper@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
Steve Capper authored
Currently there are assumptions about the alignment of VMEMMAP_START and PAGE_OFFSET that won't be valid after this series is applied. These assumptions are in the form of bitwise operators being used instead of addition and subtraction when calculating addresses. This patch replaces these bitwise operators with addition/subtraction. Signed-off-by:
Steve Capper <steve.capper@arm.com> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Will Deacon <will@kernel.org>
-
- Aug 03, 2019
-
-
Arnd Bergmann authored
ARM64 randdconfig builds regularly run into a build error, especially when NUMA_BALANCING and SPARSEMEM are enabled but not SPARSEMEM_VMEMMAP: #error "KASAN: not enough bits in page flags for tag" The last-cpuid bits are already contitional on the available space, so the result of the calculation is a bit random on whether they were already left out or not. Adding the kasan tag bits before last-cpuid makes it much more likely to end up with a successful build here, and should be reliable for randconfig at least, as long as that does not randomize NR_CPUS or NODES_SHIFT but uses the defaults. In order for the modified check to not trigger in the x86 vdso32 code where all constants are wrong (building with -m32), enclose all the definitions with an #ifdef. [arnd@arndb.de: build fix] Link: http://lkml.kernel.org/r/CAK8P3a3Mno1SWTcuAOT0Wa9VS15pdU6EfnkxLbDpyS55yO04+g@mail.gmail.com Link: http://lkml.kernel.org/r/20190722115520.3743282-1-arnd@arndb.de Link: https://lore.kernel.org/lkml/20190618095347.3850490-1-arnd@arndb.de/ Fixes: 2813b9c0 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Andrey Konovalov <andreyknvl@google.com> Reviewed-by:
Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Aug 02, 2019
-
-
Masami Hiramatsu authored
Make debug exceptions visible from RCU so that synchronize_rcu() correctly track the debug exception handler. This also introduces sanity checks for user-mode exceptions as same as x86's ist_enter()/ist_exit(). The debug exception can interrupt in idle task. For example, it warns if we put a kprobe on a function called from idle task as below. The warning message showed that the rcu_read_lock() caused this problem. But actually, this means the RCU is lost the context which is already in NMI/IRQ. /sys/kernel/debug/tracing # echo p default_idle_call >> kprobe_events /sys/kernel/debug/tracing # echo 1 > events/kprobes/enable /sys/kernel/debug/tracing # [ 135.122237] [ 135.125035] ============================= [ 135.125310] WARNING: suspicious RCU usage [ 135.125581] 5.2.0-08445-g9187c508bdc7 #20 Not tainted [ 135.125904] ----------------------------- [ 135.126205] include/linux/rcupdate.h:594 rcu_read_lock() used illegally while idle! [ 135.126839] [ 135.126839] other info that might help us debug this: [ 135.126839] [ 135.127410] [ 135.127410] RCU used illegally from idle CPU! [ 135.127410] rcu_scheduler_active = 2, debug_locks = 1 [ 135.128114] RCU used illegally from extended quiescent state! [ 135.128555] 1 lock held by swapper/0/0: [ 135.128944] #0: (____ptrval____) (rcu_read_lock){....}, at: call_break_hook+0x0/0x178 [ 135.130499] [ 135.130499] stack backtrace: [ 135.131192] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0-08445-g9187c508bdc7 #20 [ 135.131841] Hardware name: linux,dummy-virt (DT) [ 135.132224] Call trace: [ 135.132491] dump_backtrace+0x0/0x140 [ 135.132806] show_stack+0x24/0x30 [ 135.133133] dump_stack+0xc4/0x10c [ 135.133726] lockdep_rcu_suspicious+0xf8/0x108 [ 135.134171] call_break_hook+0x170/0x178 [ 135.134486] brk_handler+0x28/0x68 [ 135.134792] do_debug_exception+0x90/0x150 [ 135.135051] el1_dbg+0x18/0x8c [ 135.135260] default_idle_call+0x0/0x44 [ 135.135516] cpu_startup_entry+0x2c/0x30 [ 135.135815] rest_init+0x1b0/0x280 [ 135.136044] arch_call_rest_init+0x14/0x1c [ 135.136305] start_kernel+0x4d4/0x500 [ 135.136597] So make debug exception visible to RCU can fix this warning. Reported-by:
Naresh Kamboju <naresh.kamboju@linaro.org> Acked-by:
Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Will Deacon <will@kernel.org>
-
Masami Hiramatsu authored
kprobes manipulates the interrupted PSTATE for single step, and doesn't restore it. Thus, if we put a kprobe where the pstate.D (debug) masked, the mask will be cleared after the kprobe hits. Moreover, in the most complicated case, this can lead a kernel crash with below message when a nested kprobe hits. [ 152.118921] Unexpected kernel single-step exception at EL1 When the 1st kprobe hits, do_debug_exception() will be called. At this point, debug exception (= pstate.D) must be masked (=1). But if another kprobes hits before single-step of the first kprobe (e.g. inside user pre_handler), it unmask the debug exception (pstate.D = 0) and return. Then, when the 1st kprobe setting up single-step, it saves current DAIF, mask DAIF, enable single-step, and restore DAIF. However, since "D" flag in DAIF is cleared by the 2nd kprobe, the single-step exception happens soon after restoring DAIF. This has been introduced by commit 7419333f ("arm64: kprobe: Always clear pstate.D in breakpoint exception handler") To solve this issue, this stores all DAIF bits and restore it after single stepping. Reported-by:
Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: 7419333f ("arm64: kprobe: Always clear pstate.D in breakpoint exception handler") Reviewed-by:
James Morse <james.morse@arm.com> Tested-by:
James Morse <james.morse@arm.com> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Will Deacon <will@kernel.org>
-
- Aug 01, 2019
-
-
Qian Cai authored
When CONFIG_KASAN_SW_TAGS=n, set_tag() is compiled away. GCC throws a warning, mm/kasan/common.c: In function '__kasan_kmalloc': mm/kasan/common.c:464:5: warning: variable 'tag' set but not used [-Wunused-but-set-variable] u8 tag = 0xff; ^~~ Fix it by making __tag_set() a static inline function the same as arch_kasan_set_tag() in mm/kasan/kasan.h for consistency because there is a macro in arch/arm64/include/asm/kasan.h, #define arch_kasan_set_tag(addr, tag) __tag_set(addr, tag) However, when CONFIG_DEBUG_VIRTUAL=n and CONFIG_SPARSEMEM_VMEMMAP=y, page_to_virt() will call __tag_set() with incorrect type of a parameter, so fix that as well. Also, still let page_to_virt() return "void *" instead of "const void *", so will not need to add a similar cast in lowmem_page_address(). Signed-off-by:
Qian Cai <cai@lca.pw> Signed-off-by:
Will Deacon <will@kernel.org>
-
Qian Cai authored
GCC throws a warning, arch/arm64/mm/mmu.c: In function 'pud_free_pmd_page': arch/arm64/mm/mmu.c:1033:8: warning: variable 'pud' set but not used [-Wunused-but-set-variable] pud_t pud; ^~~ because pud_table() is a macro and compiled away. Fix it by making it a static inline function and for pud_sect() as well. Signed-off-by:
Qian Cai <cai@lca.pw> Signed-off-by:
Will Deacon <will@kernel.org>
-
Masami Hiramatsu authored
Remove rcu_read_lock()/rcu_read_unlock() from debug exception handlers since we are sure those are not preemptible and interrupts are off. Acked-by:
Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Will Deacon <will@kernel.org>
-
Masami Hiramatsu authored
Prohibit probing on return_address() and subroutines which is called from return_address(), since the it is invoked from trace_hardirqs_off() which is also kprobe blacklisted. Reported-by:
Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Will Deacon <will@kernel.org>
-
Julien Thierry authored
On a system with two security states, if SCR_EL3.FIQ is cleared, non-secure IRQ priorities get shifted to fit the secure view but priority masks aren't. On such system, it turns out that GIC_PRIO_IRQON masks the priority of normal interrupts, which obviously ends up in a hang. Increase GIC_PRIO_IRQON value (i.e. lower priority) to make sure interrupts are not blocked by it. Cc: Oleg Nesterov <oleg@redhat.com> Fixes: bd82d4bd ("arm64: Fix incorrect irqflag restore for priority masking") Acked-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Julien Thierry <julien.thierry.kdev@gmail.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> [will: fixed Fixes: tag] Signed-off-by:
Will Deacon <will@kernel.org>
-
James Bottomley authored
Apparently we don't have an archclean target in our arch/parisc/Makefile, so files in there never get cleaned out by make mrproper. This, in turn means that the sizes.h file in arch/parisc/boot/compressed never gets removed and worse, when you transition to an O=build/parisc[64] build model it overrides the generated file. The upshot being my bzImage was building with a SZ_end that was too small. I fixed it by making mrproper clean everything. Signed-off-by:
James Bottomley <James.Bottomley@HansenPartnership.com> Cc: stable@vger.kernel.org # v4.20+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
Helge Deller authored
Same as on x86-64, strip the .comment, .note and debug sections from the Linux kernel before creating the compressed image for the boot loader. Reported-by:
James Bottomley <James.Bottomley@HansenPartnership.com> Reported-by:
Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org # v4.20+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
Helge Deller authored
With debug info enabled (CONFIG_DEBUG_INFO=y) the resulting vmlinux may get that huge that we need to increase the start addresss for the decompression text section otherwise one will face a linker error. Reported-by:
Sven Schnelle <svens@stackframe.org> Tested-by:
Sven Schnelle <svens@stackframe.org> Cc: stable@vger.kernel.org # v4.14+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
- Jul 31, 2019
-
-
Paul Walmsley authored
Align the RV64 defconfig to the output of "make savedefconfig" to avoid unnecessary deltas for future defconfig patches. This patch should have no runtime functional impact. Signed-off-by:
Paul Walmsley <paul.walmsley@sifive.com> Reviewed-by:
Bin Meng <bmeng.cn@gmail.com>
-
Paul Walmsley authored
On FU540-based systems, the "timebase-frequency" (RTCCLK) is sourced from an external crystal located on the PCB. Thus the timebase-frequency DT property should be defined by the board that uses the SoC, not the SoC itself. Drop the superfluous timebase-frequency property from the SoC DT data. (It's already present in the board DT data.) Signed-off-by:
Paul Walmsley <paul.walmsley@sifive.com> Reviewed-by:
Bin Meng <bmeng.cn@gmail.com>
-
Mao Han authored
This patch fix following perf record error by linking vdso.so with build id. perf.data perf.data.old [ perf record: Woken up 1 times to write data ] free(): double free detected in tcache 2 Aborted perf record use filename__read_build_id(util/symbol-minimal.c) to get build id when libelf is not supported. When vdso.so is linked without build id, the section size of PT_NOTE will be zero, buf size will realloc to zero and cause memory corruption. Signed-off-by:
Mao Han <han_mao@c-sky.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Signed-off-by:
Paul Walmsley <paul.walmsley@sifive.com>
-
Qian Cai authored
GCC throws out this warning on arm64. drivers/firmware/efi/libstub/arm-stub.c: In function 'efi_entry': drivers/firmware/efi/libstub/arm-stub.c:132:22: warning: variable 'si' set but not used [-Wunused-but-set-variable] Fix it by making free_screen_info() a static inline function. Acked-by:
Will Deacon <will@kernel.org> Signed-off-by:
Qian Cai <cai@lca.pw> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Will Deacon authored
If CTR_EL0.{CWG,ERG} are 0b0000 then they must be interpreted to have their architecturally maximum values, which defeats the use of FTR_HIGHER_SAFE when sanitising CPU ID registers on heterogeneous machines. Introduce FTR_HIGHER_OR_ZERO_SAFE so that these fields effectively saturate at zero. Fixes: 3c739b57 ("arm64: Keep track of CPU feature registers") Cc: <stable@vger.kernel.org> # 4.4.x- Reviewed-by:
Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Will Deacon <will@kernel.org> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Vincenzo Frascino authored
Using an old .config in combination with "make oldconfig" can cause an incorrect detection of the compat compiler: $ grep CROSS_COMPILE_COMPAT .config CONFIG_CROSS_COMPILE_COMPAT_VDSO="" $ make oldconfig && make arch/arm64/Makefile:58: gcc not found, check CROSS_COMPILE_COMPAT. Stop. Accordingly to the section 7.2 of the GNU Make manual "Syntax of Conditionals", "When the value results from complex expansions of variables and functions, expansions you would consider empty may actually contain whitespace characters and thus are not seen as empty. However, you can use the strip function to avoid interpreting whitespace as a non-empty value." Fix the issue adding strip to the CROSS_COMPILE_COMPAT string evaluation. Reported-by:
Matteo Croce <mcroce@redhat.com> Tested-by:
Matteo Croce <mcroce@redhat.com> Acked-by:
Will Deacon <will@kernel.org> Signed-off-by:
Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Sven Schnelle authored
Assume the following ftrace code sequence that was patched in earlier by ftrace_make_call(): PAGE A: ffc: addr of ftrace_caller() PAGE B: 000: 0x6fc10080 /* stw,ma r1,40(sp) */ 004: 0x48213fd1 /* ldw -18(r1),r1 */ 008: 0xe820c002 /* bv,n r0(r1) */ 00c: 0xe83f1fdf /* b,l,n .-c,r1 */ When a Code sequences that is to be patched spans a page break, we might have already cleared the part on the PAGE A. If an interrupt is coming in during the remap of the fixed mapping to PAGE B, it might execute the patched function with only parts of the FTRACE code cleared. To prevent this, clear the jump to our mini trampoline first, and clear the remaining parts after this. This might also happen when patch_text() patches a function that it calls during remap. Signed-off-by:
Sven Schnelle <svens@stackframe.org> Cc: <stable@vger.kernel.org> # 5.2+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
Masahiro Yamada authored
'default_defconfig' is an awkward name since 'defconfig' is the default. Let's simply say 'defconfig' like other architectures. You can drop the KBUILD_DEFCONFIG define by following the standard naming. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by:
Helge Deller <deller@gmx.de>
-
Helge Deller authored
In fpudispatch.c we see a lot of fall-through warnings, but for this file we prefer to not mark the switches and instead keep it in it's original state as it's copied from HP-UX. Fixes: a035d552 ("Makefile: Globally enable fall-through warning") Signed-off-by:
Helge Deller <deller@gmx.de>
-
Helge Deller authored
Fix a fall-through warning in fault.c. Fixes: a035d552 ("Makefile: Globally enable fall-through warning") Signed-off-by:
Helge Deller <deller@gmx.de>
-
Christophe Leroy authored
Due to commit 4a6d8cf9 ("powerpc/mm: don't use pte_alloc_kernel() until slab is available on PPC32"), pte_alloc_kernel() cannot be used during early KASAN init. Fix it by using memblock_alloc() instead. Fixes: 2edb16ef ("powerpc/32: Add KASAN support") Cc: stable@vger.kernel.org # v5.2+ Reported-by:
Erhard F. <erhard_f@mailbox.org> Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/da89670093651437f27d2975224712e0a130b055.1564552796.git.christophe.leroy@c-s.fr
-
- Jul 30, 2019
-
-
Thomas Gleixner authored
The generic VDSO implementation uses the Y2038 safe clock_gettime64() and clock_getres_time64() syscalls as fallback for 32bit VDSO. This breaks seccomp setups because these syscalls might be not (yet) allowed. Implement the 32bit variants which use the legacy syscalls and select the variant in the core library. The 64bit time variants are not removed because they are required for the time64 based vdso accessors. Fixes: 00b26474 ("lib/vdso: Provide generic VDSO implementation") Reported-by:
Sean Christopherson <sean.j.christopherson@intel.com> Reported-by:
Paul Bolle <pebolle@tiscali.nl> Suggested-by:
Andy Lutomirski <luto@kernel.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by:
Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20190728131648.971361611@linutronix.de
-
Thomas Gleixner authored
The generic VDSO implementation uses the Y2038 safe clock_gettime64() and clock_getres_time64() syscalls as fallback for 32bit VDSO. This breaks seccomp setups because these syscalls might be not (yet) allowed. Implement the 32bit variants which use the legacy syscalls and select the variant in the core library. The 64bit time variants are not removed because they are required for the time64 based vdso accessors. Fixes: 7ac87074 ("x86/vdso: Switch to generic vDSO implementation") Reported-by:
Sean Christopherson <sean.j.christopherson@intel.com> Reported-by:
Paul Bolle <pebolle@tiscali.nl> Suggested-by:
Andy Lutomirski <luto@kernel.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by:
Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20190728131648.879156507@linutronix.de
-
Michael Ellerman authored
Mark switch cases where we are expecting to fall through. Fixes errors such as below, seen with mpc85xx_defconfig: arch/powerpc/kernel/align.c: In function 'emulate_spe': arch/powerpc/kernel/align.c:178:8: error: this statement may fall through ret |= __get_user_inatomic(temp.v[3], p++); ^~ Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190730141917.21817-1-mpe@ellerman.id.au
-
Aneesh Kumar K.V authored
Currently, nvdimm subsystem expects the device numa node for SCM device to be an online node. It also doesn't try to bring the device numa node online. Hence if we use a non-online numa node as device node we hit crashes like below. This is because we try to access uninitialized NODE_DATA in different code paths. cpu 0x0: Vector: 300 (Data Access) at [c0000000fac53170] pc: c0000000004bbc50: ___slab_alloc+0x120/0xca0 lr: c0000000004bc834: __slab_alloc+0x64/0xc0 sp: c0000000fac53400 msr: 8000000002009033 dar: 73e8 dsisr: 80000 current = 0xc0000000fabb6d80 paca = 0xc000000003870000 irqmask: 0x03 irq_happened: 0x01 pid = 7, comm = kworker/u16:0 Linux version 5.2.0-06234-g76bd729b2644 (kvaneesh@ltc-boston123) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #135 SMP Thu Jul 11 05:36:30 CDT 2019 enter ? for help [link register ] c0000000004bc834 __slab_alloc+0x64/0xc0 [c0000000fac53400] c0000000fac53480 (unreliable) [c0000000fac53500] c0000000004bc818 __slab_alloc+0x48/0xc0 [c0000000fac53560] c0000000004c30a0 __kmalloc_node_track_caller+0x3c0/0x6b0 [c0000000fac535d0] c000000000cfafe4 devm_kmalloc+0x74/0xc0 [c0000000fac53600] c000000000d69434 nd_region_activate+0x144/0x560 [c0000000fac536d0] c000000000d6b19c nd_region_probe+0x17c/0x370 [c0000000fac537b0] c000000000d6349c nvdimm_bus_probe+0x10c/0x230 [c0000000fac53840] c000000000cf3cc4 really_probe+0x254/0x4e0 [c0000000fac538d0] c000000000cf429c driver_probe_device+0x16c/0x1e0 [c0000000fac53950] c000000000cf0b44 bus_for_each_drv+0x94/0x130 [c0000000fac539b0] c000000000cf392c __device_attach+0xdc/0x200 [c0000000fac53a50] c000000000cf231c bus_probe_device+0x4c/0xf0 [c0000000fac53a90] c000000000ced268 device_add+0x528/0x810 [c0000000fac53b60] c000000000d62a58 nd_async_device_register+0x28/0xa0 [c0000000fac53bd0] c0000000001ccb8c async_run_entry_fn+0xcc/0x1f0 [c0000000fac53c50] c0000000001bcd9c process_one_work+0x46c/0x860 [c0000000fac53d20] c0000000001bd4f4 worker_thread+0x364/0x5f0 [c0000000fac53db0] c0000000001c7260 kthread+0x1b0/0x1c0 [c0000000fac53e20] c00000000000b954 ret_from_kernel_thread+0x5c/0x68 The patch tries to fix this by picking the nearest online node as the SCM node. This does have a problem of us losing the information that SCM node is equidistant from two other online nodes. If applications need to understand these fine-grained details we should express then like x86 does via /sys/devices/system/node/nodeX/accessY/initiators/ With the patch we get # numactl -H available: 2 nodes (0-1) node 0 cpus: node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 node 1 size: 130865 MB node 1 free: 129130 MB node distances: node 0 1 0: 10 20 1: 20 10 # cat /sys/bus/nd/devices/region0/numa_node 0 # dmesg | grep papr_scm [ 91.332305] papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Region registered with target node 2 and online node 0 Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190729095128.23707-1-aneesh.kumar@linux.ibm.com
-
- Jul 29, 2019
-
-
Heiko Carstens authored
Commit a035d552 ("Makefile: Globally enable fall-through warning") enables fall-through warnings globally. Add missing annotations. Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Vasily Gorbik authored
Since gmap_test_and_clear_dirty_pmd is not exported and has no reason to be globally visible make it static to avoid the following sparse warning: arch/s390/mm/gmap.c:2427:6: warning: symbol 'gmap_test_and_clear_dirty_pmd' was not declared. Should it be static? Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Vasily Gorbik authored
Include <asm/kexec.h> into machine_kexec_reloc.c to expose arch_kexec_do_relocs declaration and avoid the following sparse warnings: arch/s390/kernel/machine_kexec_reloc.c:4:5: warning: symbol 'arch_kexec_do_relocs' was not declared. Should it be static? arch/s390/boot/../kernel/machine_kexec_reloc.c:4:5: warning: symbol 'arch_kexec_do_relocs' was not declared. Should it be static? Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Vasily Gorbik authored
Since there is really no reason for cf_diag_csd per cpu variable to be globally visible make it static to avoid the following sparse warning: arch/s390/kernel/perf_cpum_cf_diag.c:37:1: warning: symbol 'cf_diag_csd' was not declared. Should it be static? Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-
Vasily Gorbik authored
Include <asm/xor.h> into arch/s390/lib/xor.c to expose xor_block_xc declaration and avoid the following sparse warning: arch/s390/lib/xor.c:128:27: warning: symbol 'xor_block_xc' was not declared. Should it be static? Signed-off-by:
Vasily Gorbik <gor@linux.ibm.com>
-