- Feb 08, 2021
-
-
Suzuki K Poulose authored
The erratum 1024718 affects Cortex-A55 r0p0 to r2p0. However we apply the work around for r0p0 - r1p0. Unfortunately this won't be fixed for the future revisions for the CPU. Thus extend the work around for all versions of A55, to cover for r2p0 and any future revisions. Cc: stable@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by:
Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20210203230057.3961239-1-suzuki.poulose@arm.com [will: Update Kconfig help text] Signed-off-by:
Will Deacon <will@kernel.org>
-
- Jan 16, 2021
-
-
Atish Patra authored
Linux kernel can only map 1GB of address space for RV32 as the page offset is set to 0xC0000000. The current description in the Kconfig is confusing as it indicates that RV32 can support 2GB of physical memory. That is simply not true for current kernel. In future, a 2GB split support can be added to allow 2GB physical address space. Reviewed-by:
Anup Patel <anup@brainfault.org> Signed-off-by:
Atish Patra <atish.patra@wdc.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Atish Patra authored
Currently, linux kernel can not use last 4k bytes of addressable space because IS_ERR_VALUE macro treats those as an error. This will be an issue for RV32 as any memblock allocator potentially allocate chunk of memory from the end of DRAM (2GB) leading bad address error even though the address was technically valid. Fix this issue by limiting the memblock if available memory spans the entire address space. Reviewed-by:
Anup Patel <anup@brainfault.org> Signed-off-by:
Atish Patra <atish.patra@wdc.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Atish Patra authored
Currently, resource tree allocates memory blocks while iterating on the list. It leads to following kernel warning because memblock allocation also invokes memory block reservation API. [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/resource.c:795 __insert_resource+0x8e/0xd0 [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0-00022-ge20097fb37e2-dirty #549 [ 0.000000] epc: c00125c2 ra : c001262c sp : c1c01f50 [ 0.000000] gp : c1d456e0 tp : c1c0a980 t0 : ffffcf20 [ 0.000000] t1 : 00000000 t2 : 00000000 s0 : c1c01f60 [ 0.000000] s1 : ffffcf00 a0 : ffffff00 a1 : c1c0c0c4 [ 0.000000] a2 : 80c12b15 a3 : 80402000 a4 : 80402000 [ 0.000000] a5 : c1c0c0c4 a6 : 80c12b15 a7 : f5faf600 [ 0.000000] s2 : c1c0c0c4 s3 : c1c0e000 s4 : c1009a80 [ 0.000000] s5 : c1c0c000 s6 : c1d48000 s7 : c1613b4c [ 0.000000] s8 : 00000fff s9 : 80000200 s10: c1613b40 [ 0.000000] s11: 00000000 t3 : c1d4a000 t4 : ffffffff This is also unnecessary as we can pre-compute the total memblocks required for each memory region and allocate it before the loop. It save precious boot time not going through memblock allocation code every time. Fixes: 00ab027a3b82 ("RISC-V: Add kernel image sections to the resource tree") Reviewed-by:
Anup Patel <anup@brainfault.org> Tested-by:
Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Atish Patra <atish.patra@wdc.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Jan 15, 2021
-
-
Mark Rutland authored
The kbuild test robot reports that when building with W=1, GCC will warn for a couple of missing prototypes in syscall.c: | arch/arm64/kernel/syscall.c:157:6: warning: no previous prototype for 'do_el0_svc' [-Wmissing-prototypes] | 157 | void do_el0_svc(struct pt_regs *regs) | | ^~~~~~~~~~ | arch/arm64/kernel/syscall.c:164:6: warning: no previous prototype for 'do_el0_svc_compat' [-Wmissing-prototypes] | 164 | void do_el0_svc_compat(struct pt_regs *regs) | | ^~~~~~~~~~~~~~~~~ While this isn't a functional problem, as a general policy we should include the prototype for functions wherever possible to catch any accidental divergence between the prototype and implementation. Here we can easily include <asm/exception.h>, so let's do so. While there are a number of warnings elsewhere and some warnings enabled under W=1 are of questionable benefit, this change helps to make the code more robust as it evolved and reduces the noise somewhat, so it seems worthwhile. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Reported-by:
kernel test robot <lkp@intel.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/202101141046.n8iPO3mw-lkp@intel.com Link: https://lore.kernel.org/r/20210114124812.17754-1-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Kefeng Wang authored
Using global sp_in_global directly to fix the following warning, arch/riscv/kernel/stacktrace.c:31:3: warning: ‘register’ is not at beginning of declaration [-Wold-style-declaration] 31 | const register unsigned long current_sp = sp_in_global; | ^~~~~ Signed-off-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Jan 14, 2021
-
-
Sagar Shrikant Kadam authored
Ethernet phy VSC8541-01 on HiFive Unleashed has its reset line connected to a gpio, so enable GPIO driver's required to reset the phy. Signed-off-by:
Sagar Shrikant Kadam <sagar.kadam@sifive.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Sagar Shrikant Kadam authored
The GEMGXL_RST line on HiFive Unleashed is pulled low and is using GPIO number 12. Add these reset-gpio details to dt-node using which the linux phylib can reset the phy. Signed-off-by:
Sagar Shrikant Kadam <sagar.kadam@sifive.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Sagar Shrikant Kadam authored
HiFive unleashed A00 board has VSC8541-01 ethernet phy, this device is identified as a Revision B device as described in device identification registers. In order to use this phy in the unmanaged mode, it requires a specific reset sequence of logical 0-1-0-1 transition on the NRESET pin as documented here [1]. Currently, the bootloader (fsbl or u-boot-spl) takes care of the phy reset. If due to some reason the phy device hasn't received the reset by the prior stages before the linux macb driver comes into the picture, the MACB mii bus gets probed but the mdio scan fails and is not even able to read the phy ID registers. It gives an error message: "libphy: MACB_mii_bus: probed mdio_bus 10090000.ethernet-ffffffff: MDIO device at address 0 is missing." Thus adding the device OUI (Organizationally Unique Identifier) to the phy device node helps to probe the phy device. [1]: VSC8541-01 datasheet: https://www.mouser.com/ds/2/523/Microsemi_VSC8541-01_Datasheet_10496_V40-1148034.pdf Signed-off-by:
Sagar Shrikant Kadam <sagar.kadam@sifive.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Andreas Schwab authored
The second argument of __kernel_clock_gettime64 points to a struct __kernel_timespec, with 64-bit time_t, so use the clock_gettime64 syscall in the fallback function for the 32-bit VDSO. Similarly, clock_getres_fallback should use the clock_getres_time64 syscall, though it isn't yet called from the 32-bit VDSO. Fixes: d0e3fc69 ("powerpc/vdso: Provide __kernel_clock_gettime64() on vdso32") Signed-off-by:
Andreas Schwab <schwab@linux-m68k.org> [chleroy: Moved into a single #ifdef __powerpc64__ block] Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0c0ab0eb3cc80687c326f76ff0dd5762b8812ecc.1610452505.git.christophe.leroy@csgroup.eu
-
Nick Hu authored
Use virtual address instead of physical address when translating the address to shadow memory by kasan_mem_to_shadow(). Signed-off-by:
Nick Hu <nickhu@andestech.com> Signed-off-by:
Nylon Chen <nylon7@andestech.com> Fixes: b10d6bca ("arch, drivers: replace for_each_membock() with for_each_mem_range()") Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Jan 13, 2021
-
-
David Woodhouse authored
Only the IPI-related functions in the smp_ops should be conditional on the vector callback being available. The rest should still happen: • xen_hvm_smp_prepare_boot_cpu() This function does two things, both of which should still happen if there is no vector callback support. The call to xen_vcpu_setup() for vCPU0 should still happen as it just sets up the vcpu_info for CPU0. That does happen for the secondary vCPUs too, from xen_cpu_up_prepare_hvm(). The second thing it does is call xen_init_spinlocks(), which perhaps counter-intuitively should *also* still be happening in the case without vector callbacks, so that it can clear its local xen_pvspin flag and disable the virt_spin_lock_key accordingly. Checking xen_have_vector_callback in xen_init_spinlocks() itself would affect PV guests, so set the global nopvspin flag in xen_hvm_smp_init() instead, when vector callbacks aren't available. • xen_hvm_smp_prepare_cpus() This does some IPI-related setup by calling xen_smp_intr_init() and xen_init_lock_cpu(), which can be made conditional. And it sets the xen_vcpu_id to XEN_VCPU_ID_INVALID for all possible CPUS, which does need to happen. • xen_smp_cpus_done() This offlines any vCPUs which doesn't fit in the global shared_info page, if separate vcpu_info placement isn't available. That part also needs to happen regardless of vector callback support. • xen_hvm_cpu_die() This doesn't actually do anything other than commin_cpu_die() right right now in the !vector_callback case; all three teardown functions it calls should be no-ops. But to guard against future regressions it's useful to call it anyway, and for it to explicitly check for xen_have_vector_callback before calling those additional functions. Signed-off-by:
David Woodhouse <dwmw@amazon.co.uk> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-6-dwmw2@infradead.org Signed-off-by:
Juergen Gross <jgross@suse.com>
-
David Woodhouse authored
In the case where xen_have_vector_callback is false, we still register the IPI vectors in xen_smp_intr_init() for the secondary CPUs even though they aren't going to be used. Stop doing that. Signed-off-by:
David Woodhouse <dwmw@amazon.co.uk> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-5-dwmw2@infradead.org Signed-off-by:
Juergen Gross <jgross@suse.com>
-
David Woodhouse authored
It's useful to be able to test non-vector event channel delivery, to make sure Linux will work properly on older Xen which doesn't have it. It's also useful for those working on Xen and Xen-compatible hypervisors, because there are guest kernels still in active use which use PCI INTX even when vector delivery is available. Signed-off-by:
David Woodhouse <dwmw@amazon.co.uk> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210106153958.584169-4-dwmw2@infradead.org Signed-off-by:
Juergen Gross <jgross@suse.com>
-
David Woodhouse authored
For a while, event channel notification via the PCI platform device has been broken, because we attempt to communicate with xenstore before we even have notifications working, with the xs_reset_watches() call in xs_init(). We tend to get away with this on Xen versions below 4.0 because we avoid calling xs_reset_watches() anyway, because xenstore might not cope with reading a non-existent key. And newer Xen *does* have the vector callback support, so we rarely fall back to INTX/GSI delivery. To fix it, clean up a bit of the mess of xs_init() and xenbus_probe() startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM case, deferring it to be called from xenbus_probe() in the XS_HVM case instead. Then fix up the invocation of xenbus_probe() to happen either from its device_initcall if the callback is available early enough, or when the callback is finally set up. This means that the hack of calling xenbus_probe() from a workqueue after the first interrupt, or directly from the PCI platform device setup, is no longer needed. Signed-off-by:
David Woodhouse <dwmw@amazon.co.uk> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210113132606.422794-2-dwmw2@infradead.org Signed-off-by:
Juergen Gross <jgross@suse.com>
-
Arnd Bergmann authored
With UBSAN enabled and building with clang, there are occasionally warnings like WARNING: modpost: vmlinux.o(.text+0xc533ec): Section mismatch in reference from the function arch_atomic64_or() to the variable .init.data:numa_nodes_parsed The function arch_atomic64_or() references the variable __initdata numa_nodes_parsed. This is often because arch_atomic64_or lacks a __initdata annotation or the annotation of numa_nodes_parsed is wrong. for functions that end up not being inlined as intended but operating on __initdata variables. Mark these as __always_inline, along with the corresponding asm-generic wrappers. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210108092024.4034860-1-arnd@kernel.org Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Jianlin Lv authored
S_FRAME_SIZE is the size of the pt_regs structure, no longer the size of the kernel stack frame, the name is misleading. In keeping with arm32, rename S_FRAME_SIZE to PT_REGS_SIZE. Signed-off-by:
Jianlin Lv <Jianlin.Lv@arm.com> Acked-by:
Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210112015813.2340969-1-Jianlin.Lv@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Will Deacon authored
This reverts commit 367c820e. lockup_detector_init() makes heavy use of per-cpu variables and must be called with preemption disabled. Usually, it's handled early during boot in kernel_init_freeable(), before SMP has been initialised. Since we do not know whether or not our PMU interrupt can be signalled as an NMI until considerably later in the boot process, the Arm PMU driver attempts to re-initialise the lockup detector off the back of a device_initcall(). Unfortunately, this is called from preemptible context and results in the following splat: | BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 | caller is debug_smp_processor_id+0x20/0x2c | CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x3c0 | show_stack+0x20/0x6c | dump_stack+0x2f0/0x42c | check_preemption_disabled+0x1cc/0x1dc | debug_smp_processor_id+0x20/0x2c | hardlockup_detector_event_create+0x34/0x18c | hardlockup_detector_perf_init+0x2c/0x134 | watchdog_nmi_probe+0x18/0x24 | lockup_detector_init+0x44/0xa8 | armv8_pmu_driver_init+0x54/0x78 | do_one_initcall+0x184/0x43c | kernel_init_freeable+0x368/0x380 | kernel_init+0x1c/0x1cc | ret_from_fork+0x10/0x30 Rather than bodge this with raw_smp_processor_id() or randomly disabling preemption, simply revert the culprit for now until we figure out how to do this properly. Reported-by:
Lecopzer Chen <lecopzer.chen@mediatek.com> Signed-off-by:
Will Deacon <will@kernel.org> Acked-by:
Mark Rutland <mark.rutland@arm.com> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
All EL0 returns go via ret_to_user(), which masks IRQs and notifies lockdep and tracing before calling into do_notify_resume(). Therefore, there's no need for do_notify_resume() to call trace_hardirqs_off(), and the comment is stale. The call is simply redundant. In ret_to_user() we call exit_to_user_mode(), which notifies lockdep and tracing the IRQs will be enabled in userspace, so there's no need for el0_svc_common() to call trace_hardirqs_on() before returning. Further, at the start of ret_to_user() we call trace_hardirqs_off(), so not only is this redundant, but it is immediately undone. In addition to being redundant, the trace_hardirqs_on() in el0_svc_common() leaves lockdep inconsistent with the hardware state, and is liable to cause issues for any C code or instrumentation between this and the call to trace_hardirqs_off() which undoes it in ret_to_user(). This patch removes the redundant tracing calls and associated stale comments. Fixes: 23529049 ("arm64: entry: fix non-NMI user<->kernel transitions") Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Acked-by:
Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210107145310.44616-1-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Menglong Dong authored
The type of 'r' in octeon_irq_init_ciu is 'unsigned int', so 'r < 0' can't be true. Fix this by change the type of 'r' and 'i' from 'unsigned int' to 'int'. As 'i' won't be negative, this change works. Fixes: 99fbc70f ("MIPS: Octeon: irq: Alloc desc before configuring IRQ") Signed-off-by:
Menglong Dong <dong.menglong@zte.com.cn> Reviewed-by:
Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de>
-
Alexander Lobakin authored
LLVM-built Linux triggered a boot hangup with KASLR enabled. arch/mips/kernel/relocate.c:get_random_boot() uses linux_banner, which is a string constant, as a random seed, but accesses it as an array of unsigned long (in rotate_xor()). When the address of linux_banner is not aligned to sizeof(long), such access emits unaligned access exception and hangs the kernel. Use PTR_ALIGN() to align input address to sizeof(long) and also align down the input length to prevent possible access-beyond-end. Fixes: 405bc8fd ("MIPS: Kernel: Implement KASLR using CONFIG_RELOCATABLE") Cc: stable@vger.kernel.org # 4.7+ Signed-off-by:
Alexander Lobakin <alobakin@pm.me> Tested-by:
Nathan Chancellor <natechancellor@gmail.com> Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de>
-
Guo Ren authored
The patch fix commit: ad5d1122 ("riscv: use vDSO common flow to reduce the latency of the time-related functions"). The GENERIC_TIME_VSYSCALL should be CONFIG_GENERIC_TIME_VSYSCALL or vgettimeofday won't work. Signed-off-by:
Guo Ren <guoren@linux.alibaba.com> Reviewed-by:
Pekka Enberg <penberg@kernel.org> Fixes: ad5d1122 ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Kefeng Wang authored
Use raw_smp_processor_id instead of smp_processor_id() to fix warning, BUG: using smp_processor_id() in preemptible [00000000] code: init/1 caller is debug_smp_processor_id+0x1c/0x26 CPU: 0 PID: 1 Comm: init Not tainted 5.10.0-rc4 #211 Call Trace: walk_stackframe+0x0/0xaa show_stack+0x32/0x3e dump_stack+0x76/0x90 check_preemption_disabled+0xaa/0xac debug_smp_processor_id+0x1c/0x26 get_cache_size+0x18/0x68 load_elf_binary+0x868/0xece bprm_execve+0x224/0x498 kernel_execve+0xdc/0x142 run_init_process+0x90/0x9e try_to_run_init_process+0x12/0x3c kernel_init+0xb4/0xf8 ret_from_exception+0x0/0xc The issue is found when CONFIG_DEBUG_PREEMPT enabled. Reviewed-by:
Atish Patra <atish.patra@wdc.com> Tested-by:
Atish Patra <atish.patra@wdc.com> Signed-off-by:
Kefeng Wang <wangkefeng.wang@huawei.com> [Palmer: Added a comment.] Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Atish Patra authored
We should call irq trace only if interrupt is going to be enabled during excecption handling. Otherwise, it results in following warning during boot with lock debugging enabled. [ 0.000000] ------------[ cut here ]------------ [ 0.000000] DEBUG_LOCKS_WARN_ON(early_boot_irqs_disabled) [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:4085 lockdep_hardirqs_on_prepare+0x22a/0x22e [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0-00022-ge20097fb37e2-dirty #548 [ 0.000000] epc: c005d5d4 ra : c005d5d4 sp : c1c01e80 [ 0.000000] gp : c1d456e0 tp : c1c0a980 t0 : 00000000 [ 0.000000] t1 : ffffffff t2 : 00000000 s0 : c1c01ea0 [ 0.000000] s1 : c100f360 a0 : 0000002d a1 : c00666ee [ 0.000000] a2 : 00000000 a3 : 00000000 a4 : 00000000 [ 0.000000] a5 : 00000000 a6 : c1c6b390 a7 : 3ffff00e [ 0.000000] s2 : c2384fe8 s3 : 00000000 s4 : 00000001 [ 0.000000] s5 : c1c0a980 s6 : c1d48000 s7 : c1613b4c [ 0.000000] s8 : 00000fff s9 : 80000200 s10: c1613b40 [ 0.000000] s11: 00000000 t3 : 00000000 t4 : 00000000 [ 0.000000] t5 : 00000001 t6 : 00000000 Fixes: 3c469798 ("riscv:Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") Signed-off-by:
Atish Patra <atish.patra@wdc.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Jan 12, 2021
-
-
Catalin Marinas authored
With the introduction of a dynamic ZONE_DMA range based on DT or IORT information, there's no need for CMA allocations from the wider ZONE_DMA32 since on most platforms ZONE_DMA will cover the 32-bit addressable range. Remove the arm64_dma32_phys_limit and set arm64_dma_phys_limit to cover the smallest DMA range required on the platform. CMA allocation and crashkernel reservation now go in the dynamically sized ZONE_DMA, allowing correct functionality on RPi4. Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Zhou <chenzhou10@huawei.com> Reviewed-by:
Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> # On RPi4B
-
Ariel Marcovitch authored
This is a bug that causes early crashes in builds with an .exit.text section smaller than a page and an .init.text section that ends in the beginning of a physical page (this is kinda random, which might explain why this wasn't really encountered before). The init sections are ordered like this: .init.text .exit.text .init.data Currently, these sections aren't page aligned. Because the init code might become read-only at runtime and because the .init.text section can potentially reside on the same physical page as .init.data, the beginning of .init.data might be mapped read-only along with .init.text. Then when the kernel tries to modify a variable in .init.data (like kthreadd_done, used in kernel_init()) the kernel panics. To avoid this, make _einittext page aligned and also align .exit.text to make sure .init.data is always seperated from the text segments. Fixes: 060ef9d8 ("powerpc32: PAGE_EXEC required for inittext") Signed-off-by:
Ariel Marcovitch <ariel.marcovitch@gmail.com> Reviewed-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210102201156.10805-1-ariel.marcovitch@gmail.com
-
- Jan 09, 2021
-
-
Kefeng Wang authored
commit b91540d5 ("RISC-V: Add EFI runtime services") add a duplicated PAGE_KERNEL_EXEC, kill it. Signed-off-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by:
Pekka Enberg <penberg@kernel.org> Reviewed-by:
Atish Patra <atish.patra@wdc.com> Fixes: b91540d5 ("RISC-V: Add EFI runtime services") Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Jan 08, 2021
-
-
Vineet Gupta authored
HSDK has hardware floating point and the common use case is with glibc+hf so enable that as default. Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Arnd Bergmann authored
dtc points out that the interrupts for some devices are not parsable: picoxcell-pc3x2.dtsi:45.19-49.5: Warning (interrupts_property): /paxi/gem@30000: Missing interrupt-parent picoxcell-pc3x2.dtsi:51.21-55.5: Warning (interrupts_property): /paxi/dmac@40000: Missing interrupt-parent picoxcell-pc3x2.dtsi:57.21-61.5: Warning (interrupts_property): /paxi/dmac@50000: Missing interrupt-parent picoxcell-pc3x2.dtsi:233.21-237.5: Warning (interrupts_property): /rwid-axi/axi2pico@c0000000: Missing interrupt-parent There are two VIC instances, so it's not clear which one needs to be used. I found the BSP sources that reference VIC0, so use that: https://github.com/r1mikey/meta-picoxcell/blob/master/recipes-kernel/linux/linux-picochip-3.0/0001-picoxcell-support-for-Picochip-picoXcell-SoC.patch Acked-by:
Jamie Iles <jamie@jamieiles.com> Link: https://lore.kernel.org/r/20201230152010.3914962-1-arnd@kernel.org ' Signed-off-by:
Arnd Bergmann <arnd@arndb.de>
-
Paolo Bonzini authored
Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Fenghua Yu authored
Shakeel Butt reported in [1] that a user can request a task to be moved to a resource group even if the task is already in the group. It just wastes time to do the move operation which could be costly to send IPI to a different CPU. Add a sanity check to ensure that the move operation only happens when the task is not already in the resource group. [1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/ Fixes: e02737d5 ("x86/intel_rdt: Add tasks files") Reported-by:
Shakeel Butt <shakeelb@google.com> Signed-off-by:
Fenghua Yu <fenghua.yu@intel.com> Signed-off-by:
Reinette Chatre <reinette.chatre@intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Tony Luck <tony.luck@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/962ede65d8e95be793cb61102cca37f7bb018e66.1608243147.git.reinette.chatre@intel.com
-
Fenghua Yu authored
Currently, when moving a task to a resource group the PQR_ASSOC MSR is updated with the new closid and rmid in an added task callback. If the task is running, the work is run as soon as possible. If the task is not running, the work is executed later in the kernel exit path when the kernel returns to the task again. Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task is running is the right thing to do. Queueing work for a task that is not running is unnecessary (the PQR_ASSOC MSR is already updated when the task is scheduled in) and causing system resource waste with the way in which it is implemented: Work to update the PQR_ASSOC register is queued every time the user writes a task id to the "tasks" file, even if the task already belongs to the resource group. This could result in multiple pending work items associated with a single task even if they are all identical and even though only a single update with most recent values is needed. Specifically, even if a task is moved between different resource groups while it is sleeping then it is only the last move that is relevant but yet a work item is queued during each move. This unnecessary queueing of work items could result in significant system resource waste, especially on tasks sleeping for a long time. For example, as demonstrated by Shakeel Butt in [1] writing the same task id to the "tasks" file can quickly consume significant memory. The same problem (wasted system resources) occurs when moving a task between different resource groups. As pointed out by Valentin Schneider in [2] there is an additional issue with the way in which the queueing of work is done in that the task_struct update is currently done after the work is queued, resulting in a race with the register update possibly done before the data needed by the update is available. To solve these issues, update the PQR_ASSOC MSR in a synchronous way right after the new closid and rmid are ready during the task movement, only if the task is running. If a moved task is not running nothing is done since the PQR_ASSOC MSR will be updated next time the task is scheduled. This is the same way used to update the register when tasks are moved as part of resource group removal. [1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/ [2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com [ bp: Massage commit message and drop the two update_task_closid_rmid() variants. ] Fixes: e02737d5 ("x86/intel_rdt: Add tasks files") Reported-by:
Shakeel Butt <shakeelb@google.com> Reported-by:
Valentin Schneider <valentin.schneider@arm.com> Signed-off-by:
Fenghua Yu <fenghua.yu@intel.com> Signed-off-by:
Reinette Chatre <reinette.chatre@intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Tony Luck <tony.luck@intel.com> Reviewed-by:
James Morse <james.morse@arm.com> Reviewed-by:
Valentin Schneider <valentin.schneider@arm.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.com
-
Damien Le Moal authored
When running is M-Mode (no MMU config), MPIE does not get set. This results in all syscalls being executed with interrupts disabled as handle_exception never sets SR_IE as it always sees SR_PIE being cleared. Fix this by always force enabling interrupts in handle_syscall when CONFIG_RISCV_M_MODE is enabled. Signed-off-by:
Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by:
Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Damien Le Moal authored
If of_clk_init() is not called in time_init(), clock providers defined in the system device tree are not initialized, resulting in failures for other devices to initialize due to missing clocks. Similarly to other architectures and to the default kernel time_init() implementation, call of_clk_init() before executing timer_probe() in time_init(). Signed-off-by:
Damien Le Moal <damien.lemoal@wdc.com> Acked-by:
Stephen Boyd <sboyd@kernel.org> Reviewed-by:
Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
Andreas Schwab authored
Properly return -ENOSYS for syscall -1 instead of leaving the return value uninitialized. This fixes the strace teststuite. Fixes: 5340627e ("riscv: add support for SECCOMP and SECCOMP_FILTER") Cc: stable@vger.kernel.org Signed-off-by:
Andreas Schwab <schwab@suse.de> Reviewed-by:
Tycho Andersen <tycho@tycho.pizza> Signed-off-by:
Palmer Dabbelt <palmerdabbelt@google.com>
-
- Jan 07, 2021
-
-
Tom Lendacky authored
Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence, where the guest vCPU register state is updated and then the vCPU is VMRUN to begin execution of the AP. For an SEV-ES guest, this won't work because the guest register state is encrypted. Following the GHCB specification, the hypervisor must not alter the guest register state, so KVM must track an AP/vCPU boot. Should the guest want to park the AP, it must use the AP Reset Hold exit event in place of, for example, a HLT loop. First AP boot (first INIT-SIPI-SIPI sequence): Execute the AP (vCPU) as it was initialized and measured by the SEV-ES support. It is up to the guest to transfer control of the AP to the proper location. Subsequent AP boot: KVM will expect to receive an AP Reset Hold exit event indicating that the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to awaken it. When the AP Reset Hold exit event is received, KVM will place the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI sequence, KVM will make the vCPU runnable. It is again up to the guest to then transfer control of the AP to the proper location. To differentiate between an actual HLT and an AP Reset Hold, a new MP state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is placed in upon receiving the AP Reset Hold exit event. Additionally, to communicate the AP Reset Hold exit event up to userspace (if needed), a new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD. A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds a new function that, for non SEV-ES guests, invokes the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests, implements the logic above. Signed-off-by:
Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Maxim Levitsky authored
It is possible to exit the nested guest mode, entered by svm_set_nested_state prior to first vm entry to it (e.g due to pending event) if the nested run was not pending during the migration. In this case we must not switch to the nested msr permission bitmap. Also add a warning to catch similar cases in the future. Fixes: a7d5c7ce ("KVM: nSVM: delay MSR permission processing to first nested VM run") Signed-off-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-2-mlevitsk@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Maxim Levitsky authored
We overwrite most of vmcb fields while doing so, so we must mark it as dirty. Signed-off-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-5-mlevitsk@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Maxim Levitsky authored
The code to store it on the migration exists, but no code was restoring it. One of the side effects of fixing this is that L1->L2 injected events are no longer lost when migration happens with nested run pending. Signed-off-by:
Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-3-mlevitsk@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Ben Gardon authored
The tdp_mmu_roots and tdp_mmu_pages in struct kvm_arch should only contain pages with tdp_mmu_page set to true. tdp_mmu_pages should not contain any pages with a non-zero root_count and tdp_mmu_roots should only contain pages with a positive root_count, unless a thread holds the MMU lock and is in the process of modifying the list. Various functions expect these invariants to be maintained, but they are not explictily documented. Add to the comments on both fields to document the above invariants. Signed-off-by:
Ben Gardon <bgardon@google.com> Message-Id: <20210107001935.3732070-2-bgardon@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-