- Feb 25, 2023
-
-
Changbin Du authored
Replace the obsolete and ambiguos macro in_irq() with new macro in_hardirq(). Signed-off-by:
Changbin Du <changbin.du@gmail.com> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Al Viro authored
On each context switch we save the FPU registers on stack of old process and restore FPU registers from the stack of new one. That allows us to avoid doing that each time we enter/leave the kernel mode; however, that can get suboptimal in some cases. For one thing, we don't need to bother saving anything for kernel threads. For another, if between entering and leaving the kernel a thread gives CPU up more than once, it will do useless work, saving the same values every time, only to discard the saved copy as soon as it returns from switch_to(). Alternative solution: * move the array we save into from switch_stack to thread_info * have a (thread-synchronous) flag set when we save them * have another flag set when they should be restored on return to userland. * do *NOT* save/restore them in do_switch_stack()/undo_switch_stack(). * restore on the exit to user mode if the restore flag had been set. Clear both flags. * on context switch, entry to fork/clone/vfork, before entry into do_signal() and on entry into straced syscall save the registers and set the 'saved' flag unless it had been already set. * on context switch set the 'restore' flag as well. * have copy_thread() set both flags for child, so the registers would be restored once the child returns to userland. * use the saved data in setup_sigcontext(); have restore_sigcontext() set both flags and copy from sigframe to save area. * teach ptrace to look for FPU registers in thread_info instead of switch_stack. * teach isolated accesses to FPU registers (rdfpcr, wrfpcr, etc.) to check the 'saved' flag (under preempt_disable()) and work with the save area if it's been set; if 'saved' flag is found upon write access, set 'restore' flag as well. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Al Viro authored
gzip_mark() and gzip_release() are gone; there used to be two forward declarations of each and the patch removing those suckers had left one of each behind... Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Al Viro authored
Just memcmp() with ELFMAG - that's the normal way to do it in userland code, which that thing is. Besides, that has the benefit of actually building - str_has_prefix() is *NOT* present in <string.h>. Fixes: 5f14596e "alpha: Replace strncmp with str_has_prefix" Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Al Viro authored
Type 3 instruction fault (FPU insn with FPU disabled) is handled by quietly enabling FPU and returning. Which is fine, except that we need to do that both for fault in userland and in the kernel; the latter *can* legitimately happen - all it takes is this: .global _start _start: call_pal 0xae lda $0, 0 ldq $0, 0($0) - call_pal CLRFEN to clear "FPU enabled" flag and arrange for a signal delivery (SIGSEGV in this case). Fixed by moving the handling of type 3 into the common part of do_entIF(), before we check for kernel vs. user mode. Incidentally, check for kernel mode is unidiomatic; the normal way to do that is !user_mode(regs). The difference is that the open-coded variant treats any of bits 63..3 of regs->ps being set as "it's user mode" while the normal approach is to check just the bit 3. PS is a 4-bit register and regs->ps always will have bits 63..4 clear, so the open-code variant here is actually equivalent to !user_mode(regs). Harder to follow, though... Reproducer above will crash any box where CLRFEN is not ignored by PAL (== any actual hardware, AFAICS; PAL used in qemu doesn't bother implementing that crap). Cc: stable@vger.kernel.org # all way back... Reviewed-by:
Richard Henderson <rth@twiddle.net> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
Joe Perches authored
Use semicolons and braces. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
rj1 authored
Signed-off-by:
Matt Turner <mattst88@gmail.com>
-
- Feb 22, 2023
-
-
Randy Dunlap authored
modpost reports section mismatch errors/warnings: WARNING: modpost: vmlinux.o: section mismatch in reference: svm_hv_hardware_setup (section: .text) -> (unknown) (section: .init.data) WARNING: modpost: vmlinux.o: section mismatch in reference: svm_hv_hardware_setup (section: .text) -> (unknown) (section: .init.data) WARNING: modpost: vmlinux.o: section mismatch in reference: svm_hv_hardware_setup (section: .text) -> (unknown) (section: .init.data) This "(unknown) (section: .init.data)" all refer to svm_x86_ops. Tag svm_hv_hardware_setup() with __init to fix a modpost warning as the non-stub implementation accesses __initdata (svm_x86_ops), i.e. would generate a use-after-free if svm_hv_hardware_setup() were actually invoked post-init. The helper is only called from svm_hardware_setup(), which is also __init, i.e. lack of __init is benign other than the modpost warning. Fixes: 1e0c7d40 ("KVM: SVM: hyper-v: Remote TLB flush for SVM") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Vineeth Pillai <viremana@linux.microsoft.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Reviewed-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20230222073315.9081-1-rdunlap@infradead.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Conor Dooley authored
The patchwork automation reported a sparse complaint that spin_shadow_stack was not declared and should be static: ../arch/riscv/kernel/traps.c:335:15: warning: symbol 'spin_shadow_stack' was not declared. Should it be static? However, this is used in entry.S and therefore shouldn't be static. The same applies to the shadow_stack that this pseudo spinlock is trying to protect, so do like its charge and add a declaration to thread_info.h Signed-off-by:
Conor Dooley <conor.dooley@microchip.com> Fixes: 7e186433 ("riscv: fix race when vmap stack overflow") Reviewed-by:
Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230210185945.915806-1-conor@kernel.org Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Russell Currey authored
plpks_is_available() can be called on any platform via kexec but calls _plpks_get_config() which makes a hcall, which will only work on pseries. Fix this by returning early in plpks_is_available() if hcalls aren't possible. Fixes: 119da30d ("powerpc/pseries: Expose PLPKS config values, support additional fields") Reported-by:
Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by:
Russell Currey <ruscur@russell.cc> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230222021708.146257-1-ruscur@russell.cc
-
Guo Ren authored
Add HVO support for RISC-V; see commit 6be24bed ("mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP"). This patch is similar to commit 1e63ac08 ("arm64: mm: hugetlb: enable HUGETLB_PAGE_FREE_VMEMMAP for arm64"), and riscv's motivation is the same as arm64. The current riscv was ready to enable HVO after fixup, ref commit d33deda0 ("riscv/mm: hugepage's PG_dcache_clean flag is only set in head page"). See Documentation/mm/vmemmap_dedup.rst for more details. The HugeTLB VmemmapvOptimization (HVO) defaults to off in Kconfig. Here is the riscv test log: cat /proc/sys/vm/hugetlb_optimize_vmemmap echo 8 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages mount -t hugetlbfs none test/ -o pagesize=2048k <Try some simple hugetlb test in test dir, no problem found.> Signed-off-by:
Guo Ren <guoren@linux.alibaba.com> Signed-off-by:
Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/linux-riscv/1F5AF29D-708A-483B-A29F-CAEE6F554866@linux.dev/ Acked-by:
Muchun Song <songmuchun@bytedance.com> Reviewed-by:
Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230201015259.3222524-1-guoren@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Liao Chang authored
Add header include guards to insn.h to prevent repeating declaration of any identifiers in insn.h. Fixes: edde5584 ("riscv: Add SW single-step support for KDB") Signed-off-by:
Liao Chang <liaochang1@huawei.com> Reviewed-by:
Andrew Jones <ajones@ventanamicro.com> Fixes: c9c1af3f ("RISC-V: rename parse_asm.h to insn.h") Reviewed-by:
Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230129094242.282620-1-liaochang1@huawei.com Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Jisheng Zhang authored
If we patched auipc + jalr pair, we'd better proceed one more instruction. Andrew pointed out "There's not a problem now, since we're only adding a fixup for jal, not jalr, but we should future-proof this and there's no reason to revisit an already fixed-up instruction anyway." Signed-off-by:
Jisheng Zhang <jszhang@kernel.org> Suggested-by:
Andrew Jones <ajones@ventanamicro.com> Reviewed-by:
Andrew Jones <ajones@ventanamicro.com> Reviewed-by:
Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20230115162811.3146-1-jszhang@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Mattias Nissler authored
While working on something else, I noticed that the kernel would start accepting interrupts again after crashing in an interrupt handler. Since the kernel is already in inconsistent state, enabling interrupts is dangerous and opens up risk of kernel state deteriorating further. Interrupts do get enabled via what looks like an unintended side effect of spin_unlock_irq, so switch to the more cautious spin_lock_irqsave/spin_unlock_irqrestore instead. Fixes: 76d2a049 ("RISC-V: Init and Halt Code") Signed-off-by:
Mattias Nissler <mnissler@rivosinc.com> Reviewed-by:
Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20230215144828.3370316-1-mnissler@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Björn Töpel authored
Commit 21855cac ("riscv/mm: Prevent kernel module to access user memory without uaccess routines") added early exits/deaths for page faults stemming from accesses to user-space without using proper uaccess routines (where sstatus.SUM is set). Unfortunatly, this is too strict for some BPF programs, which relies on BPF exhandler fixups. These BPF programs loads "BTF pointers". A BTF pointers could either be a valid kernel pointer or NULL, but not a userspace address. Resolve the problem by calling the fixup handler in the early exit path. Fixes: 21855cac ("riscv/mm: Prevent kernel module to access user memory without uaccess routines") Signed-off-by:
Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230214162515.184827-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Conor Dooley authored
Guenter reported a splat during boot, that Samuel pointed out was the lockdep assertion failing in patch_insn_write(): WARNING: CPU: 0 PID: 0 at arch/riscv/kernel/patch.c:63 patch_insn_write+0x222/0x2f6 epc : patch_insn_write+0x222/0x2f6 ra : patch_insn_write+0x21e/0x2f6 epc : ffffffff800068c6 ra : ffffffff800068c2 sp : ffffffff81803df0 gp : ffffffff81a1ab78 tp : ffffffff81814f80 t0 : ffffffffffffe000 t1 : 0000000000000001 t2 : 4c45203a76637369 s0 : ffffffff81803e40 s1 : 0000000000000004 a0 : 0000000000000000 a1 : ffffffffffffffff a2 : 0000000000000004 a3 : 0000000000000000 a4 : 0000000000000001 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000052464e43 s2 : ffffffff80b4889c s3 : 000000000000082c s4 : ffffffff80b48828 s5 : 0000000000000828 s6 : ffffffff8131a0a0 s7 : 0000000000000fff s8 : 0000000008000200 s9 : ffffffff8131a520 s10: 0000000000000018 s11: 000000000000000b t3 : 0000000000000001 t4 : 000000000000000d t5 : ffffffffd8180000 t6 : ffffffff81803bc8 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800068c6>] patch_insn_write+0x222/0x2f6 [<ffffffff80006a36>] patch_text_nosync+0xc/0x2a [<ffffffff80003b86>] riscv_cpufeature_patch_func+0x52/0x98 [<ffffffff80003348>] _apply_alternatives+0x46/0x86 [<ffffffff80c02d36>] apply_boot_alternatives+0x3c/0xfa [<ffffffff80c03ad8>] setup_arch+0x584/0x5b8 [<ffffffff80c0075a>] start_kernel+0xa2/0x8f8 This issue was exposed by 702e6455 ("riscv: fpu: switch has_fpu() to riscv_has_extension_likely()"), as it is the patching in has_fpu() that triggers the splats in Guenter's report. Take the text_mutex before doing any code patching to satisfy lockdep. Fixes: ff689fd2 ("riscv: add RISC-V Svpbmt extension support") Fixes: a35707c3 ("riscv: add memory-type errata for T-Head") Fixes: 1a0e5dbd ("riscv: sifive: Add SiFive alternative ports") Reported-by:
Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/all/20230212154333.GA3760469@roeck-us.net/ Signed-off-by:
Conor Dooley <conor.dooley@microchip.com> Reviewed-by:
Samuel Holland <samuel@sholland.org> Tested-by:
Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230212194735.491785-1-conor@kernel.org Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Andrew Jones authored
While the comment above the ISA extension ID definitions says "Entries are sorted alphabetically.", this stopped being good advice with commit d8a3d8a7 ("riscv: hwcap: make ISA extension ids can be used in asm"), as we now use macros instead of enums. Reshuffling defines is error-prone, so, since they don't need to be in any particular order, change the advice to just adding new extensions at the bottom. Also, take the opportunity to change spaces to tabs, merge three comments into one, and move the base and max defines into more logical locations wrt the ID definitions. Signed-off-by:
Andrew Jones <ajones@ventanamicro.com> Reviewed-by:
Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230209123636.123537-1-ajones@ventanamicro.com Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Heiko Stuebner authored
As Andrew reported, Zb* comes after Zi* according 27.11 "Subset Naming Convention" so fix the ordering accordingly. Reported-by:
Andrew Jones <ajones@ventanamicro.com> Signed-off-by:
Heiko Stuebner <heiko.stuebner@vrull.eu> Reviewed-by:
Conor Dooley <conor.dooley@microchip.com> Reviewed-by:
Andrew Jones <ajones@ventanamicro.com> Tested-by:
Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230208225328.1636017-2-heiko@sntech.de Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Andy Chiu authored
Runtime code patching must be done at a naturally aligned address, or we may execute on a partial instruction. We have encountered problems traced back to static jump functions during the test. We switched the tracer randomly for every 1~5 seconds on a dual-core QEMU setup and found the kernel sucking at a static branch where it jumps to itself. The reason is that the static branch was 2-byte but not 4-byte aligned. Then, the kernel would patch the instruction, either J or NOP, with two half-word stores if the machine does not have efficient unaligned accesses. Thus, moments exist where half of the NOP mixes with the other half of the J when transitioning the branch. In our particular case, on a little-endian machine, the upper half of the NOP was mixed with the lower part of the J when enabling the branch, resulting in a jump that jumped to itself. Conversely, it would result in a HINT instruction when disabling the branch, but it might not be observable. ARM64 does not have this problem since all instructions must be 4-byte aligned. Fixes: ebc00dde ("riscv: Add jump-label implementation") Link: https://lore.kernel.org/linux-riscv/20220913094252.3555240-6-andy.chiu@sifive.com/ Reviewed-by:
Greentime Hu <greentime.hu@sifive.com> Signed-off-by:
Andy Chiu <andy.chiu@sifive.com> Signed-off-by:
Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230206090440.1255001-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Palmer Dabbelt authored
The recent refactoring led to us leaking some HWCAP bits to userspace that didn't make much sense. With any luck we'll have a better scheme soon, but for now just mask off those bits to avoid polluting userspace. Acked-by:
Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230202233832.11036-1-palmer@rivosinc.com Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Sergey Matyukevich authored
This is a partial revert of the commit 4bd1d80e ("riscv: mm: notify remote harts about mmu cache updates"). Original commit included two loosely related changes serving the same purpose of fixing stale TLB entries causing user-space application crash: - introduce deferred per-ASID TLB flush for CPUs not running the task - switch to per-ASID TLB flush on all CPUs running the task in update_mmu_cache According to report and discussion in [1], the second part caused a regression on Renesas RZ/Five SoC. For now restore the old behavior of the update_mmu_cache. [1] https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/ Fixes: 4bd1d80e ("riscv: mm: notify remote harts about mmu cache updates") Reported-by:
"Lad, Prabhakar" <prabhakar.csengg@gmail.com> Signed-off-by:
Sergey Matyukevich <sergey.matyukevich@syntacore.com> Link: trailer, so that it can be parsed with git's trailer functionality? Reviewed-by:
Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230129211818.686557-1-geomatsi@gmail.com Cc: stable@vger.kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Björn Töpel authored
Add instruction dump (Code:) output to RISC-V splats. Dump 16b parcels. An example: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Oops [#1] Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc3-00302-g840ff44c571d-dirty #27 Hardware name: riscv-virtio,qemu (DT) epc : kernel_init+0xc8/0x10e ra : kernel_init+0x70/0x10e epc : ffffffff80bd9a40 ra : ffffffff80bd99e8 sp : ff2000000060bec0 gp : ffffffff81730b28 tp : ff6000007ff00000 t0 : 7974697275636573 t1 : 0000000000000000 t2 : 3030303270393d6e s0 : ff2000000060bee0 s1 : ffffffff81732028 a0 : 0000000000000000 a1 : ff60000080dd1780 a2 : 0000000000000002 a3 : ffffffff8176a470 a4 : 0000000000000000 a5 : 000000000000000a a6 : 0000000000000081 a7 : ff60000080dd1780 s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000 s5 : 0000000000000000 s6 : 0000000000000000 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000000000000 s10: 0000000000000000 s11: 0000000000000000 t3 : ffffffff81186018 t4 : 0000000000000022 t5 : 000000000000003d t6 : 0000000000000000 status: 0000000200000120 badaddr: 0000000000000000 cause: 000000000000000f [<ffffffff80003528>] ret_from_exception+0x0/0x16 Code: 862a d179 608c a517 0069 0513 2be5 d0ef db2e 47a9 (c11c) a517 ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b SMP: stopping secondary CPUs ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- Signed-off-by:
Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230119074738.708301-2-bjorn@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Jisheng Zhang authored
Now, after that all the sections are explicitly described and declared in vmlinux.lds.S, we can enable ld orphan warnings for !XIP_KERNEL to prevent from missing any new sections in future. Signed-off-by:
Jisheng Zhang <jszhang@kernel.org> Reviewed-by:
Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230119155417.2600-6-jszhang@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Jisheng Zhang authored
When enabling linker orphan section warning, I got warnings similar as below: ld.lld: warning: ./drivers/firmware/efi/libstub/lib.a(efi-stub-helper.stub.o):(.init.bss) is being placed in '.init.bss' Catch the sections so that we can enable linker orphan section warning. Signed-off-by:
Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20230119155417.2600-5-jszhang@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Jisheng Zhang authored
When enabling linker orphan section warning, I got warnings similar as below: riscv64-linux-gnu-ld: warning: orphan section `.riscv.attributes' from `init/main.o' being placed in section `.riscv.attributes' While I don't see any usage of .riscv.attributes sections' in kernel now, just catch the sections so that we can enable linker orphan section warning. Signed-off-by:
Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20230119155417.2600-4-jszhang@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Jisheng Zhang authored
When enabling linker orphan section warning, I got warnings similar as below: riscv64-linux-gnu-ld: warning: orphan section `.rela.text' from `init/main.o' being placed in section `.rela.dyn' Use the approach similar as ARM64 does and declare it in vmlinux.lds.S Signed-off-by:
Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20230119155417.2600-3-jszhang@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
Jisheng Zhang authored
riscv discards .exit.* sections at run-time but doesn't define RUNTIME_DISCARD_EXIT. However, the .exit.* sections are still allocated and kept even if the generic DISCARDS would discard the sections due to missing RUNTIME_DISCARD_EXIT, because the DISCARD sits at the end of the linker script. Add the missing RUNTIME_DISCARD_EXIT define so that it still works if we move DISCARD up or even at the beginning of the linker script. Signed-off-by:
Jisheng Zhang <jszhang@kernel.org> Suggested-by:
Masahiro Yamada <masahiroy@kernel.org> Reviewed-by:
Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230119155417.2600-2-jszhang@kernel.org Signed-off-by:
Palmer Dabbelt <palmer@rivosinc.com>
-
- Feb 21, 2023
-
-
Pali Rohár authored
Due to CPLD firmware bugs, set CPLD syscon-reboot priority level to 64 (between rstcr and watchdog) to ensure that rstcr's global-utilities reset method which is preferred stay as default one, and to ensure that CPLD syscon-reboot is more preferred than watchdog reset method. Fixes: 0531a4ab ("powerpc: dts: turris1x.dts: Add CPLD reboot node") Depends-on: e6333293 ("power: reset: syscon-reboot: Add support for specifying priority") Signed-off-by:
Pali Rohár <pali@kernel.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230220080435.4237-1-pali@kernel.org
-
- Feb 20, 2023
-
-
Yang Jihong authored
When arch_prepare_optimized_kprobe calculating jump destination address, it copies original instructions from jmp-optimized kprobe (see __recover_optprobed_insn), and calculated based on length of original instruction. arch_check_optimized_kprobe does not check KPROBE_FLAG_OPTIMATED when checking whether jmp-optimized kprobe exists. As a result, setup_detour_execution may jump to a range that has been overwritten by jump destination address, resulting in an inval opcode error. For example, assume that register two kprobes whose addresses are <func+9> and <func+11> in "func" function. The original code of "func" function is as follows: 0xffffffff816cb5e9 <+9>: push %r12 0xffffffff816cb5eb <+11>: xor %r12d,%r12d 0xffffffff816cb5ee <+14>: test %rdi,%rdi 0xffffffff816cb5f1 <+17>: setne %r12b 0xffffffff816cb5f5 <+21>: push %rbp 1.Register the kprobe for <func+11>, assume that is kp1, corresponding optimized_kprobe is op1. After the optimization, "func" code changes to: 0xffffffff816cc079 <+9>: push %r12 0xffffffff816cc07b <+11>: jmp 0xffffffffa0210000 0xffffffff816cc080 <+16>: incl 0xf(%rcx) 0xffffffff816cc083 <+19>: xchg %eax,%ebp 0xffffffff816cc084 <+20>: (bad) 0xffffffff816cc085 <+21>: push %rbp Now op1->flags == KPROBE_FLAG_OPTIMATED; 2. Register the kprobe for <func+9>, assume that is kp2, corresponding optimized_kprobe is op2. register_kprobe(kp2) register_aggr_kprobe alloc_aggr_kprobe __prepare_optimized_kprobe arch_prepare_optimized_kprobe __recover_optprobed_insn // copy original bytes from kp1->optinsn.copied_insn, // jump address = <func+14> 3. disable kp1: disable_kprobe(kp1) __disable_kprobe ... if (p == orig_p || aggr_kprobe_disabled(orig_p)) { ret = disarm_kprobe(orig_p, true) // add op1 in unoptimizing_list, not unoptimized orig_p->flags |= KPROBE_FLAG_DISABLED; // op1->flags == KPROBE_FLAG_OPTIMATED | KPROBE_FLAG_DISABLED ... 4. unregister kp2 __unregister_kprobe_top ... if (!kprobe_disabled(ap) && !kprobes_all_disarmed) { optimize_kprobe(op) ... if (arch_check_optimized_kprobe(op) < 0) // because op1 has KPROBE_FLAG_DISABLED, here not return return; p->kp.flags |= KPROBE_FLAG_OPTIMIZED; // now op2 has KPROBE_FLAG_OPTIMIZED } "func" code now is: 0xffffffff816cc079 <+9>: int3 0xffffffff816cc07a <+10>: push %rsp 0xffffffff816cc07b <+11>: jmp 0xffffffffa0210000 0xffffffff816cc080 <+16>: incl 0xf(%rcx) 0xffffffff816cc083 <+19>: xchg %eax,%ebp 0xffffffff816cc084 <+20>: (bad) 0xffffffff816cc085 <+21>: push %rbp 5. if call "func", int3 handler call setup_detour_execution: if (p->flags & KPROBE_FLAG_OPTIMIZED) { ... regs->ip = (unsigned long)op->optinsn.insn + TMPL_END_IDX; ... } The code for the destination address is 0xffffffffa021072c: push %r12 0xffffffffa021072e: xor %r12d,%r12d 0xffffffffa0210731: jmp 0xffffffff816cb5ee <func+14> However, <func+14> is not a valid start instruction address. As a result, an error occurs. Link: https://lore.kernel.org/all/20230216034247.32348-3-yangjihong1@huawei.com/ Fixes: f66c0447 ("kprobes: Set unoptimized flag after unoptimizing code") Signed-off-by:
Yang Jihong <yangjihong1@huawei.com> Cc: stable@vger.kernel.org Acked-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org>
-
Yang Jihong authored
Since the following commit: commit f66c0447 ("kprobes: Set unoptimized flag after unoptimizing code") modified the update timing of the KPROBE_FLAG_OPTIMIZED, a optimized_kprobe may be in the optimizing or unoptimizing state when op.kp->flags has KPROBE_FLAG_OPTIMIZED and op->list is not empty. The __recover_optprobed_insn check logic is incorrect, a kprobe in the unoptimizing state may be incorrectly determined as unoptimizing. As a result, incorrect instructions are copied. The optprobe_queued_unopt function needs to be exported for invoking in arch directory. Link: https://lore.kernel.org/all/20230216034247.32348-2-yangjihong1@huawei.com/ Fixes: f66c0447 ("kprobes: Set unoptimized flag after unoptimizing code") Cc: stable@vger.kernel.org Signed-off-by:
Yang Jihong <yangjihong1@huawei.com> Acked-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org>
-
Mark Rutland authored
When building a kernel with many debug options enabled (which happens in test configurations use by myself and syzbot), the kernel can become large enough that portions of .text can be more than 128M away from .idmap.text (which is placed inside the .rodata section). Where idmap code branches into .text, the linker will place veneers in the .idmap.text section to make those branches possible. Unfortunately, as Ard reports, GNU LD has bseen observed to add 4K of padding when adding such veneers, e.g. | .idmap.text 0xffffffc01e48e5c0 0x32c arch/arm64/mm/proc.o | 0xffffffc01e48e5c0 idmap_cpu_replace_ttbr1 | 0xffffffc01e48e600 idmap_kpti_install_ng_mappings | 0xffffffc01e48e800 __cpu_setup | *fill* 0xffffffc01e48e8ec 0x4 | .idmap.text.stub | 0xffffffc01e48e8f0 0x18 linker stubs | 0xffffffc01e48f8f0 __idmap_text_end = . | 0xffffffc01e48f000 . = ALIGN (0x1000) | *fill* 0xffffffc01e48f8f0 0x710 | 0xffffffc01e490000 idmap_pg_dir = . This makes the __idmap_text_start .. __idmap_text_end region bigger than the 4K we require it to fit within, and triggers an assertion in arm64's vmlinux.lds.S, which breaks the build: | LD .tmp_vmlinux.kallsyms1 | aarch64-linux-gnu-ld: ID map text too big or misaligned | make[1]: *** [scripts/Makefile.vmlinux:35: vmlinux] Error 1 | make: *** [Makefile:1264: vmlinux] Error 2 Avoid this by using an `ADRP+ADD+BLR` sequence for branches out of .idmap.text, which avoids the need for veneers. These branches are only executed once per boot, and only when the MMU is on, so there should be no noticeable performance penalty in replacing `BL` with `ADRP+ADD+BLR`. At the same time, remove the "x" and "w" attributes when placing code in .idmap.text, as these are not necessary, and this will prevent the linker from assuming that it is safe to place PLTs into .idmap.text, causing it to warn if and when there are out-of-range branches within .idmap.text, e.g. | LD .tmp_vmlinux.kallsyms1 | arch/arm64/kernel/head.o: in function `primary_entry': | (.idmap.text+0x1c): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o | arch/arm64/kernel/head.o: in function `init_el2': | (.idmap.text+0x88): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o | make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1 | make: *** [Makefile:1252: vmlinux] Error 2 Thus, if future changes add out-of-range branches in .idmap.text, it should be easy enough to identify those from the resulting linker errors. Reported-by:
<syzbot+f8ac312e31226e23302b@syzkaller.appspotmail.com> Link: https://lore.kernel.org/linux-arm-kernel/00000000000028ea4105f4e2ef54@google.com/ Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Will Deacon <will@kernel.org> Tested-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230220162317.1581208-1-mark.rutland@arm.com Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Randy Dunlap authored
When neither LANTIQ nor MIPS_MALTA is set, 'physical_memsize' is not declared. This causes the build to fail with: mips-linux-ld: arch/mips/kernel/vpe-mt.o: in function `vpe_run': arch/mips/kernel/vpe-mt.c:(.text.vpe_run+0x280): undefined reference to `physical_memsize' LANTIQ is not using 'physical_memsize' and MIPS_MALTA's use of it is self-contained in mti-malta/malta-dtshim.c. Use of physical_memsize in vpe-mt.c appears to be unused, so eliminate this loader mode completely and require VPE programs to be compiled with DFLT_STACK_SIZE and DFLT_HEAP_SIZE defined. Fixes: 9050d50e ("MIPS: lantiq: Set physical_memsize") Fixes: 1a2a6d7e ("MIPS: APRP: Split VPE loader into separate files.") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Reported-by:
kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/all/202302030625.2g3E98sY-lkp@intel.com/ Cc: Dengcheng Zhu <dzhu@wavecomp.com> Cc: John Crispin <john@phrozen.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Philippe Mathieu-Daudé <philmd@linaro.org> Cc: "Steven J. Hill" <Steven.Hill@imgtec.com> Cc: Qais Yousef <Qais.Yousef@imgtec.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de>
-
Christophe Leroy authored
Kernel test robot reports: arch/powerpc/mm/nohash/e500.c:314:21: warning: no previous prototype for 'relocate_init' [-Wmissing-prototypes] 314 | notrace void __init relocate_init(u64 dt_ptr, phys_addr_t start) | ^~~~~~~~~~~~~ Add it in mm/mmu_decl.h, close to associated is_second_reloc variable declaration. Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/oe-kbuild-all/202302181136.wgyCKUcs-lkp@intel.com/ Link: https://lore.kernel.org/r/ac9107acf24135e1a07e8f84d2090572d43e3fe4.1676712510.git.christophe.leroy@csgroup.eu
-
- Feb 19, 2023
-
-
Pierre Gondois authored
Running a rt-kernel base on 6.2.0-rc3-rt1 on an Ampere Altra outputs the following: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9, name: kworker/u320:0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 3 locks held by kworker/u320:0/9: #0: ffff3fff8c27d128 ((wq_completion)efi_rts_wq){+.+.}-{0:0}, at: process_one_work (./include/linux/atomic/atomic-long.h:41) #1: ffff80000861bdd0 ((work_completion)(&efi_rts_work.work)){+.+.}-{0:0}, at: process_one_work (./include/linux/atomic/atomic-long.h:41) #2: ffffdf7e1ed3e460 (efi_rt_lock){+.+.}-{3:3}, at: efi_call_rts (drivers/firmware/efi/runtime-wrappers.c:101) Preemption disabled at: efi_virtmap_load (./arch/arm64/include/asm/mmu_context.h:248) CPU: 0 PID: 9 Comm: kworker/u320:0 Tainted: G W 6.2.0-rc3-rt1 Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18 Workqueue: efi_rts_wq efi_call_rts Call trace: dump_backtrace (arch/arm64/kernel/stacktrace.c:158) show_stack (arch/arm64/kernel/stacktrace.c:165) dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) dump_stack (lib/dump_stack.c:114) __might_resched (kernel/sched/core.c:10134) rt_spin_lock (kernel/locking/rtmutex.c:1769 (discriminator 4)) efi_call_rts (drivers/firmware/efi/runtime-wrappers.c:101) [...] This seems to come from commit ff7a1679 ("arm64: efi: Execute runtime services from a dedicated stack") which adds a spinlock. This spinlock is taken through: efi_call_rts() \-efi_call_virt() \-efi_call_virt_pointer() \-arch_efi_call_virt_setup() Make 'efi_rt_lock' a raw_spinlock to avoid being preempted. [ardb: The EFI runtime services are called with a different set of translation tables, and are permitted to use the SIMD registers. The context switch code preserves/restores neither, and so EFI calls must be made with preemption disabled, rather than only disabling migration.] Fixes: ff7a1679 ("arm64: efi: Execute runtime services from a dedicated stack") Signed-off-by:
Pierre Gondois <pierre.gondois@arm.com> Cc: <stable@vger.kernel.org> # v6.1+ Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Elvira Khabirova authored
The implementation of syscall_get_nr on mips used to ignore the task argument and return the syscall number of the calling thread instead of the target thread. The bug was exposed to user space by commit 201766a2 ("ptrace: add PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite. Link: https://github.com/strace/strace/issues/235 Fixes: c2d9f177 ("MIPS: Fix syscall_get_nr for the syscall exit tracing.") Cc: <stable@vger.kernel.org> # v3.19+ Co-developed-by:
Dmitry V. Levin <ldv@strace.io> Signed-off-by:
Dmitry V. Levin <ldv@strace.io> Signed-off-by:
Elvira Khabirova <lineprinter0@gmail.com> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de>
-
Randy Dunlap authored
When MIPS_CPS=y, MIPS_CPS_PM is not set, HOTPLUG_CPU is not set, and KEXEC=y, cps_shutdown_this_cpu() attempts to call cps_pm_enter_state(), which is not built when MIPS_CPS_PM is not set. Conditionally execute the else branch based on CONFIG_HOTPLUG_CPU to remove the build error. This build failure is from a randconfig file. mips-linux-ld: arch/mips/kernel/smp-cps.o: in function `$L162': smp-cps.c:(.text.cps_kexec_nonboot_cpu+0x31c): undefined reference to `cps_pm_enter_state' Fixes: 1447864b ("MIPS: kexec: CPS systems to halt nonboot CPUs") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Dengcheng Zhu <dzhu@wavecomp.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Cc: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de>
-
H. Nikolaus Schaller authored
This makes the driver present the clk32k signal if requested. It is needed to clock the PMU of the BCM4330 WiFi and Bluetooth module of the CI20 board. Signed-off-by:
H. Nikolaus Schaller <hns@goldelico.com> Reviewed-by:
Paul Cercueil <paul@crapouillou.net> Signed-off-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de>
-
- Feb 18, 2023
-
-
Jan Beulich authored
Both the 4Gb-segments and the PAE-extended-CR3 one are applicable to 32-bit guests only. The PAE-extended-CR3 use, furthermore, was redundant with the PAE_MODE ELF note anyway. Signed-off-by:
Jan Beulich <jbeulich@suse.com> Reviewed-by:
Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/215515af-cfb9-3237-03ba-3312e3fa0d34@suse.com Signed-off-by:
Juergen Gross <jgross@suse.com>
-
- Feb 17, 2023
-
-
Pu Lehui authored
BPF trampoline is the critical infrastructure of the BPF subsystem, acting as a mediator between kernel functions and BPF programs. Numerous important features, such as using BPF program for zero overhead kernel introspection, rely on this key component. We can't wait to support bpf trampoline on RV64. The related tests have passed, as well as the test_verifier with no new failure ceses. Signed-off-by:
Pu Lehui <pulehui@huawei.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Tested-by:
Björn Töpel <bjorn@rivosinc.com> Acked-by:
Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/bpf/20230215135205.1411105-5-pulehui@huaweicloud.com
-