- Nov 22, 2022
-
-
Alexandre Belloni authored
When using dash as /bin/sh, the CC_HAS_ASM_GOTO_TIED_OUTPUT test fails with a syntax error which is not the one we are looking for: <stdin>: In function ‘foo’: <stdin>:1:29: warning: missing terminating " character <stdin>:1:29: error: missing terminating " character <stdin>:2:5: error: expected ‘:’ before ‘+’ token <stdin>:2:7: warning: missing terminating " character <stdin>:2:7: error: missing terminating " character <stdin>:2:5: error: expected declaration or statement at end of input Removing '\n' solves this. Fixes: 1aa0e8b1 ("Kconfig: Add option for asm goto w/ tied outputs to workaround clang-13 bug") Signed-off-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
- Oct 21, 2022
-
-
Colin Ian King authored
There is a spelling mistake in a Kconfig description. Fix it. Link: https://lkml.kernel.org/r/20221007204339.2757753-1-colin.i.king@gmail.com Signed-off-by:
Colin Ian King <colin.i.king@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Oct 12, 2022
-
-
Ren Zhijie authored
Commit 3c07bfce92a5 ("proc: make config PROC_CHILDREN depend on PROC_FS") make config PROC_CHILDREN depend on PROC_FS. When CONFIG_PROC_FS is not set and CONFIG_CHECKPOINT_RESTORE=y, make menuconfig screams like this: WARNING: unmet direct dependencies detected for PROC_CHILDREN Depends on [n]: PROC_FS [=n] Selected by [y]: - CHECKPOINT_RESTORE [=y] CHECKPOINT_RESTORE would select PROC_CHILDREN which depends on PROC_FS, so add depends on PROC_FS to CHECKPOINT_RESTORE to fix this. Link: https://lkml.kernel.org/r/20220929070057.59044-1-renzhijie2@huawei.com Fixes: 3c07bfce92a5 ("proc: make config PROC_CHILDREN depend on PROC_FS") Signed-off-by:
Ren Zhijie <renzhijie2@huawei.com> Reviewed-by:
Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Oct 03, 2022
-
-
Zhou jie authored
The void pointer object can be directly assigned to different structure objects, it does not need to be cast. Link: https://lkml.kernel.org/r/20220928014539.11046-1-zhoujie@nfschina.com Signed-off-by:
Zhou jie <zhoujie@nfschina.com> Reviewed-by:
Andrew Halaney <ahalaney@redhat.com> Acked-by:
Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Johannes Weiner authored
Since 2d1c4980 ("mm: memcontrol: make swap tracking an integral part of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP. Update the sites accordingly and drop the symbol. [ While touching the docs, remove two references to CONFIG_MEMCG_KMEM, which hasn't been a user-visible symbol for over half a decade. ] Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Acked-by:
Shakeel Butt <shakeelb@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Alexander Potapenko authored
kmsan_init_shadow() scans the mappings created at boot time and creates metadata pages for those mappings. When the memblock allocator returns pages to pagealloc, we reserve 2/3 of those pages and use them as metadata for the remaining 1/3. Once KMSAN starts, every page allocated by pagealloc has its associated shadow and origin pages. kmsan_initialize() initializes the bookkeeping for init_task and enables KMSAN. Link: https://lkml.kernel.org/r/20220915150417.722975-18-glider@google.com Signed-off-by:
Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Sep 29, 2022
-
-
Jason A. Donenfeld authored
The full RNG initialization relies on some timestamps, made possible with initialization functions like time_init() and timekeeping_init(). However, these are only available rather late in initialization. Meanwhile, other things, such as memory allocator functions, make use of the RNG much earlier. So split RNG initialization into two phases. We can provide arch randomness very early on, and then later, after timekeeping and such are available, initialize the rest. This ensures that, for example, slabs are properly randomized if RDRAND is available. Without this, CONFIG_SLAB_FREELIST_RANDOM=y loses a degree of its security, because its random seed is potentially deterministic, since it hasn't yet incorporated RDRAND. It also makes it possible to use a better seed in kfence, which currently relies on only the cycle counter. Another positive consequence is that on systems with RDRAND, running with CONFIG_WARN_ALL_UNSEEDED_RANDOM=y results in no warnings at all. One subtle side effect of this change is that on systems with no RDRAND, RDTSC is now only queried by random_init() once, committing the moment of the function call, instead of multiple times as before. This is intentional, as the multiple RDTSCs in a loop before weren't accomplishing very much, with jitter being better provided by try_to_generate_entropy(). Plus, filling blocks with RDTSC is still being done in extract_entropy(), which is necessarily called before random bytes are served anyway. Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Reviewed-by:
Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by:
Jason A. Donenfeld <Jason@zx2c4.com>
-
- Sep 28, 2022
-
-
Masahiro Yamada authored
Now that UTS_VERSION was separated out, this header can be generated much earlier, and probably the top Makefile is a better place to do it than init/Makefile. Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
Masahiro Yamada authored
Kbuild builds init/built-in.a twice; first during the ordinary directory descending, second from scripts/link-vmlinux.sh. We do this because UTS_VERSION contains the build version and the timestamp. We cannot update it during the normal directory traversal since we do not yet know if we need to update vmlinux. UTS_VERSION is temporarily calculated, but omitted from the update check. Otherwise, vmlinux would be rebuilt every time. When Kbuild results in running link-vmlinux.sh, it increments the version number in the .version file and takes the timestamp at that time to really fix UTS_VERSION. However, updating the same file twice is a footgun. To avoid nasty timestamp issues, all build artifacts that depend on init/built-in.a are atomically generated in link-vmlinux.sh, where some of them do not need rebuilding. To fix this issue, this commit changes as follows: [1] Split UTS_VERSION out to include/generated/utsversion.h from include/generated/compile.h include/generated/utsversion.h is generated just before the vmlinux link. It is generated under include/generated/ because some decompressors (s390, x86) use UTS_VERSION. [2] Split init_uts_ns and linux_banner out to init/version-timestamp.c from init/version.c init_uts_ns and linux_banner contain UTS_VERSION. During the ordinary directory descending, they are compiled with __weak and used to determine if vmlinux needs relinking. Just before the vmlinux link, they are compiled without __weak to embed the real version and timestamp. Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
Masahiro Yamada authored
This is unneeded since commit 073a9ecb ("init/version.c: remove Version_<LINUX_VERSION_CODE> symbol"). Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
Miguel Ojeda authored
Having most of the new files in place, we now enable Rust support in the build system, including `Kconfig` entries related to Rust, the Rust configuration printer and a few other bits. Reviewed-by:
Kees Cook <keescook@chromium.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Tested-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Co-developed-by:
Alex Gaynor <alex.gaynor@gmail.com> Signed-off-by:
Alex Gaynor <alex.gaynor@gmail.com> Co-developed-by:
Finn Behrens <me@kloenk.de> Signed-off-by:
Finn Behrens <me@kloenk.de> Co-developed-by:
Adam Bratschi-Kaye <ark.email@gmail.com> Signed-off-by:
Adam Bratschi-Kaye <ark.email@gmail.com> Co-developed-by:
Wedson Almeida Filho <wedsonaf@google.com> Signed-off-by:
Wedson Almeida Filho <wedsonaf@google.com> Co-developed-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Co-developed-by:
Sven Van Asbroeck <thesven73@gmail.com> Signed-off-by:
Sven Van Asbroeck <thesven73@gmail.com> Co-developed-by:
Gary Guo <gary@garyguo.net> Signed-off-by:
Gary Guo <gary@garyguo.net> Co-developed-by:
Boris-Chengbiao Zhou <bobo1239@web.de> Signed-off-by:
Boris-Chengbiao Zhou <bobo1239@web.de> Co-developed-by:
Boqun Feng <boqun.feng@gmail.com> Signed-off-by:
Boqun Feng <boqun.feng@gmail.com> Co-developed-by:
Douglas Su <d0u9.su@outlook.com> Signed-off-by:
Douglas Su <d0u9.su@outlook.com> Co-developed-by:
Dariusz Sosnowski <dsosnowski@dsosnowski.pl> Signed-off-by:
Dariusz Sosnowski <dsosnowski@dsosnowski.pl> Co-developed-by:
Antonio Terceiro <antonio.terceiro@linaro.org> Signed-off-by:
Antonio Terceiro <antonio.terceiro@linaro.org> Co-developed-by:
Daniel Xu <dxu@dxuuu.xyz> Signed-off-by:
Daniel Xu <dxu@dxuuu.xyz> Co-developed-by:
Björn Roy Baron <bjorn3_gh@protonmail.com> Signed-off-by:
Björn Roy Baron <bjorn3_gh@protonmail.com> Co-developed-by:
Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by:
Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by:
Miguel Ojeda <ojeda@kernel.org>
-
- Sep 27, 2022
-
-
Liam R. Howlett authored
Patch series "Introducing the Maple Tree" The maple tree is an RCU-safe range based B-tree designed to use modern processor cache efficiently. There are a number of places in the kernel that a non-overlapping range-based tree would be beneficial, especially one with a simple interface. If you use an rbtree with other data structures to improve performance or an interval tree to track non-overlapping ranges, then this is for you. The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf nodes. With the increased branching factor, it is significantly shorter than the rbtree so it has fewer cache misses. The removal of the linked list between subsequent entries also reduces the cache misses and the need to pull in the previous and next VMA during many tree alterations. The first user that is covered in this patch set is the vm_area_struct, where three data structures are replaced by the maple tree: the augmented rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The long term goal is to reduce or remove the mmap_lock contention. The plan is to get to the point where we use the maple tree in RCU mode. Readers will not block for writers. A single write operation will be allowed at a time. A reader re-walks if stale data is encountered. VMAs would be RCU enabled and this mode would be entered once multiple tasks are using the mm_struct. Davidlor said : Yes I like the maple tree, and at this stage I don't think we can ask for : more from this series wrt the MM - albeit there seems to still be some : folks reporting breakage. Fundamentally I see Liam's work to (re)move : complexity out of the MM (not to say that the actual maple tree is not : complex) by consolidating the three complimentary data structures very : much worth it considering performance does not take a hit. This was very : much a turn off with the range locking approach, which worst case scenario : incurred in prohibitive overhead. Also as Liam and Matthew have : mentioned, RCU opens up a lot of nice performance opportunities, and in : addition academia[1] has shown outstanding scalability of address spaces : with the foundation of replacing the locked rbtree with RCU aware trees. A similar work has been discovered in the academic press https://pdos.csail.mit.edu/papers/rcuvm:asplos12.pdf Sheer coincidence. We designed our tree with the intention of solving the hardest problem first. Upon settling on a b-tree variant and a rough outline, we researched ranged based b-trees and RCU b-trees and did find that article. So it was nice to find reassurances that we were on the right path, but our design choice of using ranges made that paper unusable for us. This patch (of 70): The maple tree is an RCU-safe range based B-tree designed to use modern processor cache efficiently. There are a number of places in the kernel that a non-overlapping range-based tree would be beneficial, especially one with a simple interface. If you use an rbtree with other data structures to improve performance or an interval tree to track non-overlapping ranges, then this is for you. The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf nodes. With the increased branching factor, it is significantly shorter than the rbtree so it has fewer cache misses. The removal of the linked list between subsequent entries also reduces the cache misses and the need to pull in the previous and next VMA during many tree alterations. The first user that is covered in this patch set is the vm_area_struct, where three data structures are replaced by the maple tree: the augmented rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The long term goal is to reduce or remove the mmap_lock contention. The plan is to get to the point where we use the maple tree in RCU mode. Readers will not block for writers. A single write operation will be allowed at a time. A reader re-walks if stale data is encountered. VMAs would be RCU enabled and this mode would be entered once multiple tasks are using the mm_struct. There is additional BUG_ON() calls added within the tree, most of which are in debug code. These will be replaced with a WARN_ON() call in the future. There is also additional BUG_ON() calls within the code which will also be reduced in number at a later date. These exist to catch things such as out-of-range accesses which would crash anyways. Link: https://lkml.kernel.org/r/20220906194824.2110408-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20220906194824.2110408-2-Liam.Howlett@oracle.com Signed-off-by:
Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by:
David Howells <dhowells@redhat.com> Tested-by:
Sven Schnelle <svens@linux.ibm.com> Tested-by:
Yu Zhao <yuzhao@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: SeongJae Park <sj@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Sep 12, 2022
-
-
wuchi authored
As my_inptr is only used in __init function unpack_to_rootfs(), mark it as __initdata to allow it be freed after boot. Link: https://lkml.kernel.org/r/20220827071116.83078-1-wuchi.zero@gmail.com Signed-off-by:
wuchi <wuchi.zero@gmail.com> Reviewed-by:
David Disseldorp <ddiss@suse.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Martin Wilck <mwilck@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Wolfram Sang authored
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Link: https://lkml.kernel.org/r/20220818210200.8203-1-wsa+renesas@sang-engineering.com Signed-off-by:
Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Li Zhe authored
In commit 2f1ee091 ("Revert "mm: use early_pfn_to_nid in page_ext_init""), we call page_ext_init() after page_alloc_init_late() to avoid some panic problem. It seems that we cannot track early page allocations in current kernel even if page structure has been initialized early. This patch introduces a new boot parameter 'early_page_ext' to resolve this problem. If we pass it to the kernel, page_ext_init() will be moved up and the feature 'deferred initialization of struct pages' will be disabled to initialize the page allocator early and prevent the panic problem above. It can help us to catch early page allocations. This is useful especially when we find that the free memory value is not the same right after different kernel booting. [akpm@linux-foundation.org: fix section issue by removing __meminitdata] Link: https://lkml.kernel.org/r/20220825102714.669-1-lizhe.67@bytedance.com Signed-off-by:
Li Zhe <lizhe.67@bytedance.com> Suggested-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Vlastimil Babka <vbabka@suse.cz> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Sep 07, 2022
-
-
Peter Zijlstra authored
handle_initrd() marks itself as PF_FREEZER_SKIP in order to ensure that the UMH, which is going to freeze the system, doesn't indefinitely wait for it's caller. Rework things by adding UMH_FREEZABLE to indicate the completion is freezable. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220822114648.791019324@infradead.org
-
- Sep 05, 2022
-
-
Nicholas Piggin authored
Distro kernels tend to be moving to VIRT_CPU_ACCOUNTING_GEN, and there is not much reason why PPC64 should be special here. Remove the special case and make the ppc64 and pseries defconfigs use GEN accounting (others will use TICK, as-per Kconfig defaults). VIRT_CPU_ACCOUNTING_NATIVE does provide scaled vtime and stolen time apportioned between system and user time, and vtime accounting is not unconditionally enabled, and possibly other things. But it would be better at this point to extend GEN to cover important missing features rather than directing users back to a less used option. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220902085316.2071519-4-npiggin@gmail.com
-
- Aug 23, 2022
-
-
Mark Rutland authored
On arm64, "rodata=full" has been suppored (but not documented) since commit: c55191e9 ("arm64: mm: apply r/o permissions of VM areas to its linear alias as well") As it's necessary to determine the rodata configuration early during boot, arm64 has an early_param() handler for this, whereas init/main.c has a __setup() handler which is run later. Unfortunately, this split meant that since commit: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") ... passing "rodata=full" would result in a spurious warning from the __setup() handler (though RO permissions would be configured appropriately). Further, "rodata=full" has been broken since commit: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") ... which caused strtobool() to parse "full" as false (in addition to many other values not documented for the "rodata=" kernel parameter. This patch fixes this breakage by: * Moving the core parameter parser to an __early_param(), such that it is available early. * Adding an (optional) arch hook which arm64 can use to parse "full". * Updating the documentation to mention that "full" is valid for arm64. * Having the core parameter parser handle "on" and "off" explicitly, such that any undocumented values (e.g. typos such as "ful") are reported as errors rather than being silently accepted. Note that __setup() and early_param() have opposite conventions for their return values, where __setup() uses 1 to indicate a parameter was handled and early_param() uses 0 to indicate a parameter was handled. Fixes: f9a40b08 ("init/main.c: return 1 from handled __setup() functions") Fixes: 0d6ea3ac ("lib/kstrtox.c: add "false"/"true" support to kstrtobool()") Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jagdish Gediya <jvgediya@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Reviewed-by:
Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220817154022.3974645-1-mark.rutland@arm.com Signed-off-by:
Will Deacon <will@kernel.org>
-
- Aug 21, 2022
-
-
Nick Desaulniers authored
GCC has supported asm goto since 4.5, and Clang has since version 9.0.0. The minimum supported versions of these tools for the build according to Documentation/process/changes.rst are 5.1 and 11.0.0 respectively. Remove the feature detection script, Kconfig option, and clean up some fallback code that is no longer supported. The removed script was also testing for a GCC specific bug that was fixed in the 4.7 release. Also remove workarounds for bpftrace using clang older than 9.0.0, since other BPF backend fixes are required at this point. Link: https://lore.kernel.org/lkml/CAK7LNATSr=BXKfkdW8f-H5VT_w=xBpT2ZQcZ7rm6JfkdE+QnmA@mail.gmail.com/ Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48637 Acked-by:
Borislav Petkov <bp@suse.de> Suggested-by:
Masahiro Yamada <masahiroy@kernel.org> Suggested-by:
Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Ingo Molnar <mingo@kernel.org> Reviewed-by:
Nathan Chancellor <nathan@kernel.org> Reviewed-by:
Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 27, 2022
-
-
Baruch Siach authored
CONFIG_KALLSYMS_ALL is required for kernel live patching which is a common use case that is enabled in some major distros. Update the Kconfig help text to reflect that. While at it, s/e.g./i.e./ to match the text intention. Signed-off-by:
Baruch Siach <baruch@tkos.co.il> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
Nick Desaulniers authored
The difference in most compilers between `-O3` and `-O2` is mostly down to whether loops with statically determinable trip counts are fully unrolled vs unrolled to a multiple of SIMD width. This patch is effectively a revert of commit 15f5db60 ("kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC") without re-adding ARCH_CFLAGS Ever since commit cfdbc2e1 ("ARC: Build system: Makefiles, Kconfig, Linker script") ARC has been built with -O3, though the reason for doing so was not specified in inline comments or the commit message. This commit does not re-add -O3 to arch/arc/Makefile. Folks looking to experiment with `-O3` (or any compiler flag for that matter) may pass them along to the command line invocation of make: $ make KCFLAGS=-O3 Code that looks to re-add an explicit Kconfig option for `-O3` should provide: 1. A rigorous and reproducible performance profile of a reasonable userspace workload that demonstrates a hot loop in the kernel that would benefit from `-O3` over `-O2`. 2. Disassembly of said loop body before and after. 3. Provides stats on terms of increase in file size. Link: https://lore.kernel.org/linux-kbuild/CA+55aFz2sNBbZyg-_i8_Ldr2e8o9dfvdSfHHuRzVtP2VMAUWPg@mail.gmail.com/ Signed-off-by:
Nick Desaulniers <ndesaulniers@google.com> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
- Jul 23, 2022
-
-
Tejun Heo authored
3942a9bd ("locking, rcu, cgroup: Avoid synchronize_sched() in __cgroup_procs_write()") disabled percpu operations on threadgroup_rwsem because the impiled synchronize_rcu() on write locking was pushing up the latencies too much for android which constantly moves processes between cgroups. This makes the hotter paths - fork and exit - slower as they're always forced into the slow path. There is no reason to force this on everyone especially given that more common static usage pattern can now completely avoid write-locking the rwsem. Write-locking is elided when turning on and off controllers on empty sub-trees and CLONE_INTO_CGROUP enables seeding a cgroup without grabbing the rwsem. Restore the default percpu operations and introduce the mount option "favordynmods" and config option CGROUP_FAVOR_DYNMODS for users who need lower latencies for the dynamic operations. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Michal Koutn� <mkoutny@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: Oleg Nesterov <oleg@redhat.com>
-
- Jul 18, 2022
-
-
Dan Moulding authored
The gethostname system call returns the hostname for the current machine. However, the kernel has no mechanism to initially set the current machine's name in such a way as to guarantee that the first userspace process to call gethostname will receive a meaningful result. It relies on some unspecified userspace process to first call sethostname before gethostname can produce a meaningful name. Traditionally the machine's hostname is set from userspace by the init system. The init system, in turn, often relies on a configuration file (say, /etc/hostname) to provide the value that it will supply in the call to sethostname. Consequently, the file system containing /etc/hostname usually must be available before the hostname will be set. There may, however, be earlier userspace processes that could call gethostname before the file system containing /etc/hostname is mounted. Such a process will get some other, likely meaningless, name from gethostname (such as "(none)", "localhost", or "darkstar"). A real-world example where this can happen, and lead to undesirable results, is with mdadm. When assembling arrays, mdadm distinguishes between "local" arrays and "foreign" arrays. A local array is one that properly belongs to the current machine, and a foreign array is one that is (possibly temporarily) attached to the current machine, but properly belongs to some other machine. To determine if an array is local or foreign, mdadm may compare the "homehost" recorded on the array with the current hostname. If mdadm is run before the root file system is mounted, perhaps because the root file system itself resides on an md-raid array, then /etc/hostname isn't yet available and the init system will not yet have called sethostname, causing mdadm to incorrectly conclude that all of the local arrays are foreign. Solving this problem *could* be delegated to the init system. It could be left up to the init system (including any init system that starts within an initramfs, if one is in use) to ensure that sethostname is called before any other userspace process could possibly call gethostname. However, it may not always be obvious which processes could call gethostname (for example, udev itself might not call gethostname, but it could via udev rules invoke processes that do). Additionally, the init system has to ensure that the hostname configuration value is stored in some place where it will be readily accessible during early boot. Unfortunately, every init system will attempt to (or has already attempted to) solve this problem in a different, possibly incorrect, way. This makes getting consistently working configurations harder for users. I believe it is better for the kernel to provide the means by which the hostname may be set early, rather than making this a problem for the init system to solve. The option to set the hostname during early startup, via a kernel parameter, provides a simple, reliable way to solve this problem. It also could make system configuration easier for some embedded systems. [dmoulding@me.com: v2] Link: https://lkml.kernel.org/r/20220506060310.7495-2-dmoulding@me.com Link: https://lkml.kernel.org/r/20220505180651.22849-2-dmoulding@me.com Signed-off-by:
Dan Moulding <dmoulding@me.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- Jul 15, 2022
-
-
Eric Biggers authored
Since the Linux RNG no longer uses sha1_transform(), the SHA-1 library is no longer needed unconditionally. Make it possible to build the Linux kernel without the SHA-1 library by putting it behind a kconfig option, and selecting this new option from the kconfig options that gate the remaining users: CRYPTO_SHA1 for crypto/sha1_generic.c, BPF for kernel/bpf/core.c, and IPV6 for net/ipv6/addrconf.c. Unfortunately, since BPF is selected by NET, for now this can only make a difference for kernels built without networking support. Signed-off-by:
Eric Biggers <ebiggers@google.com> Reviewed-by:
Jason A. Donenfeld <Jason@zx2c4.com> Acked-by:
Jakub Kicinski <kuba@kernel.org> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au>
-
- Jul 12, 2022
-
-
Christophe Leroy authored
In init/Kconfig, the part dedicated to modules is quite large. Move it into a dedicated Kconfig in kernel/module/ MODULES_TREE_LOOKUP was outside of the 'if MODULES', but as it is only used when MODULES are set, move it in with everything else to avoid confusion. MODULE_SIG_FORMAT is left in init/Kconfig because this configuration item is not used in kernel/modules/ but in kernel/ and can be selected independently from CONFIG_MODULES. It is for instance selected from security/integrity/ima/Kconfig. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Luis Chamberlain <mcgrof@kernel.org>
-
- Jul 07, 2022
-
-
Mauro Carvalho Chehab authored
Changeset f5461124 ("Documentation: move watch_queue to core-api") renamed: Documentation/watch_queue.rst to: Documentation/core-api/watch_queue.rst. Update the cross-references accordingly. Fixes: f5461124 ("Documentation: move watch_queue to core-api") Reviewed-by:
Randy Dunlap <rdunlap@infradead.org> Signed-off-by:
Mauro Carvalho Chehab <mchehab@kernel.org> Link: https://lore.kernel.org/r/1c220de9c58f35e815a3df9458ac2bea323c8bfb.1656234456.git.mchehab@kernel.org Signed-off-by:
Jonathan Corbet <corbet@lwn.net>
-
- Jul 02, 2022
-
-
GONG, Ruiqi authored
Fix the following Sparse warnings that got noticed when the PPC-dev patchwork was checking another patch (see the link below): init/main.c:862:1: warning: symbol 'randomize_kstack_offset' was not declared. Should it be static? init/main.c:864:1: warning: symbol 'kstack_offset' was not declared. Should it be static? Which in fact are triggered on all architectures that have HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET support (for instances x86, arm64 etc). Link: https://lore.kernel.org/lkml/e7b0d68b-914d-7283-827c-101988923929@huawei.com/T/#m49b2d4490121445ce4bf7653500aba59eefcb67f Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
GONG, Ruiqi <gongruiqi1@huawei.com> Reviewed-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Fixes: 39218ff4 ("stack: Optionally randomize kernel stack offset each syscall") Signed-off-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220629060423.2515693-1-gongruiqi1@huawei.com
-
- Jun 30, 2022
-
-
Frederic Weisbecker authored
Context tracking is going to be used not only to track user transitions but also idle/IRQs/NMIs. The user tracking part will then become a separate feature. Prepare Kconfig for that. [ frederic: Apply Max Filippov feedback. ] Signed-off-by:
Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Yu Liao <liaoyu15@huawei.com> Cc: Phil Auld <pauld@redhat.com> Cc: Paul Gortmaker<paul.gortmaker@windriver.com> Cc: Alex Belits <abelits@marvell.com> Signed-off-by:
Paul E. McKenney <paulmck@kernel.org> Reviewed-by:
Nicolas Saenz Julienne <nsaenzju@redhat.com> Tested-by:
Nicolas Saenz Julienne <nsaenzju@redhat.com>
-
- Jun 20, 2022
-
-
Paul E. McKenney authored
This commit adds fields to task_struct and to rcu_tasks_percpu that will be used to avoid the task-list scan for RCU Tasks Trace grace periods, and also initializes these fields. Signed-off-by:
Paul E. McKenney <paulmck@kernel.org> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: KP Singh <kpsingh@kernel.org>
-
- Jun 09, 2022
-
-
Linus Torvalds authored
In commit 8b202ee2 ("s390: disable -Warray-bounds") the s390 people disabled the '-Warray-bounds' warning for gcc-12, because the new logic in gcc would cause warnings for their use of the S390_lowcore macro, which accesses absolute pointers. It turns out gcc-12 has many other issues in this area, so this takes that s390 warning disable logic, and turns it into a kernel build config entry instead. Part of the intent is that we can make this all much more targeted, and use this conflig flag to disable it in only particular configurations that cause problems, with the s390 case as an example: select GCC12_NO_ARRAY_BOUNDS and we could do that for other configuration cases that cause issues. Or we could possibly use the CONFIG_CC_NO_ARRAY_BOUNDS thing in a more targeted way, and disable the warning only for particular uses: again the s390 case as an example: KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds) but this ends up just doing it globally in the top-level Makefile, since the current issues are spread fairly widely all over: KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds We'll try to limit this later, since the gcc-12 problems are rare enough that *much* of the kernel can be built with it without disabling this warning. Cc: Kees Cook <keescook@chromium.org> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 27, 2022
-
-
Vlastimil Babka authored
After commits 7b42f104 ("mm: Kconfig: move swap and slab config options to the MM section") and 519bcb79 ("mm: Kconfig: group swap, slab, hotplug and thp options into submenus") we now have nicely organized mm related config options. I have noticed some that were still misplaced, so this moves them from various places into the new structure: VM_EVENT_COUNTERS, COMPAT_BRK, MMAP_ALLOW_UNINITIALIZED to mm/Kconfig and general MM section. SLUB_STATS to mm/Kconfig and the slab submenu. DEBUG_SLAB, SLUB_DEBUG, SLUB_DEBUG_ON to mm/Kconfig.debug and the Kernel hacking / Memory Debugging submenu. Link: https://lkml.kernel.org/r/20220525112559.1139-1-vbabka@suse.cz Signed-off-by:
Vlastimil Babka <vbabka@suse.cz> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- May 24, 2022
-
-
Masahiro Yamada authored
include/{linux,asm-generic}/export.h defines a weak symbol, __crc_* as a placeholder. Genksyms writes the version CRCs into the linker script, which will be used for filling the __crc_* symbols. The linker script format depends on CONFIG_MODULE_REL_CRCS. If it is enabled, __crc_* holds the offset to the reference of CRC. It is time to get rid of this complexity. Now that modpost parses text files (.*.cmd) to collect all the CRCs, it can generate C code that will be linked to the vmlinux or modules. Generate a new C file, .vmlinux.export.c, which contains the CRCs of symbols exported by vmlinux. It is compiled and linked to vmlinux in scripts/link-vmlinux.sh. Put the CRCs of symbols exported by modules into the existing *.mod.c files. No additional build step is needed for modules. As before, *.mod.c are compiled and linked to *.ko in scripts/Makefile.modfinal. No linker magic is used here. The new C implementation works in the same way, whether CONFIG_RELOCATABLE is enabled or not. CONFIG_MODULE_REL_CRCS is no longer needed. Previously, Kbuild invoked additional $(LD) to update the CRCs in objects, but this step is unneeded too. Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org> Tested-by:
Nathan Chancellor <nathan@kernel.org> Tested-by:
Nicolas Schier <nicolas@fjasle.eu> Reviewed-by:
Nicolas Schier <nicolas@fjasle.eu> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
-
- May 19, 2022
-
-
Johannes Weiner authored
These are currently under General Setup. MM seems like a better fit. Link: https://lkml.kernel.org/r/20220510152847.230957-3-hannes@cmpxchg.org Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- May 18, 2022
-
-
Jason A. Donenfeld authored
Currently, start_kernel() adds latent entropy and the command line to the entropy bool *after* the RNG has been initialized, deferring when it's actually used by things like stack canaries until the next time the pool is seeded. This surely is not intended. Rather than splitting up which entropy gets added where and when between start_kernel() and random_init(), just do everything in random_init(), which should eliminate these kinds of bugs in the future. While we're at it, rename the awkwardly titled "rand_initialize()" to the more standard "random_init()" nomenclature. Reviewed-by:
Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by:
Jason A. Donenfeld <Jason@zx2c4.com>
-
- May 13, 2022
-
-
Jason A. Donenfeld authored
Currently time_init() is called after rand_initialize(), but rand_initialize() makes use of the timer on various platforms, and sometimes this timer needs to be initialized by time_init() first. In order for random_get_entropy() to not return zero during early boot when it's potentially used as an entropy source, reverse the order of these two calls. The block doing random initialization was right before time_init() before, so changing the order shouldn't have any complicated effects. Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by:
Stafford Horne <shorne@gmail.com> Signed-off-by:
Jason A. Donenfeld <Jason@zx2c4.com>
-
Peter Xu authored
We used to have USERFAULTFD configs stored in init/. It makes sense as a start because that's the default place for storing syscall related configs. However userfaultfd evolved a bit in the past few years and some more config options were added. They're no longer related to syscalls and start to be not suitable to be kept in the init/ directory anymore, because they're pure mm concepts. But it's not ideal either to keep the userfaultfd configs separate from each other. Hence this patch moves the userfaultfd configs under init/ to be under mm/ so that we'll start to group all userfaultfd configs together. We do have quite a few examples of syscall related configs that are not put under init/Kconfig: FTRACE_SYSCALLS, SWAP, FILE_LOCKING, MEMFD_CREATE.. They all reside in the dir where they're more suitable for the concept. So it seems there's no restriction to keep the role of having syscall related CONFIG_* under init/ only. Link: https://lkml.kernel.org/r/20220420144823.35277-1-peterx@redhat.com Signed-off-by:
Peter Xu <peterx@redhat.com> Suggested-by:
Johannes Weiner <hannes@cmpxchg.org> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Reviewed-by:
Axel Rasmussen <axelrasmussen@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- May 12, 2022
-
-
Aaron Tomlin authored
Currently, only the initial module that tainted the kernel is recorded e.g. when an out-of-tree module is loaded. The purpose of this patch is to allow the kernel to maintain a record of each unloaded module that taints the kernel. So, in addition to displaying a list of linked modules (see print_modules()) e.g. in the event of a detected bad page, unloaded modules that carried a taint/or taints are displayed too. A tainted module unload count is maintained. The number of tracked modules is not fixed. This feature is disabled by default. Signed-off-by:
Aaron Tomlin <atomlin@redhat.com> Signed-off-by:
Luis Chamberlain <mcgrof@kernel.org>
-
- May 10, 2022
-
-
David Disseldorp authored
Add support for extraction of checksum-enabled "070702" cpio archives, specified in Documentation/driver-api/early-userspace/buffer-format.rst. Fail extraction if the calculated file data checksum doesn't match the value carried in the header. Link: https://lkml.kernel.org/r/20220404093429.27570-7-ddiss@suse.de Signed-off-by:
David Disseldorp <ddiss@suse.de> Suggested-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Martin Wilck <mwilck@suse.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
initramfs cpio mtime preservation, as implemented in commit 889d51a1 ("initramfs: add option to preserve mtime from initramfs cpio images"), uses a linked list to defer directory mtime processing until after all other items in the cpio archive have been processed. This is done to ensure that parent directory mtimes aren't overwritten via subsequent child creation. The lkml link below indicates that the mtime retention use case was for embedded devices with applications running exclusively out of initramfs, where the 32-bit mtime value provided a rough file version identifier. Linux distributions which discard an extracted initramfs immediately after the root filesystem has been mounted may want to avoid the unnecessary overhead. This change adds a new INITRAMFS_PRESERVE_MTIME Kconfig option, which can be used to disable on-by-default mtime retention and in turn speed up initramfs extraction, particularly for cpio archives with large directory counts. Benchmarks with a one million directory cpio archive extracted 20 times demonstrated: mean extraction time (s) std dev INITRAMFS_PRESERVE_MTIME=y 3.808 0.006 INITRAMFS_PRESERVE_MTIME unset 3.056 0.004 The above extraction times were measured using ftrace (initcall_finish - initcall_start) values for populate_rootfs() with initramfs_async disabled. [ddiss@suse.de: rebase atop dir_entry.name flexible array member and drop separate initramfs_mtime.h header] Link: https://lkml.org/lkml/2008/9/3/424 Link: https://lkml.kernel.org/r/20220404093429.27570-4-ddiss@suse.de Signed-off-by:
David Disseldorp <ddiss@suse.de> Reviewed-by:
Martin Wilck <mwilck@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
dir_entry.name is currently allocated via a separate kstrdup(). Change it to a flexible array member and allocate it along with struct dir_entry. Link: https://lkml.kernel.org/r/20220404093429.27570-3-ddiss@suse.de Signed-off-by:
David Disseldorp <ddiss@suse.de> Acked-by:
Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Wilck <mwilck@suse.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-