- Aug 24, 2018
-
-
Marc Zyngier authored
A number of the Rockchip-specific drivers (IOMMU, display controllers) are now assuming that CONFIG_PM is set, and may completely misbehave if that's not the case. Since there is hardly any reason for this configuration option not to be selected anyway, let's require it (in the same way Tegra already does). Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Olof Johansson <olof@lixom.net>
-
Marc Zyngier authored
A number of the Rockchip-specific drivers (IOMMU, display controllers) are now assuming that CONFIG_PM is set, and may completely misbehave if that's not the case. Since there is hardly any reason for this configuration option not to be selected anyway, let's require it (in the same way Tegra already does). Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Olof Johansson <olof@lixom.net>
-
Amit Kucheria authored
The idle-states binding documentation[1] mentions that the 'entry-method' property is required on 64-bit platforms and must be set to "psci". commit a13f18f5 ("Documentation: arm: Fix typo in the idle-states bindings examples") attempted to fix this earlier but clearly more is needed. Fix the cpu-capacity.txt documentation that uses the incorrect value so we don't get copy-paste errors like these. Clarify the language in idle-states.txt by removing the reference to the psci bindings that might be causing this confusion. Finally, fix devicetrees of various boards to reflect current documentation. [1] Documentation/devicetree/bindings/arm/idle-states.txt (see idle-states node) Signed-off-by:
Amit Kucheria <amit.kucheria@linaro.org> Acked-by:
Sudeep Holla <sudeep.holla@arm.com> Acked-by:
Li Yang <leoyang.li@nxp.com> Signed-off-by:
Olof Johansson <olof@lixom.net>
-
Vlastimil Babka authored
Two users have reported [1] that they have an "extremely unlikely" system with more than MAX_PA/2 memory and L1TF mitigation is not effective. Make the warning more helpful by suggesting the proper mem=X kernel boot parameter to make it effective and a link to the L1TF document to help decide if the mitigation is worth the unusable RAM. [1] https://bugzilla.suse.com/show_bug.cgi?id=1105536 Suggested-by:
Michal Hocko <mhocko@suse.com> Signed-off-by:
Vlastimil Babka <vbabka@suse.cz> Acked-by:
Michal Hocko <mhocko@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/966571f0-9d7f-43dc-92c6-a10eec7a1254@suse.cz
-
Vlastimil Babka authored
Two users have reported [1] that they have an "extremely unlikely" system with more than MAX_PA/2 memory and L1TF mitigation is not effective. In fact it's a CPU with 36bits phys limit (64GB) and 32GB memory, but due to holes in the e820 map, the main region is almost 500MB over the 32GB limit: [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000081effffff] usable Suggestions to use 'mem=32G' to enable the L1TF mitigation while losing the 500MB revealed, that there's an off-by-one error in the check in l1tf_select_mitigation(). l1tf_pfn_limit() returns the last usable pfn (inclusive) and the range check in the mitigation path does not take this into account. Instead of amending the range check, make l1tf_pfn_limit() return the first PFN which is over the limit which is less error prone. Adjust the other users accordingly. [1] https://bugzilla.suse.com/show_bug.cgi?id=1105536 Fixes: 17dbca11 ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Reported-by:
George Anchev <studio@anchev.net> Reported-by:
Christopher Snowhill <kode54@gmail.com> Signed-off-by:
Vlastimil Babka <vbabka@suse.cz> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180823134418.17008-1-vbabka@suse.cz
-
Arnd Bergmann authored
The ebcdic.c file contains tables for converting between ebcdic and PC codepage 437. I could however not identify which encoding was used for the comments. This seems to be some variation of ISO_8859-1 with non-UTF-8 escape characters. I have converted this to UTF-8 by manually removing the escape characters and then running it through recode, to get the same encoding that we use for the rest of the kernel. Link: http://lkml.kernel.org/r/20180724111600.4158975-2-arnd@arndb.de Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Arnd Bergmann authored
Almost all files in the kernel are either plain text or UTF-8 encoded. A couple however are ISO_8859-1, usually just a few characters in a C comments, for historic reasons. This converts them all to UTF-8 for consistency. Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.de Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by: Simon Horman <horms@verge.net.au> [IPVS portion] Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [IIO] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by:
Rob Herring <robh@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Finn Thain authored
Also add these typos to spelling.txt so checkpatch.pl will look for them. Link: http://lkml.kernel.org/r/88af06b9de34d870cb0afc46cfd24e0458be2575.1529471371.git.fthain@telegraphics.com.au Signed-off-by:
Finn Thain <fthain@telegraphics.com.au> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Joe Perches <joe@perches.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Will Deacon authored
As of commit fd1102f0 ("mm: mmu_notifier fix for tlb_end_vma"), asm-generic/tlb.h now calls tlb_flush() from a static inline function, so we need to make sure that it's declared before #including the asm-generic header in the arch header. Signed-off-by:
Will Deacon <will.deacon@arm.com> Acked-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Aug 23, 2018
-
-
Masahiro Yamada authored
Commit a0f97e06 ("kbuild: enable 'make CFLAGS=...' to add additional options to CC") renamed CFLAGS to KBUILD_CFLAGS. Commit 222d394d ("kbuild: enable 'make AFLAGS=...' to add additional options to AS") renamed AFLAGS to KBUILD_AFLAGS. Commit 06c5040c ("kbuild: enable 'make CPPFLAGS=...' to add additional options to CPP") renamed CPPFLAGS to KBUILD_CPPFLAGS. For some reason, LDFLAGS was not renamed. Using a well-known variable like LDFLAGS may result in accidental override of the variable. Kbuild generally uses KBUILD_ prefixed variables for the internally appended options, so here is one more conversion to sanitize the naming convention. I did not touch Makefiles under tools/ since the tools build system is a different world. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by:
Palmer Dabbelt <palmer@sifive.com>
-
Peter Zijlstra authored
If we don't use paravirt; don't play unnecessary and complicated games to free page-tables. Suggested-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Rik van Riel <riel@surriel.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Jann reported that x86 was missing required TLB invalidates when he hit the !*batch slow path in tlb_remove_table(). This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache) invalidates, the PowerPC-hash where this code originated and the Sparc-hash where this was subsequently used did not need that. ARM which later used this put an explicit TLB invalidate in their __p*_free_tlb() functions, and PowerPC-radix followed that example. But when we hooked up x86 we failed to consider this. Fix this by (optionally) hooking tlb_remove_table() into the TLB invalidate code. NOTE: s390 was also needing something like this and might now be able to use the generic code again. [ Modified to be on top of Nick's cleanups, which simplified this patch now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ] Fixes: 9e52fc2b ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") Reported-by:
Jann Horn <jannh@google.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Rik van Riel <riel@surriel.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Will Deacon <will.deacon@arm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: stable@kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Mahesh Salgaonkar authored
The commit e7e81847 ("powerpc/64s: move machine check SLB flushing to mm/slb.c") introduced a bug in reloading bolted SLB entries. Unused bolted entries are stored with .esid=0 in the slb_shadow area, and that value is now used directly as the RB input to slbmte, which means the RB[52:63] index field is set to 0, which causes SLB entry 0 to be cleared. Fix this by storing the index bits in the unused bolted entries, which directs the slbmte to the right place. The SLB shadow area is also used by the hypervisor, but PAPR is okay with that, from LoPAPR v1.1, 14.11.1.3 SLB Shadow Buffer: Note: SLB is filled sequentially starting at index 0 from the shadow buffer ignoring the contents of RB field bits 52-63 Fixes: e7e81847 ("powerpc/64s: move machine check SLB flushing to mm/slb.c") Signed-off-by:
Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Reviewed-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Paul Mackerras authored
Commit 76fa4975 ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page", 2018-07-17) added some checks to ensure that guest DMA mappings don't attempt to map more than the guest is entitled to access. However, errors in the logic mean that legitimate guest requests to map pages for DMA are being denied in some situations. Specifically, if the first page of the range passed to mm_iommu_get() is mapped with a normal page, and subsequent pages are mapped with transparent huge pages, we end up with mem->pageshift == 0. That means that the page size checks in mm_iommu_ua_to_hpa() and mm_iommu_up_to_hpa_rm() will always fail for every page in that region, and thus the guest can never map any memory in that region for DMA, typically leading to a flood of error messages like this: qemu-system-ppc64: VFIO_MAP_DMA: -22 qemu-system-ppc64: vfio_dma_map(0x10005f47780, 0x800000000000000, 0x10000, 0x7fff63ff0000) = -22 (Invalid argument) The logic errors in mm_iommu_get() are: (a) use of 'ua' not 'ua + (i << PAGE_SHIFT)' in the find_linux_pte() call (meaning that find_linux_pte() returns the pte for the first address in the range, not the address we are currently up to); (b) use of 'pageshift' as the variable to receive the hugepage shift returned by find_linux_pte() - for a normal page this gets set to 0, leading to us setting mem->pageshift to 0 when we conclude that the pte returned by find_linux_pte() didn't match the page we were looking at; (c) comparing 'compshift', which is a page order, i.e. log base 2 of the number of pages, with 'pageshift', which is a log base 2 of the number of bytes. To fix these problems, this patch introduces 'cur_ua' to hold the current user address and uses that in the find_linux_pte() call; introduces 'pteshift' to hold the hugepage shift found by find_linux_pte(); and compares 'pteshift' with 'compshift + PAGE_SHIFT' rather than 'compshift'. The patch also moves the local_irq_restore to the point after the PTE pointer returned by find_linux_pte() has been dereferenced because otherwise the PTE could change underneath us, and adds a check to avoid doing the find_linux_pte() call once mem->pageshift has been reduced to PAGE_SHIFT, as an optimization. Fixes: 76fa4975 ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
The Nest MMU workaround is only needed for RW upgrades. Avoid doing that for other PTE updates. We also avoid clearing the PTE while marking it invalid. This is because other page table walkers will find this PTE none and can result in unexpected behaviour due to that. Instead we clear _PAGE_PRESENT and set the software PTE bit _PAGE_INVALID. pte_present() is already updated to check for both bits. This makes sure page table walkers will find the PTE present and things like pte_pfn(pte) returns the right value. Based on an original patch from Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
When splitting a huge pmd pte, we need to mark the pmd entry invalid. We can do that by clearing _PAGE_PRESENT bit. But then that will be taken as a swap pte. In order to differentiate between the two use a software pte bit when invalidating. For regular pte, due to bd5050e3 ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang") we need to mark the pte entry invalid when relaxing access permission. Instead of marking pte_none which can result in different page table walk routines possibly skipping this pte entry, invalidate it but still keep it marked present. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
Commit 5769beaf ("powerpc/mm: Add proper pte access check helper for other platforms") replaced generic pte_access_permitted() by an arch specific one. The generic one is defined as (pte_present(pte) && (!(write) || pte_write(pte))) The arch specific one is open coded checking that _PAGE_USER and _PAGE_WRITE (_PAGE_RW) flags are set, but lacking to check that _PAGE_RO and _PAGE_PRIVILEGED are unset, leading to a useless test on targets like the 8xx which defines _PAGE_RW and _PAGE_USER as 0. Commit 5fa5b16b ("powerpc/mm/hugetlb: Use pte_access_permitted for hugetlb access check") replaced some tests performed with pte helpers by a call to pte_access_permitted(), leading to the same issue. This patch rewrites powerpc/nohash pte_access_permitted() using pte helpers. Fixes: 5769beaf ("powerpc/mm: Add proper pte access check helper for other platforms") Fixes: 5fa5b16b ("powerpc/mm/hugetlb: Use pte_access_permitted for hugetlb access check") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Peter Zijlstra authored
Revert commits: 95b0e635 x86/mm/tlb: Always use lazy TLB mode 64482aaf x86/mm/tlb: Only send page table free TLB flush to lazy TLB CPUs ac031589 x86/mm/tlb: Make lazy TLB mode lazier 61d0beb5 x86/mm/tlb: Restructure switch_mm_irqs_off() 2ff6ddf1 x86/mm/tlb: Leave lazy TLB mode at page table free time In order to simplify the TLB invalidate fixes for x86 and unify the parts that need backporting. We'll try again later. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Rik van Riel <riel@surriel.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Nick Desaulniers authored
Commit cafa0010 ("Raise the minimum required gcc version to 4.6") recently exposed a brittle part of the build for supporting non-gcc compilers. Both Clang and ICC define __GNUC__, __GNUC_MINOR__, and __GNUC_PATCHLEVEL__ for quick compatibility with code bases that haven't added compiler specific checks for __clang__ or __INTEL_COMPILER. This is brittle, as they happened to get compatibility by posing as a certain version of GCC. This broke when upgrading the minimal version of GCC required to build the kernel, to a version above what ICC and Clang claim to be. Rather than always including compiler-gcc.h then undefining or redefining macros in compiler-intel.h or compiler-clang.h, let's separate out the compiler specific macro definitions into mutually exclusive headers, do more proper compiler detection, and keep shared definitions in compiler_types.h. Fixes: cafa0010 ("Raise the minimum required gcc version to 4.6") Reported-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Suggested-by:
Eli Friedman <efriedma@codeaurora.org> Suggested-by:
Joe Perches <joe@perches.com> Signed-off-by:
Nick Desaulniers <ndesaulniers@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Aug 22, 2018
-
-
Tony Luck authored
This has been broken for an embarassingly long time (since v4.4). Just needs a couple of __init tags on functions to make the sections match up. Signed-off-by:
Tony Luck <tony.luck@intel.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Ard Biesheuvel authored
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries, each consisting of two 64-bit fields containing absolute references, to the symbol itself and to a char array containing its name, respectively. When we build the same configuration with KASLR enabled, we end up with an additional ~192 KB of relocations in the .init section, i.e., one 24 byte entry for each absolute reference, which all need to be processed at boot time. Given how the struct kernel_symbol that describes each entry is completely local to module.c (except for the references emitted by EXPORT_SYMBOL() itself), we can easily modify it to contain two 32-bit relative references instead. This reduces the size of the __ksymtab section by 50% for all 64-bit architectures, and gets rid of the runtime relocations entirely for architectures implementing KASLR, either via standard PIE linking (arm64) or using custom host tools (x86). Note that the binary search involving __ksymtab contents relies on each section being sorted by symbol name. This is implemented based on the input section names, not the names in the ksymtab entries, so this patch does not interfere with that. Given that the use of place-relative relocations requires support both in the toolchain and in the module loader, we cannot enable this feature for all architectures. So make it dependent on whether CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined. Link: http://lkml.kernel.org/r/20180704083651.24360-4-ard.biesheuvel@linaro.org Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by:
Jessica Yu <jeyu@kernel.org> Acked-by:
Michael Ellerman <mpe@ellerman.id.au> Reviewed-by:
Will Deacon <will.deacon@arm.com> Acked-by:
Ingo Molnar <mingo@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morris <james.morris@microsoft.com> Cc: James Morris <jmorris@namei.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Ard Biesheuvel authored
To allow existing C code to be incorporated into the decompressor or the UEFI stub, introduce a CPP macro that turns all EXPORT_SYMBOL_xxx declarations into nops, and #define it in places where such exports are undesirable. Note that this gets rid of a rather dodgy redefine of linux/export.h's header guard. Link: http://lkml.kernel.org/r/20180704083651.24360-3-ard.biesheuvel@linaro.org Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by:
Nicolas Pitre <nico@linaro.org> Acked-by:
Michael Ellerman <mpe@ellerman.id.au> Reviewed-by:
Will Deacon <will.deacon@arm.com> Acked-by:
Ingo Molnar <mingo@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morris <james.morris@microsoft.com> Cc: James Morris <jmorris@namei.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Ard Biesheuvel authored
Patch series "add support for relative references in special sections", v10. This adds support for emitting special sections such as initcall arrays, PCI fixups and tracepoints as relative references rather than absolute references. This reduces the size by 50% on 64-bit architectures, but more importantly, it removes the need for carrying relocation metadata for these sections in relocatable kernels (e.g., for KASLR) that needs to be fixed up at boot time. On arm64, this reduces the vmlinux footprint of such a reference by 8x (8 byte absolute reference + 24 byte RELA entry vs 4 byte relative reference) Patch #3 was sent out before as a single patch. This series supersedes the previous submission. This version makes relative ksymtab entries dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather than trying to infer from kbuild test robot replies for which architectures it should be blacklisted. Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS, and sets it for the main architectures that are expected to benefit the most from this feature, i.e., 64-bit architectures or ones that use runtime relocations. Patch #2 add support for #define'ing __DISABLE_EXPORTS to get rid of ksymtab/kcrctab sections in decompressor and EFI stub objects when rebuilding existing C files to run in a different context. Patches #4 - #6 implement relative references for initcalls, PCI fixups and tracepoints, respectively, all of which produce sections with order ~1000 entries on an arm64 defconfig kernel with tracing enabled. This means we save about 28 KB of vmlinux space for each of these patches. [From the v7 series blurb, which included the jump_label patches as well]: For the arm64 kernel, all patches combined reduce the memory footprint of vmlinux by about 1.3 MB (using a config copied from Ubuntu that has KASLR enabled), of which ~1 MB is the size reduction of the RELA section in .init, and the remaining 300 KB is reduction of .text/.data. This patch (of 6): Before updating certain subsystems to use place relative 32-bit relocations in special sections, to save space and reduce the number of absolute relocations that need to be processed at runtime by relocatable kernels, introduce the Kconfig symbol and define it for some architectures that should be able to support and benefit from it. Link: http://lkml.kernel.org/r/20180704083651.24360-2-ard.biesheuvel@linaro.org Signed-off-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by:
Michael Ellerman <mpe@ellerman.id.au> Reviewed-by:
Will Deacon <will.deacon@arm.com> Acked-by:
Ingo Molnar <mingo@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: James Morris <jmorris@namei.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Cc: James Morris <james.morris@microsoft.com> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
Rather than in vm_area_alloc(). To ensure that the various oddball stack-based vmas are in a good state. Some of the callers were zeroing them out, others were not. Acked-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
There are several blockable mmu notifiers which might sleep in mmu_notifier_invalidate_range_start and that is a problem for the oom_reaper because it needs to guarantee a forward progress so it cannot depend on any sleepable locks. Currently we simply back off and mark an oom victim with blockable mmu notifiers as done after a short sleep. That can result in selecting a new oom victim prematurely because the previous one still hasn't torn its memory down yet. We can do much better though. Even if mmu notifiers use sleepable locks there is no reason to automatically assume those locks are held. Moreover majority of notifiers only care about a portion of the address space and there is absolutely zero reason to fail when we are unmapping an unrelated range. Many notifiers do really block and wait for HW which is harder to handle and we have to bail out though. This patch handles the low hanging fruit. __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks are not allowed to sleep if the flag is set to false. This is achieved by using trylock instead of the sleepable lock for most callbacks and continue as long as we do not block down the call chain. I think we can improve that even further because there is a common pattern to do a range lookup first and then do something about that. The first part can be done without a sleeping lock in most cases AFAICS. The oom_reaper end then simply retries if there is at least one notifier which couldn't make any progress in !blockable mode. A retry loop is already implemented to wait for the mmap_sem and this is basically the same thing. The simplest way for driver developers to test this code path is to wrap userspace code which uses these notifiers into a memcg and set the hard limit to hit the oom. This can be done e.g. after the test faults in all the mmu notifier managed memory and set the hard limit to something really small. Then we are looking for a proper process tear down. [akpm@linux-foundation.org: coding style fixes] [akpm@linux-foundation.org: minor code simplification] Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org Signed-off-by:
Michal Hocko <mhocko@suse.com> Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp Reported-by:
David Rientjes <rientjes@google.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Dimitri Sivanich <sivanich@sgi.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Paolo Bonzini authored
Two bug fixes: 1) missing entries in the l1d_param array; this can cause a host crash if an access attempts to reach the missing entry. Future-proof the get function against any overflows as well. However, the two entries VMENTER_L1D_FLUSH_EPT_DISABLED and VMENTER_L1D_FLUSH_NOT_REQUIRED must not be accepted by the parse function, so disable them there. 2) invalid values must be rejected even if the CPU does not have the bug, so test for them before checking boot_cpu_has(X86_BUG_L1TF) ... and a small refactoring, since the .cmd field is redundant with the index in the array. Reported-by:
Bandan Das <bsd@redhat.com> Cc: stable@vger.kernel.org Fixes: a7b9020b Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Gleixner authored
Mikhail reported the following lockdep splat: WARNING: possible irq lock inversion dependency detected CPU 0/KVM/10284 just changed the state of lock: 000000000d538a88 (&st->lock){+...}, at: speculative_store_bypass_update+0x10b/0x170 but this lock was taken by another, HARDIRQ-safe lock in the past: (&(&sighand->siglock)->rlock){-.-.} and interrupts could create inverse lock ordering between them. Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&st->lock); local_irq_disable(); lock(&(&sighand->siglock)->rlock); lock(&st->lock); <Interrupt> lock(&(&sighand->siglock)->rlock); *** DEADLOCK *** The code path which connects those locks is: speculative_store_bypass_update() ssb_prctl_set() do_seccomp() do_syscall_64() In svm_vcpu_run() speculative_store_bypass_update() is called with interupts enabled via x86_virt_spec_ctrl_set_guest/host(). This is actually a false positive, because GIF=0 so interrupts are disabled even if IF=1; however, we can easily move the invocations of x86_virt_spec_ctrl_set_guest/host() into the interrupt disabled region to cure it, and it's a good idea to keep the GIF=0/IF=1 area as small and self-contained as possible. Fixes: 1f50ddb4 ("x86/speculation: Handle HT correctly on AMD") Reported-by:
Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: x86@kernel.org Cc: stable@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Virtualization of Intel SGX depends on Enclave Page Cache (EPC) management that is not yet available in the kernel, i.e. KVM support for exposing SGX to a guest cannot be added until basic support for SGX is upstreamed, which is a WIP[1]. Until SGX is properly supported in KVM, ensure a guest sees expected behavior for ENCLS, i.e. all ENCLS #UD. Because SGX does not have a true software enable bit, e.g. there is no CR4.SGXE bit, the ENCLS instruction can be executed[1] by the guest if SGX is supported by the system. Intercept all ENCLS leafs (via the ENCLS- exiting control and field) and unconditionally inject #UD. [1] https://www.spinics.net/lists/kvm/msg171333.html or https://lkml.org/lkml/2018/7/3/879 [2] A guest can execute ENCLS in the sense that ENCLS will not take an immediate #UD, but no ENCLS will ever succeed in a guest without explicit support from KVM (map EPC memory into the guest), unless KVM has a *very* egregious bug, e.g. accidentally mapped EPC memory into the guest SPTEs. In other words this patch is needed only to prevent the guest from seeing inconsistent behavior, e.g. #GP (SGX not enabled in Feature Control MSR) or #PF (leaf operand(s) does not point at EPC memory) instead of #UD on ENCLS. Intercepting ENCLS is not required to prevent the guest from truly utilizing SGX. Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20180814163334.25724-3-sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Hardware support for basic SGX virtualization adds a new execution control (ENCLS_EXITING), VMCS field (ENCLS_EXITING_BITMAP) and exit reason (ENCLS), that enables a VMM to intercept specific ENCLS leaf functions, e.g. to inject faults when the VMM isn't exposing SGX to a VM. When ENCLS_EXITING is enabled, the VMM can set/clear bits in the bitmap to intercept/allow ENCLS leaf functions in non-root, e.g. setting bit 2 in the ENCLS_EXITING_BITMAP will cause ENCLS[EINIT] to VMExit(ENCLS). Note: EXIT_REASON_ENCLS was previously added by commit 1f519992 ("KVM: VMX: add missing exit reasons"). Signed-off-by:
Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20180814163334.25724-2-sean.j.christopherson@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Yi Wang authored
Substitute spaces with tab. No functional changes. Signed-off-by:
Yi Wang <wang.yi59@zte.com.cn> Reviewed-by:
Jiang Biao <jiang.biao2@zte.com.cn> Message-Id: <1534398159-48509-1-git-send-email-wang.yi59@zte.com.cn> Cc: stable@vger.kernel.org # L1TF Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Arnd Bergmann authored
Removing one of the two accesses of the maxphyaddr variable led to a harmless warning: arch/x86/kvm/x86.c: In function 'kvm_set_mmio_spte_mask': arch/x86/kvm/x86.c:6563:6: error: unused variable 'maxphyaddr' [-Werror=unused-variable] Removing the #ifdef seems to be the nicest workaround, as it makes the code look cleaner than adding another #ifdef. Fixes: 28a1f3ac ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org # L1TF Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Yoshinori Sato authored
Old timer handler use 24 (Compare match A). But current timer handler use 26 (Overflow). Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Randy Dunlap authored
Make the "defconfig" target valid for arch/h8300. Currently "make ARCH=h8300 defconfig" produces: *** Can't find default configuration "arch/h8300/defconfig"! ../scripts/kconfig/Makefile:87: recipe for target 'defconfig' failed By adding a value for KBUILD_DEFCONFIG, "make ARCH=h8300 defconfig" successfully produces a kernel .config file: *** Default configuration is based on 'edosk2674_defconfig' This is useful for Kconfig editing/testing. Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp (moderated for non-subscribers) Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Randy Dunlap authored
Drop the "const" qualifier from arch_kgdb_ops to eliminate the gcc warning (gcc version is 8.1.0). arch/h8300/kernel/kgdb.c:132:24: error: conflicting type qualifiers for 'arch_kgdb_ops' const struct kgdb_arch arch_kgdb_ops = { In file included from ../arch/h8300/kernel/kgdb.c:12: ../include/linux/kgdb.h:284:26: note: previous declaration of 'arch_kgdb_ops' was here extern struct kgdb_arch arch_kgdb_ops; Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Randy Dunlap authored
Add a "struct task_struct;" stub to arch/h8300's ptrace.h header to eliminate gcc warnings (gcc version is 8.1.0). ../arch/h8300/include/asm/ptrace.h:32:34: warning: 'struct task_struct' declared inside parameter list will not be visible outside of this definition or declaration extern long h8300_get_reg(struct task_struct *task, int regno); ../arch/h8300/include/asm/ptrace.h:33:33: warning: 'struct task_struct' declared inside parameter list will not be visible outside of this definition or declaration extern int h8300_put_reg(struct task_struct *task, int regno, Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Luc Van Oostenryck authored
All 64bit archs use unsigned long for size_t and most 32bit archs use 'unsigned int'. By default, this is what is assumed by sparse. However, on h8300 (a 32bit arch) size_t is unsigned long which can led sparse to emit wrong warnings. Fix this by passing to sparse the flag -msize-long, telling it that size_t is unsigned long. Signed-off-by:
Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Will Deacon authored
linux/kernel.h isn't needed by asm/atomic.h and will result in circular dependencies when the asm-generic atomic bitops are built around the tomic_long_t interface. Remove the broad include and replace it with linux/compiler.h for READ_ONCE etc and asm/irqflags.h for arch_local_irq_save etc. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Rob Herring authored
The DT core will call of_platform_populate, so it is not necessary for arch specific code to call it unless there are custom match entries, auxdata or parent device. Neither of those apply here, so remove the call. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by:
Rob Herring <robh@kernel.org> Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Geert Uytterhoeven authored
mm/filemap.c: In function 'clear_bit_unlock_is_negative_byte': mm/filemap.c:1181:30: warning: passing argument 2 of 'test_bit' discards 'volatile' qualifier from pointer target type [-Wdiscarded-qualifiers] return test_bit(PG_waiters, mem); ^~~ In file included from include/linux/bitops.h:38, from include/linux/kernel.h:11, from include/linux/list.h:9, from include/linux/wait.h:7, from include/linux/wait_bit.h:8, from include/linux/fs.h:6, from include/linux/dax.h:5, from mm/filemap.c:14: arch/h8300/include/asm/bitops.h:69:57: note: expected 'const long unsigned int *' but argument is of type 'volatile void *' static inline int test_bit(int nr, const unsigned long *addr) ~~~~~~~~~~~~~~~~~~~~~^~~~ Make the bitmask pointed to by the "addr" parameter volatile to fix this, like is done on other architectures. Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-
Rob Herring authored
Commit 0fa1c579 ("of/fdt: use memblock_virt_alloc for early alloc") inadvertently switched the DT unflattening allocations from memblock to bootmem which doesn't work because the unflattening happens before bootmem is initialized. Swapping the order of bootmem init and unflattening could also fix this, but removing bootmem is desired. So enable NO_BOOTMEM on h8300 like other architectures have done. Fixes: 0fa1c579 ("of/fdt: use memblock_virt_alloc for early alloc") Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp Signed-off-by:
Rob Herring <robh@kernel.org> Signed-off-by:
Yoshinori Sato <ysato@users.sourceforge.jp>
-