- Nov 06, 2017
-
-
Mark Rutland authored
When CONFIG_DEBUG_USER is enabled, it's possible for a user to deliberately trigger dump_instr() with a chosen kernel address. Let's avoid problems resulting from this by using get_user() rather than __get_user(), ensuring that we don't erroneously access kernel memory. So that we can use the same code to dump user instructions and kernel instructions, the common dumping code is factored out to __dump_instr(), with the fs manipulated appropriately in dump_instr() around calls to this. Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
- Nov 02, 2017
-
-
Russell King authored
Add an additional symbol to the decompressor image, which will allow future debugging of non-bootable problems similar to the one encountered with the EFI stub. Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
- Nov 01, 2017
-
-
Luc Van Oostenryck authored
ARM depends on the macros '__ARMEL__' & '__ARMEB__' being defined or not to correctly select or define endian-specific macros, structures or pieces of code. These macros are predefined by the compiler but sparse knows nothing about them and thus may pre-process files differently from what gcc would. Fix this by passing '-D__ARMEL__' or '-D__ARMEB__' to sparse, depending on the endianness of the kernel, like defined by GCC. Note: In most case it won't change anything since most ARMs use little-endian (but an allyesconfig would use big-endian!). To: Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by:
Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
- Oct 24, 2017
-
-
Arnd Bergmann authored
The asm-generic/unaligned.h header provides two different implementations for accessing unaligned variables: the access_ok.h version used when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is set pretends that all pointers are in fact aligned, while the le_struct.h version convinces gcc that the alignment of a pointer is '1', to make it issue the correct load/store instructions depending on the architecture flags. On ARMv5 and older, we always use the second version, to let the compiler use byte accesses. On ARMv6 and newer, we currently use the access_ok.h version, so the compiler can use any instruction including stm/ldm and ldrd/strd that will cause an alignment trap. This trap can significantly impact performance when we have to do a lot of fixups and, worse, has led to crashes in the LZ4 decompressor code that does not have a trap handler. This adds an ARM specific version of asm/unaligned.h that uses the le_struct.h/be_struct.h implementation unconditionally. This should lead to essentially the same code on ARMv6+ as before, with the exception of using regular load/store instructions instead of the trapping instructions multi-register variants. The crash in the LZ4 decompressor code was probably introduced by the patch replacing the LZ4 implementation, commit 4e1a33b1 ("lib: update LZ4 compressor module"), so linux-4.11 and higher would be affected most. However, we probably want to have this backported to all older stable kernels as well, to help with the performance issues. There are two follow-ups that I think we should also work on, but not backport to stable kernels, first to change the asm-generic version of the header to remove the ARM special case, and second to review all other uses of CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to see if they might be affected by the same problem on ARM. Cc: stable@vger.kernel.org Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
- Oct 12, 2017
-
-
Nicolas Pitre authored
The svc instruction doesn't exist on v7m processors. Semihosting ops are invoked with the bkpt instruction instead. Signed-off-by:
Nicolas Pitre <nico@linaro.org> Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
Luc Van Oostenryck II authored
By default sparse uses the characteristics of the build machine to infer things like the wordsize. This is fine when doing native builds but for ARM it's, I suspect, very rarely the case and if the build are done on a 64bit machine we get a bunch of warnings like: 'cast truncates bits from constant value (... becomes ...)' Fix this by adding the -m32 flags for sparse. Reported-by:
Stephen Boyd <sboyd@codeaurora.org> Signed-off-by:
Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
Nicolas Pitre authored
Some nommu systems have RAM at address 0. When vectors are not located there, the very beginning of memory remains available for dynamic allocations. The memblock allocator explicitly skips the first page but the standard page allocator does not, and while it correctly returns a non-null struct page pointer for that page, page_address() gives 0 which gets confused with NULL (out of memory) by callers despite having plenty of free memory left. Signed-off-by:
Nicolas Pitre <nico@linaro.org> Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
- Sep 15, 2017
-
-
Jim Mattson authored
When emulating a nested VM-entry from L1 to L2, several control field validation checks are deferred to the hardware. Should one of these validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When this happens, the L2 guest state is not loaded (even in part), and execution should continue in L1 with the next instruction after the VMLAUNCH/VMRESUME. The VMCS12 is not modified (except for the VM-instruction error field), the VMCS12 MSR save/load lists are not processed, and the CPU state is not loaded from the VMCS12 host area. Moreover, the vmcs02 exit reason is stale, so it should not be consulted for any reason. Signed-off-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jim Mattson authored
On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the VM-instruction error field of the current VMCS), the launch state of the current VMCS is not set to "launched," and the VM-exit information fields of the current VMCS (including IDT-vectoring information and exit reason) are stale. On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit of the exit reason field), the launch state of the current VMCS is not set to "launched," and only two of the VM-exit information fields of the current VMCS are modified (exit reason and exit qualification). The remaining VM-exit information fields of the current VMCS (including IDT-vectoring information, in particular) are stale. Signed-off-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jim Mattson authored
After a successful VM-entry, RFLAGS is cleared, with the exception of bit 1, which is always set. This is handled by load_vmcs12_host_state. Signed-off-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Davidlohr Bueso authored
For example, the following could occur, making us miss a wakeup: CPU0 CPU1 kvm_vcpu_block kvm_mips_comparecount_func [L] swait_active(&vcpu->wq) [S] prepare_to_swait(&vcpu->wq) [L] if (!kvm_vcpu_has_pending_timer(vcpu)) schedule() [S] queue_timer_int(vcpu) Ensure that the swait_active() check is not hoisted over the interrupt. Signed-off-by:
Davidlohr Bueso <dbueso@suse.de> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Davidlohr Bueso authored
Particularly because kvmppc_fast_vcpu_kick_hv() is a callback, ensure that we properly serialize wq active checks in order to avoid potentially missing a wakeup due to racing with the waiter side. Signed-off-by:
Davidlohr Bueso <dbueso@suse.de> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Davidlohr Bueso authored
During code inspection, the following potential race was seen: CPU0 CPU1 kvm_async_pf_task_wait apf_task_wake_one [L] swait_active(&n->wq) [S] prepare_to_swait(&n.wq) [L] if (!hlist_unhahed(&n.link)) schedule() [S] hlist_del_init(&n->link); Properly serialize swait_active() checks such that a wakeup is not missed. Signed-off-by:
Davidlohr Bueso <dbueso@suse.de> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Davidlohr Bueso authored
A comment might serve future readers. Signed-off-by:
Davidlohr Bueso <dbueso@suse.de> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jan H. Schönherr authored
The value of the guest_irq argument to vmx_update_pi_irte() is ultimately coming from a KVM_IRQFD API call. Do not BUG() in vmx_update_pi_irte() if the value is out-of bounds. (Especially, since KVM as a whole seems to hang after that.) Instead, print a message only once if we find that we don't have a route for a certain IRQ (which can be out-of-bounds or within the array). This fixes CVE-2017-1000252. Fixes: efc64404 ("KVM: x86: Update IRTE for posted-interrupts") Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Guenter Roeck authored
Mainline crashes as follows when running nios2 images. On node 0 totalpages: 65536 free_area_init_node: node 0, pgdat c8408fa0, node_mem_map c8726000 Normal zone: 512 pages used for memmap Normal zone: 0 pages reserved Normal zone: 65536 pages, LIFO batch:15 Unable to handle kernel NULL pointer dereference at virtual address 00000000 ea = c8003cb0, ra = c81cbf40, cause = 15 Kernel panic - not syncing: Oops Problem is seen because get_cycles() is called before the timer it depends on is initialized. Returning 0 in that situation fixes the problem. Fixes: 33d72f38 ("init/main.c: extract early boot entropy from the ..") Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by:
Guenter Roeck <linux@roeck-us.net>
-
Tobias Klauser authored
Allow earlycon to be used on the JTAG UART present in the 3c120 GHRD. Signed-off-by:
Tobias Klauser <tklauser@distanz.ch>
-
Jim Mattson authored
If L1 does not specify the "use TPR shadow" VM-execution control in vmcs12, then L0 must specify the "CR8-load exiting" and "CR8-store exiting" VM-execution controls in vmcs02. Failure to do so will give the L2 VM unrestricted read/write access to the hardware CR8. This fixes CVE-2017-12154. Signed-off-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
David Hildenbrand <david@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- Sep 14, 2017
-
-
Paul Mackerras authored
This fixes the emulation of the dcbz instruction in the alignment interrupt handler. The error was that we were comparing just the instruction type field of op.type rather than the whole thing, and therefore the comparison "type != CACHEOP + DCBZ" was always true. Fixes: 31bfdb03 ("powerpc: Use instruction emulation infrastructure to handle alignment faults") Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Tested-by:
Michal Sojka <sojkam1@fel.cvut.cz> Tested-by:
Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Wanpeng Li authored
qemu-system-x86-8600 [004] d..1 7205.687530: kvm_entry: vcpu 2 qemu-system-x86-8600 [004] .... 7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e qemu-system-x86-8600 [004] .... 7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0 qemu-system-x86-8600 [004] .... 7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e qemu-system-x86-8600 [004] .N.. 7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018 kworker/4:2-7814 [004] .... 7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000 qemu-system-x86-8600 [004] .... 7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018 qemu-system-x86-8600 [004] d..1 7205.687711: kvm_entry: vcpu 2 After running some memory intensive workload in guest, I catch the kworker which completes the GUP too quickly, and queues an "Page Ready" #PF exception after the "Page not Present" exception before the next vmentry as the above trace which will result in #DF injected to guest. This patch fixes it by clearing the queue for "Page not Present" if "Page Ready" occurs before the next vmentry since the GUP has already got the required page and shadow page table has already been fixed by "Page Ready" handler. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Wanpeng Li <wanpeng.li@hotmail.com> Fixes: 7c90705b ("KVM: Inject asynchronous page fault into a PV guest if page is swapped out.") [Changed indentation and added clearing of injected. - Radim] Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Wanpeng Li authored
Don't block vCPU if there is pending exception. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Suravee Suthikulpanit authored
SVM AVIC hardware accelerates guest write to APIC_EOI register (for edge-trigger interrupt), which means it does not trap to KVM. So, only enable SVM AVIC only in split irqchip mode. (e.g. launching qemu w/ option '-machine kernel_irqchip=split'). Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Fixes: 44a95dae ("KVM: x86: Detect and Initialize AVIC support") [Removed pr_debug - Radim.] Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Christoph Hellwig authored
... and __initconst if applicable. Based on similar work for an older kernel in the Grsecurity patch. [JD: fix toshiba-wmi build] [JD: add htcpen] [JD: move __initconst where checkscript wants it] Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jean Delvare <jdelvare@suse.de>
-
Prakash Gupta authored
The stacktraces always begin as follows: [<c00117b4>] save_stack_trace_tsk+0x0/0x98 [<c0011870>] save_stack_trace+0x24/0x28 ... This is because the stack trace code includes the stack frames for itself. This is incorrect behaviour, and also leads to "skip" doing the wrong thing (which is the number of stack frames to avoid recording.) Perversely, it does the right thing when passed a non-current thread. Fix this by ensuring that we have a known constant number of frames above the main stack trace function, and always skip these. This was fixed for arch arm by commit 3683f44c ("ARM: stacktrace: avoid listing stacktrace functions in stacktrace") Link: http://lkml.kernel.org/r/1504078343-28754-1-git-send-email-guptap@codeaurora.org Signed-off-by:
Prakash Gupta <guptap@codeaurora.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's primary motivation was to allow users to tell that an allocation is short lived and so the allocator can try to place such allocations close together and prevent long term fragmentation. As much as this sounds like a reasonable semantic it becomes much less clear when to use the highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the context holding that memory sleep? Can it take locks? It seems there is no good answer for those questions. The current implementation of GFP_TEMPORARY is basically GFP_KERNEL | __GFP_RECLAIMABLE which in itself is tricky because basically none of the existing caller provide a way to reclaim the allocated memory. So this is rather misleading and hard to evaluate for any benefits. I have checked some random users and none of them has added the flag with a specific justification. I suspect most of them just copied from other existing users and others just thought it might be a good idea to use without any measuring. This suggests that GFP_TEMPORARY just motivates for cargo cult usage without any reasoning. I believe that our gfp flags are quite complex already and especially those with highlevel semantic should be clearly defined to prevent from confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and replace all existing users to simply use GFP_KERNEL. Please note that SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and so they will be placed properly for memory fragmentation prevention. I can see reasons we might want some gfp flag to reflect shorterm allocations but I propose starting from a clear semantic definition and only then add users with proper justification. This was been brought up before LSF this year by Matthew [1] and it turned out that GFP_TEMPORARY really doesn't have a clear semantic. It seems to be a heuristic without any measured advantage for most (if not all) its current users. The follow up discussion has revealed that opinions on what might be temporary allocation differ a lot between developers. So rather than trying to tweak existing users into a semantic which they haven't expected I propose to simply remove the flag and start from scratch if we really need a semantic for short term allocations. [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org [akpm@linux-foundation.org: fix typo] [akpm@linux-foundation.org: coding-style fixes] [sfr@canb.auug.org.au: drm/i915: fix up] Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org Signed-off-by:
Michal Hocko <mhocko@suse.com> Signed-off-by:
Stephen Rothwell <sfr@canb.auug.org.au> Acked-by:
Mel Gorman <mgorman@suse.de> Acked-by:
Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Neil Brown <neilb@suse.de> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Sep 13, 2017
-
-
Dan Carpenter authored
The intention is to return negative error codes. "pid" is already negative but we accidentally negate it again back to positive. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
Dan Carpenter authored
Static checkers would urge us to add curly braces to this code, but actually the code works correctly. It just isn't indented right. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
Thomas Meyer authored
When building a dynamic kernel image use relative symbols with MODVERSIONS. Signed-off-by:
Thomas Meyer <thomas@m3y3r.de> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
Thomas Meyer authored
Debian's gcc defaults to pie. The global Makefile already defines the -fno-pie option. Link UML dynamic kernel image also with -no-pie to fix the build. Signed-off-by:
Thomas Meyer <thomas@m3y3r.de> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
Thomas Meyer authored
Explicitly export symbols so modpost doesn't complain. Signed-off-by:
Thomas Meyer <thomas@m3y3r.de> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
James Pack authored
Signed-off-by:
James Pack <jpack61108@gmail.com> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
Krzysztof Kozlowski authored
Remove old, dead Kconfig option INET_LRO. It is gone since commit 7bbf3cae ("ipv4: Remove inet_lro library"). Signed-off-by:
Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
Thomas Meyer authored
Hard code max size. Taken from https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/common/x86-xstate.h Signed-off-by:
Thomas Meyer <thomas@m3y3r.de> Signed-off-by:
Richard Weinberger <richard@nod.at>
-
Suravee Suthikulpanit authored
Modify struct kvm_x86_ops.arch.apicv_active() to take struct kvm_vcpu pointer as parameter in preparation to subsequent changes. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Suravee Suthikulpanit authored
Preparing the base code for subsequent changes. This does not change existing logic. Signed-off-by:
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Radim Krčmář authored
Clang resolves __builtin_constant_p() to false even if the expression is constant in the end. The only purpose of that expression was to differentiate a case where the following expression couldn't be checked at compile-time, so we can just remove the check. Clang handles the following two correctly. Turn it into BUG_ON if there are any more problems with this. Fixes: d6321d49 ("KVM: x86: generalize guest_cpuid_has_ helpers") Reported-by:
Dmitry Vyukov <dvyukov@google.com> Reviewed-by:
David Hildenbrand <david@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Jan H. Schönherr authored
When user space sets kvm_run->immediate_exit, KVM is supposed to return quickly. However, when a vCPU is in KVM_MP_STATE_UNINITIALIZED, the value is not considered and the vCPU blocks. Fix that oversight. Fixes: 460df4c1 ("KVM: race-free exit from KVM_RUN without POSIX signals") Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Jan H. Schönherr authored
KVM API says that KVM_RUN will return with -EINTR when a signal is pending. However, if a vCPU is in KVM_MP_STATE_UNINITIALIZED, then the return value is unconditionally -EAGAIN. Copy over some code from vcpu_run(), so that the case of a pending signal results in the expected return value. Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Jan H. Schönherr authored
Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Fixes: f6511935 ("KVM: SVM: Add checks for IO instructions") Reviewed-by:
David Hildenbrand <david@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Joerg Roedel authored
The commit 9dd21e104bc ('KVM: x86: simplify handling of PKRU') removed all users and providers of that call-back, but didn't remove it. Remove it now. Signed-off-by:
Joerg Roedel <jroedel@suse.de> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-