- Dec 19, 2016
-
-
Vineet Gupta authored
An ARC700 customer reported linux boot crashes when upgrading to bigger L1 dcache (64K from 32K). Turns out they had an aliasing VIPT config and current code only assumed 2 colours, while theirs had 4. So default to 4 colours and complain if there are fewer. Ideally this needs to be a Kconfig option, but heck that's too much of hassle for a single user. Cc: stable@vger.kernel.org Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Historical MMU revisions have been paired with Cache revision updates which are captured in MMU and Cache Build Configuration Registers respectively. This was used in boot code to check for configurations mismatches, speically in simulations (such as running with non existent caches, non pairing MMU and Cache version etc). This can instead be inferred from other cache params such as line size. So remove @ver from post processed @cpuinfo which could be used later to save soem other interesting info. Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vladimir Kondratiev authored
Signed-off-by:
Vladimir Kondratiev <vladimir.kondratiev@intel.com> Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
- Dec 14, 2016
-
-
Vineet Gupta authored
ARC HS Cores support configurable multiple interrupt priorities of upto 16 levels. In commit dec2b284 ("ARCv2: intc: Allow interruption by lowest priority interrupt") we switched to 15 which seems a bit excessive given that there would be rare hardware implementing so many preemption levels AND running Linux. It would seem that 2 levels will be more common so switch to 1 as the default priority level. This will be the "lower" priority level saving 0 for implementing NMI style support. This scheme also works in systems with more than 2 prioity levels as well. Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
And while at it - use the proper assembler macro which includes the optional irq tracing already - de-uglify'ing the code a bit Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
- Dec 13, 2016
-
-
Vineet Gupta authored
Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
- Nov 30, 2016
-
-
Alexey Brodkin authored
Up until now we had ARC PGU not enabled in axs10x defconfigs trying to not bloat kernel image again with yet another drivers and subsystems. This change configures ARC PGU (as well as DRM bits it depends on) to be built as a module and so those who need LCD screen to work on axs10x may bundle built .ko files in their target's file-system with help of the following command on host: ------------->8------------- make INSTALL_MOD_PATH=_path_to_target_fs_ modules_install ------------->8------------- and later on target with commands as simple as: ------------->8------------- modprobe adv7511.ko modprobe arcpgu.ko ------------->8------------- get LCD working. Signed-off-by:
Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
There are more ARC Linux HAPS users than Zebu ones. Same kernel would work fine on both, even with embedded DT, assuming the FPGA bitfile configuration is same Suggested-by:
Francois Bedard <fbedard@ynopsys.com> Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Noam Camus authored
This new header file is for NPS400 SoC (part of ARC architecture). The header file includes macros for save/restore of HW scheduling. The control of HW scheduling is achieved by writing core registers. This code was moved from arc/plat-eznps so it can be used from drivers/clocksource/, available only for CONFIG_EZNPS_MTM_EXT. Signed-off-by:
Noam Camus <noamca@mellanox.com> Acked-by:
Daniel Lezcano <daniel.lezcano@linaro.org>
-
Vineet Gupta authored
This adds support for - CONFIG_ARC_TIMERS : legacy 32-bit TIMER0 and TIMER1 which count UP from @CNT to @LIMIT, before optionally triggering an interrupt. These are programmed using ARC auxiliary register interface. These are present in all ARC cores (ARC700 and ARC HS38) TIMER0 serves as clockevent for all ARC linux builds. TIMER1 is used for clocksource in arc700 builds. - CONFIG_ARC_TIMERS_64BIT: 64-bit counters, RTC and GFRC found in ARC HS38 cores. These are independnet IP blocks with different programming model respectively. Link: http://lkml.kernel.org/r/20161111231132.GA4186@mai Acked-by:
Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
... which allows for use in drivers/clocksource later Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Also remove the dependency on ARCv2, to increase compile coverage for !ARCV2 builds Acked-by:
Daniel Lezcano <daniel.lezcnao@linaro.org> Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
ARC timers use aux registers for programming and this paves way for moving ARC timer drivers into drivers/clocksource Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
to allow future git mv of the driver into drivers/clocksource Acked-by:
Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
The original distinction was done as they were developed at different times and primarily because they are specific to UP (RTC) and SMP (GFRC). But given that driver handles that at runtime, (i.e. not allowing RTC as clocksource in SMP), we can simplify things a bit. Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
... don't rely on cpuinfo populated in arc boot code. This paves way for moving this code in drivers/clocksource/ And while at it, convert the WARN() to pr_warn() as sugested by Daniel Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
A standard "C" shift will be handled appropriately by the compiler depending on the endian for the build. So we don't need the explicit distinction in code Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
- Nov 29, 2016
-
-
Yuriy Kolerov authored
commit 1c3c9093 broke PAE40. Macro pfn_pte(pfn, prot) creates paddr from pfn, but the page shift was getting truncated to 32 bits since we lost the proper cast to 64 bits (for PAE400 Instead of reverting that commit, use a better helper which is 32/64 bits safe just like ARM implementation. Fixes: 1c3c9093 ("ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS") Cc: <stable@vger.kernel.org> #4.4+ Signed-off-by:
Yuriy Kolerov <yuriy.kolerov@synopsys.com> [vgupta: massaged changelog] Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
- Nov 28, 2016
-
-
Vineet Gupta authored
Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Apparenty this is coming in the way of gcc fix which inhibits the usage of LP_COUNT as a gpr. Cc: stable@vger.kernel.org Signed-off-by:
Vineet Gupta <vgupta@synopsys.com>
-
- Nov 25, 2016
-
-
John David Anglin authored
This is the second issue I noticed in reviewing the parisc TLB code. The fic instruction may use either the instruction or data TLB in flushing the instruction cache. Thus, on machines with a split TLB, we should also flush the data TLB after setting up the temporary alias registers. Although this has no functional impact, I changed the pdtlb and pitlb instructions to consistently use the index register %r0. These instructions do not support integer displacements. Tested on rp3440 and c8000. Signed-off-by:
John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
John David Anglin authored
We are still troubled by occasional random segmentation faults and memory memory corruption on SMP machines. The causes quite a few package builds to fail on the Debian buildd machines for parisc. When gcc-6 failed to build three times in a row, I looked again at the TLB related code. I found a couple of issues. This is the first. In general, we need to ensure page table updates and corresponding TLB purges are atomic. The attached patch fixes an instance in pci-dma.c where the page table update was not guarded by the TLB lock. Tested on rp3440 and c8000. So far, no further random segmentation faults have been observed. Signed-off-by:
John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
Helge Deller authored
Drop the open-coded sched_clock() function and replace it by the provided GENERIC_SCHED_CLOCK implementation. We have seen quite some hung tasks in the past, which seem to be fixed by this patch. Signed-off-by:
Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.7+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
John David Anglin authored
Helge reported to me the following startup crash: [ 0.000000] Linux version 4.8.0-1-parisc64-smp (debian-kernel@lists.debian.org) (gcc version 5.4.1 20161019 (GCC) ) #1 SMP Debian 4.8.7-1 (2016-11-13) [ 0.000000] The 64-bit Kernel has started... [ 0.000000] Kernel default page size is 4 KB. Huge pages enabled with 1 MB physical and 2 MB virtual size. [ 0.000000] Determining PDC firmware type: System Map. [ 0.000000] model 9000/785/J5000 [ 0.000000] Total Memory: 2048 MB [ 0.000000] Memory: 2018528K/2097152K available (9272K kernel code, 3053K rwdata, 1319K rodata, 1024K init, 840K bss, 78624K reserved, 0K cma-reserved) [ 0.000000] virtual kernel memory layout: [ 0.000000] vmalloc : 0x0000000000008000 - 0x000000003f000000 (1007 MB) [ 0.000000] memory : 0x0000000040000000 - 0x00000000c0000000 (2048 MB) [ 0.000000] .init : 0x0000000040100000 - 0x0000000040200000 (1024 kB) [ 0.000000] .data : 0x0000000040b0e000 - 0x0000000040f533e0 (4372 kB) [ 0.000000] .text : 0x0000000040200000 - 0x0000000040b0e000 (9272 kB) [ 0.768910] Brought up 1 CPUs [ 0.992465] NET: Registered protocol family 16 [ 2.429981] Releasing cpu 1 now, hpa=fffffffffffa2000 [ 2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online [ 2.726692] Setting cache flush threshold to 1024 kB [ 2.729932] Not-handled unaligned insn 0x43ffff80 [ 2.798114] Setting TLB flush threshold to 140 kB [ 2.928039] Unaligned handler failed, ret = -1 [ 3.000419] _______________________________ [ 3.000419] < Your System ate a SPARC! Gah! > [ 3.000419] ------------------------------- [ 3.000419] \ ^__^ [ 3.000419] (__)\ )\/\ [ 3.000419] U ||----w | [ 3.000419] || || [ 9.340055] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1 [ 9.448082] task: 00000000bfd48060 task.stack: 00000000bfd50000 [ 9.528040] [ 10.760029] IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004025d154 000000004025d158 [ 10.868052] IIR: 43ffff80 ISR: 0000000000340000 IOR: 000001ff54150960 [ 10.960029] CPU: 1 CR30: 00000000bfd50000 CR31: 0000000011111111 [ 11.052057] ORIG_R28: 000000004021e3b4 [ 11.100045] IAOQ[0]: irq_exit+0x94/0x120 [ 11.152062] IAOQ[1]: irq_exit+0x98/0x120 [ 11.208031] RP(r2): irq_exit+0xb8/0x120 [ 11.256074] Backtrace: [ 11.288067] [<00000000402cd944>] cpu_startup_entry+0x1e4/0x598 [ 11.368058] [<0000000040109528>] smp_callin+0x2c0/0x2f0 [ 11.436308] [<00000000402b53fc>] update_curr+0x18c/0x2d0 [ 11.508055] [<00000000402b73b8>] dequeue_entity+0x2c0/0x1030 [ 11.584040] [<00000000402b3cc0>] set_next_entity+0x80/0xd30 [ 11.660069] [<00000000402c1594>] pick_next_task_fair+0x614/0x720 [ 11.740085] [<000000004020dd34>] __schedule+0x394/0xa60 [ 11.808054] [<000000004020e488>] schedule+0x88/0x118 [ 11.876039] [<0000000040283d3c>] rescuer_thread+0x4d4/0x5b0 [ 11.948090] [<000000004028fc4c>] kthread+0x1ec/0x248 [ 12.016053] [<0000000040205020>] end_fault_vector+0x20/0xc0 [ 12.092239] [<00000000402050c0>] _switch_to_ret+0x0/0xf40 [ 12.164044] [ 12.184036] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1 [ 12.244040] Backtrace: [ 12.244040] [<000000004021c480>] show_stack+0x68/0x80 [ 12.244040] [<00000000406f332c>] dump_stack+0xec/0x168 [ 12.244040] [<000000004021c74c>] die_if_kernel+0x25c/0x430 [ 12.244040] [<000000004022d320>] handle_unaligned+0xb48/0xb50 [ 12.244040] [ 12.632066] ---[ end trace 9ca05a7215c7bbb2 ]--- [ 12.692036] Kernel panic - not syncing: Attempted to kill the idle task! We have the insn 0x43ffff80 in IIR but from IAOQ we should have: 4025d150: 0f f3 20 df ldd,s r19(r31),r31 4025d154: 0f 9f 00 9c ldw r31(ret0),ret0 4025d158: bf 80 20 58 cmpb,*<> r0,ret0,4025d18c <irq_exit+0xcc> Cpu0 has just completed running parisc_setup_cache_timing: [ 2.429981] Releasing cpu 1 now, hpa=fffffffffffa2000 [ 2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online [ 2.726692] Setting cache flush threshold to 1024 kB [ 2.729932] Not-handled unaligned insn 0x43ffff80 [ 2.798114] Setting TLB flush threshold to 140 kB [ 2.928039] Unaligned handler failed, ret = -1 From the backtrace, cpu1 is in smp_callin: void __init smp_callin(void) { int slave_id = cpu_now_booting; smp_cpu_init(slave_id); preempt_disable(); flush_cache_all_local(); /* start with known state */ flush_tlb_all_local(NULL); local_irq_enable(); /* Interrupts have been off until now */ cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); So, it has just flushed its caches and the TLB. It would seem either the flushes in parisc_setup_cache_timing or smp_callin have corrupted kernel memory. The attached patch reworks parisc_setup_cache_timing to remove the races in setting the cache and TLB flush thresholds. It also corrects the number of bytes flushed in the TLB calculation. The patch flushes the cache and TLB on cpu0 before starting the secondary processors so that they are started from a known state. Tested with a few reboots on c8000. Signed-off-by:
John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v3.18+ Signed-off-by:
Helge Deller <deller@gmx.de>
-
Matt Redfearn authored
Since commit 4bcc595c ("printk: reinstate KERN_CONT for printing continuation lines") the output from __do_page_fault on MIPS has been pretty unreadable due to the lack of KERN_CONT markers. Use pr_cont to provide the appropriate markers & restore the expected output. Signed-off-by:
Matt Redfearn <matt.redfearn@imgtec.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/14544/ Signed-off-by:
Ralf Baechle <ralf@linux-mips.org>
-
Aneesh Kumar K.V authored
With commit e58e87ad ("powerpc/mm: Update _PAGE_KERNEL_RO") we started using the ppp value 0b110 to map kernel readonly. But that facility was only added as part of ISA 2.04. For earlier ISA version only supported ppp bit value for readonly mapping is 0b011. (This implies both user and kernel get mapped using the same ppp bit value for readonly mapping.). Update the code such that for earlier architecture version we use ppp value 0b011 for readonly mapping. We don't differentiate between power5+ and power5 here and apply the new ppp bits only from power6 (ISA 2.05). This keep the changes minimal. This fixes issue with PS3 spu usage reported at https://lkml.kernel.org/r/rep.1421449714.geoff@infradead.org Fixes: e58e87ad ("powerpc/mm: Update _PAGE_KERNEL_RO") Cc: stable@vger.kernel.org # v4.7+ Tested-by:
Geoff Levand <geoff@infradead.org> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- Nov 24, 2016
-
-
Radim Krčmář authored
Split irqchip allows pic and ioapic routes to be used without them being created, which results in NULL access. Check for NULL and avoid it. (The setup is too racy for a nicer solutions.) Found by syzkaller: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 3 PID: 11923 Comm: kworker/3:2 Not tainted 4.9.0-rc5+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: events irqfd_inject task: ffff88006a06c7c0 task.stack: ffff880068638000 RIP: 0010:[...] [...] __lock_acquire+0xb35/0x3380 kernel/locking/lockdep.c:3221 RSP: 0000:ffff88006863ea20 EFLAGS: 00010006 RAX: dffffc0000000000 RBX: dffffc0000000000 RCX: 0000000000000000 RDX: 0000000000000039 RSI: 0000000000000000 RDI: 1ffff1000d0c7d9e RBP: ffff88006863ef58 R08: 0000000000000001 R09: 0000000000000000 R10: 00000000000001c8 R11: 0000000000000000 R12: ffff88006a06c7c0 R13: 0000000000000001 R14: ffffffff8baab1a0 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88006d100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004abdd0 CR3: 000000003e2f2000 CR4: 00000000000026e0 Stack: ffffffff894d0098 1ffff1000d0c7d56 ffff88006863ecd0 dffffc0000000000 ffff88006a06c7c0 0000000000000000 ffff88006863ecf8 0000000000000082 0000000000000000 ffffffff815dd7c1 ffffffff00000000 ffffffff00000000 Call Trace: [...] lock_acquire+0x2a2/0x790 kernel/locking/lockdep.c:3746 [...] __raw_spin_lock include/linux/spinlock_api_smp.h:144 [...] _raw_spin_lock+0x38/0x50 kernel/locking/spinlock.c:151 [...] spin_lock include/linux/spinlock.h:302 [...] kvm_ioapic_set_irq+0x4c/0x100 arch/x86/kvm/ioapic.c:379 [...] kvm_set_ioapic_irq+0x8f/0xc0 arch/x86/kvm/irq_comm.c:52 [...] kvm_set_irq+0x239/0x640 arch/x86/kvm/../../../virt/kvm/irqchip.c:101 [...] irqfd_inject+0xb4/0x150 arch/x86/kvm/../../../virt/kvm/eventfd.c:60 [...] process_one_work+0xb40/0x1ba0 kernel/workqueue.c:2096 [...] worker_thread+0x214/0x18a0 kernel/workqueue.c:2230 [...] kthread+0x328/0x3e0 kernel/kthread.c:209 [...] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433 Reported-by:
Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: 49df6397 ("KVM: x86: Split the APIC from the rest of IRQCHIP.") Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Radim Krčmář authored
KVM was using arrays of size KVM_MAX_VCPUS with vcpu_id, but ID can be bigger that the maximal number of VCPUs, resulting in out-of-bounds access. Found by syzkaller: BUG: KASAN: slab-out-of-bounds in __apic_accept_irq+0xb33/0xb50 at addr [...] Write of size 1 by task a.out/27101 CPU: 1 PID: 27101 Comm: a.out Not tainted 4.9.0-rc5+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [...] Call Trace: [...] __apic_accept_irq+0xb33/0xb50 arch/x86/kvm/lapic.c:905 [...] kvm_apic_set_irq+0x10e/0x180 arch/x86/kvm/lapic.c:495 [...] kvm_irq_delivery_to_apic+0x732/0xc10 arch/x86/kvm/irq_comm.c:86 [...] ioapic_service+0x41d/0x760 arch/x86/kvm/ioapic.c:360 [...] ioapic_set_irq+0x275/0x6c0 arch/x86/kvm/ioapic.c:222 [...] kvm_ioapic_inject_all arch/x86/kvm/ioapic.c:235 [...] kvm_set_ioapic+0x223/0x310 arch/x86/kvm/ioapic.c:670 [...] kvm_vm_ioctl_set_irqchip arch/x86/kvm/x86.c:3668 [...] kvm_arch_vm_ioctl+0x1a08/0x23c0 arch/x86/kvm/x86.c:3999 [...] kvm_vm_ioctl+0x1fa/0x1a70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099 Reported-by:
Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: af1bae54 ("KVM: x86: bump KVM_MAX_VCPU_ID to 1023") Reviewed-by:
Paolo Bonzini <pbonzini@redhat.com> Reviewed-by:
David Hildenbrand <david@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Radim Krčmář authored
em_jmp_far and em_ret_far assumed that setting IP can only fail in 64 bit mode, but syzkaller proved otherwise (and SDM agrees). Code segment was restored upon failure, but it was left uninitialized outside of long mode, which could lead to a leak of host kernel stack. We could have fixed that by always saving and restoring the CS, but we take a simpler approach and just break any guest that manages to fail as the error recovery is error-prone and modern CPUs don't need emulator for this. Found by syzkaller: WARNING: CPU: 2 PID: 3668 at arch/x86/kvm/emulate.c:2217 em_ret_far+0x428/0x480 Kernel panic - not syncing: panic_on_warn set ... CPU: 2 PID: 3668 Comm: syz-executor Not tainted 4.9.0-rc4+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [...] Call Trace: [...] __dump_stack lib/dump_stack.c:15 [...] dump_stack+0xb3/0x118 lib/dump_stack.c:51 [...] panic+0x1b7/0x3a3 kernel/panic.c:179 [...] __warn+0x1c4/0x1e0 kernel/panic.c:542 [...] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585 [...] em_ret_far+0x428/0x480 arch/x86/kvm/emulate.c:2217 [...] em_ret_far_imm+0x17/0x70 arch/x86/kvm/emulate.c:2227 [...] x86_emulate_insn+0x87a/0x3730 arch/x86/kvm/emulate.c:5294 [...] x86_emulate_instruction+0x520/0x1ba0 arch/x86/kvm/x86.c:5545 [...] emulate_instruction arch/x86/include/asm/kvm_host.h:1116 [...] complete_emulated_io arch/x86/kvm/x86.c:6870 [...] complete_emulated_mmio+0x4e9/0x710 arch/x86/kvm/x86.c:6934 [...] kvm_arch_vcpu_ioctl_run+0x3b7a/0x5a90 arch/x86/kvm/x86.c:6978 [...] kvm_vcpu_ioctl+0x61e/0xdd0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2557 [...] vfs_ioctl fs/ioctl.c:43 [...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679 [...] SYSC_ioctl fs/ioctl.c:694 [...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685 [...] entry_SYSCALL_64_fastpath+0x1f/0xc2 Reported-by:
Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: d1442d85 ("KVM: x86: Handle errors when RIP is set during far jumps") Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Radim Krčmář authored
Cluster xAPIC delivery incorrectly assumed that dest_id <= 0xff. With enabled KVM_X2APIC_API_USE_32BIT_IDS in KVM_CAP_X2APIC_API, a userspace can send an interrupt with dest_id that results in out-of-bounds access. Found by syzkaller: BUG: KASAN: slab-out-of-bounds in kvm_irq_delivery_to_apic_fast+0x11fa/0x1210 at addr ffff88003d9ca750 Read of size 8 by task syz-executor/22923 CPU: 0 PID: 22923 Comm: syz-executor Not tainted 4.9.0-rc4+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [...] Call Trace: [...] __dump_stack lib/dump_stack.c:15 [...] dump_stack+0xb3/0x118 lib/dump_stack.c:51 [...] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 [...] print_address_description mm/kasan/report.c:194 [...] kasan_report_error mm/kasan/report.c:283 [...] kasan_report+0x231/0x500 mm/kasan/report.c:303 [...] __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:329 [...] kvm_irq_delivery_to_apic_fast+0x11fa/0x1210 arch/x86/kvm/lapic.c:824 [...] kvm_irq_delivery_to_apic+0x132/0x9a0 arch/x86/kvm/irq_comm.c:72 [...] kvm_set_msi+0x111/0x160 arch/x86/kvm/irq_comm.c:157 [...] kvm_send_userspace_msi+0x201/0x280 arch/x86/kvm/../../../virt/kvm/irqchip.c:74 [...] kvm_vm_ioctl+0xba5/0x1670 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3015 [...] vfs_ioctl fs/ioctl.c:43 [...] do_vfs_ioctl+0x18c/0x1040 fs/ioctl.c:679 [...] SYSC_ioctl fs/ioctl.c:694 [...] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685 [...] entry_SYSCALL_64_fastpath+0x1f/0xc2 Reported-by:
Dmitry Vyukov <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: e45115b6 ("KVM: x86: use physical LAPIC array for logical x2APIC") Reviewed-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com>
-
Paul Burton authored
Since MIPSr6 the Wired register is split into 2 fields, with the upper 16 bits of the register indicating a limit on the value that the wired entry count in the bottom 16 bits of the register can take. This means that simply reading the wired register doesn't get us a valid TLB entry index any longer, and we instead need to retrieve only the lower 16 bits of the register. Introduce a new num_wired_entries() function which does this on MIPSr6 or higher and simply returns the value of the wired register on older architecture revisions, and make use of it when reading the number of wired entries. Since commit e710d666 ("MIPS: tlb-r4k: If there are wired entries, don't use TLBINVF") we have been using a non-zero number of wired entries to determine whether we should avoid use of the tlbinvf instruction (which would invalidate wired entries) and instead loop over TLB entries in local_flush_tlb_all(). This loop begins with the number of wired entries, or before this patch some large bogus TLB index on MIPSr6 systems. Thus since the aforementioned commit some MIPSr6 systems with FTLBs have been prone to leaving stale address translations in the FTLB & crashing in various weird & wonderful ways when we later observe the wrong memory. Signed-off-by:
Paul Burton <paul.burton@imgtec.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14557/ Signed-off-by:
Ralf Baechle <ralf@linux-mips.org>
-
Oliver O'Halloran authored
When configured with CONFIG_PPC_EARLY_DEBUG_OPAL=y the kernel expects the OPAL entry and base addresses to be passed in r8 and r9 respectively. Currently the wrapper does not attempt to restore these values before entering the decompressed kernel which causes the kernel to branch into whatever happens to be in r9 when doing a write to the OPAL console in early boot. This patch adds a platform_ops hook that can be used to branch into the new kernel. The OPAL console driver patches this at runtime so that if the console is used it will be restored just prior to entering the kernel. Fixes: 656ad58e ("powerpc/boot: Add OPAL console to epapr wrappers") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by:
Oliver O'Halloran <oohall@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- Nov 23, 2016
-
-
Chris Metcalf authored
For large values of "mult" and long uptimes, the intermediate result of "cycles * mult" can overflow 64 bits. For example, the tile platform calls clocksource_cyc2ns with a 1.2 GHz clock; we have mult = 853, and after 208.5 days, we overflow 64 bits. Since clocksource_cyc2ns() is intended to be used for relative cycle counts, not absolute cycle counts, performance is more importance than accepting a wider range of cycle values. So, just use mult_frac() directly in tile's sched_clock(). Commit 4cecf6d4 ("sched, x86: Avoid unnecessary overflow in sched_clock") by Salman Qazi results in essentially the same generated code for x86 as this change does for tile. In fact, a follow-on change by Salman introduced mult_frac() and switched to using it, so the C code was largely identical at that point too. Peter Zijlstra then added mul_u64_u32_shr() and switched x86 to use it. This is, in principle, better; by optimizing the 64x64->64 multiplies to be 32x32->64 multiplies we can potentially save some time. However, the compiler piplines the 64x64->64 multiplies pretty well, and the conditional branch in the generic mul_u64_u32_shr() causes some bubbles in execution, with the result that it's pretty much a wash. If tilegx provided its own implementation of mul_u64_u32_shr() without the conditional branch, we could potentially save 3 cycles, but that seems like small gain for a fair amount of additional build scaffolding; no other platform currently provides a mul_u64_u32_shr() override, and tile doesn't currently have an <asm/div64.h> header to put the override in. Additionally, gcc currently has an optimization bug that prevents it from recognizing the opportunity to use a 32x32->64 multiply, and so the result would be no better than the existing mult_frac() until such time as the compiler is fixed. For now, just using mult_frac() seems like the right answer. Cc: stable@kernel.org [v3.4+] Signed-off-by:
Chris Metcalf <cmetcalf@mellanox.com>
-
Russell King authored
This reverts commit 4dd1837d. Moving the exports for assembly code into the assembly files breaks KSYM trimming, but also breaks modversions. While fixing the KSYM trimming is trivial, fixing modversions brings us to a technically worse position that we had prior to the above change: - We end up with the prototype definitions divorsed from everything else, which means that adding or removing assembly level ksyms become more fragile: * if adding a new assembly ksyms export, a missed prototype in asm-prototypes.h results in a successful build if no module in the selected configuration makes use of the symbol. * when removing a ksyms export, asm-prototypes.h will get forgotten, with armksyms.c, you'll get a build error if you forget to touch the file. - We end up with the same amount of include files and prototypes, they're just in a header file instead of a .c file with their exports. As for lines of code, we don't get much of a size reduction: (original commit) 47 files changed, 131 insertions(+), 208 deletions(-) (fix for ksyms trimming) 7 files changed, 18 insertions(+), 5 deletions(-) (two fixes for modversions) 1 file changed, 34 insertions(+) 3 files changed, 7 insertions(+), 2 deletions(-) which results in a net total of only 25 lines deleted. As there does not seem to be much benefit from this change of approach, revert the change. Signed-off-by:
Russell King <rmk+kernel@armlinux.org.uk>
-
- Nov 22, 2016
-
-
Helge Deller authored
Signed-off-by:
Helge Deller <deller@gmx.de>
-
Peter Zijlstra authored
Group validation expects all events to be of the same PMU; however is_uncore_pmu() is too wide, it matches _all_ uncore events, even across PMUs. This triggers failure when we group different events from different uncore PMUs, like: perf stat -vv -e '{uncore_cbox_0/config=0x0334/,uncore_qpi_0/event=1/}' -a sleep 1 Fix is_uncore_pmu() by only matching events to the box at hand. Note that generic code; ran after this step; will disallow this mixture of PMU events. Reported-by:
Jiri Olsa <jolsa@redhat.com> Tested-by:
Jiri Olsa <jolsa@redhat.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vince@deater.net> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20161118125354.GQ3117@twins.programming.kicks-ass.net Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event unwinds sometimes do 'weird' things. In particular, we seemed to be ending up unwinding from random places on the NMI stack. While it was somewhat expected that the event record BP,SP would not match the interrupt BP,SP in that the interrupt is strictly later than the record event, it was overlooked that it could be on an already overwritten stack. Therefore, don't copy the recorded BP,SP over the interrupted BP,SP when we need stack unwinds. Note that its still possible the unwind doesn't full match the actual event, as its entirely possible to have done an (I)RET between record and interrupt, but on average it should still point in the general direction of where the event came from. Also, it's the best we can do, considering. The particular scenario that triggered the bogus NMI stack unwind was a PEBS event with very short period, upon enabling the event at the tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly triggers a record (while still on the NMI stack) which in turn triggers the next PMI. This then causes back-to-back NMIs and we'll try and unwind the stack-frame from the last NMI, which obviously is now overwritten by our own. Analyzed-by:
Josh Poimboeuf <jpoimboe@redhat.com> Reported-by:
Vince Weaver <vincent.weaver@maine.edu> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk> Cc: dvyukov@google.com <dvyukov@google.com> Cc: stable@vger.kernel.org Fixes: ca037701 ("perf, x86: Add PEBS infrastructure") Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.net Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Johannes Weiner authored
The following commit: 75925e1a ("perf/x86: Optimize stack walk user accesses") ... switched from copy_from_user_nmi() to __copy_from_user_nmi() with a manual access_ok() check. Unfortunately, copy_from_user_nmi() does an explicit check against TASK_SIZE, whereas the access_ok() uses whatever the current address limit of the task is. We are getting NMIs when __probe_kernel_read() has switched to KERNEL_DS, and then see vmalloc faults when we access what looks like pointers into vmalloc space: [] WARNING: CPU: 3 PID: 3685731 at arch/x86/mm/fault.c:435 vmalloc_fault+0x289/0x290 [] CPU: 3 PID: 3685731 Comm: sh Tainted: G W 4.6.0-5_fbk1_223_gdbf0f40 #1 [] Call Trace: [] <NMI> [<ffffffff814717d1>] dump_stack+0x4d/0x6c [] [<ffffffff81076e43>] __warn+0xd3/0xf0 [] [<ffffffff81076f2d>] warn_slowpath_null+0x1d/0x20 [] [<ffffffff8104a899>] vmalloc_fault+0x289/0x290 [] [<ffffffff8104b5a0>] __do_page_fault+0x330/0x490 [] [<ffffffff8104b70c>] do_page_fault+0xc/0x10 [] [<ffffffff81794e82>] page_fault+0x22/0x30 [] [<ffffffff81006280>] ? perf_callchain_user+0x100/0x2a0 [] [<ffffffff8115124f>] get_perf_callchain+0x17f/0x190 [] [<ffffffff811512c7>] perf_callchain+0x67/0x80 [] [<ffffffff8114e750>] perf_prepare_sample+0x2a0/0x370 [] [<ffffffff8114e840>] perf_event_output+0x20/0x60 [] [<ffffffff8114aee7>] ? perf_event_update_userpage+0xc7/0x130 [] [<ffffffff8114ea01>] __perf_event_overflow+0x181/0x1d0 [] [<ffffffff8114f484>] perf_event_overflow+0x14/0x20 [] [<ffffffff8100a6e3>] intel_pmu_handle_irq+0x1d3/0x490 [] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10 [] [<ffffffff81197191>] ? vunmap_page_range+0x1a1/0x2f0 [] [<ffffffff811972f1>] ? unmap_kernel_range_noflush+0x11/0x20 [] [<ffffffff814f2056>] ? ghes_copy_tofrom_phys+0x116/0x1f0 [] [<ffffffff81040d1d>] ? x2apic_send_IPI_self+0x1d/0x20 [] [<ffffffff8100411d>] perf_event_nmi_handler+0x2d/0x50 [] [<ffffffff8101ea31>] nmi_handle+0x61/0x110 [] [<ffffffff8101ef94>] default_do_nmi+0x44/0x110 [] [<ffffffff8101f13b>] do_nmi+0xdb/0x150 [] [<ffffffff81795187>] end_repeat_nmi+0x1a/0x1e [] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10 [] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10 [] [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10 [] <<EOE>> <IRQ> [<ffffffff8115d05e>] ? __probe_kernel_read+0x3e/0xa0 Fix this by moving the valid_user_frame() check to before the uaccess that loads the return address and the pointer to the next frame. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Fixes: 75925e1a ("perf/x86: Optimize stack walk user accesses") Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Nicholas Piggin authored
After patch 4efca4ed ("kbuild: modversions for EXPORT_SYMBOL() for asm"), asm exports can get modversions CRCs generated if they have C definitions in asm-prototypes.h. This patch adds missing definitions for 32 and 64 bit allmodconfig builds. Fixes: 9445aa1a ("ppc: move exports to definitions") Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Benjamin Herrenschmidt authored
There is a new bit, LPCR_PECE_HVEE (Hypervisor Virtualization Exit Enable), which controls wakeup from STOP states on Hypervisor Virtualization Interrupts (which happen to also be all external interrupts in host or bare metal mode). It needs to be set or we will miss wakeups. Fixes: 9baaef0a ("powerpc/irq: Add support for HV virtualization interrupts") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Rename it to HVEE to match the name in the ISA] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-