Skip to content
  1. Jul 19, 2016
  2. Jun 19, 2016
    • Lee Jones's avatar
      ARM: dts: STi: stih407-family: Disable reserved-memory co-processor nodes · 0e289e53
      Lee Jones authored
      
      
      This patch fixes a non-booting issue in Mainline.
      
      When booting with a compressed kernel, we need to be careful how we
      populate memory close to DDR start.  AUTO_ZRELADDR is enabled by default
      in multi-arch enabled configurations, which place some restrictions on
      where the kernel is placed and where it will be uncompressed to on boot.
      
      AUTO_ZRELADDR takes the decompressor code's start address and masks out
      the bottom 28 bits to obtain an address to uncompress the kernel to
      (thus a load address of 0x42000000 means that the kernel will be
      uncompressed to 0x40000000 i.e. DDR START on this platform).
      
      Even changing the load address to after the co-processor's shared memory
      won't render a booting platform, since the AUTO_ZRELADDR algorithm still
      ensures the kernel is uncompressed into memory shared with the first
      co-processor (0x40000000).
      
      Another option would be to move loading to 0x4A000000, since this will
      mean the decompressor will decompress the kernel to 0x48000000. However,
      this would mean a large chunk (0x44000000 => 0x48000000 (64MB)) of
      memory would essentially be wasted for no good reason.
      
      Until we can work with ST to find a suitable memory location to
      relocate co-processor shared memory, let's disable the shared memory
      nodes.  This will ensure a working platform in the mean time.
      
      NB: The more observant of you will notice that we're leaving the DMU
      shared memory node enabled; this is because a) it is the only one in
      active use at the time of this writing and b) it is not affected by
      the current default behaviour which is causing issues.
      
      Fixes: fe135c63 (ARM: dts: STiH407: Move over to using the 'reserved-memory' API for obtaining DMA memory)
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Reviewed-by Peter Griffin <peter.griffin@linaro.org>
      Signed-off-by: default avatarMaxime Coquelin <maxime.coquelin@st.com>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      0e289e53
  3. Jun 18, 2016
    • William Breathitt Gray's avatar
      isa: Allow ISA-style drivers on modern systems · 3a495511
      William Breathitt Gray authored
      
      
      Several modern devices, such as PC/104 cards, are expected to run on
      modern systems via an ISA bus interface. Since ISA is a legacy interface
      for most modern architectures, ISA support should remain disabled in
      general. Support for ISA-style drivers should be enabled on a per driver
      basis.
      
      To allow ISA-style drivers on modern systems, this patch introduces the
      ISA_BUS_API and ISA_BUS Kconfig options. The ISA bus driver will now
      build conditionally on the ISA_BUS_API Kconfig option, which defaults to
      the legacy ISA Kconfig option. The ISA_BUS Kconfig option allows the
      ISA_BUS_API Kconfig option to be selected on architectures which do not
      enable ISA (e.g. X86_64).
      
      The ISA_BUS Kconfig option is currently only implemented for X86
      architectures. Other architectures may have their own ISA_BUS Kconfig
      options added as required.
      
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWilliam Breathitt Gray <vilhelm.gray@gmail.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a495511
  4. Jun 17, 2016
  5. Jun 16, 2016
  6. Jun 15, 2016
    • Suravee Suthikulpanit's avatar
      kvm: svm: Do not support AVIC if not CONFIG_X86_LOCAL_APIC · 5b8abf1f
      Suravee Suthikulpanit authored
      
      
      Add logic to disable AVIC #ifndef CONFIG_X86_LOCAL_APIC.
      
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5b8abf1f
    • Suravee Suthikulpanit's avatar
      kvm: svm: Fix implicit declaration for __default_cpu_present_to_apicid() · 7d669f50
      Suravee Suthikulpanit authored
      
      
      The commit 8221c137 ("svm: Manage vcpu load/unload when enable AVIC")
      introduces a build error due to implicit function declaration
      when #ifdef CONFIG_X86_32 and #ifndef CONFIG_X86_LOCAL_APIC
      (as reported by Kbuild test robot i386-randconfig-x0-06121009).
      
      So, this patch introduces kvm_cpu_get_apicid() wrapper
      around __default_cpu_present_to_apicid() with additional
      handling if CONFIG_X86_LOCAL_APIC is not defined.
      
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Fixes: commit 8221c137 ("svm: Manage vcpu load/unload when enable AVIC")
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7d669f50
    • Will Deacon's avatar
      arm64: spinlock: Ensure forward-progress in spin_unlock_wait · c56bdcac
      Will Deacon authored
      
      
      Rather than wait until we observe the lock being free (which might never
      happen), we can also return from spin_unlock_wait if we observe that the
      lock is now held by somebody else, which implies that it was unlocked
      but we just missed seeing it in that state.
      
      Furthermore, in such a scenario there is no longer a need to write back
      the value that we loaded, since we know that there has been a lock
      hand-off, which is sufficient to publish any stores prior to the
      unlock_wait because the ARm architecture ensures that a Store-Release
      instruction is multi-copy atomic when observed by a Load-Acquire
      instruction.
      
      The litmus test is something like:
      
      AArch64
      {
      0:X1=x; 0:X3=y;
      1:X1=y;
      2:X1=y; 2:X3=x;
      }
       P0          | P1           | P2           ;
       MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;
       STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;
       DMB SY      |              |              ;
       LDR W2,[X3] |              |              ;
      exists
      (0:X2=0 /\ 2:X0=1 /\ 2:X2=0)
      
      where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
      doing spin_lock.
      
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      c56bdcac
    • Will Deacon's avatar
      arm64: spinlock: fix spin_unlock_wait for LSE atomics · 3a5facd0
      Will Deacon authored
      
      
      Commit d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against
      concurrent lockers") fixed spin_unlock_wait for LL/SC-based atomics under
      the premise that the LSE atomics (in particular, the LDADDA instruction)
      are indivisible.
      
      Unfortunately, these instructions are only indivisible when used with the
      -AL (full ordering) suffix and, consequently, the same issue can
      theoretically be observed with LSE atomics, where a later (in program
      order) load can be speculated before the write portion of the atomic
      operation.
      
      This patch fixes the issue by performing a CAS of the lock once we've
      established that it's unlocked, in much the same way as the LL/SC code.
      
      Fixes: d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      3a5facd0
    • Will Deacon's avatar
      arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks · 38b850a7
      Will Deacon authored
      
      
      spin_is_locked has grown two very different use-cases:
      
      (1) [The sane case] API functions may require a certain lock to be held
          by the caller and can therefore use spin_is_locked as part of an
          assert statement in order to verify that the lock is indeed held.
          For example, usage of assert_spin_locked.
      
      (2) [The insane case] There are two locks, where a CPU takes one of the
          locks and then checks whether or not the other one is held before
          accessing some shared state. For example, the "optimized locking" in
          ipc/sem.c.
      
      In the latter case, the sequence looks like:
      
        spin_lock(&sem->lock);
        if (!spin_is_locked(&sma->sem_perm.lock))
          /* Access shared state */
      
      and requires that the spin_is_locked check is ordered after taking the
      sem->lock. Unfortunately, since our spinlocks are implemented using a
      LDAXR/STXR sequence, the read of &sma->sem_perm.lock can be speculated
      before the STXR and consequently return a stale value.
      
      Whilst this hasn't been seen to cause issues in practice, PowerPC fixed
      the same issue in 51d7d520 ("powerpc: Add smp_mb() to
      arch_spin_is_locked()") and, although we did something similar for
      spin_unlock_wait in d86b8da0 ("arm64: spinlock: serialise
      spin_unlock_wait against concurrent lockers") that doesn't actually take
      care of ordering against local acquisition of a different lock.
      
      This patch adds an smp_mb() to the start of our arch_spin_is_locked and
      arch_spin_unlock_wait routines to ensure that the lock value is always
      loaded after any other locks have been taken by the current CPU.
      
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      38b850a7
  7. Jun 14, 2016
    • Paul E. McKenney's avatar
      arm: Use _rcuidle for smp_cross_call() tracepoints · 7c64cc05
      Paul E. McKenney authored
      
      
      Further testing with false negatives suppressed by commit 293e2421
      ("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
      identified another unprotected use of RCU from the idle loop.  Because RCU
      actively ignores idle-loop code (for energy-efficiency reasons, among
      other things), using RCU from the idle loop can result in too-short
      grace periods, in turn resulting in arbitrary misbehavior.
      
      The resulting lockdep-RCU splat is as follows:
      
      ------------------------------------------------------------------------
      
      ===============================
      [ INFO: suspicious RCU usage. ]
      4.6.0-rc5-next-20160426+ #1112 Not tainted
      -------------------------------
      include/trace/events/ipi.h:35 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      RCU used illegally from idle CPU!
      rcu_scheduler_active = 1, debug_locks = 0
      RCU used illegally from extended quiescent state!
      no locks held by swapper/0/0.
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
      Hardware name: Generic OMAP4 (Flattened Device Tree)
      [<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
      [<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
      [<c047fec8>] (dump_stack) from [<c010dcfc>] (smp_cross_call+0xbc/0x188)
      [<c010dcfc>] (smp_cross_call) from [<c01c9e28>] (generic_exec_single+0x9c/0x15c)
      [<c01c9e28>] (generic_exec_single) from [<c01ca0a0>] (smp_call_function_single_async+0 x38/0x9c)
      [<c01ca0a0>] (smp_call_function_single_async) from [<c0603728>] (cpuidle_coupled_poke_others+0x8c/0xa8)
      [<c0603728>] (cpuidle_coupled_poke_others) from [<c0603c10>] (cpuidle_enter_state_coupled+0x26c/0x390)
      [<c0603c10>] (cpuidle_enter_state_coupled) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
      [<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
      [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
      
      ------------------------------------------------------------------------
      
      Reported-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: <linux-omap@vger.kernel.org>
      Cc: <linux-arm-kernel@lists.infradead.org>
      7c64cc05
    • Mark Rutland's avatar
      arm64: mm: mark fault_info table const · bbb1681e
      Mark Rutland authored
      
      
      Unlike the debug_fault_info table, we never intentionally alter the
      fault_info table at runtime, and all derived pointers are treated as
      const currently.
      
      Make the table const so that it can be placed in .rodata and protected
      from unintentional writes, as we do for the syscall tables.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      bbb1681e
    • Mark Rutland's avatar
      arm64: fix dump_instr when PAN and UAO are in use · c5cea06b
      Mark Rutland authored
      
      
      If the kernel is set to show unhandled signals, and a user task does not
      handle a SIGILL as a result of an instruction abort, we will attempt to
      log the offending instruction with dump_instr before killing the task.
      
      We use dump_instr to log the encoding of the offending userspace
      instruction. However, dump_instr is also used to dump instructions from
      kernel space, and internally always switches to KERNEL_DS before dumping
      the instruction with get_user. When both PAN and UAO are in use, reading
      a user instruction via get_user while in KERNEL_DS will result in a
      permission fault, which leads to an Oops.
      
      As we have regs corresponding to the context of the original instruction
      abort, we can inspect this and only flip to KERNEL_DS if the original
      abort was taken from the kernel, avoiding this issue. At the same time,
      remove the redundant (and incorrect) comments regarding the order
      dump_mem and dump_instr are called in.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: <stable@vger.kernel.org> #4.6+
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reported-by: default avatarVladimir Murzin <vladimir.murzin@arm.com>
      Tested-by: default avatarVladimir Murzin <vladimir.murzin@arm.com>
      Fixes: 57f4959b ("arm64: kernel: Add support for User Access Override")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      c5cea06b
    • James Hogan's avatar
      MIPS: KVM: Fix CACHE triggered exception emulation · 6df82a7b
      James Hogan authored
      
      
      When emulating TLB miss / invalid exceptions during CACHE instruction
      emulation, be sure to set up the correct PC and host_cp0_badvaddr state
      for the kvm_mips_emlulate_tlb*_ld() function to pick up for guest EPC
      and BadVAddr.
      
      PC needs to be rewound otherwise the guest EPC will end up pointing at
      the next instruction after the faulting CACHE instruction.
      
      host_cp0_badvaddr must be set because guest CACHE instructions trap with
      a Coprocessor Unusable exception, which doesn't update the host BadVAddr
      as a TLB exception would.
      
      This doesn't tend to get hit when dynamic translation of emulated
      instructions is enabled, since only the first execution of each CACHE
      instruction actually goes through this code path, with subsequent
      executions hitting the SYNCI instruction that it gets replaced with.
      
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6df82a7b
    • James Hogan's avatar
      MIPS: KVM: Don't unwind PC when emulating CACHE · cc81e948
      James Hogan authored
      
      
      When a CACHE instruction is emulated by kvm_mips_emulate_cache(), the PC
      is first updated to point to the next instruction, and afterwards it
      falls through the "dont_update_pc" label, which rewinds the PC back to
      its original address.
      
      This works when dynamic translation of emulated instructions is enabled,
      since the CACHE instruction is replaced with a SYNCI which works without
      trapping, however when dynamic translation is disabled the guest hangs
      on CACHE instructions as they always trap and are never stepped over.
      
      Roughly swap the meanings of the "done" and "dont_update_pc" to match
      kvm_mips_emulate_CP0(), so that "done" will roll back the PC on failure,
      and "dont_update_pc" won't change PC at all (for the sake of exceptions
      that have already modified the PC).
      
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cc81e948
    • James Hogan's avatar
      MIPS: KVM: Include bit 31 in segment matches · 7f5a1ddc
      James Hogan authored
      
      
      When faulting guest addresses are matched against guest segments with
      the KVM_GUEST_KSEGX() macro, change the mask to 0xe0000000 so as to
      include bit 31.
      
      This is mainly for safety's sake, as it prevents a rogue BadVAddr in the
      host kseg2/kseg3 segments (e.g. 0xC*******) after a TLB exception from
      matching the guest kseg0 segment (e.g. 0x4*******), triggering an
      internal KVM error instead of allowing the corresponding guest kseg0
      page to be mapped into the host vmalloc space.
      
      Such a rogue BadVAddr was observed to happen with the host MIPS kernel
      running under QEMU with KVM built as a module, due to a not entirely
      transparent optimisation in the QEMU TLB handling. This has already been
      worked around properly in a previous commit.
      
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7f5a1ddc
    • James Hogan's avatar
      MIPS: KVM: Fix modular KVM under QEMU · 797179bc
      James Hogan authored
      
      
      Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
      get a TLB refill exception in it when KVM is built as a module.
      
      This was observed to happen with the host MIPS kernel running under
      QEMU, due to a not entirely transparent optimisation in the QEMU TLB
      handling where TLB entries replaced with TLBWR are copied to a separate
      part of the TLB array. Code in those pages continue to be executable,
      but those mappings persist only until the next ASID switch, even if they
      are marked global.
      
      An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
      switching to the guest exception base. Subsequent TLB mapped kernel
      instructions just prior to switching to the guest trigger a TLB refill
      exception, which enters the guest exception handlers without updating
      EPC. This appears as a guest triggered TLB refill on a host kernel
      mapped (host KSeg2) address, which is not handled correctly as user
      (guest) mode accesses to kernel (host) segments always generate address
      error exceptions.
      
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      797179bc
  8. Jun 13, 2016
  9. Jun 10, 2016
  10. Jun 09, 2016
  11. Jun 08, 2016
    • Marek Vasut's avatar
      ARM: dts: socfpga: Add missing PHY phandle · c106c21c
      Marek Vasut authored
      
      
      Add missing PHY phandle into the DT, otherwise the stmmac code won't
      detect the PHY correctly anymore.
      
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Dinh Nguyen <dinguyen@opensource.altera.com>
      Signed-off-by: default avatarDinh Nguyen <dinguyen@opensource.altera.com>
      c106c21c
    • Borislav Petkov's avatar
      x86/cpu/AMD: Extend X86_FEATURE_TOPOEXT workaround to newer models · 96685a55
      Borislav Petkov authored
      
      
      We need to reenable the topology extensions CPUID leafs on newer models
      too, if BIOS has disabled them, as we rely on them to get proper compute
      unit topology.
      
      Make the printk a once thing, while at it.
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rui Huang <ray.huang@amd.com>
      Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-hwmon@vger.kernel.org
      Link: http://lkml.kernel.org/r/1464775468-23355-1-git-send-email-bp@alien8.de
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      96685a55
    • Dave Hansen's avatar
      x86/cpu/intel: Introduce macros for Intel family numbers · 970442c5
      Dave Hansen authored
      
      
      Problem:
      
      We have a boatload of open-coded family-6 model numbers.  Half of
      them have these model numbers in hex and the other half in
      decimal.  This makes grepping for them tons of fun, if you were
      to try.
      
      Solution:
      
      Consolidate all the magic numbers.  Put all the definitions in
      one header.
      
      The names here are closely derived from the comments describing
      the models from arch/x86/events/intel/core.c.  We could easily
      make them shorter by doing things like s/SANDYBRIDGE/SNB/, but
      they seemed fine even with the longer versions to me.
      
      Do not take any of these names too literally, like "DESKTOP"
      or "MOBILE".  These are all colloquial names and not precise
      descriptions of everywhere a given model will show up.
      
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Cc: Eduardo Valentin <edubezval@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
      Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Ulf Hansson <ulf.hansson@linaro.org>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Vishwanath Somayaji <vishwanath.somayaji@intel.com>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: jacob.jun.pan@intel.com
      Cc: linux-acpi@vger.kernel.org
      Cc: linux-edac@vger.kernel.org
      Cc: linux-mmc@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: platform-driver-x86@vger.kernel.org
      Link: http://lkml.kernel.org/r/20160603001927.F2A7D828@viggo.jf.intel.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      970442c5
    • Will Deacon's avatar
      arm64: mm: always take dirty state from new pte in ptep_set_access_flags · 0106d456
      Will Deacon authored
      
      
      Commit 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for
      hardware AF/DBM") ensured that pte flags are updated atomically in the
      face of potential concurrent, hardware-assisted updates. However, Alex
      reports that:
      
       | This patch breaks swapping for me.
       | In the broken case, you'll see either systemd cpu time spike (because
       | it's stuck in a page fault loop) or the system hang (because the
       | application owning the screen is stuck in a page fault loop).
      
      It turns out that this is because the 'dirty' argument to
      ptep_set_access_flags is always 0 for read faults, and so we can't use
      it to set PTE_RDONLY. The failing sequence is:
      
        1. We put down a PTE_WRITE | PTE_DIRTY | PTE_AF pte
        2. Memory pressure -> pte_mkold(pte) -> clear PTE_AF
        3. A read faults due to the missing access flag
        4. ptep_set_access_flags is called with dirty = 0, due to the read fault
        5. pte is then made PTE_WRITE | PTE_DIRTY | PTE_AF | PTE_RDONLY (!)
        6. A write faults, but pte_write is true so we get stuck
      
      The solution is to check the new page table entry (as would be done by
      the generic, non-atomic definition of ptep_set_access_flags that just
      calls set_pte_at) to establish the dirty state.
      
      Cc: <stable@vger.kernel.org> # 4.3+
      Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarAlexander Graf <agraf@suse.de>
      Tested-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      0106d456
Loading