Skip to content
  1. May 23, 2014
  2. Apr 18, 2014
  3. Apr 17, 2014
    • Masami Hiramatsu's avatar
      kprobes/x86: Fix page-fault handling logic · 6381c24c
      Masami Hiramatsu authored
      
      
      Current kprobes in-kernel page fault handler doesn't
      expect that its single-stepping can be interrupted by
      an NMI handler which may cause a page fault(e.g. perf
      with callback tracing).
      
      In that case, the page-fault handled by kprobes and it
      misunderstands the page-fault has been caused by the
      single-stepping code and tries to recover IP address
      to probed address.
      
      But the truth is the page-fault has been caused by the
      NMI handler, and do_page_fault failes to handle real
      page fault because the IP address is modified and
      causes Kernel BUGs like below.
      
       ----
       [ 2264.726905] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
       [ 2264.727190] IP: [<ffffffff813c46e0>] copy_user_generic_string+0x0/0x40
      
      To handle this correctly, I fixed the kprobes fault
      handler to ensure the faulted ip address is its own
      single-step buffer instead of checking current kprobe
      state.
      
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Sandeepa Prabhu <sandeepa.prabhu@linaro.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: fche@redhat.com
      Cc: systemtap@sourceware.org
      Link: http://lkml.kernel.org/r/20140417081644.26341.52351.stgit@ltc230.yrl.intra.hitachi.co.jp
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6381c24c
    • Ingo Molnar's avatar
      x86/mce: Fix CMCI preemption bugs · ea431643
      Ingo Molnar authored
      
      
      The following commit:
      
        27f6c573 ("x86, CMCI: Add proper detection of end of CMCI storms")
      
      Added two preemption bugs:
      
       - machine_check_poll() does a get_cpu_var() without a matching
         put_cpu_var(), which causes preemption imbalance and crashes upon
         bootup.
      
       - it does percpu ops without disabling preemption. Preemption is not
         disabled due to the mistaken use of a raw spinlock.
      
      To fix these bugs fix the imbalance and change
      cmci_discover_lock to a regular spinlock.
      
      Reported-by: default avatarOwen Kibel <qmewlo@gmail.com>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Chen, Gong <gong.chen@linux.intel.com>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexander Todorov <atodorov@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/n/tip-jtjptvgigpfkpvtQxpEk1at2@git.kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      --
       arch/x86/kernel/cpu/mcheck/mce.c       |    4 +---
       arch/x86/kernel/cpu/mcheck/mce_intel.c |   18 +++++++++---------
       2 files changed, 10 insertions(+), 12 deletions(-)
      ea431643
  4. Apr 16, 2014
    • Tony Luck's avatar
      [IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237) · c0b5a64d
      Tony Luck authored
      April 2014 Itanium processor specification update:
      
      http://www.intel.com/content/www/us/en/processors/itanium/itanium-specification-update.html
      
      
      
      describes this erratum:
      
      =========================================================================
      237. Under a complex set of conditions, store to load forwarding for a
      sub 8-byte load may complete incorrectly
      
      Problem: A load instruction may complete incorrectly when a code sequence
      using 4-byte or smaller load and store operations to the same address
      is executed in combination with specific timing of all the following
      concurrent conditions: store to load forwarding, alignment checking
      enabled, a mis-predicted branch, and complex cache utilization activity.
      
      Implication: The affected sub 8-byte instruction may complete
      incorrectly resulting in unpredictable system behavior. There is an
      extremely low probability of exposure due to the significant number of
      complex microarchitectural concurrent conditions required to encounter
      the erratum.
      
      Workaround: Set PSR.ac = 0 to completely avoid the erratum. Disabling
      Hyper-Threading will significantly reduce exposure to the conditions
      that contribute to encountering the erratum.
      
      Status: See the Summary Table of Changes for the affected steppings.
      =========================================================================
      
      [Table of changes essentially lists all models from McKinley to Tukwila]
      
      The PSR.ac bit controls whether the processor will always generate
      an unaligned reference trap (0x5a00) for a misaligned data access
      (when PSR.ac=1) or if it will let the access succeed when running
      on a cpu that implements logic to handle some unaligned accesses.
      
      Way back in 2008 in commit b704882e
        [IA64] Rationalize kernel mode alignment checking
      we made the decision to always enable strict checking. We were
      already doing so in trap/interrupt context because the common
      preamble code set this bit - but the rest of supervisor code
      (and by inheritance user code) ran with PSR.ac=0.
      
      We now reverse that decision and set PSR.ac=0 everywhere in the
      kernel (also inherited by user processes). This will avoid the
      erratum using the method described in the Itanium specification
      update.  Net effect for users is that the processor will handle
      unaligned access when it can (typically with a tiny performance
      bubble in the pipeline ... but much less invasive than taking a
      trap and having the OS perform the access).
      
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      c0b5a64d
    • Ingo Molnar's avatar
      x86: Remove the PCI reboot method from the default chain · 5be44a6f
      Ingo Molnar authored
      
      
      Steve reported a reboot hang and bisected it back to this commit:
      
        a4f1987e x86, reboot: Add EFI and CF9 reboot methods into the default list
      
      He heroically tested all reboot methods and found the following:
      
        reboot=t       # triple fault                  ok
        reboot=k       # keyboard ctrl                 FAIL
        reboot=b       # BIOS                          ok
        reboot=a       # ACPI                          FAIL
        reboot=e       # EFI                           FAIL   [system has no EFI]
        reboot=p       # PCI 0xcf9                     FAIL
      
      And I think it's pretty obvious that we should only try PCI 0xcf9 as a
      last resort - if at all.
      
      The other observation is that (on this box) we should never try
      the PCI reboot method, but close with either the 'triple fault'
      or the 'BIOS' (terminal!) reboot methods.
      
      Thirdly, CF9_COND is a total misnomer - it should be something like
      CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ...
      
      So this patch fixes the worst problems:
      
       - it orders the actual reboot logic to follow the reboot ordering
         pattern - it was in a pretty random order before for no good
         reason.
      
       - it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and
         BOOT_CF9_SAFE flags to make the code more obvious.
      
       - it tries the BIOS reboot method before the PCI reboot method.
         (Since 'BIOS' is a terminal reboot method resulting in a hang
          if it does not work, this is essentially equivalent to removing
          the PCI reboot method from the default reboot chain.)
      
       - just for the miraculous possibility of terminal (resulting
         in hang) reboot methods of triple fault or BIOS returning
         without having done their job, there's an ordering between
         them as well.
      
      Reported-and-bisected-and-tested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Li Aubrey <aubrey.li@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Link: http://lkml.kernel.org/r/20140404064120.GB11877@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5be44a6f
  5. Apr 15, 2014
  6. Apr 14, 2014
  7. Apr 13, 2014
  8. Apr 12, 2014
  9. Apr 11, 2014
Loading