Skip to content
  1. Jan 26, 2012
  2. Jan 19, 2012
  3. Jan 18, 2012
  4. Jan 17, 2012
    • Linus Torvalds's avatar
      x86, tsc: Fix SMI induced variation in quick_pit_calibrate() · 68f30fbe
      Linus Torvalds authored
      
      
      pit_expect_msb() returns success wrongly in the below SMI scenario:
      
      a. pit_verify_msb() has not yet seen the MSB transition.
      
      b. we are close to the MSB transition though and got a SMI immediately after
         returning from pit_verify_msb() which didn't see the MSB transition. PIT MSB
         transition has happened somewhere during SMI execution.
      
      c. returned from SMI and we noted down the 'tsc', saw the pit MSB change now and
         exited the loop to calculate 'deltatsc'. Instead of noting the TSC at the MSB
         transition, we are way off because of the SMI.  And as the SMI happened
         between the pit_verify_msb() and before the 'tsc' is recorded in the
         for loop, 'delattsc' (d1/d2 in quick_pit_calibrate()) will be small and
         quick_pit_calibrate() will not notice this error.
      
      Depending on whether SMI disturbance happens while computing d1 or d2, we will
      see the TSC calibrated value smaller or bigger than the expected value. As a
      result, in a cluster we were seeing a variation of approximately +/- 20MHz in
      the calibrated values, resulting in NTP failures.
      
        [ As far as the SMI source is concerned, this is a periodic SMI that gets
          disabled after ACPI is enabled by the OS. But the TSC calibration happens
          before the ACPI is enabled. ]
      
      To address this, change pit_expect_msb() so that
      
       - the 'tsc' is the TSC in between the two reads that read the MSB
      change from the PIT (same as before)
      
       - the 'delta' is the difference in TSC from *before* the MSB changed
      to *after* the MSB changed.
      
      Now the delta is twice as big as before (it covers four PIT accesses,
      roughly 4us) and quick_pit_calibrate() will loop a bit longer to get
      the calibrated value with in the 500ppm precision. As the delta (d1/d2)
      covers four PIT accesses, actual calibrated result might be closer to
      250ppm precision.
      
      As the loop now takes longer to stabilize, double MAX_QUICK_PIT_MS to 50.
      
      SMI disturbance will showup as much larger delta's and the loop will take
      longer than usual for the result to be with in the accepted precision. Or will
      fallback to slow PIT calibration if it takes more than 50msec.
      
      Also while we are at this, remove the calibration correction that aims to
      get the result to the middle of the error bars. We really don't know which
      direction to correct into, so remove it.
      
      Reported-and-tested-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1326843337.5291.4.camel@sbsiddha-mobl2
      
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      68f30fbe
    • Eric Paris's avatar
      audit: inline audit_syscall_entry to reduce burden on archs · b05d8447
      Eric Paris authored
      
      
      Every arch calls:
      
      if (unlikely(current->audit_context))
      	audit_syscall_entry()
      
      which requires knowledge about audit (the existance of audit_context) in
      the arch code.  Just do it all in static inline in audit.h so that arch's
      can remain blissfully ignorant.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      b05d8447
    • Eric Paris's avatar
      audit: ia32entry.S sign extend error codes when calling 64 bit code · f031cd25
      Eric Paris authored
      
      
      In the ia32entry syscall exit audit fastpath we have assembly code which calls
      __audit_syscall_exit directly.  This code was, however, zeroes the upper 32
      bits of the return code.  It then proceeded to call code which expects longs
      to be 64bits long.  In order to handle code which expects longs to be 64bit we
      sign extend the return code if that code is an error.  Thus the
      __audit_syscall_exit function can correctly handle using the values in
      snprintf("%ld").  This fixes the regression introduced in 5cbf1565.
      
      Old record:
      type=SYSCALL msg=audit(1306197182.256:281): arch=40000003 syscall=192 success=no exit=4294967283
      New record:
      type=SYSCALL msg=audit(1306197182.256:281): arch=40000003 syscall=192 success=no exit=-13
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      f031cd25
    • Eric Paris's avatar
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris authored
      
      
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b
    • Ulrich Drepper's avatar
      x86, opcode: ANDN and Group 17 in x86-opcode-map.txt · ce79dac8
      Ulrich Drepper authored
      The Intel documentation at
      
      http://software.intel.com/file/36945
      
      
      
      shows the ANDN opcode and Group 17 with encoding f2 and f3 encoding
      respectively.  The current version of x86-opcode-map.txt shows them
      with f3 and f4.  Unless someone can point to documentation which shows
      the currently used encoding the following patch be applied.
      
      Signed-off-by: default avatarUlrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/r/CAOPLpQdq5SuVo9=023CYhbFLAX9rONyjmYq7jJkqc5xwctW5eA@mail.gmail.com
      
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      ce79dac8
    • Randy Dunlap's avatar
      x86/kconfig: Move the ZONE_DMA entry under a menu · 5ee71535
      Randy Dunlap authored
      
      
      Move the ZONE_DMA kconfig symbol under a menu item instead
      of having it listed before everything else in
      "make {xconfig | gconfig | nconfig | menuconfig}".
      
      This drops the first line of the top-level kernel config menu
      (in 3.2) below and moves it under "Processor type and features".
      
                [*] DMA memory allocation support
                    General setup  --->
                [*] Enable loadable module support  --->
                [*] Enable the block layer  --->
                    Processor type and features  --->
                    Power management and ACPI options  --->
                    Bus options (PCI etc.)  --->
                    Executable file formats / Emulations  --->
      
      Signed-off-by: default avatarRandy Dunlap <rdunlap@xenotime.net>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-mm@kvack.org <linux-mm@kvack.org>
      Link: http://lkml.kernel.org/r/4F14811E.6090107@xenotime.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: David Rientjes <rientjes@google.com>
      5ee71535
    • Kurt Garloff's avatar
      ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64) · cd298f60
      Kurt Garloff authored
      
      
      In SRAT v1, we had 8bit proximity domain (PXM) fields; SRAT v2 provides
      32bits for these. The new fields were reserved before.
      According to the ACPI spec, the OS must disregrard reserved fields.
      
      x86/x86-64 was rather inconsistent prior to this patch; it used 8 bits
      for the pxm field in cpu_affinity, but 32 bits in mem_affinity.
      This patch makes it consistent: Either use 8 bits consistently (SRAT
      rev 1 or lower) or 32 bits (SRAT rev 2 or higher).
      
      cc: x86@kernel.org
      Signed-off-by: default avatarKurt Garloff <kurt@garloff.de>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      cd298f60
    • Huang Ying's avatar
      ACPI, Record ACPI NVS regions · b54ac6d2
      Huang Ying authored
      
      
      Some firmware will access memory in ACPI NVS region via APEI.  That
      is, instructions in APEI ERST/EINJ table will read/write ACPI NVS
      region.  The original resource conflict checking in APEI code will
      check memory/ioport accessed by APEI via general resource management
      mechanism.  But ACPI NVS region is marked as busy already, so that the
      false resource conflict will prevent APEI ERST/EINJ to work.
      
      To fix this, this patch record ACPI NVS regions, so that we can avoid
      request resources for memory region inside it.
      
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      b54ac6d2
    • Cliff Wickman's avatar
      x86/UV2: Add accounting for BAU strong nacks · b54bd9be
      Cliff Wickman authored
      
      
      This patch adds separate accounting of UV2 message "strong
      nack's" in the BAU statistics.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120116212238.GF5767@sgi.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b54bd9be
    • Cliff Wickman's avatar
      x86/UV2: Ack BAU interrupt earlier · 88ed9dd7
      Cliff Wickman authored
      
      
      This patch moves the ack of the BAU interrupt to the beginning
      of  the interrupt handler so that there is less possibility of a
      lost interrupt and slower response to a shootdown message.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120116212146.GE5767@sgi.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      88ed9dd7
    • Cliff Wickman's avatar
      x86/UV2: Remove stale no-resources test for UV2 BAU · 478c6e52
      Cliff Wickman authored
      
      
      This patch removes an unnecessary test for a
      no-destination-resources-available condition that looks like a
      destination timeout in UV1, but is separately distinguishable in
      UV2.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120116212050.GD5767@sgi.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      478c6e52
    • Cliff Wickman's avatar
      x86/UV2: Work around BAU bug · c5d35d39
      Cliff Wickman authored
      
      
      This patch implements a workaround for a UV2 hardware bug.
      The bug is a non-atomic update of a memory-mapped register. When
      hardware message delivery and software message acknowledge occur
      simultaneously the pending message acknowledge for the arriving
      message may be lost.  This causes the sender's message status to
      stay busy.
      
      Part of the workaround is to not acknowledge a completed message
      until it is verified that no other message is actually using the
      resource that is mistakenly recorded in the completed message.
      
      Part of the workaround is to test for long elapsed time in such
      a busy condition, then handle it by using a spare sending
      descriptor. The stay-busy condition is eventually timed out by
      hardware, and then the original sending descriptor can be
      re-used. Most of that logic change is in keeping track of the
      current descriptor and the state of the spares.
      
      The occurrences of the workaround are added to the BAU
      statistics.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120116211947.GC5767@sgi.com
      
      
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c5d35d39
    • Cliff Wickman's avatar
      x86/UV2: Fix BAU destination timeout initialization · d059f9fa
      Cliff Wickman authored
      
      
      Move the call to enable_timeouts() forward so that
      BAU_MISC_CONTROL is initialized before using it in
      calculate_destination_timeout().
      
      Fix the calculation of a BAU destination timeout
      for UV2 (in calculate_destination_timeout()).
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120116211848.GB5767@sgi.com
      
      
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d059f9fa
    • Cliff Wickman's avatar
      x86/UV2: Fix new UV2 hardware by using native UV2 broadcast mode · da87c937
      Cliff Wickman authored
      
      
      Update the use of the Broadcast Assist Unit on SGI Altix UV2 to
      the use of native UV2 mode on new hardware (not the legacy mode).
      
      UV2 native mode has a different format for a broadcast message.
      We also need quick differentiaton between UV1 and UV2.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120116211750.GA5767@sgi.com
      
      
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      da87c937
    • Greg Kroah-Hartman's avatar
      mce: fix warning messages about static struct mce_device · e032d807
      Greg Kroah-Hartman authored
      
      
      When suspending, there was a large list of warnings going something like:
      
      	Device 'machinecheck1' does not have a release() function, it is broken and must be fixed
      
      This patch turns the static mce_devices into dynamically allocated, and
      properly frees them when they are removed from the system.  It solves
      the warning messages on my laptop here.
      
      Reported-by: default avatar"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Tested-by: default avatarDjalal Harouni <tixxdz@opendz.org>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Borislav Petkov <bp@amd64.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e032d807
  5. Jan 16, 2012
  6. Jan 14, 2012
  7. Jan 13, 2012
  8. Jan 12, 2012
  9. Jan 11, 2012
  10. Jan 10, 2012
  11. Jan 09, 2012
  12. Jan 08, 2012
  13. Jan 07, 2012
Loading