Skip to content
  1. May 23, 2011
  2. Apr 28, 2011
  3. Mar 23, 2011
  4. Feb 10, 2011
  5. Jan 31, 2011
  6. Jan 03, 2011
    • Ben Hutchings's avatar
      watchdog: Improve initialisation error message and documentation · 55142374
      Ben Hutchings authored
      
      
      The error message 'NMI watchdog failed to create perf event...'
      does not make it clear that this is a fatal error for the
      watchdog.  It also currently prints the error value as a
      pointer, rather than extracting the error code with PTR_ERR().
      Fix that.
      
      Add a note to the description of the 'nowatchdog' kernel
      parameter to associate it with this message.
      
      Reported-by: default avatarCesare Leonardi <celeonar@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: 599368@bugs.debian.org
      Cc: 608138@bugs.debian.org
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: <stable@kernel.org> # .37.x and later
      LKML-Reference: <1294009362.3167.126.camel@localhost>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      55142374
  7. Dec 17, 2010
  8. Dec 09, 2010
  9. Nov 26, 2010
  10. Nov 06, 2010
  11. Oct 23, 2010
  12. Sep 15, 2010
    • Matt Helsley's avatar
      perf events: Clean up pid passing · 38a81da2
      Matt Helsley authored
      The kernel perf event creation path shouldn't use find_task_by_vpid()
      because a vpid exists in a specific namespace. find_task_by_vpid() uses
      current's pid namespace which isn't always the correct namespace to use
      for the vpid in all the places perf_event_create_kernel_counter() (and
      thus find_get_context()) is called.
      
      The goal is to clean up pid namespace handling and prevent bugs like:
      
      	https://bugzilla.kernel.org/show_bug.cgi?id=17281
      
      
      
      Instead of using pids switch find_get_context() to use task struct
      pointers directly. The syscall is responsible for resolving the pid to
      a task struct. This moves the pid namespace resolution into the syscall
      much like every other syscall that takes pid parameters.
      
      Signed-off-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robin Green <greenrd@greenrd.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      LKML-Reference: <a134e5e392ab0204961fd1a62c84a222bf5874a9.1284407763.git.matthltc@us.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      38a81da2
    • Stephane Eranian's avatar
      watchdog: Avoid kernel crash when disabling watchdog · d9ca07a0
      Stephane Eranian authored
      
      
      In case you boot with the watchdog disabled, i.e., nowatchdog, then,
      if you try to disable it via /proc/sys/kernel/watchdog, you get
      a kernel crash. The reason is that you are trying to cancel a hrtimer
      which has never been initialized.
      
      This patch fixes this by skipping execution of
      watchdog_disable_all_cpus() when the watchdog is marked
      disabled from boot.
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4c8f7a23.cae9d80a.2c11.0bb4@mx.google.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d9ca07a0
  13. Sep 01, 2010
  14. Aug 23, 2010
  15. Aug 20, 2010
  16. Jul 07, 2010
    • Kulikov Vasiliy's avatar
      kernel/watchdog: Initialize 'result' · eb703f98
      Kulikov Vasiliy authored
      
      
      Variable on the stack is not initialized to zero, do it
      explicitly.
      
      This bug was found by a compiler warning:
      
       kernel/watchdog.c:463: warning: 'result' may be used uninitialized in this function
      
      Signed-off-by: default avatarKulikov Vasiliy <segooon@gmail.com>
      Acked-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1278316854-28442-1-git-send-email-segooon@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      eb703f98
  17. May 19, 2010
  18. May 16, 2010
    • Don Zickus's avatar
      lockup_detector: Cross arch compile fixes · cafcd80d
      Don Zickus authored
      
      
      Combining the softlockup and hardlockup code causes watchdog.c
      to build even without the hardlockup detection support.
      
      So if an arch, that has the previous and the new nmi watchdog
      implementations cohabiting, wants to know if the generic one
      is in use, CONFIG_LOCKUP_DETECTOR is not a reliable check.
      We need to use CONFIG_HARDLOCKUP_DETECTOR instead.
      
      Fixes:
      	kernel/built-in.o: In function `touch_nmi_watchdog':
      	(.text+0x449bc): multiple definition of `touch_nmi_watchdog'
      	arch/sparc/kernel/built-in.o:(.text+0x11b28): first defined here
      
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      LKML-Reference: <20100514151121.GR15159@redhat.com>
      [ use CONFIG_HARDLOCKUP_DETECTOR instead of CONFIG_PERF_EVENTS_NMI]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      cafcd80d
  19. May 15, 2010
    • Frederic Weisbecker's avatar
      lockup_detector: Introduce CONFIG_HARDLOCKUP_DETECTOR · 23637d47
      Frederic Weisbecker authored
      
      
      This new config is deemed to simplify even more the lockup detector
      dependencies and can make it easier to bring a smooth sorting
      between archs that support the new generic lockup detector and those
      that still have their own, especially for those that are in the
      middle of this migration.
      
      Instead of checking whether we have CONFIG_LOCKUP_DETECTOR +
      CONFIG_PERF_EVENTS_NMI each time an arch wants to know if it needs
      to build its own lockup detector, take a shortcut with this new
      config. It is enabled only if the hardlockup detection part of
      the whole lockup detector is on.
      
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      23637d47
  20. May 13, 2010
    • Ingo Molnar's avatar
      watchdog: Export touch_softlockup_watchdog · 0167c781
      Ingo Molnar authored
      
      
      There are modules that rely on it:
      
        ERROR: "touch_softlockup_watchdog" [drivers/video/nvidia/nvidiafb.ko] undefined!
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      LKML-Reference: <1273713674-8434-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0167c781
  21. May 12, 2010
    • Don Zickus's avatar
      lockup_detector: Separate touch_nmi_watchdog code path from touch_watchdog · d7c54733
      Don Zickus authored
      
      
      When I combined the nmi_watchdog (hardlockup) and softlockup code, I
      also combined the paths the touch_watchdog and touch_nmi_watchdog took.
      This may not be the best idea as pointed out by Frederic W., that the
      touch_watchdog case probably should not reset the hardlockup count.
      
      Therefore the patch below falls back to the previous idea of keeping
      the touch_nmi_watchdog a superset of the touch_watchdog case.
      
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-9-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      d7c54733
    • Don Zickus's avatar
      lockup_detector: Touch_softlockup cleanups and softlockup_tick removal · 332fbdbc
      Don Zickus authored
      
      
      Just some code cleanup to make touch_softlockup clearer and remove the
      softlockup_tick function as it is no longer needed.
      
      Also remove the /proc softlockup_thres call as it has been changed to
      watchdog_thres.
      
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      332fbdbc
    • Don Zickus's avatar
      lockup_detector: Combine nmi_watchdog and softlockup detector · 58687acb
      Don Zickus authored
      
      
      The new nmi_watchdog (which uses the perf event subsystem) is very
      similar in structure to the softlockup detector.  Using Ingo's
      suggestion, I combined the two functionalities into one file:
      kernel/watchdog.c.
      
      Now both the nmi_watchdog (or hardlockup detector) and softlockup
      detector sit on top of the perf event subsystem, which is run every
      60 seconds or so to see if there are any lockups.
      
      To detect hardlockups, cpus not responding to interrupts, I
      implemented an hrtimer that runs 5 times for every perf event
      overflow event.  If that stops counting on a cpu, then the cpu is
      most likely in trouble.
      
      To detect softlockups, tasks not yielding to the scheduler, I used the
      previous kthread idea that now gets kicked every time the hrtimer fires.
      If the kthread isn't being scheduled neither is anyone else and the
      warning is printed to the console.
      
      I tested this on x86_64 and both the softlockup and hardlockup paths
      work.
      
      V2:
      - cleaned up the Kconfig and softlockup combination
      - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
      - seperated out the softlockup case from perf event subsystem
      - re-arranged the enabling/disabling nmi watchdog from proc space
      - added cpumasks for hardlockup failure cases
      - removed fallback to soft events if no PMU exists for hard events
      
      V3:
      - comment cleanups
      - drop support for older softlockup code
      - per_cpu cleanups
      - completely remove software clock base hardlockup detector
      - use per_cpu masking on hard/soft lockup detection
      - #ifdef cleanups
      - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
      - documentation additions
      
      V4:
      - documentation fixes
      - convert per_cpu to __get_cpu_var
      - powerpc compile fixes
      
      V5:
      - split apart warn flags for hard and soft lockups
      
      TODO:
      - figure out how to make an arch-agnostic clock2cycles call
        (if possible) to feed into perf events as a sample period
      
      [fweisbec: merged conflict patch]
      
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      58687acb
Loading