Skip to content
  1. Feb 06, 2016
    • Eric Dumazet's avatar
      dump_stack: avoid potential deadlocks · d7ce3692
      Eric Dumazet authored
      
      
      Some servers experienced fatal deadlocks because of a combination of
      bugs, leading to multiple cpus calling dump_stack().
      
      The checksumming bug was fixed in commit 34ae6a1a ("ipv6: update
      skb->csum when CE mark is propagated").
      
      The second problem is a faulty locking in dump_stack()
      
      CPU1 runs in process context and calls dump_stack(), grabs dump_lock.
      
         CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
         call dump_stack() from netdev_rx_csum_fault().
      
         dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
         dump_lock is owned by CPU1
      
      While dumping its stack, CPU1 is interrupted by a softirq, and happens
      to process a packet for the TCP socket locked by CPU2.
      
      CPU1 spins forever in spin_lock() : deadlock
      
      Stack trace on CPU1 looked like :
      
          NMI backtrace for cpu 1
          RIP: _raw_spin_lock+0x25/0x30
          ...
          Call Trace:
            <IRQ>
            tcp_v6_rcv+0x243/0x620
            ip6_input_finish+0x11f/0x330
            ip6_input+0x38/0x40
            ip6_rcv_finish+0x3c/0x90
            ipv6_rcv+0x2a9/0x500
            process_backlog+0x461/0xaa0
            net_rx_action+0x147/0x430
            __do_softirq+0x167/0x2d0
            call_softirq+0x1c/0x30
            do_softirq+0x3f/0x80
            irq_exit+0x6e/0xc0
            smp_call_function_single_interrupt+0x35/0x40
            call_function_single_interrupt+0x6a/0x70
            <EOI>
            printk+0x4d/0x4f
            printk_address+0x31/0x33
            print_trace_address+0x33/0x3c
            print_context_stack+0x7f/0x119
            dump_trace+0x26b/0x28e
            show_trace_log_lvl+0x4f/0x5c
            show_stack_log_lvl+0x104/0x113
            show_stack+0x42/0x44
            dump_stack+0x46/0x58
            netdev_rx_csum_fault+0x38/0x3c
            __skb_checksum_complete_head+0x6e/0x80
            __skb_checksum_complete+0x11/0x20
            tcp_rcv_established+0x2bd5/0x2fd0
            tcp_v6_do_rcv+0x13c/0x620
            sk_backlog_rcv+0x15/0x30
            release_sock+0xd2/0x150
            tcp_recvmsg+0x1c1/0xfc0
            inet_recvmsg+0x7d/0x90
            sock_recvmsg+0xaf/0xe0
            ___sys_recvmsg+0x111/0x3b0
            SyS_recvmsg+0x5c/0xb0
            system_call_fastpath+0x16/0x1b
      
      Fixes: b58d9774 ("dump_stack: serialize the output from dump_stack()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7ce3692
  2. Feb 03, 2016
    • Matthew Wilcox's avatar
      radix-tree: fix race in gang lookup · 46437f9a
      Matthew Wilcox authored
      
      
      If the indirect_ptr bit is set on a slot, that indicates we need to redo
      the lookup.  Introduce a new function radix_tree_iter_retry() which
      forces the loop to retry the lookup by setting 'slot' to NULL and
      turning the iterator back to point at the problematic entry.
      
      This is a pretty rare problem to hit at the moment; the lookup has to
      race with a grow of the radix tree from a height of 0.  The consequences
      of hitting this race are that gang lookup could return a pointer to a
      radix_tree_node instead of a pointer to whatever the user had inserted
      in the tree.
      
      Fixes: cebbd29e ("radix-tree: rewrite gang lookup using iterator")
      Signed-off-by: default avatarMatthew Wilcox <willy@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ohad Ben-Cohen <ohad@wizery.com>
      Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      46437f9a
    • Vitaly Kuznetsov's avatar
      lib/test-string_helpers.c: fix and improve string_get_size() tests · 72676bb5
      Vitaly Kuznetsov authored
      
      
      Recently added commit 564b026f ("string_helpers: fix precision loss
      for some inputs") fixed precision issues for string_get_size() and broke
      tests.
      
      Fix and improve them: test both STRING_UNITS_2 and STRING_UNITS_10 at a
      time, better failure reporting, test small an huge values.
      
      Fixes: 564b026f ("string_helpers: fix precision loss for some inputs")
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: James Bottomley <JBottomley@Odin.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      72676bb5
  3. Jan 27, 2016
  4. Jan 22, 2016
    • Jaewon Kim's avatar
      ratelimit: fix bug in time interval by resetting right begin time · c2594bc3
      Jaewon Kim authored
      
      
      rs->begin in ratelimit is set in two cases.
       1) when rs->begin was not initialized
       2) when rs->interval was passed
      
      For case #2, current ratelimit sets the begin to 0.  This incurrs
      improper suppression.  The begin value will be set in the next ratelimit
      call by 1).  Then the time interval check will be always false, and
      rs->printed will not be initialized.  Although enough time passed,
      ratelimit may return 0 if rs->printed is not less than rs->burst.  To
      reset interval properly, begin should be jiffies rather than 0.
      
      For an example code below:
      
          static DEFINE_RATELIMIT_STATE(mylimit, 1, 1);
          for (i = 1; i <= 10; i++) {
              if (__ratelimit(&mylimit))
                  printk("ratelimit test count %d\n", i);
              msleep(3000);
          }
      
      test result in the current code shows suppression even there is 3 seconds sleep.
      
        [  78.391148] ratelimit test count 1
        [  81.295988] ratelimit test count 2
        [  87.315981] ratelimit test count 4
        [  93.336267] ratelimit test count 6
        [  99.356031] ratelimit test count 8
        [ 105.376367] ratelimit test count 10
      
      Signed-off-by: default avatarJaewon Kim <jaewon31.kim@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c2594bc3
  5. Jan 21, 2016
  6. Jan 19, 2016
  7. Jan 18, 2016
    • Arnd Bergmann's avatar
      lib: sw842: select crc32 · 5b571677
      Arnd Bergmann authored
      
      
      The sw842 library code was merged in linux-4.1 and causes a very rare randconfig
      failure when CONFIG_CRC32 is not set:
      
          lib/built-in.o: In function `sw842_compress':
          oid_registry.c:(.text+0x12ddc): undefined reference to `crc32_be'
          lib/built-in.o: In function `sw842_decompress':
          oid_registry.c:(.text+0x137e4): undefined reference to `crc32_be'
      
      This adds an explict 'select CRC32' statement, similar to what the other users
      of the crc32 code have. In practice, CRC32 is always enabled anyway because
      over 100 other symbols select it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 2da572c9 ("lib: add software 842 compression/decompression")
      Acked-by: default avatarDan Streetman <ddstreet@ieee.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      5b571677
  8. Jan 16, 2016
Loading