Skip to content
  1. Oct 10, 2008
  2. Oct 07, 2008
    • Cedric Le Goater's avatar
      sunrpc: fix oops in rpc_create when the mount namespace is unshared · 63ffc23d
      Cedric Le Goater authored
      
      
      On a system with nfs mounts, if a task unshares its mount namespace,
      a oops can occur when the system is rebooted if the task is the last
      to unreference the nfs mount. It will try to create a rpc request
      using utsname() which has been invalidated by free_nsproxy().
      
      The patch fixes the issue by using the global init_utsname() which is
      always valid. the capability of identifying rpc clients per uts namespace
      stills needs some extra work so this should not be a problem.
      
      BUG: unable to handle kernel NULL pointer dereference at 00000004
      IP: [<c024c9ab>] rpc_create+0x332/0x42f
      Oops: 0000 [#1] DEBUG_PAGEALLOC
      
      Pid: 1857, comm: uts-oops Not tainted (2.6.27-rc5-00319-g7686ad5 #4)
      EIP: 0060:[<c024c9ab>] EFLAGS: 00210287 CPU: 0
      EIP is at rpc_create+0x332/0x42f
      EAX: 00000000 EBX: df26adf0 ECX: c0251887 EDX: 00000001
      ESI: df26ae58 EDI: c02f293c EBP: dda0fc9c ESP: dda0fc2c
       DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
      Process uts-oops (pid: 1857, ti=dda0e000 task=dd9a0778 task.ti=dda0e000)
      Stack: c0104532 dda0fffc dda0fcac dda0e000 dda0e000 dd93b7f0 00000009 c02f2880
             df26aefc dda0fc68 c01096b7 00000000 c0266ee0 c039a070 c039a070 dda0fc74
             c012ca67 c039a064 dda0fc8c c012cb20 c03daf74 00000011 00000000 c0275c90
      Call Trace:
       [<c0104532>] ? dump_trace+0xc2/0xe2
       [<c01096b7>] ? save_stack_trace+0x1c/0x3a
       [<c012ca67>] ? save_trace+0x37/0x8c
       [<c012cb20>] ? add_lock_to_list+0x64/0x96
       [<c0256fc4>] ? rpcb_register_call+0x62/0xbb
       [<c02570c8>] ? rpcb_register+0xab/0xb3
       [<c0252f4d>] ? svc_register+0xb4/0x128
       [<c0253114>] ? svc_destroy+0xec/0x103
       [<c02531b2>] ? svc_exit_thread+0x87/0x8d
       [<c01a75cd>] ? lockd_down+0x61/0x81
       [<c01a577b>] ? nlmclnt_done+0xd/0xf
       [<c01941fe>] ? nfs_destroy_server+0x14/0x16
       [<c0194328>] ? nfs_free_server+0x4c/0xaa
       [<c019a066>] ? nfs_kill_super+0x23/0x27
       [<c0158585>] ? deactivate_super+0x3f/0x51
       [<c01695d1>] ? mntput_no_expire+0x95/0xb4
       [<c016965b>] ? release_mounts+0x6b/0x7a
       [<c01696cc>] ? __put_mnt_ns+0x62/0x70
       [<c0127501>] ? free_nsproxy+0x25/0x80
       [<c012759a>] ? switch_task_namespaces+0x3e/0x43
       [<c01275a9>] ? exit_task_namespaces+0xa/0xc
       [<c0117fed>] ? do_exit+0x4fd/0x666
       [<c01181b3>] ? do_group_exit+0x5d/0x83
       [<c011fa8c>] ? get_signal_to_deliver+0x2c8/0x2e0
       [<c0102630>] ? do_notify_resume+0x69/0x700
       [<c011d85a>] ? do_sigaction+0x134/0x145
       [<c0127205>] ? hrtimer_nanosleep+0x8f/0xce
       [<c0126d1a>] ? hrtimer_wakeup+0x0/0x1c
       [<c0103488>] ? work_notifysig+0x13/0x1b
       =======================
      Code: 70 20 68 cb c1 2c c0 e8 75 4e 01 00 8b 83 ac 00 00 00 59 3d 00 f0 ff ff 5f 77 63 eb 57 a1 00 80 2d c0 8b 80 a8 02 00 00 8d 73 68 <8b> 40 04 83 c0 45 e8 41 46 f7 ff ba 20 00 00 00 83 f8 21 0f 4c
      EIP: [<c024c9ab>] rpc_create+0x332/0x42f SS:ESP 0068:dda0fc2c
      
      Signed-off-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "Serge E. Hallyn" <serue@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      63ffc23d
    • Trond Myklebust's avatar
    • Trond Myklebust's avatar
      SUNRPC: Fix autobind on cloned rpc clients · 9a4bd29f
      Trond Myklebust authored
      
      
      Despite the fact that cloned rpc clients won't have the cl_autobind flag
      set, they may still find themselves calling rpcb_getport_async(). For this
      to happen, it suffices for a _parent_ rpc_clnt to use autobinding, in which
      case any clone may find itself triggering the !xprt_bound() case in
      call_bind().
      
      The correct fix for this is to walk back up the tree of cloned rpc clients,
      in order to find the parent that 'owns' the transport, either because it
      has clnt->cl_autobind set, or because it originally created the
      transport...
      
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      9a4bd29f
    • Denis V. Lunev's avatar
      sunrpc: do not pin sunrpc module in the memory · c9f6cde6
      Denis V. Lunev authored
      
      
      Basically, try_module_get here are pretty useless. Any other module using
      this API will pin sunrpc in memory due using exported symbols.
      
      Signed-off-by: default avatarDenis V. Lunev <den@openvz.org>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      c9f6cde6
  3. Sep 01, 2008
    • Cyrill Gorcunov's avatar
      sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports · 27df6f25
      Cyrill Gorcunov authored
      
      
      Vegard Nossum reported
      ----------------------
      > I noticed that something weird is going on with /proc/sys/sunrpc/transports.
      > This file is generated in net/sunrpc/sysctl.c, function proc_do_xprt(). When
      > I "cat" this file, I get the expected output:
      >    $ cat /proc/sys/sunrpc/transports
      >    tcp 1048576
      >    udp 32768
      
      > But I think that it does not check the length of the buffer supplied by
      > userspace to read(). With my original program, I found that the stack was
      > being overwritten by the characters above, even when the length given to
      > read() was just 1.
      
      David Wagner added (among other things) that copy_to_user could be
      probably used here.
      
      Ingo Oeser suggested to use simple_read_from_buffer() here.
      
      The conclusion is that proc_do_xprt doesn't check for userside buffer
      size indeed so fix this by using Ingo's suggestion.
      
      Reported-by: default avatarVegard Nossum <vegard.nossum@gmail.com>
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      CC: Ingo Oeser <ioe-lkml@rameria.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Greg Banks <gnb@sgi.com>
      Cc: Tom Tucker <tom@opengridcomputing.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      27df6f25
  4. Aug 13, 2008
    • Tom Tucker's avatar
      svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet · 24b8b447
      Tom Tucker authored
      
      
      RDMA_READ completions are kept on a separate queue from the general
      I/O request queue. Since a separate lock is used to protect the RDMA_READ
      completion queue, a race exists between the dto_tasklet and the
      svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA
      bit and adds I/O to the read-completion queue. Concurrently, the
      recvfrom thread checks the generic queue, finds it empty and resets
      the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue
      the transport for I/O and cause the transport to "stall".
      
      The fix is to protect both lists with the same lock and set the XPT_DATA
      bit with this lock held.
      
      Signed-off-by: default avatarTom Tucker <tom@opengridcomputing.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      24b8b447
  5. Jul 26, 2008
    • Alexey Dobriyan's avatar
      SL*B: drop kmem cache argument from constructor · 51cc5068
      Alexey Dobriyan authored
      
      
      Kmem cache passed to constructor is only needed for constructors that are
      themselves multiplexeres.  Nobody uses this "feature", nor does anybody uses
      passed kmem cache in non-trivial way, so pass only pointer to object.
      
      Non-trivial places are:
      	arch/powerpc/mm/init_64.c
      	arch/powerpc/mm/hugetlbpage.c
      
      This is flag day, yes.
      
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Acked-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Matt Mackall <mpm@selenic.com>
      [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
      [akpm@linux-foundation.org: fix mm/slab.c]
      [akpm@linux-foundation.org: fix ubifs]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      51cc5068
    • FUJITA Tomonori's avatar
      dma-mapping: add the device argument to dma_mapping_error() · 8d8bb39b
      FUJITA Tomonori authored
      Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
      architecture does:
      
      This enables us to cleanly fix the Calgary IOMMU issue that some devices
      are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423
      
      ).
      
      I think that per-device dma_mapping_ops support would be also helpful for
      KVM people to support PCI passthrough but Andi thinks that this makes it
      difficult to support the PCI passthrough (see the above thread).  So I
      CC'ed this to KVM camp.  Comments are appreciated.
      
      A pointer to dma_mapping_ops to struct dev_archdata is added.  If the
      pointer is non NULL, DMA operations in asm/dma-mapping.h use it.  If it's
      NULL, the system-wide dma_ops pointer is used as before.
      
      If it's useful for KVM people, I plan to implement a mechanism to register
      a hook called when a new pci (or dma capable) device is created (it works
      with hot plugging).  It enables IOMMUs to set up an appropriate
      dma_mapping_ops per device.
      
      The major obstacle is that dma_mapping_error doesn't take a pointer to the
      device unlike other DMA operations.  So x86 can't have dma_mapping_ops per
      device.  Note all the POWER IOMMUs use the same dma_mapping_error function
      so this is not a problem for POWER but x86 IOMMUs use different
      dma_mapping_error functions.
      
      The first patch adds the device argument to dma_mapping_error.  The patch
      is trivial but large since it touches lots of drivers and dma-mapping.h in
      all the architecture.
      
      This patch:
      
      dma_mapping_error() doesn't take a pointer to the device unlike other DMA
      operations.  So we can't have dma_mapping_ops per device.
      
      Note that POWER already has dma_mapping_ops per device but all the POWER
      IOMMUs use the same dma_mapping_error function.  x86 IOMMUs use device
      argument.
      
      [akpm@linux-foundation.org: fix sge]
      [akpm@linux-foundation.org: fix svc_rdma]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix bnx2x]
      [akpm@linux-foundation.org: fix s2io]
      [akpm@linux-foundation.org: fix pasemi_mac]
      [akpm@linux-foundation.org: fix sdhci]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix sparc]
      [akpm@linux-foundation.org: fix ibmvscsi]
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Avi Kivity <avi@qumranet.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8d8bb39b
    • Mike Travis's avatar
      cpumask: change cpumask_of_cpu_ptr to use new cpumask_of_cpu · 0bc3cc03
      Mike Travis authored
      
      
        * Replace previous instances of the cpumask_of_cpu_ptr* macros
          with a the new (lvalue capable) generic cpumask_of_cpu().
      
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0bc3cc03
  6. Jul 18, 2008
    • Mike Travis's avatar
      cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr · 65c01184
      Mike Travis authored
      
      
        * This patch replaces the dangerous lvalue version of cpumask_of_cpu
          with new cpumask_of_cpu_ptr macros.  These are patterned after the
          node_to_cpumask_ptr macros.
      
          In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
          the cpumask_of_cpu_map[cpu] entry is used.  The cpumask_of_cpu_map
          is provided when there is a large NR_CPUS count, reducing
          greatly the amount of code generated and stack space used for
          cpumask_of_cpu().  The pointer to the cpumask_t value is needed for
          calling set_cpus_allowed_ptr() to reduce the amount of stack space
          needed to pass the cpumask_t value.
      
          If there isn't a cpumask_of_cpu_map[], then a temporary variable is
          declared and filled in with value from cpumask_of_cpu(cpu) as well as
          a pointer variable pointing to this temporary variable.  Afterwards,
          the pointer is used to reference the cpumask value.  The compiler
          will optimize out the extra dereference through the pointer as well
          as the stack space used for the pointer, resulting in identical code.
      
          A good example of the orthogonal usages is in net/sunrpc/svc.c:
      
      	case SVC_POOL_PERCPU:
      	{
      		unsigned int cpu = m->pool_to[pidx];
      		cpumask_of_cpu_ptr(cpumask, cpu);
      
      		*oldmask = current->cpus_allowed;
      		set_cpus_allowed_ptr(current, cpumask);
      		return 1;
      	}
      	case SVC_POOL_PERNODE:
      	{
      		unsigned int node = m->pool_to[pidx];
      		node_to_cpumask_ptr(nodecpumask, node);
      
      		*oldmask = current->cpus_allowed;
      		set_cpus_allowed_ptr(current, nodecpumask);
      		return 1;
      	}
      
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      65c01184
  7. Jul 15, 2008
  8. Jul 09, 2008
Loading