Skip to content
  1. Sep 06, 2016
    • Chuck Lever's avatar
      xprtrdma: Revert 3d4cf35b ("xprtrdma: Reply buffer exhaustion...") · 78d506e1
      Chuck Lever authored
      
      
      Receive buffer exhaustion, if it were to actually occur, would be
      catastrophic. However, when there are no reply buffers to post, that
      means all of them have already been posted and are waiting for
      incoming replies. By design, there can never be more RPCs in flight
      than there are available receive buffers.
      
      A receive buffer can be left posted after an RPC exits without a
      received reply; say, due to a credential problem or a soft timeout.
      This does not result in fewer posted receive buffers than there are
      pending RPCs, and there is already logic in xprtrdma to deal
      appropriately with this case.
      
      It also looks like the "+ 2" that was removed was accidentally
      accommodating the number of extra receive buffers needed for
      receiving backchannel requests. That will need to be addressed by
      another patch.
      
      Fixes: 3d4cf35b ("xprtrdma: Reply buffer exhaustion can be...")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      78d506e1
  2. Sep 03, 2016
  3. Aug 25, 2016
  4. Aug 05, 2016
  5. Aug 02, 2016
  6. Aug 01, 2016
    • Trond Myklebust's avatar
      SUNRPC: Detect immediate closure of accepted sockets · c7995f8a
      Trond Myklebust authored
      
      
      This modification is useful for debugging issues that happen while
      the socket is being initialised.
      
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      c7995f8a
    • Trond Myklebust's avatar
      SUNRPC: accept() may return sockets that are still in SYN_RECV · b2f21f7d
      Trond Myklebust authored
      
      
      We're seeing traces of the following form:
      
       [10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2
       [10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80
       [10952.396362] nfsd: connect from 10.2.6.1, port=187
       [10952.396364] svc: svc_setup_socket ffff8800b99bcf00
       [10952.396368] setting up TCP socket for reading
       [10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800)
       [10952.396373] svc: transport ffff8803eb10a000 put into queue
       [10952.396375] svc: transport ffff88042ba4a000 put into queue
       [10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000)
       [10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2
       [10952.396381] svc_recv: found XPT_CLOSE
       [10952.396397] svc: svc_delete_xprt(ffff8803eb10a000)
       [10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000)
       [10952.396399] svc: svc_sock_detach(ffff8803eb10a000)
       [10952.396412] svc: svc_sock_free(ffff8803eb10a000)
      
      i.e. an immediate close of the socket after initialisation.
      
      The culprit appears to be the test at the end of svc_tcp_init, which
      checks if the newly created socket is in the TCP_ESTABLISHED state,
      and immediately closes it if not. The evidence appears to suggest that
      the socket might still be in the SYN_RECV state at this time.
      
      The fix is to check for both states, and then to add a check in
      svc_tcp_state_change() to ensure we don't close the socket when
      it transitions into TCP_ESTABLISHED.
      
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      b2f21f7d
    • Trond Myklebust's avatar
      SUNRPC: Handle EADDRNOTAVAIL on connection failures · 1f4c17a0
      Trond Myklebust authored
      
      
      If the connect attempt immediately fails with an EADDRNOTAVAIL error, then
      that means our choice of source port number was bad.
      This error is expected when we set the SO_REUSEPORT socket option and we
      have 2 sockets sharing the same source and destination address and port
      combinations.
      
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Fixes: 402e23b4 ("SUNRPC: Fix stupid typo in xs_sock_set_reuseport")
      Cc: stable@vger.kernel.org # v4.0+
      1f4c17a0
  7. Jul 24, 2016
  8. Jul 19, 2016
    • kbuild test robot's avatar
      xprtrdma: fix semicolon.cocci warnings · 53d78523
      kbuild test robot authored
      
      
      net/sunrpc/xprtrdma/verbs.c:798:2-3: Unneeded semicolon
      
       Remove unneeded semicolon.
      
      Generated by: scripts/coccinelle/misc/semicolon.cocci
      
      CC: Chuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      53d78523
    • Frank Sorenson's avatar
      sunrpc: Prevent resvport min/max inversion via sysfs and module parameter · ffb6ca33
      Frank Sorenson authored
      
      
      The current min/max resvport settings are independently limited
      by the entire range of allowed ports, so max_resvport can be
      set to a port lower than min_resvport.
      
      Prevent inversion of min/max values when set through sysfs and
      module parameter by setting the limits dependent on each other.
      
      Signed-off-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      ffb6ca33
    • Frank Sorenson's avatar
      sunrpc: Prevent resvport min/max inversion via sysctl · e08ea3a9
      Frank Sorenson authored
      
      
      The current min/max resvport settings are independently limited
      by the entire range of allowed ports, so max_resvport can be
      set to a port lower than min_resvport.
      
      Prevent inversion of min/max values when set through sysctl by
      setting the limits dependent on each other.
      
      Signed-off-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      e08ea3a9
    • Frank Sorenson's avatar
      sunrpc: Fix reserved port range calculation · 5d71899a
      Frank Sorenson authored
      
      
      The range calculation for choosing the random reserved port will panic
      with divide-by-zero when min_resvport == max_resvport, a range of one
      port, not zero.
      
      Fix the reserved port range calculation by adding one to the difference.
      
      Signed-off-by: default avatarFrank Sorenson <sorenson@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      5d71899a
    • Frank Sorenson's avatar
      sunrpc: Fix bit count when setting hashtable size to power-of-two · 34ae685c
      Frank Sorenson authored
      
      
      Author: Frank Sorenson <sorenson@redhat.com>
      Date:   2016-06-27 13:55:48 -0500
      
          sunrpc: Fix bit count when setting hashtable size to power-of-two
      
          The hashtable size is incorrectly calculated as the next higher
          power-of-two when being set to a power-of-two.  fls() returns the
          bit number of the most significant set bit, with the least
          significant bit being numbered '1'.  For a power-of-two, fls()
          will return a bit number which is one higher than the number of bits
          required, leading to a hashtable which is twice the requested size.
      
          In addition, the value of (1 << nbits) will always be at least num,
          so the test will never be true.
      
          Fix the hash table size calculation to correctly set hashtable
          size, and eliminate the unnecessary check.
      
      Signed-off-by: default avatarFrank Sorenson <sorenson@redhat.com>
      
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      34ae685c
    • Scott Mayhew's avatar
      sunrpc: move NO_CRKEY_TIMEOUT to the auth->au_flags · ce52914e
      Scott Mayhew authored
      
      
      A generic_cred can be used to look up a unx_cred or a gss_cred, so it's
      not really safe to use the the generic_cred->acred->ac_flags to store
      the NO_CRKEY_TIMEOUT flag.  A lookup for a unx_cred triggered while the
      KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and
      KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated
      with the auth_cred to be in a state where they're perpetually doing 4K
      NFS_FILE_SYNC writes.
      
      This can be reproduced as follows:
      
      1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys.
      They do not need to be the same export, nor do they even need to be from
      the same NFS server.  Also, v3 is fine.
      $ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5
      $ sudo mount -o v3,sec=sys server2:/export /mnt/sys
      
      2. As the normal user, before accessing the kerberized mount, kinit with
      a short lifetime (but not so short that renewing the ticket would leave
      you within the 4-minute window again by the time the original ticket
      expires), e.g.
      $ kinit -l 10m -r 60m
      
      3. Do some I/O to the kerberized mount and verify that the writes are
      wsize, UNSTABLE:
      $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1
      
      4. Wait until you're within 4 minutes of key expiry, then do some more
      I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets
      set.  Verify that the writes are 4K, FILE_SYNC:
      $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1
      
      5. Now do some I/O to the sec=sys mount.  This will cause
      RPC_CRED_NO_CRKEY_TIMEOUT to be set:
      $ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1
      
      6. Writes for that user will now be permanently 4K, FILE_SYNC for that
      user, regardless of which mount is being written to, until you reboot
      the client.  Renewing the kerberos ticket (assuming it hasn't already
      expired) will have no effect.  Grabbing a new kerberos ticket at this
      point will have no effect either.
      
      Move the flag to the auth->au_flags field (which is currently unused)
      and rename it slightly to reflect that it's no longer associated with
      the auth_cred->ac_flags.  Add the rpc_auth to the arg list of
      rpcauth_cred_key_to_expire and check the au_flags there too.  Finally,
      add the inode to the arg list of nfs_ctx_key_to_expire so we can
      determine the rpc_auth to pass to rpcauth_cred_key_to_expire.
      
      Signed-off-by: default avatarScott Mayhew <smayhew@redhat.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      ce52914e
  9. Jul 16, 2016
  10. Jul 13, 2016
  11. Jul 11, 2016
Loading