Skip to content
  1. May 14, 2018
    • David Howells's avatar
      afs: Fix the handling of CB.InitCallBackState3 to find the server by UUID · 001ab5a6
      David Howells authored
      
      
      Fix the handling of the CB.InitCallBackState3 service call to find the
      record of a server that we're using by looking it up by the UUID passed as
      the parameter rather than by its address (of which it might have many, and
      which may change).
      
      Fixes: c35eccb1 ("[AFS]: Implement the CB.InitCallBackState3 operation.")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      001ab5a6
    • David Howells's avatar
      afs: Fix VNOVOL handling in address rotation · 3d9fa911
      David Howells authored
      
      
      If a volume location record lists multiple file servers for a volume, then
      it's possible that due to a misconfiguration or a changing configuration
      that one of the file servers doesn't know about it yet and will abort
      VNOVOL.  Currently, the rotation algorithm will stop with EREMOTEIO.
      
      Fix this by moving on to try the next server if VNOVOL is returned.  Once
      all the servers have been tried and the record rechecked, the algorithm
      will stop with EREMOTEIO or ENOMEDIUM.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3d9fa911
    • David Howells's avatar
      afs: Fix AFSFetchStatus decoder to provide OpenAFS compatibility · 684b0f68
      David Howells authored
      
      
      The OpenAFS server's RXAFS_InlineBulkStatus implementation has a bug
      whereby if an error occurs on one of the vnodes being queried, then the
      errorCode field is set correctly in the corresponding status, but the
      interfaceVersion field is left unset.
      
      Fix kAFS to deal with this by evaluating the AFSFetchStatus blob against
      the following cases when called from FS.InlineBulkStatus delivery:
      
       (1) If InterfaceVersion == 0 then:
      
           (a) If errorCode != 0 then it indicates the abort code for the
               corresponding vnode.
      
           (b) If errorCode == 0 then the status record is invalid.
      
       (2) If InterfaceVersion == 1 then:
      
           (a) If errorCode != 0 then it indicates the abort code for the
               corresponding vnode.
      
           (b) If errorCode == 0 then the status record is valid and can be
           	 parsed.
      
       (3) If InterfaceVersion is anything else then the status record is
           invalid.
      
      Fixes: dd9fbcb8 ("afs: Rearrange status mapping")
      Reported-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      684b0f68
    • David Howells's avatar
      afs: Fix server rotation's handling of fileserver probe failure · ec5a3b4b
      David Howells authored
      
      
      The server rotation algorithm just gives up if it fails to probe a
      fileserver.  Fix this by rotating to the next fileserver instead.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      ec5a3b4b
    • David Howells's avatar
      afs: Fix refcounting in callback registration · d4a96bec
      David Howells authored
      
      
      The refcounting on afs_cb_interest struct objects in
      afs_register_server_cb_interest() is wrong as it uses the server list
      entry's call back interest pointer without regard for the fact that it
      might be replaced at any time and the object thrown away.
      
      Fix this by:
      
       (1) Put a lock on the afs_server_list struct that can be used to
           mediate access to the callback interest pointers in the servers array.
      
       (2) Keep a ref on the callback interest that we get from the entry.
      
       (3) Dropping the old reference held by vnode->cb_interest if we replace
           the pointer.
      
      Fixes: c435ee34 ("afs: Overhaul the callback handling")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d4a96bec
    • David Howells's avatar
      afs: Fix giving up callbacks on server destruction · f2686b09
      David Howells authored
      
      
      When a server record is destroyed, we want to send a message to the server
      telling it that we're giving up all the callbacks it has promised us.
      
      Apply two fixes to this:
      
       (1) Only send the FS.GiveUpAllCallBacks message if we actually got a
           callback from that server.  We assume this to be the case if we
           performed at least one successful FS operation on that server.
      
       (2) Send it to the address last used for that server rather than always
           picking the first address in the list (which might be unreachable).
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f2686b09
    • David Howells's avatar
      afs: Fix address list parsing · 01fd79e6
      David Howells authored
      
      
      The parsing of port specifiers in the address list obtained from the DNS
      resolution upcall doesn't work as in4_pton() and in6_pton() will fail on
      encountering an unexpected delimiter (in this case, the '+' marking the
      port number).  However, in*_pton() can't be given multiple specifiers.
      
      Fix this by finding the delimiter in advance and not relying on in*_pton()
      to find the end of the address for us.
      
      Fixes: 8b2a464c ("afs: Add an address list concept")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      01fd79e6
    • David Howells's avatar
      afs: Fix directory page locking · b61f7dcf
      David Howells authored
      
      
      The afs directory loading code (primarily afs_read_dir()) locks all the
      pages that hold a directory's content blob to defend against
      getdents/getdents races and getdents/lookup races where the competitors
      issue conflicting reads on the same data.  As the reads will complete
      consecutively, they may retrieve different versions of the data and
      one may overwrite the data that the other is busy parsing.
      
      Fix this by not locking the pages at all, but rather by turning the
      validation lock into an rwsem and getting an exclusive lock on it whilst
      reading the data or validating the attributes and a shared lock whilst
      parsing the data.  Sharing the attribute validation lock should be fine as
      the data fetch will retrieve the attributes also.
      
      The individual page locks aren't needed at all as the only place they're
      being used is to serialise data loading.
      
      Without this patch, the:
      
       	if (!test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) {
      		...
      	}
      
      part of afs_read_dir() may be skipped, leaving the pages unlocked when we
      hit the success: clause - in which case we try to unlock the not-locked
      pages, leading to the following oops:
      
        page:ffffe38b405b4300 count:3 mapcount:0 mapping:ffff98156c83a978 index:0x0
        flags: 0xfffe000001004(referenced|private)
        raw: 000fffe000001004 ffff98156c83a978 0000000000000000 00000003ffffffff
        raw: dead000000000100 dead000000000200 0000000000000001 ffff98156b27c000
        page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
        page->mem_cgroup:ffff98156b27c000
        ------------[ cut here ]------------
        kernel BUG at mm/filemap.c:1205!
        ...
        RIP: 0010:unlock_page+0x43/0x50
        ...
        Call Trace:
         afs_dir_iterate+0x789/0x8f0 [kafs]
         ? _cond_resched+0x15/0x30
         ? kmem_cache_alloc_trace+0x166/0x1d0
         ? afs_do_lookup+0x69/0x490 [kafs]
         ? afs_do_lookup+0x101/0x490 [kafs]
         ? key_default_cmp+0x20/0x20
         ? request_key+0x3c/0x80
         ? afs_lookup+0xf1/0x340 [kafs]
         ? __lookup_slow+0x97/0x150
         ? lookup_slow+0x35/0x50
         ? walk_component+0x1bf/0x490
         ? path_lookupat.isra.52+0x75/0x200
         ? filename_lookup.part.66+0xa0/0x170
         ? afs_end_vnode_operation+0x41/0x60 [kafs]
         ? __check_object_size+0x9c/0x171
         ? strncpy_from_user+0x4a/0x170
         ? vfs_statx+0x73/0xe0
         ? __do_sys_newlstat+0x39/0x70
         ? __x64_sys_getdents+0xc9/0x140
         ? __x64_sys_getdents+0x140/0x140
         ? do_syscall_64+0x5b/0x160
         ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: f3ddee8d ("afs: Fix directory handling")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      b61f7dcf
  2. Apr 20, 2018
    • David Howells's avatar
      afs: Fix server record deletion · 66062592
      David Howells authored
      
      
      AFS server records get removed from the net->fs_servers tree when
      they're deleted, but not from the net->fs_addresses{4,6} lists, which
      can lead to an oops in afs_find_server() when a server record has been
      removed, for instance during rmmod.
      
      Fix this by deleting the record from the by-address lists before posting
      it for RCU destruction.
      
      The reason this hasn't been noticed before is that the fileserver keeps
      probing the local cache manager, thereby keeping the service record
      alive, so the oops would only happen when a fileserver eventually gets
      bored and stops pinging or if the module gets rmmod'd and a call comes
      in from the fileserver during the window between the server records
      being destroyed and the socket being closed.
      
      The oops looks something like:
      
        BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
        ...
        Workqueue: kafsd afs_process_async_call [kafs]
        RIP: 0010:afs_find_server+0x271/0x36f [kafs]
        ...
        Call Trace:
         afs_deliver_cb_init_call_back_state3+0x1f2/0x21f [kafs]
         afs_deliver_to_call+0x1ee/0x5e8 [kafs]
         afs_process_async_call+0x5b/0xd0 [kafs]
         process_one_work+0x2c2/0x504
         worker_thread+0x1d4/0x2ac
         kthread+0x11f/0x127
         ret_from_fork+0x24/0x30
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      66062592
  3. Apr 11, 2018
  4. Apr 09, 2018
    • David Howells's avatar
      afs: Do better accretion of small writes on newly created content · 5a813276
      David Howells authored
      
      
      Processes like ld that do lots of small writes that aren't necessarily
      contiguous result in a lot of small StoreData operations to the server, the
      idea being that if someone else changes the data on the server, we only
      write our changes over that and not the space between.  Further, we don't
      want to write back empty space if we can avoid it to make it easier for the
      server to do sparse files.
      
      However, making lots of tiny RPC ops is a lot less efficient for the server
      than one big one because each op requires allocation of resources and the
      taking of locks, so we want to compromise a bit.
      
      Reduce the load by the following:
      
       (1) If a file is just created locally or has just been truncated with
           O_TRUNC locally, allow subsequent writes to the file to be merged with
           intervening space if that space doesn't cross an entire intervening
           page.
      
       (2) Don't flush the file on ->flush() but rather on ->release() if the
           file was open for writing.
      
      Just linking vmlinux.o, without this patch, looking in /proc/fs/afs/stats:
      
      	file-wr : n=441 nb=513581204
      
      and after the patch:
      
      	file-wr : n=62 nb=513668555
      
      there were 379 fewer StoreData RPC operations at the expense of an extra
      87K being written.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5a813276
    • David Howells's avatar
      afs: Add stats for data transfer operations · 76a5cb6f
      David Howells authored
      
      
      Add statistics to /proc/fs/afs/stats for data transfer RPC operations.  New
      lines are added that look like:
      
      	file-rd : n=55794 nb=10252282150
      	file-wr : n=9789 nb=3247763645
      
      where n= indicates the number of ops completed and nb= indicates the number
      of bytes successfully transferred.  file-rd is the counts for read/fetch
      operations and file-wr the counts for write/store operations.
      
      Note that directory and symlink downloading are included in the file-rd
      stats at the moment.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      76a5cb6f
    • David Howells's avatar
      afs: Trace protocol errors · 5f702c8e
      David Howells authored
      
      
      Trace protocol errors detected in afs.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5f702c8e
    • David Howells's avatar
      afs: Locally edit directory data for mkdir/create/unlink/... · 63a4681f
      David Howells authored
      
      
      Locally edit the contents of an AFS directory upon a successful inode
      operation that modifies that directory (such as mkdir, create and unlink)
      so that we can avoid the current practice of re-downloading the directory
      after each change.
      
      This is viable provided that the directory version number we get back from
      the modifying RPC op is exactly incremented by 1 from what we had
      previously.  The data in the directory contents is in a defined format that
      we have to parse locally to perform lookups and readdir, so modifying isn't
      a problem.
      
      If the edit fails, we just clear the VALID flag on the directory and it
      will be reloaded next time it is needed.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      63a4681f
    • David Howells's avatar
      afs: Adjust the directory XDR structures · 00317636
      David Howells authored
      
      
      Adjust the AFS directory XDR structures in a number of superficial ways:
      
       (1) Rename them to all begin afs_xdr_.
      
       (2) Use u8 instead of uint8_t.
      
       (3) Mark the structures as __packed so they don't get rearranged by the
           compiler.
      
       (4) Rename the hdr member of afs_xdr_dir_block to meta.
      
       (5) Rename the pagehdr member of afs_xdr_dir_block to hdr.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      00317636
    • David Howells's avatar
      afs: Split the directory content defs into a header · 4ea219a8
      David Howells authored
      
      
      Split the directory content definitions into a header file so that they can
      be used by multiple .c files.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4ea219a8
    • David Howells's avatar
      afs: Fix directory handling · f3ddee8d
      David Howells authored
      
      
      AFS directories are structured blobs that are downloaded just like files
      and then parsed by the lookup and readdir code and, as such, are currently
      handled in the pagecache like any other file, with the entire directory
      content being thrown away each time the directory changes.
      
      However, since the blob is a known structure and since the data version
      counter on a directory increases by exactly one for each change committed
      to that directory, we can actually edit the directory locally rather than
      fetching it from the server after each locally-induced change.
      
      What we can't do, though, is mix data from the server and data from the
      client since the server is technically at liberty to rearrange or compress
      a directory if it sees fit, provided it updates the data version number
      when it does so and breaks the callback (ie. sends a notification).
      
      Further, lookup with lookup-ahead, readdir and, when it arrives, local
      editing are likely want to scan the whole of a directory.
      
      So directory handling needs to be improved to maintain the coherency of the
      directory blob prior to permitting local directory editing.
      
      To this end:
      
       (1) If any directory page gets discarded, invalidate and reread the entire
           directory.
      
       (2) If readpage notes that if when it fetches a single page that the
           version number has changed, the entire directory is flagged for
           invalidation.
      
       (3) Read as much of the directory in one go as we can.
      
      Note that this removes local caching of directories in fscache for the
      moment as we can't pass the pages to fscache_read_or_alloc_pages() since
      page->lru is in use by the LRU.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f3ddee8d
    • David Howells's avatar
      afs: Split the dynroot stuff out and give it its own ops tables · 66c7e1d3
      David Howells authored
      
      
      Split the AFS dynamic root stuff out of the main directory handling file
      and into its own file as they share little in common.
      
      The dynamic root code also gets its own dentry and inode ops tables.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      66c7e1d3
    • David Howells's avatar
      afs: Keep track of invalid-before version for dentry coherency · a4ff7401
      David Howells authored
      
      
      Each afs dentry is tagged with the version that the parent directory was at
      last time it was validated and, currently, if this differs, the directory
      is scanned and the dentry is refreshed.
      
      However, this leads to an excessive amount of revalidation on directories
      that get modified on the client without conflict with another client.  We
      know there's no conflict because the parent directory's data version number
      got incremented by exactly 1 on any create, mkdir, unlink, etc., therefore
      we can trust the current state of the unaffected dentries when we perform a
      local directory modification.
      
      Optimise by keeping track of the last version of the parent directory that
      was changed outside of the client in the parent directory's vnode and using
      that to validate the dentries rather than the current version.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      a4ff7401
    • David Howells's avatar
      afs: Rearrange status mapping · dd9fbcb8
      David Howells authored
      
      
      Rearrange the AFSFetchStatus to inode attribute mapping code in a number of
      ways:
      
       (1) Use an XDR structure rather than a series of incremented pointer
           accesses when decoding an AFSFetchStatus object.  This allows
           out-of-order decode.
      
       (2) Don't store the if_version value but rather just check it and abort if
           it's not something we can handle.
      
       (3) Store the owner and group in the status record as raw values rather
           than converting them to kuid/kgid.  Do that when they're mapped into
           i_uid/i_gid.
      
       (4) Validate the type and abort code up front and abort if they're wrong.
      
       (5) Split the inode attribute setting out into its own function from the
           XDR decode of an AFSFetchStatus object.  This allows it to be called
           from elsewhere too.
      
       (6) Differentiate changes to data from changes to metadata.
      
       (7) Use the split-out attribute mapping function from afs_iget().
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      dd9fbcb8
    • David Howells's avatar
      afs: Make it possible to get the data version in readpage · 0c3a5ac2
      David Howells authored
      
      
      Store the data version number indicated by an FS.FetchData op into the read
      request structure so that it's accessible by the page reader.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      0c3a5ac2
    • David Howells's avatar
      afs: Init inode before accessing cache · 5800db81
      David Howells authored
      
      
      We no longer parse symlinks when we get the inode to determine if this
      symlink is actually a mountpoint as we detect that by examining the mode
      instead (symlinks are always 0777 and mountpoints 0644).
      
      Access the cache after mapping the status so that we don't have to manually
      set the inode size now.
      
      Note that this may need adjusting if the disconnected operation is
      implemented as the file metadata may have to be obtained from the cache.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5800db81
    • David Howells's avatar
      afs: Introduce a statistics proc file · d55b4da4
      David Howells authored
      
      
      Introduce a proc file that displays a bunch of statistics for the AFS
      filesystem in the current network namespace.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d55b4da4
    • David Howells's avatar
      afs: Dump bad status record · 888b3384
      David Howells authored
      
      
      Dump an AFS FileStatus record that is detected as invalid.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      888b3384
    • David Howells's avatar
      afs: Implement @cell substitution handling · 37ab6368
      David Howells authored
      
      
      Implement @cell substitution handling such that if @cell is seen as a name
      in a dynamic root mount, then the name of the root cell for that network
      namespace will be substituted for @cell during lookup.
      
      The substitution of @cell for the current net namespace is set by writing
      the cell name to /proc/fs/afs/rootcell.  The value can be obtained by
      reading the file.
      
      For example:
      
      	# mount -t afs none /kafs -o dyn
      	# echo grand.central.org >/proc/fs/afs/rootcell
      	# ls /kafs/@cell
      	archive/  cvs/  doc/  local/  project/  service/  software/  user/  www/
      	# cat /proc/fs/afs/rootcell
      	grand.central.org
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      37ab6368
    • David Howells's avatar
      afs: Implement @sys substitution handling · 6f8880d8
      David Howells authored
      
      
      Implement the AFS feature by which @sys at the end of a pathname component
      may be substituted for one of a list of values, typically naming the
      operating system.  Up to 16 alternatives may be specified and these are
      tried in turn until one works.  Each network namespace has[*] a separate
      independent list.
      
      Upon creation of a new network namespace, the list of values is
      initialised[*] to a single OpenAFS-compatible string representing arch type
      plus "_linux26".  For example, on x86_64, the sysname is "amd64_linux26".
      
      [*] Or will, once network namespace support is finalised in kAFS.
      
      The list may be set by:
      
      	# for i in foo bar linux-x86_64; do echo $i; done >/proc/fs/afs/sysname
      
      for which separate writes to the same fd are amalgamated and applied on
      close.  The LF character may be used as a separator to specify multiple
      items in the same write() call.
      
      The list may be cleared by:
      
      	# echo >/proc/fs/afs/sysname
      
      and read by:
      
      	# cat /proc/fs/afs/sysname
      	foo
      	bar
      	linux-x86_64
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6f8880d8
    • David Howells's avatar
      afs: Prospectively look up extra files when doing a single lookup · 5cf9dd55
      David Howells authored
      
      
      When afs_lookup() is called, prospectively look up the next 50 uncached
      fids also from that same directory and cache the results, rather than just
      looking up the one file requested.
      
      This allows us to use the FS.InlineBulkStatus RPC op to increase efficiency
      by fetching up to 50 file statuses at a time.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5cf9dd55
    • David Howells's avatar
      afs: Don't over-increment the cell usage count when pinning it · 17814aef
      David Howells authored
      
      
      AFS cells that are added or set as the workstation cell through /proc are
      pinned against removal by setting the AFS_CELL_FL_NO_GC flag on them and
      taking a ref.  The ref should be only taken if the flag wasn't already set.
      
      Fix this by making it conditional.
      
      Without this an assertion failure will occur during module removal
      indicating that the refcount is too elevated.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      17814aef
    • David Howells's avatar
      afs: Fix checker warnings · fe342cf7
      David Howells authored
      
      
      Fix warnings raised by checker, including:
      
       (*) Warnings raised by unequal comparison for the purposes of sorting,
           where the endianness doesn't matter:
      
      fs/afs/addr_list.c:246:21: warning: restricted __be16 degrades to integer
      fs/afs/addr_list.c:246:30: warning: restricted __be16 degrades to integer
      fs/afs/addr_list.c:248:21: warning: restricted __be32 degrades to integer
      fs/afs/addr_list.c:248:49: warning: restricted __be32 degrades to integer
      fs/afs/addr_list.c:283:21: warning: restricted __be16 degrades to integer
      fs/afs/addr_list.c:283:30: warning: restricted __be16 degrades to integer
      
       (*) afs_set_cb_interest() is not actually used and can be removed.
      
       (*) afs_cell_gc_delay() should be provided with a sysctl.
      
       (*) afs_cell_destroy() needs to use rcu_access_pointer() to read
           cell->vl_addrs.
      
       (*) afs_init_fs_cursor() should be static.
      
       (*) struct afs_vnode::permit_cache needs to be marked __rcu.
      
       (*) afs_server_rcu() needs to use rcu_access_pointer().
      
       (*) afs_destroy_server() should use rcu_access_pointer() on
           server->addresses as the server object is no longer accessible.
      
       (*) afs_find_server() casts __be16/__be32 values to int in order to
           directly compare them for the purpose of finding a match in a list,
           but is should also annotate the cast with __force to avoid checker
           warnings.
      
       (*) afs_check_permit() accesses vnode->permit_cache outside of the RCU
           readlock, though it doesn't then access the value; the extraneous
           access is deleted.
      
      False positives:
      
       (*) Conditional locking around the code in xdr_decode_AFSFetchStatus.  This
           can be dealt with in a separate patch.
      
      fs/afs/fsclient.c:148:9: warning: context imbalance in 'xdr_decode_AFSFetchStatus' - different lock contexts for basic block
      
       (*) Incorrect handling of seq-retry lock context balance:
      
      fs/afs/inode.c:455:38: warning: context imbalance in 'afs_getattr' - different
      lock contexts for basic block
      fs/afs/server.c:52:17: warning: context imbalance in 'afs_find_server' - different lock contexts for basic block
      fs/afs/server.c:128:17: warning: context imbalance in 'afs_find_server_by_uuid' - different lock contexts for basic block
      
      Errors:
      
       (*) afs_lookup_cell_rcu() needs to break out of the seq-retry loop, not go
           round again if it successfully found the workstation cell.
      
       (*) Fix UUID decode in afs_deliver_cb_probe_uuid().
      
       (*) afs_cache_permit() has a missing rcu_read_unlock() before one of the
           jumps to the someone_else_changed_it label.  Move the unlock to after
           the label.
      
       (*) afs_vl_get_addrs_u() is using ntohl() rather than htonl() when
           encoding to XDR.
      
       (*) afs_deliver_yfsvl_get_endpoints() is using htonl() rather than ntohl()
           when decoding from XDR.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      fe342cf7
  5. Apr 06, 2018
  6. Apr 04, 2018
    • David Howells's avatar
      fscache: Attach the index key and aux data to the cookie · 402cb8dd
      David Howells authored
      
      
      Attach copies of the index key and auxiliary data to the fscache cookie so
      that:
      
       (1) The callbacks to the netfs for this stuff can be eliminated.  This
           can simplify things in the cache as the information is still
           available, even after the cache has relinquished the cookie.
      
       (2) Simplifies the locking requirements of accessing the information as we
           don't have to worry about the netfs object going away on us.
      
       (3) The cache can do lazy updating of the coherency information on disk.
           As long as the cache is flushed before reboot/poweroff, there's no
           need to update the coherency info on disk every time it changes.
      
       (4) Cookies can be hashed or put in a tree as the index key is easily
           available.  This allows:
      
           (a) Checks for duplicate cookies can be made at the top fscache layer
           	 rather than down in the bowels of the cache backend.
      
           (b) Caching can be added to a netfs object that has a cookie if the
           	 cache is brought online after the netfs object is allocated.
      
      A certain amount of space is made in the cookie for inline copies of the
      data, but if it won't fit there, extra memory will be allocated for it.
      
      The downside of this is that live cache operation requires more memory.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAnna Schumaker <anna.schumaker@netapp.com>
      Tested-by: default avatarSteve Dickson <steved@redhat.com>
      402cb8dd
    • David Howells's avatar
      afs: Be more aggressive in retiring cached vnodes · 678edd09
      David Howells authored
      
      
      When relinquishing cookies, either due to iget failure or to inode
      eviction, retire a cookie if we think the corresponding vnode got deleted
      on the server rather than just letting it lie in the cache.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      678edd09
    • David Howells's avatar
      afs: Use the vnode ID uniquifier in the cache key not the aux data · 27a3ee3a
      David Howells authored
      
      
      AFS vnodes (files) are referenced by a triplet of { volume ID, vnode ID,
      uniquifier }.  Currently, kafs is only using the vnode ID as the file key
      in the volume fscache index and checking the uniquifier on cookie
      acquisition against the contents of the auxiliary data stored in the cache.
      
      Unfortunately, this is subject to a race in which an FS.RemoveFile or
      FS.RemoveDir op is issued against the server but the local afs inode isn't
      torn down and disposed off before another thread issues something like
      FS.CreateFile.  The latter then gets given the vnode ID that just got
      removed, but with a new uniquifier and a cookie collision occurs in the
      cache because the cookie is only keyed on the vnode ID whereas the inode is
      keyed on the vnode ID plus the uniquifier.
      
      Fix this by keying the cookie on the uniquifier in addition to the vnode ID
      and dropping the uniquifier from the auxiliary data supplied.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      27a3ee3a
    • David Howells's avatar
      afs: Invalidate cache on server data change · c1515999
      David Howells authored
      
      
      Invalidate any data stored in fscache for a vnode that changes on the
      server so that we don't end up with the cache in a bad state locally.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c1515999
  7. Mar 27, 2018
    • David Howells's avatar
      rxrpc, afs: Use debug_ids rather than pointers in traces · a25e21f0
      David Howells authored
      
      
      In rxrpc and afs, use the debug_ids that are monotonically allocated to
      various objects as they're allocated rather than pointers as kernel
      pointers are now hashed making them less useful.  Further, the debug ids
      aren't reused anywhere nearly as quickly.
      
      In addition, allow kernel services that use rxrpc, such as afs, to take
      numbers from the rxrpc counter, assign them to their own call struct and
      pass them in to rxrpc for both client and service calls so that the trace
      lines for each will have the same ID tag.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      a25e21f0
  8. Mar 20, 2018
  9. Feb 06, 2018
    • David Howells's avatar
      afs: Support the AFS dynamic root · 4d673da1
      David Howells authored
      
      
      Support the AFS dynamic root which is a pseudo-volume that doesn't connect
      to any server resource, but rather is just a root directory that
      dynamically creates mountpoint directories where the name of such a
      directory is the name of the cell.
      
      Such a mount can be created thus:
      
      	mount -t afs none /afs -o dyn
      
      Dynamic root superblocks aren't shared except by bind mounts and
      propagation.  Cell root volumes can then be mounted by referring to them by
      name, e.g.:
      
      	ls /afs/grand.central.org/
      	ls /afs/.grand.central.org/
      
      The kernel will upcall to consult the DNS if the address wasn't supplied
      directly.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4d673da1
    • David Howells's avatar
      afs: Rearrange afs_select_fileserver() a little · 16280a15
      David Howells authored
      
      
      Rearrange afs_select_fileserver() a little to put the use_server chunk
      before the next_server chunk so that with the removal of a couple of gotos
      the main path through the function is all one sequence.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      16280a15
    • David Howells's avatar
      afs: Remove unused code · 63dc4e4a
      David Howells authored
      
      
      Remove some old unused code.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      63dc4e4a
    • David Howells's avatar
      afs: Fix server list handling · 45df8462
      David Howells authored
      
      
      Fix server list handling in the following ways:
      
       (1) In afs_alloc_volume(), remove duplicate server list build code.  This
           was already done by afs_alloc_server_list() which afs_alloc_volume()
           previously called.  This just results in twice as many VL RPCs.
      
       (2) In afs_deliver_vl_get_entry_by_name_u(), use the number of server
           records indicated by ->nServers in the UVLDB record returned by the
           VL.GetEntryByNameU RPC call rather than scanning all NMAXNSERVERS
           slots.  Unused slots may contain garbage.
      
       (3) In afs_alloc_server_list(), don't stop converting a UVLDB record into
           a server list just because we can't look up one of the servers.  Just
           skip that server and go on to the next.  If we can't look up any of
           the servers then we'll fail at the end.
      
      Without this patch, an attempt to view the umich.edu root cell using
      something like "ls /afs/umich.edu" on a dynamic root (future patch) mount
      or an autocell mount will result in ENOMEDIUM.  The failure is due to kafs
      not stopping after nServers'worth of records have been read, but then
      trying to access a server with a garbage UUID and getting an error, which
      aborts the server list build.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Reported-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: stable@vger.kernel.org
      45df8462
Loading