Skip to content
Snippets Groups Projects
  1. Jun 11, 2016
    • David Howells's avatar
      rxrpc: Limit the listening backlog · 0e119b41
      David Howells authored
      
      Limit the socket incoming call backlog queue size so that a remote client
      can't pump in sufficient new calls that the server runs out of memory.  Note
      that this is partially theoretical at the moment since whilst the number of
      calls is limited, the number of packets trying to set up new calls is not.
      This will be addressed in a later patch.
      
      If the caller of listen() specifies a backlog INT_MAX, then they get the
      current maximum; anything else greater than max_backlog or anything
      negative incurs EINVAL.
      
      The limit on the maximum queue size can be set by:
      
      	echo N >/proc/sys/net/rxrpc/max_backlog
      
      where 4<=N<=32.
      
      Further, set the default backlog to 0, requiring listen() to be called
      before we start actually queueing new calls.  Whilst this kind of is a
      change in the UAPI, the caller can't actually *accept* new calls anyway
      unless they've first called listen() to put the socket into the LISTENING
      state - thus the aforementioned new calls would otherwise just sit there,
      eating up kernel memory.  (Note that sockets that don't have a non-zero
      service ID bound don't get incoming calls anyway.)
      
      Given that the default backlog is now 0, make the AFS filesystem call
      kernel_listen() to set the maximum backlog for itself.
      
      Possible improvements include:
      
       (1) Trimming a too-large backlog to max_backlog when listen is called.
      
       (2) Trimming the backlog value whenever the value is used so that changes
           to max_backlog are applied to an open socket automatically.  Note that
           the AFS filesystem opens one socket and keeps it open for extended
           periods, so would miss out on changes to max_backlog.
      
       (3) Having a separate setting for the AFS filesystem.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e119b41
  2. May 27, 2016
    • Arnd Bergmann's avatar
      remove lots of IS_ERR_VALUE abuses · 287980e4
      Arnd Bergmann authored
      
      Most users of IS_ERR_VALUE() in the kernel are wrong, as they
      pass an 'int' into a function that takes an 'unsigned long'
      argument. This happens to work because the type is sign-extended
      on 64-bit architectures before it gets converted into an
      unsigned type.
      
      However, anything that passes an 'unsigned short' or 'unsigned int'
      argument into IS_ERR_VALUE() is guaranteed to be broken, as are
      8-bit integers and types that are wider than 'unsigned long'.
      
      Andrzej Hajda has already fixed a lot of the worst abusers that
      were causing actual bugs, but it would be nice to prevent any
      users that are not passing 'unsigned long' arguments.
      
      This patch changes all users of IS_ERR_VALUE() that I could find
      on 32-bit ARM randconfig builds and x86 allmodconfig. For the
      moment, this doesn't change the definition of IS_ERR_VALUE()
      because there are probably still architecture specific users
      elsewhere.
      
      Almost all the warnings I got are for files that are better off
      using 'if (err)' or 'if (err < 0)'.
      The only legitimate user I could find that we get a warning for
      is the (32-bit only) freescale fman driver, so I did not remove
      the IS_ERR_VALUE() there but changed the type to 'unsigned long'.
      For 9pfs, I just worked around one user whose calling conventions
      are so obscure that I did not dare change the behavior.
      
      I was using this definition for testing:
      
       #define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \
             unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO))
      
      which ends up making all 16-bit or wider types work correctly with
      the most plausible interpretation of what IS_ERR_VALUE() was supposed
      to return according to its users, but also causes a compile-time
      warning for any users that do not pass an 'unsigned long' argument.
      
      I suggested this approach earlier this year, but back then we ended
      up deciding to just fix the users that are obviously broken. After
      the initial warning that caused me to get involved in the discussion
      (fs/gfs2/dir.c) showed up again in the mainline kernel, Linus
      asked me to send the whole thing again.
      
      [ Updated the 9p parts as per Al Viro  - Linus ]
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Andrzej Hajda <a.hajda@samsung.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: https://lkml.org/lkml/2016/1/7/363
      Link: https://lkml.org/lkml/2016/5/27/486
      
      
      Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      287980e4
  3. May 10, 2016
  4. May 02, 2016
    • Al Viro's avatar
      make ext2_get_page() and friends work without external serialization · be5b82db
      Al Viro authored
      
      Right now ext2_get_page() (and its analogues in a bunch of other filesystems)
      relies upon the directory being locked - the way it sets and tests Checked and
      Error bits would be racy without that.  Switch to a slightly different scheme,
      _not_ setting Checked in case of failure.  That way the logics becomes
      	if Checked => OK
      	else if Error => fail
      	else if !validate => fail
      	else => OK
      with validation setting Checked or Error on success and failure resp. and
      returning which one had happened.  Equivalent to the current logics, but unlike
      the current logics not sensitive to the order of set_bit, test_bit getting
      reordered by CPU, etc.
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      be5b82db
  5. Apr 11, 2016
    • David Howells's avatar
      rxrpc: Differentiate local and remote abort codes in structs · dc44b3a0
      David Howells authored
      
      In the rxrpc_connection and rxrpc_call structs, there's one field to hold
      the abort code, no matter whether that value was generated locally to be
      sent or was received from the peer via an abort packet.
      
      Split the abort code fields in two for cleanliness sake and add an error
      field to hold the Linux error number to the rxrpc_call struct too
      (sometimes this is generated in a context where we can't return it to
      userspace directly).
      
      Furthermore, add a skb mark to indicate a packet that caused a local abort
      to be generated so that recvmsg() can pick up the correct abort code.  A
      future addition will need to be to indicate to userspace the difference
      between aborts via a control message.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc44b3a0
    • David Howells's avatar
      afs: Wait for outstanding async calls before closing rxrpc socket · 2f02f7ae
      David Howells authored
      
      The afs filesystem needs to wait for any outstanding asynchronous calls
      (such as FS.GiveUpCallBacks cleaning up the callbacks lodged with a server)
      to complete before closing the AF_RXRPC socket when unloading the module.
      
      This may occur if the module is removed too quickly after unmounting all
      filesystems.  This will produce an error report that looks like:
      
      	AFS: Assertion failed
      	1 == 0 is false
      	0x1 == 0x0 is false
      	------------[ cut here ]------------
      	kernel BUG at ../fs/afs/rxrpc.c:135!
      	...
      	RIP: 0010:[<ffffffffa004111c>] afs_close_socket+0xec/0x107 [kafs]
      	...
      	Call Trace:
      	 [<ffffffffa004a160>] afs_exit+0x1f/0x57 [kafs]
      	 [<ffffffff810c30a0>] SyS_delete_module+0xec/0x17d
      	 [<ffffffff81610417>] entry_SYSCALL_64_fastpath+0x12/0x6b
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f02f7ae
  6. Apr 04, 2016
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  7. Jan 22, 2016
    • Al Viro's avatar
      wrappers for ->i_mutex access · 5955102c
      Al Viro authored
      
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  8. Jan 15, 2016
    • Vladimir Davydov's avatar
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov authored
      
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  9. Jan 04, 2016
  10. Dec 09, 2015
    • Al Viro's avatar
      don't put symlink bodies in pagecache into highmem · 21fc61c7
      Al Viro authored
      
      kmap() in page_follow_link_light() needed to go - allowing to hold
      an arbitrary number of kmaps for long is a great way to deadlocking
      the system.
      
      new helper (inode_nohighmem(inode)) needs to be used for pagecache
      symlinks inodes; done for all in-tree cases.  page_follow_link_light()
      instrumented to yell about anything missed.
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      21fc61c7
  11. May 11, 2015
  12. Apr 15, 2015
  13. Apr 12, 2015
  14. Apr 01, 2015
  15. Mar 26, 2015
  16. Feb 04, 2015
  17. Jan 20, 2015
  18. Dec 09, 2014
    • Al Viro's avatar
      put iov_iter into msghdr · c0371da6
      Al Viro authored
      
      Note that the code _using_ ->msg_iter at that point will be very
      unhappy with anything other than unshifted iovec-backed iov_iter.
      We still need to convert users to proper primitives.
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      c0371da6
  19. Nov 19, 2014
  20. Oct 31, 2014
  21. Oct 09, 2014
  22. Sep 19, 2014
    • Kirill Tkhai's avatar
      sched, cleanup, treewide: Remove set_current_state(TASK_RUNNING) after schedule() · f139caf2
      Kirill Tkhai authored
      
      schedule(), io_schedule() and schedule_timeout() always return
      with TASK_RUNNING state set, so one more setting is unnecessary.
      
      (All places in patch are visible good, only exception is
       kiblnd_scheduler() from:
      
            drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
      
       Its schedule() is one line above standard 3 lines of unified diff)
      
      No places where set_current_state() is used for mb().
      
      Signed-off-by: default avatarKirill Tkhai <ktkhai@parallels.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1410529254.3569.23.camel@tkhai
      
      
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Anil Belur <askb23@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dmitry Eremin <dmitry.eremin@intel.com>
      Cc: Frank Blaschka <blaschka@linux.vnet.ibm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Isaac Huang <he.huang@intel.com>
      Cc: James E.J. Bottomley <JBottomley@parallels.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: J. Bruce Fields <bfields@fieldses.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Laura Abbott <lauraa@codeaurora.org>
      Cc: Liang Zhen <liang.zhen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masaru Nomura <massa.nomura@gmail.com>
      Cc: Michael Opdenacker <michael.opdenacker@free-electrons.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Oleg Drokin <green@linuxhacker.ru>
      Cc: Peng Tao <bergwolf@gmail.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Robert Love <robert.w.love@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Ursula Braun <ursula.braun@de.ibm.com>
      Cc: Zi Shen Lim <zlim.lnx@gmail.com>
      Cc: devel@driverdev.osuosl.org
      Cc: dm-devel@redhat.com
      Cc: dri-devel@lists.freedesktop.org
      Cc: fcoe-devel@open-fcoe.org
      Cc: jfs-discussion@lists.sourceforge.net
      Cc: linux390@de.ibm.com
      Cc: linux-afs@lists.infradead.org
      Cc: linux-cris-kernel@axis.com
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-raid@vger.kernel.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-scsi@vger.kernel.org
      Cc: qla2xxx-upstream@qlogic.com
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: user-mode-linux-user@lists.sourceforge.net
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f139caf2
  23. Jul 29, 2014
  24. Jun 02, 2014
    • Jeff Layton's avatar
      locks: ensure that fl_owner is always initialized properly in flock and lease codepaths · 130d1f95
      Jeff Layton authored
      
      Currently, the fl_owner isn't set for flock locks. Some filesystems use
      byte-range locks to simulate flock locks and there is a common idiom in
      those that does:
      
          fl->fl_owner = (fl_owner_t)filp;
          fl->fl_start = 0;
          fl->fl_end = OFFSET_MAX;
      
      Since flock locks are generally "owned" by the open file description,
      move this into the common flock lock setup code. The fl_start and fl_end
      fields are already set appropriately, so remove the unneeded setting of
      that in flock ops in those filesystems as well.
      
      Finally, the lease code also sets the fl_owner as if they were owned by
      the process and not the open file description. This is incorrect as
      leases have the same ownership semantics as flock locks. Set them the
      same way. The lease code doesn't actually use the fl_owner value for
      anything, so this is more for consistency's sake than a bugfix.
      
      Reported-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJeff Layton <jlayton@poochiereds.net>
      Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (Staging portion)
      Acked-by: default avatarJ. Bruce Fields <bfields@fieldses.org>
      130d1f95
  25. May 23, 2014
    • David Howells's avatar
      AFS: Pass an afs_call* to call->async_workfn() instead of a work_struct* · 656f88dd
      David Howells authored
      
      call->async_workfn() can take an afs_call* arg rather than a work_struct* as
      the functions assigned there are now called from afs_async_workfn() which has
      to call container_of() anyway.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      656f88dd
    • Nathaniel Wesley Filardo's avatar
      AFS: Fix kafs module unloading · 150a6b47
      Nathaniel Wesley Filardo authored
      
      At present, it is not possible to successfully unload the kafs module if there
      are outstanding async outgoing calls (those made with afs_make_call()).  This
      appears to be due to the changes introduced by:
      
      	commit 05949945
      	Author: Tejun Heo <tj@kernel.org>
      	Date:   Fri Mar 7 10:24:50 2014 -0500
      	Subject: afs: don't use PREPARE_WORK
      
      which didn't go far enough.  The problem is due to:
      
       (1) The aforementioned commit introduced a separate handler function pointer
           in the call, call->async_workfn, in addition to the original workqueue
           item, call->async_work, for asynchronous operations because workqueues
           subsystem cannot handle the workqueue item pointer being changed whilst
           the item is queued or being processed.
      
       (2) afs_async_workfn() was introduced in that commit to be the callback for
           call->async_work.  Its sole purpose is to run whatever call->async_workfn
           points to.
      
       (3) call->async_workfn is only used from afs_async_workfn(), which is only
           set on async_work by afs_collect_incoming_call() - ie. for incoming
           calls.
      
       (4) call->async_workfn is *not* set by afs_make_call() when outgoing calls are
           made, and call->async_work is set afs_process_async_call() - and not
           afs_async_workfn().
      
       (5) afs_process_async_call() now changes call->async_workfn rather than
           call->async_work to point to afs_delete_async_call() to clean up, but this
           is only effective for incoming calls because call->async_work does not
           point to afs_async_workfn() for outgoing calls.
      
       (6) Because, for incoming calls, call->async_work remains pointing to
           afs_process_async_call() this results in an infinite loop.
      
      Instead, make the workqueue uniformly vector through call->async_workfn, via
      afs_async_workfn() and simply initialise call->async_workfn to point to
      afs_process_async_call() in afs_make_call().
      
      Signed-off-by: default avatarNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarTejun Heo <tj@kernel.org>
      150a6b47
    • Nathaniel Wesley Filardo's avatar
      AFS: Part of afs_end_call() is identical to code elsewhere, so split it · 6cf12869
      Nathaniel Wesley Filardo authored
      
      Split afs_end_call() into two pieces, one of which is identical to code in
      afs_process_async_call().  Replace the latter with a call to the first part of
      afs_end_call().
      
      Signed-off-by: default avatarNathaniel Wesley Filardo <nwf@cs.jhu.edu>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6cf12869
  26. May 21, 2014
    • David Howells's avatar
      AFS: Fix cache manager service handlers · 6c67c7c3
      David Howells authored
      
      Fix the cache manager RPC service handlers.  The afs_send_empty_reply() and
      afs_send_simple_reply() functions:
      
       (a) Kill the call and free up the buffers associated with it if they fail.
      
       (b) Return with call intact if it they succeed.
      
      However, none of the callers actually check the result or clean up if
      successful - and may use the now non-existent data if it fails.
      
      This was detected by Dan Carpenter using a static checker:
      
      	The patch 08e0e7c8: "[AF_RXRPC]: Make the in-kernel AFS
      	filesystem use AF_RXRPC." from Apr 26, 2007, leads to the following
      	static checker warning:
      	"fs/afs/cmservice.c:155 SRXAFSCB_CallBack()
      		 warn: 'call' was already freed."
      
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6c67c7c3
  27. May 06, 2014
  28. Apr 03, 2014
    • Johannes Weiner's avatar
      mm + fs: store shadow entries in page cache · 91b0abe3
      Johannes Weiner authored
      
      Reclaim will be leaving shadow entries in the page cache radix tree upon
      evicting the real page.  As those pages are found from the LRU, an
      iput() can lead to the inode being freed concurrently.  At this point,
      reclaim must no longer install shadow pages because the inode freeing
      code needs to ensure the page tree is really empty.
      
      Add an address_space flag, AS_EXITING, that the inode freeing code sets
      under the tree lock before doing the final truncate.  Reclaim will check
      for this flag before installing shadow pages.
      
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Metin Doslu <metin@citusdata.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ozgun Erdogan <ozgun@citusdata.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roman Gushchin <klamm@yandex-team.ru>
      Cc: Ryan Mallon <rmallon@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      91b0abe3
  29. Mar 07, 2014
    • Tejun Heo's avatar
      afs: don't use PREPARE_WORK · 05949945
      Tejun Heo authored
      
      PREPARE_[DELAYED_]WORK() are being phased out.  They have few users
      and a nasty surprise in terms of reentrancy guarantee as workqueue
      considers work items to be different if they don't have the same work
      function.
      
      afs_call->async_work is multiplexed with multiple work functions.
      Introduce afs_async_workfn() which invokes afs_call->async_workfn and
      always use it as the work function and update the users to set the
      ->async_workfn field instead of overriding the work function using
      PREPARE_WORK().
      
      It would probably be best to route this with other related updates
      through the workqueue tree.
      
      Compile tested.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: linux-afs@lists.infradead.org
      05949945
  30. Feb 01, 2014
  31. Jan 25, 2014
  32. Sep 30, 2013
  33. Sep 27, 2013
    • David Howells's avatar
      FS-Cache: Provide the ability to enable/disable cookies · 94d30ae9
      David Howells authored
      
      Provide the ability to enable and disable fscache cookies.  A disabled cookie
      will reject or ignore further requests to:
      
      	Acquire a child cookie
      	Invalidate and update backing objects
      	Check the consistency of a backing object
      	Allocate storage for backing page
      	Read backing pages
      	Write to backing pages
      
      but still allows:
      
      	Checks/waits on the completion of already in-progress objects
      	Uncaching of pages
      	Relinquishment of cookies
      
      Two new operations are provided:
      
       (1) Disable a cookie:
      
      	void fscache_disable_cookie(struct fscache_cookie *cookie,
      				    bool invalidate);
      
           If the cookie is not already disabled, this locks the cookie against other
           dis/enablement ops, marks the cookie as being disabled, discards or
           invalidates any backing objects and waits for cessation of activity on any
           associated object.
      
           This is a wrapper around a chunk split out of fscache_relinquish_cookie(),
           but it reinitialises the cookie such that it can be reenabled.
      
           All possible failures are handled internally.  The caller should consider
           calling fscache_uncache_all_inode_pages() afterwards to make sure all page
           markings are cleared up.
      
       (2) Enable a cookie:
      
      	void fscache_enable_cookie(struct fscache_cookie *cookie,
      				   bool (*can_enable)(void *data),
      				   void *data)
      
           If the cookie is not already enabled, this locks the cookie against other
           dis/enablement ops, invokes can_enable() and, if the cookie is not an
           index cookie, will begin the procedure of acquiring backing objects.
      
           The optional can_enable() function is passed the data argument and returns
           a ruling as to whether or not enablement should actually be permitted to
           begin.
      
           All possible failures are handled internally.  The cookie will only be
           marked as enabled if provisional backing objects are allocated.
      
      A later patch will introduce these to NFS.  Cookie enablement during nfs_open()
      is then contingent on i_writecount <= 0.  can_enable() checks for a race
      between open(O_RDONLY) and open(O_WRONLY/O_RDWR).  This simplifies NFS's cookie
      handling and allows us to get rid of open(O_RDONLY) accidentally introducing
      caching to an inode that's open for writing already.
      
      One operation has its API modified:
      
       (3) Acquire a cookie.
      
      	struct fscache_cookie *fscache_acquire_cookie(
      		struct fscache_cookie *parent,
      		const struct fscache_cookie_def *def,
      		void *netfs_data,
      		bool enable);
      
           This now has an additional argument that indicates whether the requested
           cookie should be enabled by default.  It doesn't need the can_enable()
           function because the caller must prevent multiple calls for the same netfs
           object and it doesn't need to take the enablement lock because no one else
           can get at the cookie before this returns.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      94d30ae9
  34. Sep 07, 2013
Loading