Skip to content
  1. Jan 06, 2009
  2. Jan 05, 2009
    • Christoph Hellwig's avatar
      add a vfs_fsync helper · 4c728ef5
      Christoph Hellwig authored
      
      
      Fsync currently has a fdatawrite/fdatawait pair around the method call,
      and a mutex_lock/unlock of the inode mutex.  All callers of fsync have
      to duplicate this, but we have a few and most of them don't quite get
      it right.  This patch adds a new vfs_fsync that takes care of this.
      It's a little more complicated as usual as ->fsync might get a NULL file
      pointer and just a dentry from nfsd, but otherwise gets afile and we
      want to take the mapping and file operations from it when it is there.
      
      Notes on the fsync callers:
      
       - ecryptfs wasn't calling filemap_fdatawrite / filemap_fdatawait on the
         	lower file
       - coda wasn't calling filemap_fdatawrite / filemap_fdatawait on the host
      	file, and returning 0 when ->fsync was missing
       - shm wasn't calling either filemap_fdatawrite / filemap_fdatawait nor
         taking i_mutex.  Now given that shared memory doesn't have disk
         backing not doing anything in fsync seems fine and I left it out of
         the vfs_fsync conversion for now, but in that case we might just
         not pass it through to the lower file at all but just call the no-op
         simple_sync_file directly.
      
      [and now actually export vfs_fsync]
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4c728ef5
    • Al Viro's avatar
      inode->i_op is never NULL · acfa4380
      Al Viro authored
      
      
      We used to have rather schizophrenic set of checks for NULL ->i_op even
      though it had been eliminated years ago.  You'd need to go out of your
      way to set it to NULL explicitly _and_ a bunch of code would die on
      such inodes anyway.  After killing two remaining places that still
      did that bogosity, all that crap can go away.
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      acfa4380
    • Dmitri Monakhov's avatar
      kill suid bit only for regular files · 7f5ff766
      Dmitri Monakhov authored
      
      
      We don't have to do it because it is useless for non regular files.
      In fact block device may trigger this path without dentry->d_inode->i_mutex.
      
      (akpm: concerns were expressed (by me) about S_ISDIR inodes)
      
      Signed-off-by: default avatarDmitri Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7f5ff766
  3. Jan 04, 2009
    • Nicholas Piggin's avatar
      fs: symlink write_begin allocation context fix · 54566b2c
      Nicholas Piggin authored
      
      
      With the write_begin/write_end aops, page_symlink was broken because it
      could no longer pass a GFP_NOFS type mask into the point where the
      allocations happened.  They are done in write_begin, which would always
      assume that the filesystem can be entered from reclaim.  This bug could
      cause filesystem deadlocks.
      
      The funny thing with having a gfp_t mask there is that it doesn't really
      allow the caller to arbitrarily tinker with the context in which it can be
      called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
      take the page lock.  The only thing any callers care about is __GFP_FS
      anyway, so turn that into a single flag.
      
      Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
      this flag in their write_begin function.  Change __grab_cache_page to
      accept a nofs argument as well, to honour that flag (while we're there,
      change the name to grab_cache_page_write_begin which is more instructive
      and does away with random leading underscores).
      
      This is really a more flexible way to go in the end anyway -- if a
      filesystem happens to want any extra allocations aside from the pagecache
      ones in ints write_begin function, it may now use GFP_KERNEL (rather than
      GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
      random example).
      
      [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
      [kosaki.motohiro@jp.fujitsu.com: fix fuse]
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      [ Cleaned up the calling convention: just pass in the AOP flags
        untouched to the grab_cache_page_write_begin() function.  That
        just simplifies everybody, and may even allow future expansion of the
        logic.   - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      54566b2c
    • Adam Lackorzynski's avatar
      vmalloc.c: fix flushing in vmap_page_range() · 2e4e27c7
      Adam Lackorzynski authored
      
      
      The flush_cache_vmap in vmap_page_range() is called with the end of the
      range twice.  The following patch fixes this for me.
      
      Signed-off-by: default avatarAdam Lackorzynski <adam@os.inf.tu-dresden.de>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e4e27c7
  4. Dec 31, 2008
  5. Dec 29, 2008
  6. Dec 20, 2008
  7. Dec 19, 2008
  8. Dec 18, 2008
  9. Dec 16, 2008
    • Jan Beulich's avatar
      x86: consolidate __swp_XXX() macros · 1796316a
      Jan Beulich authored
      
      
      Impact: cleanup, code robustization
      
      The __swp_...() macros silently relied upon which bits are used for
      _PAGE_FILE and _PAGE_PROTNONE. After having changed _PAGE_PROTNONE in
      our Xen kernel to no longer overlap _PAGE_PAT, live locks and crashes
      were reported that could have been avoided if these macros properly
      used the symbolic constants. Since, as pointed out earlier, for Xen
      Dom0 support mainline likewise will need to eliminate the conflict
      between _PAGE_PAT and _PAGE_PROTNONE, this patch does all the necessary
      adjustments, plus it introduces a mechanism to check consistency
      between MAX_SWAPFILES_SHIFT and the actual encoding macros.
      
      This also fixes a latent bug in that x86-64 used a 6-bit mask in
      __swp_type(), and if MAX_SWAPFILES_SHIFT was increased beyond 5 in (the
      seemingly unrelated) linux/swap.h, this would have resulted in a
      collision with _PAGE_FILE.
      
      Non-PAE 32-bit code gets similarly adjusted for its pte_to_pgoff() and
      pgoff_to_pte() calculations.
      
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1796316a
    • KOSAKI Motohiro's avatar
      mm: Don't touch uninitialized variable in do_pages_stat_array() · c095adbc
      KOSAKI Motohiro authored
      
      
      Commit 80bba129 removed one necessary
      variable initialization.  As a result following warning happened:
      
          CC      mm/migrate.o
        mm/migrate.c: In function 'sys_move_pages':
        mm/migrate.c:1001: warning: 'err' may be used uninitialized in this function
      
      More unfortunately, if find_vma() failed, kernel read uninitialized
      memory.
      
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      CC: Brice Goglin <Brice.Goglin@inria.fr>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c095adbc
    • Catalin Marinas's avatar
      slob: do not pass the SLAB flags as GFP in kmem_cache_create() · 5e18e2b8
      Catalin Marinas authored
      
      
      The kmem_cache_create() function in the slob allocator passes the SLAB
      flags as GFP flags to the slob_alloc() function.  The patch changes this
      call to pass GFP_KERNEL as the other allocators seem to do.
      
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarMatt Mackall <mpm@selenic.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e18e2b8
  10. Dec 13, 2008
    • Rusty Russell's avatar
      cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and... · 29c0177e
      Rusty Russell authored
      
      cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers.
      
      Impact: change calling convention of existing cpumask APIs
      
      Most cpumask functions started with cpus_: these have been replaced by
      cpumask_ ones which take struct cpumask pointers as expected.
      
      These four functions don't have good replacement names; fortunately
      they're rarely used, so we just change them over.
      
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: paulus@samba.org
      Cc: mingo@redhat.com
      Cc: tony.luck@intel.com
      Cc: ralf@linux-mips.org
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: cl@linux-foundation.org
      Cc: srostedt@redhat.com
      29c0177e
  11. Dec 10, 2008
  12. Dec 08, 2008
  13. Dec 02, 2008
    • Rik van Riel's avatar
      vmscan: evict streaming IO first · 9ff473b9
      Rik van Riel authored
      
      
      Count the insertion of new pages in the statistics used to drive the
      pageout scanning code.  This should help the kernel quickly evict
      streaming file IO.
      
      We count on the fact that new file pages start on the inactive file LRU
      and new anonymous pages start on the active anon list.  This means
      streaming file IO will increment the recent scanned file statistic, while
      leaving the recent rotated file statistic alone, driving pageout scanning
      to the file LRUs.
      
      Pageout activity does its own list manipulation.
      
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Tested-by: default avatarGene Heskett <gene.heskett@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9ff473b9
    • Kay Sievers's avatar
      bdi: register sysfs bdi device only once per queue · f1d0b063
      Kay Sievers authored
      
      
      Devices which share the same queue, like floppies and mtd devices, get
      registered multiple times in the bdi interface, but bdi accounts only the
      last registered device of the devices sharing one queue.
      
      On remove, all earlier registered devices leak, stay around in sysfs, and
      cause "duplicate filename" errors if the devices are re-created.
      
      This prevents the creation of multiple bdi interfaces per queue, and the
      bdi device will carry the dev_t name of the block device which is the
      first one registered, of the pool of devices using the same queue.
      
      [akpm@linux-foundation.org: add a WARN_ON so we know which drivers are misbehaving]
      Tested-by: default avatarPeter Korsgaard <jacmet@sunsite.dk>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarKay Sievers <kay.sievers@vrfy.org>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f1d0b063
    • KAMEZAWA Hiroyuki's avatar
      memcg: memory hotplug fix for notifier callback · dc19f9db
      KAMEZAWA Hiroyuki authored
      Fixes for memcg/memory hotplug.
      
      While memory hotplug allocate/free memmap, page_cgroup doesn't free
      page_cgroup at OFFLINE when page_cgroup is allocated via bootomem.
      (Because freeing bootmem requires special care.)
      
      Then, if page_cgroup is allocated by bootmem and memmap is freed/allocated
      by memory hotplug, page_cgroup->page == page is no longer true.
      
      But current MEM_ONLINE handler doesn't check it and update
      page_cgroup->page if it's not necessary to allocate page_cgroup.  (This
      was not found because memmap is not freed if SPARSEMEM_VMEMMAP is y.)
      
      And I noticed that MEM_ONLINE can be called against "part of section".
      So, freeing page_cgroup at CANCEL_ONLINE will cause trouble.  (freeing
      used page_cgroup) Don't rollback at CANCEL.
      
      One more, current memory hotplug notifier is stopped by slub because it
      sets NOTIFY_STOP_MASK to return vaule.  So, page_cgroup's callback never
      be called.  (low priority than slub now.)
      
      I think this slub's behavior is not intentional(BUG). and fixes it.
      
      Another way to be considered about page_cgroup allocation:
        - free page_cgroup at OFFLINE even if it's from bootmem
          and remove specieal handler. But it requires more changes.
      
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12041
      
      
      
      Signed-off-by: default avatarKAMEZAWA Hiruyoki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Tested-by: default avatarBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc19f9db
Loading