Skip to content
  1. Jun 06, 2013
    • Peter Zijlstra's avatar
      arch, mm: Remove tlb_fast_mode() · 29eb7782
      Peter Zijlstra authored
      
      
      Since the introduction of preemptible mmu_gather TLB fast mode has been
      broken. TLB fast mode relies on there being absolutely no concurrency;
      it frees pages first and invalidates TLBs later.
      
      However now we can get concurrency and stuff goes *bang*.
      
      This patch removes all tlb_fast_mode() code; it was found the better
      option vs trying to patch the hole by entangling tlb invalidation with
      the scheduler.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Luck <tony.luck@intel.com>
      Reported-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29eb7782
  2. May 24, 2013
    • Cliff Wickman's avatar
      mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas · a9ff785e
      Cliff Wickman authored
      
      
      A panic can be caused by simply cat'ing /proc/<pid>/smaps while an
      application has a VM_PFNMAP range.  It happened in-house when a
      benchmarker was trying to decipher the memory layout of his program.
      
      /proc/<pid>/smaps and similar walks through a user page table should not
      be looking at VM_PFNMAP areas.
      
      Certain tests in walk_page_range() (specifically split_huge_page_pmd())
      assume that all the mapped PFN's are backed with page structures.  And
      this is not usually true for VM_PFNMAP areas.  This can result in panics
      on kernel page faults when attempting to address those page structures.
      
      There are a half dozen callers of walk_page_range() that walk through a
      task's entire page table (as N.  Horiguchi pointed out).  So rather than
      change all of them, this patch changes just walk_page_range() to ignore
      VM_PFNMAP areas.
      
      The logic of hugetlb_vma() is moved back into walk_page_range(), as we
      want to test any vma in the range.
      
      VM_PFNMAP areas are used by:
      - graphics memory manager   gpu/drm/drm_gem.c
      - global reference unit     sgi-gru/grufile.c
      - sgi special memory        char/mspec.c
      - and probably several out-of-tree modules
      
      [akpm@linux-foundation.org: remove now-unused hugetlb_vma() stub]
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Sterba <dsterba@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9ff785e
    • Randy Dunlap's avatar
      mm/memory_hotplug.c: fix printk format warnings · 348f9f05
      Randy Dunlap authored
      
      
      Fix printk format warnings in mm/memory_hotplug.c by using "%pa":
      
        mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type 'resource_size_t' [-Wformat]
        mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'resource_size_t' [-Wformat]
      
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      348f9f05
    • Aneesh Kumar K.V's avatar
      mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer · 7c342512
      Aneesh Kumar K.V authored
      
      
      We should not use set_pmd_at to update pmd_t with pgtable_t pointer.
      set_pmd_at is used to set pmd with huge pte entries and architectures
      like ppc64, clear few flags from the pte when saving a new entry.
      Without this change we observe bad pte errors like below on ppc64 with
      THP enabled.
      
        BUG: Bad page map in process ld mm=0xc000001ee39f4780 pte:7fc3f37848000001 pmd:c000001ec0000000
      
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7c342512
    • Leonid Yegoshin's avatar
      mm compaction: fix of improper cache flush in migration code · c2cc499c
      Leonid Yegoshin authored
      
      
      Page 'new' during MIGRATION can't be flushed with flush_cache_page().
      Using flush_cache_page(vma, addr, pfn) is justified only if the page is
      already placed in process page table, and that is done right after
      flush_cache_page().  But without it the arch function has no knowledge
      of process PTE and does nothing.
      
      Besides that, flush_cache_page() flushes an application cache page, but
      the kernel has a different page virtual address and dirtied it.
      
      Replace it with flush_dcache_page(new) which is the proper usage.
      
      The old page is flushed in try_to_unmap_one() before migration.
      
      This bug takes place in Sead3 board with M14Kc MIPS CPU without cache
      aliasing (but Harvard arch - separate I and D cache) in tight memory
      environment (128MB) each 1-3days on SOAK test.  It fails in cc1 during
      kernel build (SIGILL, SIGBUS, SIGSEG) if CONFIG_COMPACTION is switched
      ON.
      
      Signed-off-by: default avatarLeonid Yegoshin <Leonid.Yegoshin@imgtec.com>
      Cc: Leonid Yegoshin <yegoshin@mips.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c2cc499c
    • Johannes Weiner's avatar
      mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge · 28ccddf7
      Johannes Weiner authored
      
      
      Commit 0c59b89c ("mm: memcg: push down PageSwapCache check into
      uncharge entry functions") added a VM_BUG_ON() on PageSwapCache in the
      uncharge path after checking that page flag once, assuming that the
      state is stable in all paths, but this is not the case and the condition
      triggers in user environments.  An uncharge after the last page table
      reference to the page goes away can race with reclaim adding the page to
      swap cache.
      
      Swap cache pages are usually uncharged when they are freed after
      swapout, from a path that also handles swap usage accounting and memcg
      lifetime management.  However, since the last page table reference is
      gone and thus no references to the swap slot left, the swap slot will be
      freed shortly when reclaim attempts to write the page to disk.  The
      whole swap accounting is not even necessary.
      
      So while the race condition for which this VM_BUG_ON was added is real
      and actually existed all along, there are no negative effects.  Remove
      the VM_BUG_ON again.
      
      Reported-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Reported-by: default avatarLingzhu Xiang <lxiang@redhat.com>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      28ccddf7
    • Xiao Guangrong's avatar
      mm: mmu_notifier: re-fix freed page still mapped in secondary MMU · d34883d4
      Xiao Guangrong authored
      
      
      Commit 751efd86 ("mmu_notifier_unregister NULL Pointer deref and
      multiple ->release()") breaks the fix 3ad3d901 ("mm: mmu_notifier:
      fix freed page still mapped in secondary MMU").
      
      Since hlist_for_each_entry_rcu() is changed now, we can not revert that
      patch directly, so this patch reverts the commit and simply fix the bug
      spotted by that patch
      
      This bug spotted by commit 751efd86 is:
      
          There is a race condition between mmu_notifier_unregister() and
          __mmu_notifier_release().
      
          Assume two tasks, one calling mmu_notifier_unregister() as a result
          of a filp_close() ->flush() callout (task A), and the other calling
          mmu_notifier_release() from an mmput() (task B).
      
                              A                               B
          t1                                            srcu_read_lock()
          t2            if (!hlist_unhashed())
          t3                                            srcu_read_unlock()
          t4            srcu_read_lock()
          t5                                            hlist_del_init_rcu()
          t6                                            synchronize_srcu()
          t7            srcu_read_unlock()
          t8            hlist_del_rcu()  <--- NULL pointer deref.
      
      This can be fixed by using hlist_del_init_rcu instead of hlist_del_rcu.
      
      The another issue spotted in the commit is "multiple ->release()
      callouts", we needn't care it too much because it is really rare (e.g,
      can not happen on kvm since mmu-notify is unregistered after
      exit_mmap()) and the later call of multiple ->release should be fast
      since all the pages have already been released by the first call.
      Anyway, this issue should be fixed in a separate patch.
      
      -stable suggestions: Any version that has commit 751efd86 need to be
      backported.  I find the oldest version has this commit is 3.0-stable.
      
      [akpm@linux-foundation.org: tweak comments]
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Tested-by: default avatarRobin Holt <holt@sgi.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d34883d4
  3. May 22, 2013
    • Ralf Baechle's avatar
      mm: Fix virt_to_page() warning · bb3ec6b0
      Ralf Baechle authored
      
      
      virt_to_page() is typically implemented as a macro containing a cast so
      that it will accept both pointers and unsigned long without causing a
      warning.
      
      But MIPS virt_to_page() uses virt_to_phys which is a function so passing
      an unsigned long will cause a warning:
      
          CC      mm/page_alloc.o
        mm/page_alloc.c: In function ‘free_reserved_area’:
        mm/page_alloc.c:5161:3: warning: passing argument 1 of ‘virt_to_phys’ makes pointer from integer without a cast [enabled by default]
        arch/mips/include/asm/io.h:119:100: note: expected ‘const volatile void *’ but argument is of type ‘long unsigned int’
      
      All others users of virt_to_page() in mm/ are passing a void *.
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Reported-by: default avatarEunbong Song <eunb.song@samsung.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb3ec6b0
  4. May 09, 2013
  5. May 08, 2013
  6. May 06, 2013
  7. May 01, 2013
  8. Apr 30, 2013
  9. Apr 29, 2013
Loading