Skip to content
  1. Aug 05, 2016
  2. Jul 31, 2016
    • Rich Felker's avatar
      sh: fix build regression with CONFIG_OF && !CONFIG_OF_FLATTREE · 03767daa
      Rich Felker authored
      
      
      Such a configuration could only be selected by manually selecting
      CONFIG_OF; SH_DEVICE_TREE selects both. The affected code is using the
      flat DTB at boot time and thus rightfully should depend on
      OF_FLATTREE, not just OF.
      
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      03767daa
    • Rich Felker's avatar
      sh: allow clocksource drivers to register sched_clock backends · b46ed370
      Rich Felker authored
      
      
      There is no arch-specific sched_clock implementation for sh, resulting
      in use of the old default jiffies-based implementation. Instead, use
      the modern generic sched_clock framework so that drivers can register
      better backends.
      
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      b46ed370
    • Paul Gortmaker's avatar
      sh: make heartbeat driver explicitly non-modular · e75438e2
      Paul Gortmaker authored
      
      
      The Kconfig for this driver is currently:
      
      config HEARTBEAT
              bool "Heartbeat LED"
      
      ....meaning that it currently is not being built as a module by anyone.
      Lets remove the modular code that is essentially orphaned, so that
      when reading the driver there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.
      
      We explicitly disallow a driver unbind, since that doesn't have a
      sensible use case anyway, and it allows us to drop the ".remove"
      code for non-modular drivers.
      
      We also delete the MODULE_LICENSE tag etc. since all that information
      is already contained at the top of the file in the comments.
      
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-sh@vger.kernel.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      e75438e2
    • Paul Gortmaker's avatar
      sh: make board-secureedge5410 explicitly non-modular · f368d475
      Paul Gortmaker authored
      
      
      The Kconfig currently controlling compilation of this code is:
      
      config SH_SECUREEDGE5410
              bool "SecureEdge5410"
      
      ....meaning that it currently is not being built as a module by anyone.
      
      Lets remove the couple traces of modularity so that when reading the
      driver there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.
      
      We don't replace module.h with init.h since the file already has that.
      
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-sh@vger.kernel.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      f368d475
    • Paul Gortmaker's avatar
      sh: make mm/asids-debugfs explicitly non-modular · f15412aa
      Paul Gortmaker authored
      
      
      The Makefile/Kconfig currently controlling compilation of this code is:
      
      obj-$(CONFIG_DEBUG_FS)          += $(debugfs-y)
      debugfs-y                       := asids-debugfs.o
      
      lib/Kconfig.debug:config DEBUG_FS
      lib/Kconfig.debug:      bool "Debug Filesystem"
      
      ....meaning that it currently is not being built as a module by anyone.
      
      Lets remove the couple traces of modular code, so that when reading the
      driver there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.
      
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-sh@vger.kernel.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      f15412aa
    • Paul Gortmaker's avatar
      sh: make time.c explicitly non-modular · 7a65a34f
      Paul Gortmaker authored
      
      
      The Makefile currently controlling compilation of this code is:
      
      obj-y   := debugtraps.o dma-nommu.o dumpstack.o                 \
      [...]
                 syscalls_$(BITS).o time.o topology.o traps.o         \
                 traps_$(BITS).o unwinder.o
      
      ....meaning that it currently is not being built as a module by anyone.
      
      Lets remove the couple traces of modular code, so that when reading
      the driver there is no doubt it is builtin-only.
      
      Since module_init translates to device_initcall in the non-modular
      case, the init ordering remains unchanged with this commit.
      
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: linux-sh@vger.kernel.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      7a65a34f
    • Rich Felker's avatar
      sh: fix futex/robust_list on nommu models · 72cc564f
      Rich Felker authored
      
      
      The futex cmpxchg runtime testing in kernel/futex.c depends on
      accesses to address 0 producing EFAULT, which obviously does not work
      on nommu. Since SH always has cmpxchg, disable the broken runtime
      detection.
      
      At some point this should be fixed at the kernel/futex.c level. UP
      machines can always provide a working cmpxchg with interrupt masking,
      and SMP cannot function without a working cmpxchg anyway.
      
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      72cc564f
    • Rich Felker's avatar
      sh: disable aliased page logic on NOMMU models · 57155c65
      Rich Felker authored
      
      
      SH3/4 (with MMU) have a virtually indexed cache, requiring explicit
      work to avoid consistency problems arising from having the same
      physical address range cached in multiple cache lines. This is
      unneeded for the NOMMU case, and some of the resulting code paths
      (kmap_coherent) don't work. SH2 only avoided this problem by having a
      4-way associative cache with way size equal to the page size (4k),
      yielding no cache index bits outside of the page offset and thus no
      aliases.
      
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      57155c65
    • Rich Felker's avatar
      sh: make sigcontext definition consistent across fpu/nofpu models · bbe6c778
      Rich Felker authored
      
      
      Up until now, the SH version of the sigcontext structure, and thus
      mcontext_t/ucontext_t, varied depending on the cpu model the kernel
      was built to run on. SH-4 (including SH-4A) and SH-2A used the form
      with space for FPU registers, and everything else used a form that
      omitted them.
      
      From a userspace perspective, however, the structure layout must be
      fixed for a given ABI. Traditionally glibc and uClibc used the form
      with space for FPU registers only when __SH4__ (which implies FPU;
      __SH4_NOFPU__ is the predefined macro for SH-4 but with no-FPU ABI)
      was defined. As a result:
      
      - SH-4 no-FPU programs never matched kernel sigcontext.
      
      - SH-3 programs did not match kernel sigcontext if run on SH-4,
        despite an apparent intent that they be compatible.
      
      - SH-2 and SH-2A programs (using uClibc) did not match kernel
        sigcontext if run on SH-2A.
      
      The mismatch might seem inconsequential because it occurs at the end
      of the sigcontext structure, but sigcontext is embedded as uc_mcontext
      in ucontext_t, where it is followed by uc_sigmask, an important member
      for signal handlers to have access to. In particular, access to
      uc_sigmask is necessary for a correct implementation of thread
      cancellation.
      
      It would be possible to retain support for both sigcontext ABIs via a
      personality mechanism, but since many configurations were already
      broken and nobody noticed, and since there are very few if any users
      of legacy no-FPU models anymore, I have opted to just remove the
      variation and always include space for the FPU registers in
      sigcontext. This was proposed and discussed on a thread "SH sigcontext
      ABI is broken" cross-posted to linux-sh, libc-alpha, and musl libc
      lists in June 2015, and no objections were raised.
      
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      bbe6c778
    • Rich Felker's avatar
    • Pan Xinhui's avatar
      sh: cmpxchg: fix a bit shift bug in big_endian os · ff18143c
      Pan Xinhui authored
      
      
      Correct bitoff in big endian OS.
      Current code works correctly for 1 byte but not for 2 bytes.
      
      Fixes: 3226aad8 ("sh: support 1 and 2 byte xchg")
      Signed-off-by: default avatarPan Xinhui <xinhui.pan@linux.vnet.ibm.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarRich Felker <dalias@libc.org>
      ff18143c
  3. Jul 29, 2016
    • Nitin Gupta's avatar
      sparc64: Trim page tables for 8M hugepages · 7bc3777c
      Nitin Gupta authored
      
      
      For PMD aligned (8M) hugepages, we currently allocate
      all four page table levels which is wasteful. We now
      allocate till PMD level only which saves memory usage
      from page tables.
      
      Also, when freeing page table for 8M hugepage backed region,
      make sure we don't try to access non-existent PTE level.
      
      Orabug: 22630259
      
      Signed-off-by: default avatarNitin Gupta <nitin.m.gupta@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7bc3777c
    • Josh Poimboeuf's avatar
      x86/power/64: Fix hibernation return address corruption · 4ce827b4
      Josh Poimboeuf authored
      In kernel bug 150021, a kernel panic was reported when restoring a
      hibernate image.  Only a picture of the oops was reported, so I can't
      paste the whole thing here.  But here are the most interesting parts:
      
        kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
        BUG: unable to handle kernel paging request at ffff8804615cfd78
        ...
        RIP: ffff8804615cfd78
        RSP: ffff8804615f0000
        RBP: ffff8804615cfdc0
        ...
        Call Trace:
         do_signal+0x23
         exit_to_usermode_loop+0x64
         ...
      
      The RIP is on the same page as RBP, so it apparently started executing
      on the stack.
      
      The bug was bisected to commit ef0f3ed5 (x86/asm/power: Create
      stack frames in hibernate_asm_64.S), which in retrospect seems quite
      dangerous, since that code saves and restores the stack pointer from a
      global variable ('saved_context').
      
      There are a lot of moving parts in the hibernate save and restore paths,
      so I don't know exactly what caused the panic.  Presumably, a FRAME_END
      was executed without the corresponding FRAME_BEGIN, or vice versa.  That
      would corrupt the return address on the stack and would be consistent
      with the details of the above panic.
      
      [ rjw: One major problem is that by the time the FRAME_BEGIN in
        restore_registers() is executed, the stack pointer value may not
        be valid any more.  Namely, the stack area pointed to by it
        previously may have been overwritten by some image memory contents
        and that page frame may now be used for whatever different purpose
        it had been allocated for before hibernation.  In that case, the
        FRAME_BEGIN will corrupt that memory. ]
      
      Instead of doing the frame pointer save/restore around the bounds of the
      affected functions, just do it around the call to swsusp_save().
      
      That has the same effect of ensuring that if swsusp_save() sleeps, the
      frame pointers will be correct.  It's also a much more obviously safe
      way to do it than the original patch.  And objtool still doesn't report
      any warnings.
      
      Fixes: ef0f3ed5 (x86/asm/power: Create stack frames in hibernate_asm_64.S)
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021
      
      
      Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
      Reported-by: default avatarAndre Reinke <andre.reinke@mailbox.org>
      Tested-by: default avatarAndre Reinke <andre.reinke@mailbox.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4ce827b4
    • Dan Carpenter's avatar
      avr32: off by one in at32_init_pio() · 55f1cf83
      Dan Carpenter authored
      
      
      The pio_dev[] array has MAX_NR_PIO_DEVICES elements so the > should be
      >=.
      
      Fixes: 5f97f7f9 ('[PATCH] avr32 architecture')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      55f1cf83
    • Hans-Christian Noren Egtvedt's avatar
      avr32: fixup code style in unistd.h and syscall_table.S · 6ad4a21b
      Hans-Christian Noren Egtvedt authored
      This patch swaps the mix of tabs and space for alignment of comment
      after code to use spaces only.
      
      Also document why recvmmsg was defined twice in the syscall_table.S
      table, but only once in unistd.h. In short, wired in the table by
      generic arch patch, but forgotten in unistd.h (review slip).
      6ad4a21b
    • Hans-Christian Noren Egtvedt's avatar
      avr32: wire up preadv2 and pwritev2 syscalls · 389ce5a9
      Hans-Christian Noren Egtvedt authored
      
      
      This patch wires up the new preadv2 and pwritev2 syscall on AVR32.
      
      On AVR32, all parameters beyond the 5th are passed on the stack. System
      calls don't use the stack -- they borrow a callee-saved register
      instead. This means that syscalls that take 6 parameters must be called
      through a stub that pushes the last parameter on the stack.
      
      Signed-off-by: default avatarHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      389ce5a9
    • Mike Kravetz's avatar
      sparc64 mm: Fix base TSB sizing when hugetlb pages are used · af1b1a9b
      Mike Kravetz authored
      
      
      do_sparc64_fault() calculates both the base and huge page RSS sizes and
      uses this information in calls to tsb_grow().  The calculation for base
      page TSB size is not correct if the task uses hugetlb pages.  hugetlb
      pages are not accounted for in RSS, therefore the call to get_mm_rss(mm)
      does not include hugetlb pages.  However, the number of pages based on
      huge_pte_count (which does include hugetlb pages) is subtracted from
      this value.  This will result in an artificially small and often negative
      RSS calculation.  The base TSB size is then often set to max_tsb_size
      as the passed RSS is unsigned, so a negative value looks really big.
      
      THP pages are also accounted for in huge_pte_count, and THP pages are
      accounted for in RSS so the calculation in do_sparc64_fault() is correct
      if a task only uses THP pages.
      
      A single huge_pte_count is not sufficient for TSB sizing if both hugetlb
      and THP pages can be used.  Instead of a single counter, use two:  one
      for hugetlb and one for THP.
      
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af1b1a9b
  4. Jul 28, 2016
    • Dennis Chen's avatar
      arm64:acpi: fix the acpi alignment exception when 'mem=' specified · cb0a6502
      Dennis Chen authored
      When booting an ACPI enabled kernel with 'mem=x', there is the
      possibility that ACPI data regions from the firmware will lie above the
      memory limit.  Ordinarily these will be removed by
      memblock_enforce_memory_limit(.).
      
      Unfortunately, this means that these regions will then be mapped by
      acpi_os_ioremap(.) as device memory (instead of normal) thus unaligned
      accessess will then provoke alignment faults.
      
      In this patch we adopt memblock_mem_limit_remove_map instead, and this
      preserves these ACPI data regions (marked NOMAP) thus ensuring that
      these regions are not mapped as device memory.
      
      For example, below is an alignment exception observed on ARM platform
      when booting the kernel with 'acpi=on mem=8G':
      
        ...
        Unable to handle kernel paging request at virtual address ffff0000080521e7
        pgd = ffff000008aa0000
        [ffff0000080521e7] *pgd=000000801fffe003, *pud=000000801fffd003, *pmd=000000801fffc003, *pte=00e80083ff1c1707
        Internal error: Oops: 96000021 [#1] PREEMPT SMP
        Modules linked in:
        CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc3-next-20160616+ #172
        Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1001A 02/09/2016
        task: ffff800001ef0000 ti: ffff800001ef8000 task.ti: ffff800001ef8000
        PC is at acpi_ns_lookup+0x520/0x734
        LR is at acpi_ns_lookup+0x4a4/0x734
        pc : [<ffff0000083b8b10>] lr : [<ffff0000083b8a94>] pstate: 60000045
        sp : ffff800001efb8b0
        x29: ffff800001efb8c0 x28: 000000000000001b
        x27: 0000000000000001 x26: 0000000000000000
        x25: ffff800001efb9e8 x24: ffff000008a10000
        x23: 0000000000000001 x22: 0000000000000001
        x21: ffff000008724000 x20: 000000000000001b
        x19: ffff0000080521e7 x18: 000000000000000d
        x17: 00000000000038ff x16: 0000000000000002
        x15: 0000000000000007 x14: 0000000000007fff
        x13: ffffff0000000000 x12: 0000000000000018
        x11: 000000001fffd200 x10: 00000000ffffff76
        x9 : 000000000000005f x8 : ffff000008725fa8
        x7 : ffff000008a8df70 x6 : ffff000008a8df70
        x5 : ffff000008a8d000 x4 : 0000000000000010
        x3 : 0000000000000010 x2 : 000000000000000c
        x1 : 0000000000000006 x0 : 0000000000000000
        ...
          acpi_ns_lookup+0x520/0x734
          acpi_ds_load1_begin_op+0x174/0x4fc
          acpi_ps_build_named_op+0xf8/0x220
          acpi_ps_create_op+0x208/0x33c
          acpi_ps_parse_loop+0x204/0x838
          acpi_ps_parse_aml+0x1bc/0x42c
          acpi_ns_one_complete_parse+0x1e8/0x22c
          acpi_ns_parse_table+0x8c/0x128
          acpi_ns_load_table+0xc0/0x1e8
          acpi_tb_load_namespace+0xf8/0x2e8
          acpi_load_tables+0x7c/0x110
          acpi_init+0x90/0x2c0
          do_one_initcall+0x38/0x12c
          kernel_init_freeable+0x148/0x1ec
          kernel_init+0x10/0xec
          ret_from_fork+0x10/0x40
        Code: b9009fbc 2a00037b 36380057 3219037b (b9400260)
        ---[ end trace 03381e5eb0a24de4 ]---
        Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      
      With 'efi=debug', we can see those ACPI regions loaded by firmware on
      that board as:
      
        efi:   0x0083ff185000-0x0083ff1b4fff [Reserved           |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
        efi:   0x0083ff1b5000-0x0083ff1c2fff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
        efi:   0x0083ff223000-0x0083ff224fff [ACPI Memory NVS    |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
      
      Link: http://lkml.kernel.org/r/1468475036-5852-3-git-send-email-dennis.chen@arm.com
      
      
      Acked-by: default avatarSteve Capper <steve.capper@arm.com>
      Signed-off-by: default avatarDennis Chen <dennis.chen@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Kaly Xin <kaly.xin@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb0a6502
    • Mel Gorman's avatar
      mm: move most file-based accounting to the node · 11fb9989
      Mel Gorman authored
      There are now a number of accounting oddities such as mapped file pages
      being accounted for on the node while the total number of file pages are
      accounted on the zone.  This can be coped with to some extent but it's
      confusing so this patch moves the relevant file-based accounted.  Due to
      throttling logic in the page allocator for reliable OOM detection, it is
      still necessary to track dirty and writeback pages on a per-zone basis.
      
      [mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
        Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
      Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
      
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      11fb9989
    • Mel Gorman's avatar
      mm: move page mapped accounting to the node · 50658e2e
      Mel Gorman authored
      Reclaim makes decisions based on the number of pages that are mapped but
      it's mixing node and zone information.  Account NR_FILE_MAPPED and
      NR_ANON_PAGES pages on the node.
      
      Link: http://lkml.kernel.org/r/1467970510-21195-18-git-send-email-mgorman@techsingularity.net
      
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      50658e2e
    • Mel Gorman's avatar
      mm, vmscan: move LRU lists to node · 599d0c95
      Mel Gorman authored
      This moves the LRU lists from the zone to the node and related data such
      as counters, tracing, congestion tracking and writeback tracking.
      
      Unfortunately, due to reclaim and compaction retry logic, it is
      necessary to account for the number of LRU pages on both zone and node
      logic.  Most reclaim logic is based on the node counters but the retry
      logic uses the zone counters which do not distinguish inactive and
      active sizes.  It would be possible to leave the LRU counters on a
      per-zone basis but it's a heavier calculation across multiple cache
      lines that is much more frequent than the retry checks.
      
      Other than the LRU counters, this is mostly a mechanical patch but note
      that it introduces a number of anomalies.  For example, the scans are
      per-zone but using per-node counters.  We also mark a node as congested
      when a zone is congested.  This causes weird problems that are fixed
      later but is easier to review.
      
      In the event that there is excessive overhead on 32-bit systems due to
      the nodes being on LRU then there are two potential solutions
      
      1. Long-term isolation of highmem pages when reclaim is lowmem
      
         When pages are skipped, they are immediately added back onto the LRU
         list. If lowmem reclaim persisted for long periods of time, the same
         highmem pages get continually scanned. The idea would be that lowmem
         keeps those pages on a separate list until a reclaim for highmem pages
         arrives that splices the highmem pages back onto the LRU. It potentially
         could be implemented similar to the UNEVICTABLE list.
      
         That would reduce the skip rate with the potential corner case is that
         highmem pages have to be scanned and reclaimed to free lowmem slab pages.
      
      2. Linear scan lowmem pages if the initial LRU shrink fails
      
         This will break LRU ordering but may be preferable and faster during
         memory pressure than skipping LRU pages.
      
      Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.net
      
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      599d0c95
    • Vineet Gupta's avatar
      ARC: mm: don't loose PTE_SPECIAL in pte_modify() · 3925a16a
      Vineet Gupta authored
      
      
      LTP madvise05 was generating mm splat
      
      | [ARCLinux]# /sd/ltp/testcases/bin/madvise05
      | BUG: Bad page map in process madvise05  pte:80e08211 pmd:9f7d4000
      | page:9fdcfc90 count:1 mapcount:-1 mapping:  (null) index:0x0 flags: 0x404(referenced|reserved)
      | page dumped because: bad pte
      | addr:200b8000 vm_flags:00000070 anon_vma:  (null) mapping:  (null) index:1005c
      | file:  (null) fault:  (null) mmap:  (null) readpage:  (null)
      | CPU: 2 PID: 6707 Comm: madvise05
      
      And for newer kernels, the system was rendered unusable afterwards.
      
      The problem was mprotect->pte_modify() clearing PTE_SPECIAL (which is
      set to identify the special zero page wired to the pte).
      When pte was finally unmapped, special casing for zero page was not
      done, and instead it was treated as a "normal" page, tripping on the
      map counts etc.
      
      This fixes ARC STAR 9001053308
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      3925a16a
    • Dan Carpenter's avatar
      sparc32: off by ones in BUG_ON() · fa160828
      Dan Carpenter authored
      
      
      Smatch complains that these tests are off by one, which is true but not
      life threatening.
      
      	arch/sparc/kernel/irq_32.c:169 irq_link()
      	error: buffer overflow 'irq_map' 384 <= 384
      
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa160828
    • David S. Miller's avatar
      sparc: Don't leak context bits into thread->fault_address · 4f6deb8c
      David S. Miller authored
      
      
      On pre-Niagara systems, we fetch the fault address on data TLB
      exceptions from the TLB_TAG_ACCESS register.  But this register also
      contains the context ID assosciated with the fault in the low 13 bits
      of the register value.
      
      This propagates into current_thread_info()->fault_address and can
      cause trouble later on.
      
      So clear the low 13-bits out of the TLB_TAG_ACCESS value in the cases
      where it matters.
      
      Reported-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f6deb8c
  5. Jul 27, 2016
  6. Jul 26, 2016
Loading