Skip to content
  1. May 02, 2017
  2. May 01, 2017
  3. Apr 30, 2017
  4. Apr 28, 2017
    • Baoquan He's avatar
      x86/KASLR: Fix kexec kernel boot crash when KASLR randomization fails · da63b6b2
      Baoquan He authored
      
      
      Dave found that a kdump kernel with KASLR enabled will reset to the BIOS
      immediately if physical randomization failed to find a new position for
      the kernel. A kernel with the 'nokaslr' option works in this case.
      
      The reason is that KASLR will install a new page table for the identity
      mapping, while it missed building it for the original kernel location
      if KASLR physical randomization fails.
      
      This only happens in the kexec/kdump kernel, because the identity mapping
      has been built for kexec/kdump in the 1st kernel for the whole memory by
      calling init_pgtable(). Here if physical randomizaiton fails, it won't build
      the identity mapping for the original area of the kernel but change to a
      new page table '_pgtable'. Then the kernel will triple fault immediately
      caused by no identity mappings.
      
      The normal kernel won't see this bug, because it comes here via startup_32()
      and CR3 will be set to _pgtable already. In startup_32() the identity
      mapping is built for the 0~4G area. In KASLR we just append to the existing
      area instead of entirely overwriting it for on-demand identity mapping
      building. So the identity mapping for the original area of kernel is still
      there.
      
      To fix it we just switch to the new identity mapping page table when physical
      KASLR succeeds. Otherwise we keep the old page table unchanged just like
      "nokaslr" does.
      
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Signed-off-by: default avatarDave Young <dyoung@redhat.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/1493278940-5885-1-git-send-email-bhe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      da63b6b2
  5. Apr 26, 2017
    • Al Viro's avatar
      2fefc97b
    • Al Viro's avatar
      CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now · 701cac61
      Al Viro authored
      
      
      all architectures converted
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      701cac61
    • Al Viro's avatar
      m32r: switch to RAW_COPY_USER · 9a677341
      Al Viro authored
      
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      9a677341
    • Andy Lutomirski's avatar
      x86/mm: Fix flush_tlb_page() on Xen · dbd68d8e
      Andy Lutomirski authored
      
      
      flush_tlb_page() passes a bogus range to flush_tlb_others() and
      expects the latter to fix it up.  native_flush_tlb_others() has the
      fixup but Xen's version doesn't.  Move the fixup to
      flush_tlb_others().
      
      AFAICS the only real effect is that, without this fix, Xen would
      flush everything instead of just the one page on remote vCPUs in
      when flush_tlb_page() was called.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: e7b52ffd ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
      Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      dbd68d8e
    • Andy Lutomirski's avatar
      x86/mm: Make flush_tlb_mm_range() more predictable · ce27374f
      Andy Lutomirski authored
      
      
      I'm about to rewrite the function almost completely, but first I
      want to get a functional change out of the way.  Currently, if
      flush_tlb_mm_range() does not flush the local TLB at all, it will
      never do individual page flushes on remote CPUs.  This seems to be
      an accident, and preserving it will be awkward.  Let's change it
      first so that any regressions in the rewrite will be easier to
      bisect and so that the rewrite can attempt to change no visible
      behavior at all.
      
      The fix is simple: we can simply avoid short-circuiting the
      calculation of base_pages_to_flush.
      
      As a side effect, this also eliminates a potential corner case: if
      tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
      could have ended up flushing the entire address space one page at a
      time.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarDave Hansen <dave.hansen@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ce27374f
    • Andy Lutomirski's avatar
      x86/mm: Remove flush_tlb() and flush_tlb_current_task() · 29961b59
      Andy Lutomirski authored
      
      
      I was trying to figure out what how flush_tlb_current_task() would
      possibly work correctly if current->mm != current->active_mm, but I
      realized I could spare myself the effort: it has no callers except
      the unused flush_tlb() macro.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      29961b59
    • Andy Lutomirski's avatar
      x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly() · 9ccee237
      Andy Lutomirski authored
      
      
      mark_screen_rdonly() is the last remaining caller of flush_tlb().
      flush_tlb_mm_range() is potentially faster and isn't obsolete.
      
      Compile-tested only because I don't know whether software that uses
      this mechanism even exists.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9ccee237
    • Kirill A. Shutemov's avatar
      x86/mm/64: Fix crash in remove_pagetable() · e6ab9c4d
      Kirill A. Shutemov authored
      
      
      remove_pagetable() does page walk using p*d_page_vaddr() plus cast.
      It's not canonical approach -- we usually use p*d_offset() for that.
      
      It works fine as long as all page table levels are present. We broke the
      invariant by introducing folded p4d page table level.
      
      As result, remove_pagetable() interprets PMD as PUD and it leads to
      crash:
      
      	BUG: unable to handle kernel paging request at ffff880300000000
      	IP: memchr_inv+0x60/0x110
      	PGD 317d067
      	P4D 317d067
      	PUD 3180067
      	PMD 33f102067
      	PTE 8000000300000060
      
      Let's fix this by using p*d_offset() instead of p*d_page_vaddr() for
      page walk.
      
      Reported-by: default avatarDan Williams <dan.j.williams@intel.com>
      Tested-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Fixes: f2a6a705 ("x86: Convert the rest of the code to support p4d_t")
      Link: http://lkml.kernel.org/r/20170425092557.21852-1-kirill.shutemov@linux.intel.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e6ab9c4d
    • Josh Poimboeuf's avatar
      x86/unwind: Dump all stacks in unwind_dump() · 262fa734
      Josh Poimboeuf authored
      
      
      Currently unwind_dump() dumps only the most recently accessed stack.
      But it has a few issues.
      
      In some cases, 'first_sp' can get out of sync with 'stack_info', causing
      unwind_dump() to start from the wrong address, flood the printk buffer,
      and eventually read a bad address.
      
      In other cases, dumping only the most recently accessed stack doesn't
      give enough data to diagnose the error.
      
      Fix both issues by dumping *all* stacks involved in the trace, not just
      the last one.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 8b5e99f0 ("x86/unwind: Dump stack data on warnings")
      Link: http://lkml.kernel.org/r/016d6a9810d7d1bfc87ef8c0e6ee041c6744c909.1493171120.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      262fa734
    • Josh Poimboeuf's avatar
      x86/unwind: Silence more entry-code related warnings · b0d50c7b
      Josh Poimboeuf authored
      
      
      Borislav Petkov reported the following unwinder warning:
      
        WARNING: kernel stack regs at ffffc9000024fea8 in udevadm:92 has bad 'bp' value 00007fffc4614d30
        unwind stack type:0 next_sp:          (null) mask:0x6 graph_idx:0
        ffffc9000024fea8: 000055a6100e9b38 (0x55a6100e9b38)
        ffffc9000024feb0: 000055a6100e9b35 (0x55a6100e9b35)
        ffffc9000024feb8: 000055a6100e9f68 (0x55a6100e9f68)
        ffffc9000024fec0: 000055a6100e9f50 (0x55a6100e9f50)
        ffffc9000024fec8: 00007fffc4614d30 (0x7fffc4614d30)
        ffffc9000024fed0: 000055a6100eaf50 (0x55a6100eaf50)
        ffffc9000024fed8: 0000000000000000 ...
        ffffc9000024fee0: 0000000000000100 (0x100)
        ffffc9000024fee8: ffff8801187df488 (0xffff8801187df488)
        ffffc9000024fef0: 00007ffffffff000 (0x7ffffffff000)
        ffffc9000024fef8: 0000000000000000 ...
        ffffc9000024ff10: ffffc9000024fe98 (0xffffc9000024fe98)
        ffffc9000024ff18: 00007fffc4614d00 (0x7fffc4614d00)
        ffffc9000024ff20: ffffffffffffff10 (0xffffffffffffff10)
        ffffc9000024ff28: ffffffff811c6c1f (SyS_newlstat+0xf/0x10)
        ffffc9000024ff30: 0000000000000010 (0x10)
        ffffc9000024ff38: 0000000000000296 (0x296)
        ffffc9000024ff40: ffffc9000024ff50 (0xffffc9000024ff50)
        ffffc9000024ff48: 0000000000000018 (0x18)
        ffffc9000024ff50: ffffffff816b2e6a (entry_SYSCALL_64_fastpath+0x18/0xa8)
        ...
      
      It unwinded from an interrupt which came in right after entry code
      called into a C syscall handler, before it had a chance to set up the
      frame pointer, so regs->bp still had its user space value.
      
      Add a check to silence warnings in such a case, where an interrupt
      has occurred and regs->sp is almost at the end of the stack.
      
      Reported-by: default avatarBorislav Petkov <bp@suse.de>
      Tested-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: c32c47c6 ("x86/unwind: Warn on bad frame pointer")
      Link: http://lkml.kernel.org/r/c695f0d0d4c2cfe6542b90e2d0520e11eb901eb5.1493171120.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b0d50c7b
  6. Apr 25, 2017
Loading