Skip to content
  1. Nov 23, 2017
    • Borislav Petkov's avatar
      x86/umip: Fix insn_get_code_seg_params()'s return value · e2a5dca7
      Borislav Petkov authored
      
      
      In order to save on redundant structs definitions
      insn_get_code_seg_params() was made to return two 4-bit values in a char
      but clang complains:
      
        arch/x86/lib/insn-eval.c:780:10: warning: implicit conversion from 'int' to 'char'
      	  changes value from 132 to -124 [-Wconstant-conversion]
                        return INSN_CODE_SEG_PARAMS(4, 8);
                        ~~~~~~ ^~~~~~~~~~~~~~~~~~~~~~~~~~
        ./arch/x86/include/asm/insn-eval.h:16:57: note: expanded from macro 'INSN_CODE_SEG_PARAMS'
        #define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4))
      
      Those two values do get picked apart afterwards the opposite way of how
      they were ORed so wrt to the LSByte, the return value is the same.
      
      But this function returns -EINVAL in the error case, which is an int. So
      make it return an int which is the native word size anyway and thus fix
      the clang warning.
      
      Reported-by: default avatarKees Cook <keescook@google.com>
      Reported-by: default avatarNick Desaulniers <nick.desaulniers@gmail.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: ricardo.neri-calderon@linux.intel.com
      Link: https://lkml.kernel.org/r/20171123091951.1462-1-bp@alien8.de
      e2a5dca7
    • Chao Fan's avatar
      x86/boot/KASLR: Remove unused variable · 69550d41
      Chao Fan authored
      
      
      There are two variables "rc" in mem_avoid_memmap. One at the top of the
      function and another one inside the while() loop. Drop the outer one as it
      is unused. Cleanup some whitespace damage while at it.
      
      Signed-off-by: default avatarChao Fan <fanc.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: gregkh@linuxfoundation.org
      Cc: n-horiguchi@ah.jp.nec.com
      Cc: keescook@chromium.org
      Link: https://lkml.kernel.org/r/20171123090847.15293-1-fanc.fnst@cn.fujitsu.com
      69550d41
    • Andy Lutomirski's avatar
      x86/entry/64: Add missing irqflags tracing to native_load_gs_index() · ca37e57b
      Andy Lutomirski authored
      
      
      Running this code with IRQs enabled (where dummy_lock is a spinlock):
      
      static void check_load_gs_index(void)
      {
      	/* This will fail. */
      	load_gs_index(0xffff);
      
      	spin_lock(&dummy_lock);
      	spin_unlock(&dummy_lock);
      }
      
      Will generate a lockdep warning.  The issue is that the actual write
      to %gs would cause an exception with IRQs disabled, and the exception
      handler would, as an inadvertent side effect, update irqflag tracing
      to reflect the IRQs-off status.  native_load_gs_index() would then
      turn IRQs back on and return with irqflag tracing still thinking that
      IRQs were off.  The dummy lock-and-unlock causes lockdep to notice the
      error and warn.
      
      Fix it by adding the missing tracing.
      
      Apparently nothing did this in a context where it mattered.  I haven't
      tried to find a code path that would actually exhibit the warning if
      appropriately nasty user code were running.
      
      I suspect that the security impact of this bug is very, very low --
      production systems don't run with lockdep enabled, and the warning is
      mostly harmless anyway.
      
      Found during a quick audit of the entry code to try to track down an
      unrelated bug that Ingo found in some still-in-development code.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ca37e57b
  2. Nov 22, 2017
    • Andrey Ryabinin's avatar
      x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow · f68d62a5
      Andrey Ryabinin authored
      [ Note, this commit is a cherry-picked version of:
      
          d17a1d97: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")
      
        ... for easier x86 entry code testing and back-porting. ]
      
      The KASAN shadow is currently mapped using vmemmap_populate() since that
      provides a semi-convenient way to map pages into init_top_pgt.  However,
      since that no longer zeroes the mapped pages, it is not suitable for
      KASAN, which requires zeroed shadow memory.
      
      Add kasan_populate_shadow() interface and use it instead of
      vmemmap_populate().  Besides, this allows us to take advantage of
      gigantic pages and use them to populate the shadow, which should save us
      some memory wasted on page tables and reduce TLB pressure.
      
      Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com
      
      
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Bob Picco <bob.picco@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f68d62a5
    • Andy Lutomirski's avatar
      x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing · 548c3050
      Andy Lutomirski authored
      
      
      When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
      before it.  This means that users of entry_SYSCALL_64_after_hwframe()
      were responsible for invoking TRACE_IRQS_OFF, and the one and only
      user (Xen, added in the same commit) got it wrong.
      
      I think this would manifest as a warning if a Xen PV guest with
      CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
      context tracking bit is to cause lockdep to get invoked before we
      turn IRQs back on.)  I haven't tested that for real yet because I
      can't get a kernel configured like that to boot at all on Xen PV.
      
      Move TRACE_IRQS_OFF below the label.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: 8a9949bc ("x86/xen/64: Rearrange the SYSCALL entries")
      Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      548c3050
  3. Nov 21, 2017
    • Ricardo Neri's avatar
      x86/umip: Print a warning into the syslog if UMIP-protected instructions are used · fd11a649
      Ricardo Neri authored
      
      
      Print a rate-limited warning when a user-space program attempts to execute
      any of the instructions that UMIP protects (i.e., SGDT, SIDT, SLDT, STR
      and SMSW).
      
      This is useful, because when CONFIG_X86_INTEL_UMIP=y is selected and
      supported by the hardware, user space programs that try to execute such
      instructions will receive a SIGSEGV signal that they might not expect.
      
      In the specific cases for which emulation is provided (instructions SGDT,
      SIDT and SMSW in protected and virtual-8086 modes), no signal is
      generated. However, a warning is helpful to encourage updates in such
      programs to avoid the use of such instructions.
      
      Warnings are printed via a customized printk() function that also provides
      information about the program that attempted to use the affected
      instructions.
      
      Utility macros are defined to wrap umip_printk() for the error and warning
      kernel log levels.
      
      While here, replace an existing call to the generic rate-limited pr_err()
      with the new umip_pr_err().
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarRicardo Neri <ricardo.neri-calderon@linux.intel.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: ricardo.neri@intel.com
      Link: http://lkml.kernel.org/r/1511233476-17088-1-git-send-email-ricardo.neri-calderon@linux.intel.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fd11a649
  4. Nov 17, 2017
  5. Nov 16, 2017
    • Craig Bergstrom's avatar
      x86/mm: Limit mmap() of /dev/mem to valid physical addresses · be62a320
      Craig Bergstrom authored
      
      
      One thing /dev/mem access APIs should verify is that there's no way
      that excessively large pfn's can leak into the high bits of the
      page table entry.
      
      In particular, if people can use "very large physical page addresses"
      through /dev/mem to set the bits past bit 58 - SOFTW4 and permission
      key bits and NX bit, that could *really* confuse the kernel.
      
      We had an earlier attempt:
      
        ce56a86e ("x86/mm: Limit mmap() of /dev/mem to valid physical addresses")
      
      ... which turned out to be too restrictive (breaking mem=... bootups for example) and
      had to be reverted in:
      
        90edaac6 ("Revert "x86/mm: Limit mmap() of /dev/mem to valid physical addresses"")
      
      This v2 attempt modifies the original patch and makes sure that mmap(/dev/mem)
      limits the pfns so that it at least fits in the actual pteval_t architecturally:
      
       - Make sure mmap_mem() actually validates that the offset fits in phys_addr_t
      
          ( This may be indirectly true due to some other check, but it's not
            entirely obvious. )
      
       - Change valid_mmap_phys_addr_range() to just use phys_addr_valid()
         on the top byte
      
          ( Top byte is sufficient, because mmap_mem() has already checked that
            it cannot wrap. )
      
       - Add a few comments about what the valid_phys_addr_range() vs.
         valid_mmap_phys_addr_range() difference is.
      
      Signed-off-by: default avatarCraig Bergstrom <craigb@google.com>
      [ Fixed the checks and added comments. ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [ Collected the discussion and patches into a commit. ]
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: Sean Young <sean@mess.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/CA+55aFyEcOMb657vWSmrM13OxmHxC-XxeBmNis=DwVvpJUOogQ@mail.gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      be62a320
    • Kirill A. Shutemov's avatar
      x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border · 1e0f25db
      Kirill A. Shutemov authored
      
      
      In case of 5-level paging, the kernel does not place any mapping above
      47-bit, unless userspace explicitly asks for it.
      
      Userspace can request an allocation from the full address space by
      specifying the mmap address hint above 47-bit.
      
      Nicholas noticed that the current implementation violates this interface:
      
        If user space requests a mapping at the end of the 47-bit address space
        with a length which causes the mapping to cross the 47-bit border
        (DEFAULT_MAP_WINDOW), then the vma is partially in the address space
        below and above.
      
      Sanity check the mmap address hint so that start and end of the resulting
      vma are on the same side of the 47-bit border. If that's not the case fall
      back to the code path which ignores the address hint and allocate from the
      regular address space below 47-bit.
      
      To make the checks consistent, mask out the address hints lower bits
      (either PAGE_MASK or huge_page_mask()) instead of using ALIGN() which can
      push them up to the next boundary.
      
      [ tglx: Moved the address check to a function and massaged comment and
        	changelog ]
      
      Reported-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: linux-mm@kvack.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lkml.kernel.org/r/20171115143607.81541-1-kirill.shutemov@linux.intel.com
      1e0f25db
  6. Nov 14, 2017
  7. Nov 12, 2017
    • Xiaochen Shen's avatar
      x86/intel_rdt: Fix a silent failure when writing zero value schemata · 2244645a
      Xiaochen Shen authored
      
      
      Writing an invalid schemata with no domain values (e.g., "(L3|MB):"),
      results in a silent failure, i.e. the last_cmd_status returns OK,
      
      Check for an empty value and set the result string with a proper error
      message and return -EINVAL.
      
      Before the fix:
       # mkdir /sys/fs/resctrl/p1
      
       # echo "L3:" > /sys/fs/resctrl/p1/schemata
       (silent failure)
       # cat /sys/fs/resctrl/info/last_cmd_status
       ok
      
       # echo "MB:" > /sys/fs/resctrl/p1/schemata
       (silent failure)
       # cat /sys/fs/resctrl/info/last_cmd_status
       ok
      
      After the fix:
       # mkdir /sys/fs/resctrl/p1
      
       # echo "L3:" > /sys/fs/resctrl/p1/schemata
       -bash: echo: write error: Invalid argument
       # cat /sys/fs/resctrl/info/last_cmd_status
       Missing 'L3' value
      
       # echo "MB:" > /sys/fs/resctrl/p1/schemata
       -bash: echo: write error: Invalid argument
       # cat /sys/fs/resctrl/info/last_cmd_status
       Missing 'MB' value
      
      [ Tony: This is an unintended side effect of the patch earlier to allow the
          	user to just write the value they want to change.  While allowing
          	user to specify less than all of the values, it also allows an
          	empty value. ]
      
      Fixes: c4026b7b ("x86/intel_rdt: Implement "update" mode when writing schemata file")
      Signed-off-by: default avatarXiaochen Shen <xiaochen.shen@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Link: https://lkml.kernel.org/r/20171110191624.20280-1-tony.luck@intel.com
      2244645a
  8. Nov 10, 2017
  9. Nov 09, 2017
Loading