Skip to content
  1. Aug 31, 2009
  2. Aug 23, 2009
  3. Aug 15, 2009
  4. Aug 04, 2009
  5. Aug 03, 2009
    • H. Peter Anvin's avatar
      x86: fix assembly constraints in native_save_fl() · f1f029c7
      H. Peter Anvin authored
      
      
      From Gabe Black in bugzilla 13888:
      
      native_save_fl is implemented as follows:
      
        11static inline unsigned long native_save_fl(void)
        12{
        13        unsigned long flags;
        14
        15        asm volatile("# __raw_save_flags\n\t"
        16                     "pushf ; pop %0"
        17                     : "=g" (flags)
        18                     : /* no input */
        19                     : "memory");
        20
        21        return flags;
        22}
      
      If gcc chooses to put flags on the stack, for instance because this is
      inlined into a larger function with more register pressure, the offset
      of the flags variable from the stack pointer will change when the
      pushf is performed. gcc doesn't attempt to understand that fact, and
      address used for pop will still be the same. It will write to
      somewhere near flags on the stack but not actually into it and
      overwrite some other value.
      
      I saw this happen in the ide_device_add_all function when running in a
      simulator I work on. I'm assuming that some quirk of how the simulated
      hardware is set up caused the code path this is on to be executed when
      it normally wouldn't.
      
      A simple fix might be to change "=g" to "=r".
      
      Reported-by: default avatarGabe Black <spamforgabe@umich.edu>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Stable Team <stable@kernel.org>
      f1f029c7
    • Paul Mackerras's avatar
      x86: Make 64-bit efi_ioremap use ioremap on MMIO regions · 6a7bbd57
      Paul Mackerras authored
      
      
      Booting current 64-bit x86 kernels on the latest Apple MacBook
      (MacBook5,2) via EFI gives the following warning:
      
      [    0.182209] ------------[ cut here ]------------
      [    0.182222] WARNING: at arch/x86/mm/pageattr.c:581 __cpa_process_fault+0x44/0xa0()
      [    0.182227] Hardware name: MacBook5,2
      [    0.182231] CPA: called for zero pte. vaddr = ffff8800ffe00000 cpa->vaddr = ffff8800ffe00000
      [    0.182236] Modules linked in:
      [    0.182242] Pid: 0, comm: swapper Not tainted 2.6.31-rc4 #6
      [    0.182246] Call Trace:
      [    0.182254]  [<ffffffff8102c754>] ? __cpa_process_fault+0x44/0xa0
      [    0.182261]  [<ffffffff81048668>] warn_slowpath_common+0x78/0xd0
      [    0.182266]  [<ffffffff81048744>] warn_slowpath_fmt+0x64/0x70
      [    0.182272]  [<ffffffff8102c7ec>] ? update_page_count+0x3c/0x50
      [    0.182280]  [<ffffffff818d25c5>] ? phys_pmd_init+0x140/0x22e
      [    0.182286]  [<ffffffff8102c754>] __cpa_process_fault+0x44/0xa0
      [    0.182292]  [<ffffffff8102ce60>] __change_page_attr_set_clr+0x5f0/0xb40
      [    0.182301]  [<ffffffff810d1035>] ? vm_unmap_aliases+0x175/0x190
      [    0.182307]  [<ffffffff8102d4ae>] change_page_attr_set_clr+0xfe/0x3d0
      [    0.182314]  [<ffffffff8102dcca>] _set_memory_uc+0x2a/0x30
      [    0.182319]  [<ffffffff8102dd4b>] set_memory_uc+0x7b/0xb0
      [    0.182327]  [<ffffffff818afe31>] efi_enter_virtual_mode+0x2ad/0x2c9
      [    0.182334]  [<ffffffff818a1c66>] start_kernel+0x2db/0x3f4
      [    0.182340]  [<ffffffff818a1289>] x86_64_start_reservations+0x99/0xb9
      [    0.182345]  [<ffffffff818a1389>] x86_64_start_kernel+0xe0/0xf2
      [    0.182357] ---[ end trace 4eaa2a86a8e2da22 ]---
      [    0.182982] init_memory_mapping: 00000000ffffc000-0000000100000000
      [    0.182993]  00ffffc000 - 0100000000 page 4k
      
      This happens because the 64-bit version of efi_ioremap calls
      init_memory_mapping for all addresses, regardless of whether they are
      RAM or MMIO.  The EFI tables on this machine ask for runtime access to
      some MMIO regions:
      
      [    0.000000] EFI: mem195: type=11, attr=0x8000000000000000, range=[0x0000000093400000-0x0000000093401000) (0MB)
      [    0.000000] EFI: mem196: type=11, attr=0x8000000000000000, range=[0x00000000ffc00000-0x00000000ffc40000) (0MB)
      [    0.000000] EFI: mem197: type=11, attr=0x8000000000000000, range=[0x00000000ffc40000-0x00000000ffc80000) (0MB)
      [    0.000000] EFI: mem198: type=11, attr=0x8000000000000000, range=[0x00000000ffc80000-0x00000000ffca4000) (0MB)
      [    0.000000] EFI: mem199: type=11, attr=0x8000000000000000, range=[0x00000000ffca4000-0x00000000ffcb4000) (0MB)
      [    0.000000] EFI: mem200: type=11, attr=0x8000000000000000, range=[0x00000000ffcb4000-0x00000000ffffc000) (3MB)
      [    0.000000] EFI: mem201: type=11, attr=0x8000000000000000, range=[0x00000000ffffc000-0x0000000100000000) (0MB)
      
      This arranges to pass the EFI memory type through to efi_ioremap, and
      makes efi_ioremap use ioremap rather than init_memory_mapping if the
      type is EFI_MEMORY_MAPPED_IO.  With this, the above warning goes away.
      
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      LKML-Reference: <19062.55858.533494.471153@cargo.ozlabs.ibm.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      6a7bbd57
  6. Jul 30, 2009
    • Rusty Russell's avatar
      lguest: update commentry · a91d74a3
      Rusty Russell authored
      
      
      Every so often, after code shuffles, I need to go through and unbitrot
      the Lguest Journey (see drivers/lguest/README).  Since we now use RCU in
      a simple form in one place I took the opportunity to expand that explanation.
      
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      a91d74a3
    • Rusty Russell's avatar
      lguest: fix comment style · 2e04ef76
      Rusty Russell authored
      
      
      I don't really notice it (except to begrudge the extra vertical
      space), but Ingo does.  And he pointed out that one excuse of lguest
      is as a teaching tool, it should set a good example.
      
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      2e04ef76
  7. Jul 27, 2009
    • Benjamin Herrenschmidt's avatar
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() · 9e1b32ca
      Benjamin Herrenschmidt authored
      
      
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
      
      Upcoming paches to support the new 64-bit "BookE" powerpc architecture
      will need to have the virtual address corresponding to PTE page when
      freeing it, due to the way the HW table walker works.
      
      Basically, the TLB can be loaded with "large" pages that cover the whole
      virtual space (well, sort-of, half of it actually) represented by a PTE
      page, and which contain an "indirect" bit indicating that this TLB entry
      RPN points to an array of PTEs from which the TLB can then create direct
      entries. Thus, in order to invalidate those when PTE pages are deleted,
      we need the virtual address to pass to tlbilx or tlbivax instructions.
      
      The old trick of sticking it somewhere in the PTE page struct page sucks
      too much, the address is almost readily available in all call sites and
      almost everybody implemets these as macros, so we may as well add the
      argument everywhere. I added it to the pmd and pud variants for consistency.
      
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
      Acked-by: default avatarNick Piggin <npiggin@suse.de>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9e1b32ca
  8. Jul 21, 2009
  9. Jul 17, 2009
  10. Jul 10, 2009
  11. Jul 04, 2009
    • Eric Dumazet's avatar
      x86: atomic64: Inline atomic64_read() again · a79f0da8
      Eric Dumazet authored
      
      
      Now atomic64_read() is light weight (no register pressure and
      small icache), we can inline it again.
      
      Also use "=&A" constraint instead of "+A" to avoid warning
      about unitialized 'res' variable. (gcc had to force 0 in eax/edx)
      
        $ size vmlinux.prev vmlinux.after
           text    data     bss     dec     hex filename
        4908667  451676 1684868 7045211  6b805b vmlinux.prev
        4908651  451676 1684868 7045195  6b804b vmlinux.after
      
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <4A4E1AA2.30002@gmail.com>
      [ Also fix typo in atomic64_set() export ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a79f0da8
  12. Jul 03, 2009
    • Ingo Molnar's avatar
      x86: atomic64: Improve atomic64_xchg() · 3a8d1788
      Ingo Molnar authored
      
      
      Remove the read-first logic from atomic64_xchg() and simplify
      the loop.
      
      This function was the last user of __atomic64_read() - remove it.
      
      Also, change the 'real_val' assumption from the somewhat quirky
      1ULL << 32 value to the (just as arbitrary, but simpler) value
      of 0.
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <tip-05118ab8859492ac9ddda0154cf90e37b0a4a0b0@git.kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a8d1788
    • Paul Mackerras's avatar
      x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP · 8e049ef0
      Paul Mackerras authored
      
      
      Occasionally we get bugs where atomic_read or atomic_set are
      used on atomic64_t variables or vice versa.  These bugs don't
      generate warnings on x86 because atomic_read and atomic_set are
      coded as macros rather than C functions, so we don't get any
      type-checking on their arguments; similarly for atomic64_read
      and atomic64_set in 64-bit kernels.
      
      This converts them to C functions so that the arguments are
      type-checked and bugs like this will get caught more easily. It
      also converts atomic_cmpxchg and atomic_xchg, and
      atomic64_cmpxchg and atomic64_xchg on 64-bit, so we get
      type-checking on their arguments too.
      
      Compiling a typical 64-bit x86 config, this generates no new
      warnings, and the vmlinux text is 86 bytes smaller.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8e049ef0
    • Jaswinder Singh Rajput's avatar
      x86: Remove unused function lapic_watchdog_ok() · c7210e1f
      Jaswinder Singh Rajput authored
      
      
      lapic_watchdog_ok() is a global function but no one is using it.
      
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <1246554335.2242.29.camel@jaswinder.satnam>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c7210e1f
    • Mathieu Desnoyers's avatar
      x86: Fix fixmap page order for FIX_TEXT_POKE0,1 · 12b9d7cc
      Mathieu Desnoyers authored
      
      
      Masami reported:
      
      > Since the fixmap pages are assigned higher address to lower,
      > text_poke() has to use it with inverted order (FIX_TEXT_POKE1
      > to FIX_TEXT_POKE0).
      
      I prefer to just invert the order of the fixmap declaration.
      It's simpler and more straightforward.
      
      Backward fixmaps seems to be used by both x86 32 and 64.
      
      It's really rare but a nasty bug, because it only hurts when
      instructions to patch are crossing a page boundary. If this
      happens, the fixmap write accesses will spill on the following
      fixmap, which may very well crash the system. And this does not
      crash the system, it could leave illegal instructions in place.
      Thanks Masami for finding this.
      
      It seems to have crept into the 2.6.30-rc series, so this calls
      for a -stable inclusion.
      
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <20090701213722.GH19926@Krystal>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      12b9d7cc
    • Ingo Molnar's avatar
      x86: atomic64: Make atomic_read() type-safe · 32171208
      Ingo Molnar authored
      
      
      Linus noticed that atomic64_xchg() uses atomic_read(), which
      happens to work because atomic_read() is a macro so the
      .counter value gets u64-read on 32-bit too - but this is really
      bogus and serious bugs are waiting to happen.
      
      Change atomic_read() to be a type-safe inline, and this exposes
      the atomic64 bogosity as well:
      
        arch/x86/lib/atomic64_32.c: In function ‘atomic64_xchg’:
        arch/x86/lib/atomic64_32.c:39: warning: passing argument 1 of ‘atomic_read’ from incompatible pointer type
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      32171208
    • Ingo Molnar's avatar
      x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file · b7882b7c
      Ingo Molnar authored
      
      
      Linus noted that the atomic64_t primitives are all inlines
      currently which is crazy because these functions have a large
      register footprint anyway.
      
      Move them to a separate file: arch/x86/lib/atomic64_32.c
      
      Also, while at it, rename all uses of 'unsigned long long' to
      the much shorter u64.
      
      This makes the appearance of the prototypes a lot nicer - and
      it also uncovered a few bugs where (yet unused) API variants
      had 'long' as their return type instead of u64.
      
      [ More intrusive changes are not yet done in this patch. ]
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b7882b7c
    • Eric Dumazet's avatar
      x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too · bbf2a330
      Eric Dumazet authored
      
      
      Locked instructions on two cache lines at once are painful. If
      atomic64_t uses two cache lines, my test program is 10x slower.
      
      The chance for that is significant: 4/32 or 12.5%.
      
      Make sure an atomic64_t is 8 bytes aligned.
      
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
      [ changed it to __aligned(8) as per Andrew's suggestion ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      bbf2a330
  13. Jul 02, 2009
    • Linus Torvalds's avatar
      x86: fix power-of-2 round_up/round_down macros · 43644679
      Linus Torvalds authored
      
      
      These macros had two bugs:
       - the type of the mask was not correctly expanded to the full size of
         the argument being expanded, resulting in possible loss of high bits
         when mixing types.
       - the alignment argument was evaluated twice, despite the macro looking
         like a fancy function (but it really does need to be a macro, since
         it works on arbitrary integer types)
      
      Noticed by Peter Anvin, and with a fix that is a modification of his
      suggestion (bug noticed by Yinghai Lu).
      
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43644679
  14. Jul 01, 2009
    • Frederic Weisbecker's avatar
      perf_counter: Ignore the nmi call frames in the x86-64 backtraces · 0406ca6d
      Frederic Weisbecker authored
      
      
      About every callchains recorded with perf record are filled up
      including the internal perfcounter nmi frame:
      
       perf_callchain
       perf_counter_overflow
       intel_pmu_handle_irq
       perf_counter_nmi_handler
       notifier_call_chain
       atomic_notifier_call_chain
       notify_die
       do_nmi
       nmi
      
      We want ignore this frame as it's not interesting for
      instrumentation. To solve this, we simply ignore every frames
      from nmi context.
      
      New example of "perf report -s sym -c" after this patch:
      
      9.59%  [k] search_by_key
                   4.88%
                      search_by_key
                      reiserfs_read_locked_inode
                      reiserfs_iget
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      
                   3.19%
                      search_by_key
                      search_by_entry_key
                      reiserfs_find_entry
                      reiserfs_lookup
                      do_lookup
                      __link_path_walk
                      path_walk
                      do_path_lookup
                      user_path_at
                      vfs_fstatat
                      vfs_lstat
                      sys_newlstat
                      system_call_fastpath
                      __lxstat
                      0x406fb1
      [...]
      
      For now this patch only solves the problem in x86-64.
      
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0406ca6d
    • David Woodhouse's avatar
      Fix pci_unmap_addr() et al on i386. · 788d84bb
      David Woodhouse authored
      
      
      We can run a 32-bit kernel on boxes with an IOMMU, so we need
      pci_unmap_addr() etc. to work -- without it, drivers will leak mappings.
      
      To be honest, this whole thing looks like it's more pain than it's
      worth; I'm half inclined to remove the no-op #else case altogether.
      
      But this is the minimal fix, which just does the right thing if
      CONFIG_DMAR is set.
      
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: stable@kernel.org  [ for 2.6.30 ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      788d84bb
    • Jaswinder Singh Rajput's avatar
      x86: Remove double declaration of MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1 · 44973998
      Jaswinder Singh Rajput authored
      
      
      MSR_P6_EVNTSEL0 and MSR_P6_EVNTSEL1 is already declared in msr-index.h.
      
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      LKML-Reference: <1246450778.6940.8.camel@hpdv5.satnam>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      44973998
  15. Jun 30, 2009
    • Jan Beulich's avatar
      x86: Fix fixmap ordering · 789d03f5
      Jan Beulich authored
      
      
      The merge of the 32- and 64-bit fixmap headers made a latent
      bug on x86-64 a real one: with the right config settings
      it is possible for FIX_OHCI1394_BASE to overlap the FIX_BTMAP_*
      range.
      
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Cc: <stable@kernel.org> # for 2.6.30.x
      LKML-Reference: <4A4A0A8702000078000082E8@vpn.id2.novell.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      789d03f5
  16. Jun 25, 2009
  17. Jun 24, 2009
  18. Jun 22, 2009
    • Tejun Heo's avatar
      x86: fix pageattr handling for lpage percpu allocator and re-enable it · e59a1bb2
      Tejun Heo authored
      
      
      lpage allocator aliases a PMD page for each cpu and returns whatever
      is unused to the page allocator.  When the pageattr of the recycled
      pages are changed, this makes the two aliases point to the overlapping
      regions with different attributes which isn't allowed and known to
      cause subtle data corruption in certain cases.
      
      This can be handled in simliar manner to the x86_64 highmap alias.
      pageattr code should detect if the target pages have PMD alias and
      split the PMD alias and synchronize the attributes.
      
      pcpur allocator is updated to keep the allocated PMD pages map sorted
      in ascending address order and provide pcpu_lpage_remapped() function
      which binary searches the array to determine whether the given address
      is aliased and if so to which address.  pageattr is updated to use
      pcpu_lpage_remapped() to detect the PMD alias and split it up as
      necessary from cpa_process_alias().
      
      Jan Beulich spotted the original problem and incorrect usage of vaddr
      instead of laddr for lookup.
      
      With this, lpage percpu allocator should work correctly.  Re-enable
      it.
      
      [ Impact: fix subtle lpage pageattr bug and re-enable lpage ]
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJan Beulich <JBeulich@novell.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      e59a1bb2
  19. Jun 20, 2009
    • Linus Torvalds's avatar
      x86, 64-bit: Clean up user address masking · 9063c61f
      Linus Torvalds authored
      
      
      The discussion about using "access_ok()" in get_user_pages_fast() (see
      commit 7f818906: "x86: don't use
      'access_ok()' as a range check in get_user_pages_fast()" for details and
      end result), made us notice that x86-64 was really being very sloppy
      about virtual address checking.
      
      So be way more careful and straightforward about masking x86-64 virtual
      addresses:
      
       - All the VIRTUAL_MASK* variants now cover half of the address
         space, it's not like we can use the full mask on a signed
         integer, and the larger mask just invites mistakes when
         applying it to either half of the 48-bit address space.
      
       - /proc/kcore's kc_offset_to_vaddr() becomes a lot more
         obvious when it transforms a file offset into a
         (kernel-half) virtual address.
      
       - Unify/simplify the 32-bit and 64-bit USER_DS definition to
         be based on TASK_SIZE_MAX.
      
      This cleanup and more careful/obvious user virtual address checking also
      uncovered a buglet in the x86-64 implementation of strnlen_user(): it
      would do an "access_ok()" check on the whole potential area, even if the
      string itself was much shorter, and thus return an error even for valid
      strings. Our sloppy checking had hidden this.
      
      So this fixes 'strnlen_user()' to do this properly, the same way we
      already handled user strings in 'strncpy_from_user()'.  Namely by just
      checking the first byte, and then relying on fault handling for the
      rest.  That always works, since we impose a guard page that cannot be
      mapped at the end of the user space address space (and even if we
      didn't, we'd have the address space hole).
      
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9063c61f
  20. Jun 19, 2009
    • Ingo Molnar's avatar
      perf_counter, x86: Improve interactions with fast-gup · 0c871971
      Ingo Molnar authored
      
      
      Improve a few details in perfcounter call-chain recording that
      makes use of fast-GUP:
      
      - Use ACCESS_ONCE() to observe the pte value. ptes are fundamentally
        racy and can be changed on another CPU, so we have to be careful
        about how we access them. The PAE branch is already careful with
        read-barriers - but the non-PAE and 64-bit side needs an
        ACCESS_ONCE() to make sure the pte value is observed only once.
      
      - make the checks a bit stricter so that we can feed it any kind of
        cra^H^H^H user-space input ;-)
      
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0c871971
  21. Jun 18, 2009
  22. Jun 17, 2009
Loading