Skip to content
  1. Jun 06, 2012
    • Oleg Nesterov's avatar
      uprobes: Teach handle_swbp() to rely on "is_swbp" rather than uprobes_srcu · 56bb4cf6
      Oleg Nesterov authored
      
      
      Currently handle_swbp() assumes that it can't race with
      unregister, so it roughly does:
      
      	if (find_uprobe(vaddr))
      		process_uprobe();
      	else
      		send_sig(SIGTRAP);
      
      This relies on the not-really-working uprobes_srcu code we are
      going to remove, see the next patch.
      
      With this patch we rely on the result of
      is_swbp_at_addr(bp_vaddr) if find_uprobe() fails.
      
      If is_swbp == 1, then we hit the normal int3, we should send
      SIGTRAP.
      
      If is_swbp == 0, we raced with uprobe_unregister(), we simply
      restart this insn again.
      
      The "difficult" case is is_swbp == -EFAULT, when we can't read
      this memory. In this case I think we should restart too, and
      this is more correct compared to the current code which sends
      SIGTRAP.
      
      Ignoring ENOMEM/etc from get_user_pages(), this can only happen
      if another thread unmaps this memory before find_active_uprobe()
      takes mmap_sem. It would be better to pretend it was unmapped
      before this insn was executed, restart, and get SIGSEGV.
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192947.GF8057@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      56bb4cf6
    • Oleg Nesterov's avatar
      uprobes: Change register_for_each_vma() to take mm->mmap_sem for writing · 77fc4af1
      Oleg Nesterov authored
      
      
      Change register_for_each_vma() to take mm->mmap_sem for writing.
      This is a bit unfortunate but hopefully not too bad, this is the
      slow path anyway.
      
      This is needed to ensure that find_active_uprobe() can not race
      with uprobe_register() which adds the new bp at the same
      bp_vaddr, after find_uprobe() fails and before
      is_swbp_at_addr_fast() checks the memory.
      
      IOW, this is needed to ensure that if find_active_uprobe()
      returns NULL but is_swbp == true, we can safely assume that it
      was the "normal" int3 and we should send SIGTRAP.
      
      There is another reason for this change. We are going to replace
      uprobes_state->count with MMF_ flags set by register/unregister
      and cleared by find_active_uprobe(), and set/clear shouldn't
      race with each other.
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192928.GE8057@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      77fc4af1
    • Oleg Nesterov's avatar
      uprobes: Teach find_active_uprobe() to provide the "is_swbp" info · d790d346
      Oleg Nesterov authored
      
      
      A separate patch to simplify the review, and for the
      documentation.
      
      The patch adds another "int *is_swbp" argument to
      find_active_uprobe(), so far its only caller doesn't use this
      info.
      
      With this patch find_active_uprobe() additionally does:
      
      	- if find_vma() + ->vm_start check fails, *is_swbp = -EFAULT
      
      	- otherwise, if valid_vma() + find_uprobe() fails, it holds
      	  the result of is_swbp_at_addr(), can be negative too. The
      	  latter is only possible if we raced with another thread
      	  which did munmap/etc after we hit this bp.
      
      IOW. If find_active_uprobe(&is_swbp) returns NULL, the caller
      can look at is_swbp to figure out whether the current insn is bp
      or not, or detect the race with another thread if it is
      negative.
      
      Note: I think that performance-wise this change is fine. This
      adds is_swbp_at_addr(), but only if we raced with
      uprobe_unregister() or if we hit the "normal" int3 but this mm
      has uprobes as well. And even in this case the slow
      read_opcode() path is very unlikely, this insn recently
      triggered do_int3(), __copy_from_user_inatomic() shouldn't fail
      in the likely case.
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192914.GD8057@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d790d346
    • Oleg Nesterov's avatar
      uprobes: Introduce find_active_uprobe() helper · 3a9ea052
      Oleg Nesterov authored
      
      
      No functional changes. Move the "find uprobe" code from
      handle_swbp() to the new helper, find_active_uprobe().
      
      Note: with or without this change, the find-active-uprobe logic
      is not exactly right. We can race with another thread which
      unmaps the memory with the valid uprobe before we take
      mm->mmap_sem. We can't find this uprobe simply because
      find_vma() fails. In this case we wrongly assume that this trap
      was not caused by uprobe and send the erroneous SIGTRAP. See the
      next changes.
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192857.GC8057@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3a9ea052
    • Oleg Nesterov's avatar
      uprobes: Change read_opcode() to use FOLL_FORCE · a3d7bb47
      Oleg Nesterov authored
      
      
      set_orig_insn()->read_opcode() should not fail if the probed
      task did mprotect() after uprobe_register(), change it to use
      FOLL_FORCE. Without FOLL_WRITE this doesn't have any "side"
      effect but allows to read the !VM_READ memory.
      
      There is another reason for this change, we are going to use
      is_swbp_at_addr() from handle_swbp() which can race with another
      thread doing mprotect().
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192759.GB8057@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a3d7bb47
    • Oleg Nesterov's avatar
      uprobes: Optimize is_swbp_at_addr() for current->mm · c00b2750
      Oleg Nesterov authored
      
      
      Change is_swbp_at_addr() to try to avoid the costly
      read_opcode() if mm == current->mm, __copy_from_user_inatomic()
      should succeed in the likely case.
      
      Currently this optimization is not important, but we are going
      to add more is_swbp_at_addr(current->mm) callers.
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anton Arapov <anton@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20120529192744.GA8057@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c00b2750
    • Masami Hiramatsu's avatar
      x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix · 436d03fa
      Masami Hiramatsu authored
      
      
      Fix the x86 instruction decoder to decode bsr/bsf/jmpe with
      operand-size prefix (66h). This fixes the test case failure
      reported by Linus, attached below.
      
      bsf/bsr/jmpe have a special encoding. Opcode map in
      Intel Software Developers Manual vol2 says they have
      TZCNT/LZCNT variants if it has F3h prefix. However, there
      is no information if it has other 66h or F2h prefixes.
      Current instruction decoder supposes that those are
      bad instructions, but it actually accepts at least
      operand-size prefixes.
      
      H. Peter Anvin further explains:
      
       " TZCNT/LZCNT are F3 + BSF/BSR exactly because the F2 and
         F3 prefixes have historically been no-ops with most instructions.
         This allows software to unconditionally use the prefixed versions
         and get TZCNT/LZCNT on the processors that have them if they don't
         care about the difference. "
      
      This fixes errors reported by test_get_len:
      
        Warning: arch/x86/tools/test_get_len found difference at <em_bsf>:ffffffff81036d87
        Warning: ffffffff81036de5:	66 0f bc c2          	bsf    %dx,%ax
        Warning: objdump says 4 bytes, but insn_get_length() says 3
        Warning: arch/x86/tools/test_get_len found difference at <em_bsr>:ffffffff81036ea6
        Warning: ffffffff81036f04:	66 0f bd c2          	bsr    %dx,%ax
        Warning: objdump says 4 bytes, but insn_get_length() says 3
        Warning: decoded and checked 13298882 instructions with 2 warnings
      
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: default avatarPekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <yrl.pp-manager.tt@hitachi.com>
      Link: http://lkml.kernel.org/r/20120604150911.22338.43296.stgit@localhost.localdomain
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      436d03fa
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo' of... · 02e03040
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
       into perf/urgent
      
      Pull perf fixes from Arnaldo Carvalho de Melo:
      
       * Endianness fixes from Jiri Olsa
      
       * Fixes for make perf tarball
      
       * Fix for DSO name in perf script callchains, from David Ahern
      
       * Segfault fixes for perf top --callchain, from Namhyung Kim
      
       * Minor function result fixes from Srikar Dronamraju
      
       * Add missing 3rd ioctl parameter, from Namhyung Kim
      
       * Fix pager usage in minimal embedded systems, from Avik Sil
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      02e03040
  2. Jun 05, 2012
  3. Jun 04, 2012
  4. Jun 03, 2012
  5. Jun 02, 2012
Loading