Commit b70100f2 authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull probes updates from Masami Hiramatsu:

 - kprobes: use struct_size() for variable size kretprobe_instance data
   structure.

 - eprobe: Simplify trace_eprobe list iteration.

 - probe events: Data structure field access support on BTF argument.

     - Update BTF argument support on the functions in the kernel
       loadable modules (only loaded modules are supported).

     - Move generic BTF access function (search function prototype and
       get function parameters) to a separated file.

     - Add a function to search a member of data structure in BTF.

     - Support accessing BTF data structure member from probe args by
       C-like arrow('->') and dot('.') operators. e.g.
          't sched_switch next=next->pid vruntime=next->se.vruntime'

     - Support accessing BTF data structure member from $retval. e.g.
          'f getname_flags%return +0($retval->name):string'

     - Add string type checking if BTF type info is available. This will
       reject if user specify ":string" type for non "char pointer"
       type.

     - Automatically assume the fprobe event as a function return event
       if $retval is used.

 - selftests/ftrace: Add BTF data field access test cases.

 - Documentation: Update fprobe event example with BTF data field.

* tag 'probes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation: tracing: Update fprobe event example with BTF field
  selftests/ftrace: Add BTF fields access testcases
  tracing/fprobe-event: Assume fprobe is a return event by $retval
  tracing/probes: Add string type check with BTF
  tracing/probes: Support BTF field access from $retval
  tracing/probes: Support BTF based data structure field access
  tracing/probes: Add a function to search a member of a struct/union
  tracing/probes: Move finding func-proto API and getting func-param API to trace_btf
  tracing/probes: Support BTF argument on module functions
  tracing/eprobe: Iterate trace_eprobe directly
  kernel: kprobes: Use struct_size()
parents e021c5f1 a2439a4c
Loading
Loading
Loading
Loading
+46 −18
Original line number Original line Diff line number Diff line
@@ -79,9 +79,9 @@ automatically set by the given name. ::
 f:fprobes/myprobe vfs_read count=count pos=pos
 f:fprobes/myprobe vfs_read count=count pos=pos


It also chooses the fetch type from BTF information. For example, in the above
It also chooses the fetch type from BTF information. For example, in the above
example, the ``count`` is unsigned long, and the ``pos`` is a pointer. Thus, both
example, the ``count`` is unsigned long, and the ``pos`` is a pointer. Thus,
are converted to 64bit unsigned long, but only ``pos`` has "%Lx" print-format as
both are converted to 64bit unsigned long, but only ``pos`` has "%Lx"
below ::
print-format as below ::


 # cat events/fprobes/myprobe/format
 # cat events/fprobes/myprobe/format
 name: myprobe
 name: myprobe
@@ -105,9 +105,47 @@ is expanded to all function arguments of the function or the tracepoint. ::
 # cat dynamic_events
 # cat dynamic_events
 f:fprobes/myprobe vfs_read file=file buf=buf count=count pos=pos
 f:fprobes/myprobe vfs_read file=file buf=buf count=count pos=pos


BTF also affects the ``$retval``. If user doesn't set any type, the retval type is
BTF also affects the ``$retval``. If user doesn't set any type, the retval
automatically picked from the BTF. If the function returns ``void``, ``$retval``
type is automatically picked from the BTF. If the function returns ``void``,
is rejected.
``$retval`` is rejected.

You can access the data fields of a data structure using allow operator ``->``
(for pointer type) and dot operator ``.`` (for data structure type.)::

# echo 't sched_switch preempt prev_pid=prev->pid next_pid=next->pid' >> dynamic_events

The field access operators, ``->`` and ``.`` can be combined for accessing deeper
members and other structure members pointed by the member. e.g. ``foo->bar.baz->qux``
If there is non-name union member, you can directly access it as the C code does.
For example::

 struct {
	union {
	int a;
	int b;
	};
 } *foo;

To access ``a`` and ``b``, use ``foo->a`` and ``foo->b`` in this case.

This data field access is available for the return value via ``$retval``,
e.g. ``$retval->name``.

For these BTF arguments and fields, ``:string`` and ``:ustring`` change the
behavior. If these are used for BTF argument or field, it checks whether
the BTF type of the argument or the data field is ``char *`` or ``char []``,
or not.  If not, it rejects applying the string types. Also, with the BTF
support, you don't need a memory dereference operator (``+0(PTR)``) for
accessing the string pointed by a ``PTR``. It automatically adds the memory
dereference operator according to the BTF type. e.g. ::

# echo 't sched_switch prev->comm:string' >> dynamic_events
# echo 'f getname_flags%return $retval->name:string' >> dynamic_events

The ``prev->comm`` is an embedded char array in the data structure, and
``$retval->name`` is a char pointer in the data structure. But in both
cases, you can use ``:string`` type to get the string.



Usage examples
Usage examples
--------------
--------------
@@ -161,10 +199,10 @@ parameters. This means you can access any field values in the task
structure pointed by the ``prev`` and ``next`` arguments.
structure pointed by the ``prev`` and ``next`` arguments.


For example, usually ``task_struct::start_time`` is not traced, but with this
For example, usually ``task_struct::start_time`` is not traced, but with this
traceprobe event, you can trace it as below.
traceprobe event, you can trace that field as below.
::
::


  # echo 't sched_switch comm=+1896(next):string start_time=+1728(next):u64' > dynamic_events
  # echo 't sched_switch comm=next->comm:string next->start_time' > dynamic_events
  # head -n 20 trace | tail
  # head -n 20 trace | tail
 #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
 #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
 #              | |         |   |||||     |         |
 #              | |         |   |||||     |         |
@@ -176,13 +214,3 @@ traceprobe event, you can trace it as below.
           <idle>-0       [000] d..3.  5606.690317: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000
           <idle>-0       [000] d..3.  5606.690317: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000
      kworker/0:1-14      [000] d..3.  5606.690339: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0
      kworker/0:1-14      [000] d..3.  5606.690339: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0
           <idle>-0       [000] d..3.  5606.692368: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000
           <idle>-0       [000] d..3.  5606.692368: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000

Currently, to find the offset of a specific field in the data structure,
you need to build kernel with debuginfo and run `perf probe` command with
`-D` option. e.g.
::

 # perf probe -D "__probestub_sched_switch next->comm:string next->start_time"
 p:probe/__probestub_sched_switch __probestub_sched_switch+0 comm=+1896(%cx):string start_time=+1728(%cx):u64

And replace the ``%cx`` with the ``next``.
+1 −0
Original line number Original line Diff line number Diff line
@@ -209,6 +209,7 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type
int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec);
int btf_check_and_fixup_fields(const struct btf *btf, struct btf_record *rec);
bool btf_type_is_void(const struct btf_type *t);
bool btf_type_is_void(const struct btf_type *t);
s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p);
const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
					       u32 id, u32 *res_id);
					       u32 id, u32 *res_id);
const struct btf_type *btf_type_resolve_ptr(const struct btf *btf,
const struct btf_type *btf_type_resolve_ptr(const struct btf *btf,
+1 −1
Original line number Original line Diff line number Diff line
@@ -553,7 +553,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
	return -ENOENT;
	return -ENOENT;
}
}


static s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
{
{
	struct btf *btf;
	struct btf *btf;
	s32 ret;
	s32 ret;
+2 −4
Original line number Original line Diff line number Diff line
@@ -2232,8 +2232,7 @@ int register_kretprobe(struct kretprobe *rp)
		return -ENOMEM;
		return -ENOMEM;


	for (i = 0; i < rp->maxactive; i++) {
	for (i = 0; i < rp->maxactive; i++) {
		inst = kzalloc(sizeof(struct kretprobe_instance) +
		inst = kzalloc(struct_size(inst, data, rp->data_size), GFP_KERNEL);
			       rp->data_size, GFP_KERNEL);
		if (inst == NULL) {
		if (inst == NULL) {
			rethook_free(rp->rh);
			rethook_free(rp->rh);
			rp->rh = NULL;
			rp->rh = NULL;
@@ -2256,8 +2255,7 @@ int register_kretprobe(struct kretprobe *rp)


	rp->rph->rp = rp;
	rp->rph->rp = rp;
	for (i = 0; i < rp->maxactive; i++) {
	for (i = 0; i < rp->maxactive; i++) {
		inst = kzalloc(sizeof(struct kretprobe_instance) +
		inst = kzalloc(struct_size(inst, data, rp->data_size), GFP_KERNEL);
			       rp->data_size, GFP_KERNEL);
		if (inst == NULL) {
		if (inst == NULL) {
			refcount_set(&rp->rph->ref, i);
			refcount_set(&rp->rph->ref, i);
			free_rp_inst(rp);
			free_rp_inst(rp);
+1 −0
Original line number Original line Diff line number Diff line
@@ -99,6 +99,7 @@ obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
endif
endif
obj-$(CONFIG_DYNAMIC_EVENTS) += trace_dynevent.o
obj-$(CONFIG_DYNAMIC_EVENTS) += trace_dynevent.o
obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
obj-$(CONFIG_PROBE_EVENTS_BTF_ARGS) += trace_btf.o
obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o
obj-$(CONFIG_BOOTTIME_TRACING) += trace_boot.o
obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o
obj-$(CONFIG_FTRACE_RECORD_RECURSION) += trace_recursion_record.o
Loading