Event Histograms
1. Introduction
Histogram triggers are special event triggers that can be used to
aggregate trace event data into histograms. For information on
trace events and event triggers, see Documentation/trace/events.rst.
2. Histogram Trigger Command
A histogram trigger command is an event trigger command that
aggregates event hits into a hash table keyed on one or more trace
event format fields (or stacktrace) and a set of running totals
derived from one or more trace event format fields and/or event
counts (hitcount).
The format of a hist trigger is as follows::
[:clear][:name=histname1][:nohitcount][:<handler>.<action>] [if <filter>]
When a matching event is hit, an entry is added to a hash table
using the key(s) and value(s) named. Keys and values correspond to
fields in the event's format description. Values must correspond to
numeric fields - on an event hit, the value(s) will be added to a
sum kept for that field. The special string 'hitcount' can be used
in place of an explicit value field - this is simply a count of
event hits. If 'values' isn't specified, an implicit 'hitcount'
value will be automatically created and used as the only value.
Keys can be any field, or the special string 'stacktrace', which
will use the event's kernel stacktrace as the key. The keywords
'keys' or 'key' can be used to specify keys, and the keywords
'values', 'vals', or 'val' can be used to specify values. Compound
keys consisting of up to three fields can be specified by the 'keys'
keyword. Hashing a compound key produces a unique entry in the
table for each unique combination of component keys, and can be
useful for providing more fine-grained summaries of event data.
Additionally, sort keys consisting of up to two fields can be
specified by the 'sort' keyword. If more than one field is
specified, the result will be a 'sort within a sort': the first key
is taken to be the primary sort key and the second the secondary
key. If a hist trigger is given a name using the 'name' parameter,
its histogram data will be shared with other triggers of the same
name, and trigger hits will update this common data. Only triggers
with 'compatible' fields can be combined in this way; triggers are
'compatible' if the fields named in the trigger share the same
number and type of fields and those fields also have the same names.
Note that any two events always share the compatible 'hitcount' and
'stacktrace' fields and can therefore be combined using those
fields, however pointless that may be.
'hist' triggers add a 'hist' file to each event's subdirectory.
Reading the 'hist' file for the event will dump the hash table in
its entirety to stdout. If there are multiple hist triggers
attached to an event, there will be a table for each trigger in the
output. The table displayed for a named trigger will be the same as
any other instance having the same name. Each printed hash table
entry is a simple list of the keys and values comprising the entry;
keys are printed first and are delineated by curly braces, and are
followed by the set of value fields for the entry. By default,
numeric fields are displayed as base-10 integers. This can be
modified by appending any of the following modifiers to the field
============= =================================================
.hex display a number as a hex value
.sym display an address as a symbol
.sym-offset display an address as a symbol and offset
.syscall display a syscall id as a system call name
.execname display a common_pid as a program name
.log2 display log2 value rather than raw number
.buckets=size display grouping of values rather than raw number
.usecs display a common_timestamp in microseconds
.percent display a number of percentage value
.graph display a bar-graph of a value
.stacktrace display as a stacktrace (must by a long[] type)
============= =================================================
Note that in general the semantics of a given field aren't
interpreted when applying a modifier to it, but there are some
restrictions to be aware of in this regard:
- only the 'hex' modifier can be used for values (because values
are essentially sums, and the other modifiers don't make sense
in that context).
- the 'execname' modifier can only be used on a 'common_pid'. The
reason for this is that the execname is simply the 'comm' value
saved for the 'current' process when an event was triggered,
which is the same as the common_pid value saved by the event
tracing code. Trying to apply that comm value to other pid
values wouldn't be correct, and typically events that care save
pid-specific comm fields in the event itself.
A typical usage scenario would be the following to enable a hist
trigger, read its current contents, and then turn it off::
# echo 'hist:keys=skbaddr.hex:vals=len' > \
# cat /sys/kernel/tracing/events/net/netif_rx/hist
# echo '!hist:keys=skbaddr.hex:vals=len' > \
The trigger file itself can be read to show the details of the
currently attached hist trigger. This information is also displayed
at the top of the 'hist' file when read.
By default, the size of the hash table is 2048 entries. The 'size'
parameter can be used to specify more or fewer than that. The units
are in terms of hashtable entries - if a run uses more entries than
specified, the results will show the number of 'drops', the number
of hits that were ignored. The size should be a power of 2 between
128 and 131072 (any non- power-of-2 number specified will be rounded
The 'sort' parameter can be used to specify a value field to sort
on. The default if unspecified is 'hitcount' and the default sort
order is 'ascending'. To sort in the opposite direction, append
.descending' to the sort key.
The 'pause' parameter can be used to pause an existing hist trigger
or to start a hist trigger but not log any events until told to do
so. 'continue' or 'cont' can be used to start or restart a paused
hist trigger.
The 'clear' parameter will clear the contents of a running hist
trigger and leave its current paused/active state.
Note that the 'pause', 'cont', and 'clear' parameters should be
applied using 'append' shell operator ('>>') if applied to an
existing trigger, rather than via the '>' operator, which will cause
the trigger to be removed through truncation.
The 'nohitcount' (or NOHC) parameter will suppress display of
raw hitcount in the histogram. This option requires at least one
value field which is not a 'raw hitcount'. For example,
'hist:...:vals=hitcount:nohitcount' is rejected, but
'hist:...:vals=hitcount.percent:nohitcount' is OK.
- enable_hist/disable_hist
The enable_hist and disable_hist triggers can be used to have one
event conditionally start and stop another event's already-attached
hist trigger. Any number of enable_hist and disable_hist triggers
can be attached to a given event, allowing that event to kick off
and stop aggregations on a host of other events.
The format is very similar to the enable/disable_event triggers::
Instead of enabling or disabling the tracing of the target event
into the trace buffer as the enable/disable_event triggers do, the
enable/disable_hist triggers enable or disable the aggregation of
the target event into a hash table.
A typical usage scenario for the enable_hist/disable_hist triggers
would be to first set up a paused hist trigger on some event,
followed by an enable_hist/disable_hist pair that turns the hist
aggregation on and off when conditions of interest are hit::
# echo 'hist:keys=skbaddr.hex:vals=len:pause' > \
# echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \
# echo 'disable_hist:net:netif_receive_skb if comm==wget' > \
The above sets up an initially paused hist trigger which is unpaused
and starts aggregating events when a given program is executed, and
which stops aggregating when the process exits and the hist trigger
is paused again.
The examples below provide a more concrete illustration of the
concepts and typical usage patterns discussed above.
'special' event fields
There are a number of 'special event fields' available for use as
keys or values in a hist trigger. These look like and behave as if
they were actual event fields, but aren't really part of the event's
field definition or format file. They are however available for any
event, and can be used anywhere an actual event field could be.
They are:
====================== ==== =======================================
common_timestamp u64 timestamp (from ring buffer) associated
