Newer
Older
starting the trigger with ':pause' appended. This allows you to
start the trigger only when you're ready to start collecting data
and not before. For example, you could start the trigger in a
paused state, then unpause it and do something you want to measure,
then pause the trigger again when done.
Of course, doing this manually can be difficult and error-prone, but
it is possible to automatically start and stop a hist trigger based
on some condition, via the enable_hist and disable_hist triggers.
For example, suppose we wanted to take a look at the relative
weights in terms of skb length for each callpath that leads to a
netif_receieve_skb event when downloading a decent-sized file using
wget.
First we set up an initially paused stacktrace trigger on the
netif_receive_skb event::
# echo 'hist:key=stacktrace:vals=len:pause' > \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
Next, we set up an 'enable_hist' trigger on the sched_process_exec
event, with an 'if filename==/usr/bin/wget' filter. The effect of
this new trigger is that it will 'unpause' the hist trigger we just
set up on netif_receive_skb if and only if it sees a
sched_process_exec event with a filename of '/usr/bin/wget'. When
that happens, all netif_receive_skb events are aggregated into a
hash table keyed on stacktrace::
# echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \
/sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
The aggregation continues until the netif_receive_skb is paused
again, which is what the following disable_hist event does by
creating a similar setup on the sched_process_exit event, using the
filter 'comm==wget'::
# echo 'disable_hist:net:netif_receive_skb if comm==wget' > \
/sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
Whenever a process exits and the comm field of the disable_hist
trigger filter matches 'comm==wget', the netif_receive_skb hist
trigger is disabled.
The overall effect is that netif_receive_skb events are aggregated
into the hash table for only the duration of the wget. Executing a
wget command and then listing the 'hist' file will display the
output generated by the wget command::
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
$ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz
# cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
# trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused]
{ stacktrace:
__netif_receive_skb_core+0x46d/0x990
__netif_receive_skb+0x18/0x60
netif_receive_skb_internal+0x23/0x90
napi_gro_receive+0xc8/0x100
ieee80211_deliver_skb+0xd6/0x270 [mac80211]
ieee80211_rx_handlers+0xccf/0x22f0 [mac80211]
ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211]
ieee80211_rx+0x31d/0x900 [mac80211]
iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm]
iwl_rx_dispatch+0x8e/0xf0 [iwldvm]
iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi]
irq_thread_fn+0x20/0x50
irq_thread+0x11f/0x150
kthread+0xd2/0xf0
ret_from_fork+0x42/0x70
} hitcount: 85 len: 28884
{ stacktrace:
__netif_receive_skb_core+0x46d/0x990
__netif_receive_skb+0x18/0x60
netif_receive_skb_internal+0x23/0x90
napi_gro_complete+0xa4/0xe0
dev_gro_receive+0x23a/0x360
napi_gro_receive+0x30/0x100
ieee80211_deliver_skb+0xd6/0x270 [mac80211]
ieee80211_rx_handlers+0xccf/0x22f0 [mac80211]
ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211]
ieee80211_rx+0x31d/0x900 [mac80211]
iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm]
iwl_rx_dispatch+0x8e/0xf0 [iwldvm]
iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi]
irq_thread_fn+0x20/0x50
irq_thread+0x11f/0x150
kthread+0xd2/0xf0
} hitcount: 98 len: 664329
{ stacktrace:
__netif_receive_skb_core+0x46d/0x990
__netif_receive_skb+0x18/0x60
process_backlog+0xa8/0x150
net_rx_action+0x15d/0x340
__do_softirq+0x114/0x2c0
do_softirq_own_stack+0x1c/0x30
do_softirq+0x65/0x70
__local_bh_enable_ip+0xb5/0xc0
ip_finish_output+0x1f4/0x840
ip_output+0x6b/0xc0
ip_local_out_sk+0x31/0x40
ip_send_skb+0x1a/0x50
udp_send_skb+0x173/0x2a0
udp_sendmsg+0x2bf/0x9f0
inet_sendmsg+0x64/0xa0
sock_sendmsg+0x3d/0x50
} hitcount: 115 len: 13030
{ stacktrace:
__netif_receive_skb_core+0x46d/0x990
__netif_receive_skb+0x18/0x60
netif_receive_skb_internal+0x23/0x90
napi_gro_complete+0xa4/0xe0
napi_gro_flush+0x6d/0x90
iwl_pcie_irq_handler+0x92a/0x12f0 [iwlwifi]
irq_thread_fn+0x20/0x50
irq_thread+0x11f/0x150
kthread+0xd2/0xf0
ret_from_fork+0x42/0x70
} hitcount: 934 len: 5512212
Totals:
Hits: 1232
Entries: 4
Dropped: 0
The above shows all the netif_receive_skb callpaths and their total
lengths for the duration of the wget command.
The 'clear' hist trigger param can be used to clear the hash table.
Suppose we wanted to try another run of the previous example but
this time also wanted to see the complete list of events that went
into the histogram. In order to avoid having to set everything up
again, we can just clear the histogram first::
# echo 'hist:key=stacktrace:vals=len:clear' >> \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
Just to verify that it is in fact cleared, here's what we now see in
the hist file::
# cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
# trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused]
Totals:
Hits: 0
Entries: 0
Dropped: 0
Since we want to see the detailed list of every netif_receive_skb
event occurring during the new run, which are in fact the same
events being aggregated into the hash table, we add some additional
'enable_event' events to the triggering sched_process_exec and
sched_process_exit events as such::
# echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \
/sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
# echo 'disable_event:net:netif_receive_skb if comm==wget' > \
/sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
If you read the trigger files for the sched_process_exec and
sched_process_exit triggers, you should see two triggers for each:
one enabling/disabling the hist aggregation and the other
enabling/disabling the logging of events::
# cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget
enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget
# cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
enable_event:net:netif_receive_skb:unlimited if comm==wget
disable_hist:net:netif_receive_skb:unlimited if comm==wget
In other words, whenever either of the sched_process_exec or
sched_process_exit events is hit and matches 'wget', it enables or
disables both the histogram and the event log, and what you end up
with is a hash table and set of events just covering the specified
duration. Run the wget command again::
$ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz
Displaying the 'hist' file should show something similar to what you
saw in the last run, but this time you should also see the
individual events in the trace file::
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 183/1426 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
wget-15108 [000] ..s1 31769.606929: netif_receive_skb: dev=lo skbaddr=ffff88009c353100 len=60
wget-15108 [000] ..s1 31769.606999: netif_receive_skb: dev=lo skbaddr=ffff88009c353200 len=60
dnsmasq-1382 [000] ..s1 31769.677652: netif_receive_skb: dev=lo skbaddr=ffff88009c352b00 len=130
dnsmasq-1382 [000] ..s1 31769.685917: netif_receive_skb: dev=lo skbaddr=ffff88009c352200 len=138
##### CPU 2 buffer started ####
irq/29-iwlwifi-559 [002] ..s. 31772.031529: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433d00 len=2948
irq/29-iwlwifi-559 [002] ..s. 31772.031572: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432200 len=1500
irq/29-iwlwifi-559 [002] ..s. 31772.032196: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433100 len=2948
irq/29-iwlwifi-559 [002] ..s. 31772.032761: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433000 len=2948
irq/29-iwlwifi-559 [002] ..s. 31772.033220: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432e00 len=1500
.
.
.
The following example demonstrates how multiple hist triggers can be
attached to a given event. This capability can be useful for
creating a set of different summaries derived from the same set of
events, or for comparing the effects of different filters, among
other things::
# echo 'hist:keys=skbaddr.hex:vals=len if len < 0' >> \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=skbaddr.hex:vals=len if len > 4096' >> \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=skbaddr.hex:vals=len if len == 256' >> \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=skbaddr.hex:vals=len' >> \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=len:vals=common_preempt_count' >> \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
The above set of commands create four triggers differing only in
their filters, along with a completely different though fairly
nonsensical trigger. Note that in order to append multiple hist
triggers to the same file, you should use the '>>' operator to
append them ('>' will also add the new hist trigger, but will remove
any existing hist triggers beforehand).
Displaying the contents of the 'hist' file for the event shows the
contents of all five histograms::
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
# cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
# event histogram
#
# trigger info: hist:keys=len:vals=hitcount,common_preempt_count:sort=hitcount:size=2048 [active]
#
{ len: 176 } hitcount: 1 common_preempt_count: 0
{ len: 223 } hitcount: 1 common_preempt_count: 0
{ len: 4854 } hitcount: 1 common_preempt_count: 0
{ len: 395 } hitcount: 1 common_preempt_count: 0
{ len: 177 } hitcount: 1 common_preempt_count: 0
{ len: 446 } hitcount: 1 common_preempt_count: 0
{ len: 1601 } hitcount: 1 common_preempt_count: 0
.
.
.
{ len: 1280 } hitcount: 66 common_preempt_count: 0
{ len: 116 } hitcount: 81 common_preempt_count: 40
{ len: 708 } hitcount: 112 common_preempt_count: 0
{ len: 46 } hitcount: 221 common_preempt_count: 0
{ len: 1264 } hitcount: 458 common_preempt_count: 0
Totals:
Hits: 1428
Entries: 147
Dropped: 0
# event histogram
#
# trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active]
#
{ skbaddr: ffff8800baee5e00 } hitcount: 1 len: 130
{ skbaddr: ffff88005f3d5600 } hitcount: 1 len: 1280
{ skbaddr: ffff88005f3d4900 } hitcount: 1 len: 1280
{ skbaddr: ffff88009fed6300 } hitcount: 1 len: 115
{ skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 115
{ skbaddr: ffff88008cdb1900 } hitcount: 1 len: 46
{ skbaddr: ffff880064b5ef00 } hitcount: 1 len: 118
{ skbaddr: ffff880044e3c700 } hitcount: 1 len: 60
{ skbaddr: ffff880100065900 } hitcount: 1 len: 46
{ skbaddr: ffff8800d46bd500 } hitcount: 1 len: 116
{ skbaddr: ffff88005f3d5f00 } hitcount: 1 len: 1280
{ skbaddr: ffff880100064700 } hitcount: 1 len: 365
{ skbaddr: ffff8800badb6f00 } hitcount: 1 len: 60
.
.
.
{ skbaddr: ffff88009fe0be00 } hitcount: 27 len: 24677
{ skbaddr: ffff88009fe0a400 } hitcount: 27 len: 23052
{ skbaddr: ffff88009fe0b700 } hitcount: 31 len: 25589
{ skbaddr: ffff88009fe0b600 } hitcount: 32 len: 27326
{ skbaddr: ffff88006a462800 } hitcount: 68 len: 71678
{ skbaddr: ffff88006a463700 } hitcount: 70 len: 72678
{ skbaddr: ffff88006a462b00 } hitcount: 71 len: 77589
{ skbaddr: ffff88006a463600 } hitcount: 73 len: 71307
{ skbaddr: ffff88006a462200 } hitcount: 81 len: 81032
Totals:
Hits: 1451
Entries: 318
Dropped: 0
# event histogram
#
# trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len == 256 [active]
#
Totals:
Hits: 0
Entries: 0
Dropped: 0
# event histogram
#
# trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len > 4096 [active]
#
{ skbaddr: ffff88009fd2c300 } hitcount: 1 len: 7212
{ skbaddr: ffff8800d2bcce00 } hitcount: 1 len: 7212
{ skbaddr: ffff8800d2bcd700 } hitcount: 1 len: 7212
{ skbaddr: ffff8800d2bcda00 } hitcount: 1 len: 21492
{ skbaddr: ffff8800ae2e2d00 } hitcount: 1 len: 7212
{ skbaddr: ffff8800d2bcdb00 } hitcount: 1 len: 7212
{ skbaddr: ffff88006a4df500 } hitcount: 1 len: 4854
{ skbaddr: ffff88008ce47b00 } hitcount: 1 len: 18636
{ skbaddr: ffff8800ae2e2200 } hitcount: 1 len: 12924
{ skbaddr: ffff88005f3e1000 } hitcount: 1 len: 4356
{ skbaddr: ffff8800d2bcdc00 } hitcount: 2 len: 24420
{ skbaddr: ffff8800d2bcc200 } hitcount: 2 len: 12996
Totals:
Hits: 14
Entries: 12
Dropped: 0
# event histogram
#
# trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len < 0 [active]
#
Totals:
Hits: 0
Entries: 0
Dropped: 0
Named triggers can be used to have triggers share a common set of
histogram data. This capability is mostly useful for combining the
output of events generated by tracepoints contained inside inline
functions, but names can be used in a hist trigger on any event.
For example, these two triggers when hit will update the same 'len'
field in the shared 'foo' histogram data::
# echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \
/sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \
/sys/kernel/debug/tracing/events/net/netif_rx/trigger
You can see that they're updating common histogram data by reading
each event's hist files at the same time::
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
# cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist;
cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
# event histogram
#
# trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active]
#
{ skbaddr: ffff88000ad53500 } hitcount: 1 len: 46
{ skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76
{ skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468
{ skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46
{ skbaddr: ffff88009ff09100 } hitcount: 1 len: 52
{ skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168
{ skbaddr: ffff88006a54f400 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260
{ skbaddr: ffff880064505000 } hitcount: 1 len: 46
{ skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32
{ skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46
{ skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44
{ skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168
{ skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40
{ skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40
{ skbaddr: ffff880064505f00 } hitcount: 1 len: 174
{ skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160
{ skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76
{ skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32
{ skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988
{ skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46
{ skbaddr: ffff88002e37a200 } hitcount: 1 len: 44
{ skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676
{ skbaddr: ffff88000ad52600 } hitcount: 2 len: 107
{ skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92
{ skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142
{ skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220
{ skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92
{ skbaddr: ffff8800d021f800 } hitcount: 2 len: 92
{ skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675
{ skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138
{ skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138
{ skbaddr: ffff88002e37a100 } hitcount: 4 len: 184
{ skbaddr: ffff880064504400 } hitcount: 4 len: 184
{ skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184
{ skbaddr: ffff88000ad53700 } hitcount: 5 len: 230
{ skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196
{ skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276
{ skbaddr: ffff88006a54f900 } hitcount: 6 len: 276
Totals:
Hits: 81
Entries: 42
Dropped: 0
# event histogram
#
# trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active]
#
{ skbaddr: ffff88000ad53500 } hitcount: 1 len: 46
{ skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76
{ skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468
{ skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46
{ skbaddr: ffff88009ff09100 } hitcount: 1 len: 52
{ skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168
{ skbaddr: ffff88006a54f400 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260
{ skbaddr: ffff880064505000 } hitcount: 1 len: 46
{ skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32
{ skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46
{ skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44
{ skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168
{ skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40
{ skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40
{ skbaddr: ffff880064505f00 } hitcount: 1 len: 174
{ skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160
{ skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76
{ skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32
{ skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46
{ skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988
{ skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46
{ skbaddr: ffff88002e37a200 } hitcount: 1 len: 44
{ skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676
{ skbaddr: ffff88000ad52600 } hitcount: 2 len: 107
{ skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92
{ skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142
{ skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220
{ skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92
{ skbaddr: ffff8800d021f800 } hitcount: 2 len: 92
{ skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675
{ skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138
{ skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138
{ skbaddr: ffff88002e37a100 } hitcount: 4 len: 184
{ skbaddr: ffff880064504400 } hitcount: 4 len: 184
{ skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184
{ skbaddr: ffff88000ad53700 } hitcount: 5 len: 230
{ skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196
{ skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276
{ skbaddr: ffff88006a54f900 } hitcount: 6 len: 276
Totals:
Hits: 81
Entries: 42
Dropped: 0
And here's an example that shows how to combine histogram data from
any two events even if they don't share any 'compatible' fields
other than 'hitcount' and 'stacktrace'. These commands create a
couple of triggers named 'bar' using those fields::
# echo 'hist:name=bar:key=stacktrace:val=hitcount' > \
/sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
# echo 'hist:name=bar:key=stacktrace:val=hitcount' > \
/sys/kernel/debug/tracing/events/net/netif_rx/trigger
And displaying the output of either shows some interesting if
somewhat confusing output::
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
# cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
# cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
# event histogram
#
# trigger info: hist:name=bar:keys=stacktrace:vals=hitcount:sort=hitcount:size=2048 [active]
#
{ stacktrace:
_do_fork+0x18e/0x330
kernel_thread+0x29/0x30
kthreadd+0x154/0x1b0
ret_from_fork+0x3f/0x70
} hitcount: 1
{ stacktrace:
netif_rx_internal+0xb2/0xd0
netif_rx_ni+0x20/0x70
dev_loopback_xmit+0xaa/0xd0
ip_mc_output+0x126/0x240
ip_local_out_sk+0x31/0x40
igmp_send_report+0x1e9/0x230
igmp_timer_expire+0xe9/0x120
call_timer_fn+0x39/0xf0
run_timer_softirq+0x1e1/0x290
__do_softirq+0xfd/0x290
irq_exit+0x98/0xb0
smp_apic_timer_interrupt+0x4a/0x60
apic_timer_interrupt+0x6d/0x80
cpuidle_enter+0x17/0x20
call_cpuidle+0x3b/0x60
cpu_startup_entry+0x22d/0x310
} hitcount: 1
{ stacktrace:
netif_rx_internal+0xb2/0xd0
netif_rx_ni+0x20/0x70
dev_loopback_xmit+0xaa/0xd0
ip_mc_output+0x17f/0x240
ip_local_out_sk+0x31/0x40
ip_send_skb+0x1a/0x50
udp_send_skb+0x13e/0x270
udp_sendmsg+0x2bf/0x980
inet_sendmsg+0x67/0xa0
sock_sendmsg+0x38/0x50
SYSC_sendto+0xef/0x170
SyS_sendto+0xe/0x10
entry_SYSCALL_64_fastpath+0x12/0x6a
} hitcount: 2
{ stacktrace:
netif_rx_internal+0xb2/0xd0
netif_rx+0x1c/0x60
loopback_xmit+0x6c/0xb0
dev_hard_start_xmit+0x219/0x3a0
__dev_queue_xmit+0x415/0x4f0
dev_queue_xmit_sk+0x13/0x20
ip_finish_output2+0x237/0x340
ip_finish_output+0x113/0x1d0
ip_output+0x66/0xc0
ip_local_out_sk+0x31/0x40
ip_send_skb+0x1a/0x50
udp_send_skb+0x16d/0x270
udp_sendmsg+0x2bf/0x980
inet_sendmsg+0x67/0xa0
sock_sendmsg+0x38/0x50
___sys_sendmsg+0x14e/0x270
} hitcount: 76
{ stacktrace:
netif_rx_internal+0xb2/0xd0
netif_rx+0x1c/0x60
loopback_xmit+0x6c/0xb0
dev_hard_start_xmit+0x219/0x3a0
__dev_queue_xmit+0x415/0x4f0
dev_queue_xmit_sk+0x13/0x20
ip_finish_output2+0x237/0x340
ip_finish_output+0x113/0x1d0
ip_output+0x66/0xc0
ip_local_out_sk+0x31/0x40
ip_send_skb+0x1a/0x50
udp_send_skb+0x16d/0x270
udp_sendmsg+0x2bf/0x980
inet_sendmsg+0x67/0xa0
sock_sendmsg+0x38/0x50
___sys_sendmsg+0x269/0x270
} hitcount: 77
{ stacktrace:
netif_rx_internal+0xb2/0xd0
netif_rx+0x1c/0x60
loopback_xmit+0x6c/0xb0
dev_hard_start_xmit+0x219/0x3a0
__dev_queue_xmit+0x415/0x4f0
dev_queue_xmit_sk+0x13/0x20
ip_finish_output2+0x237/0x340
ip_finish_output+0x113/0x1d0
ip_output+0x66/0xc0
ip_local_out_sk+0x31/0x40
ip_send_skb+0x1a/0x50
udp_send_skb+0x16d/0x270
udp_sendmsg+0x2bf/0x980
inet_sendmsg+0x67/0xa0
sock_sendmsg+0x38/0x50
SYSC_sendto+0xef/0x170
} hitcount: 88
{ stacktrace:
_do_fork+0x18e/0x330
SyS_clone+0x19/0x20
entry_SYSCALL_64_fastpath+0x12/0x6a
} hitcount: 244
Totals:
Hits: 489
Entries: 7
Dropped: 0
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
2.2 Inter-event hist triggers
-----------------------------
Inter-event hist triggers are hist triggers that combine values from
one or more other events and create a histogram using that data. Data
from an inter-event histogram can in turn become the source for
further combined histograms, thus providing a chain of related
histograms, which is important for some applications.
The most important example of an inter-event quantity that can be used
in this manner is latency, which is simply a difference in timestamps
between two events. Although latency is the most important
inter-event quantity, note that because the support is completely
general across the trace event subsystem, any event field can be used
in an inter-event quantity.
An example of a histogram that combines data from other histograms
into a useful chain would be a 'wakeupswitch latency' histogram that
combines a 'wakeup latency' histogram and a 'switch latency'
histogram.
Normally, a hist trigger specification consists of a (possibly
compound) key along with one or more numeric values, which are
continually updated sums associated with that key. A histogram
specification in this case consists of individual key and value
specifications that refer to trace event fields associated with a
single event type.
The inter-event hist trigger extension allows fields from multiple
events to be referenced and combined into a multi-event histogram
specification. In support of this overall goal, a few enabling
features have been added to the hist trigger support:
- In order to compute an inter-event quantity, a value from one
event needs to saved and then referenced from another event. This
requires the introduction of support for histogram 'variables'.
- The computation of inter-event quantities and their combination
require some minimal amount of support for applying simple
expressions to variables (+ and -).
- A histogram consisting of inter-event quantities isn't logically a
histogram on either event (so having the 'hist' file for either
event host the histogram output doesn't really make sense). To
address the idea that the histogram is associated with a
combination of events, support is added allowing the creation of
'synthetic' events that are events derived from other events.
These synthetic events are full-fledged events just like any other
and can be used as such, as for instance to create the
'combination' histograms mentioned previously.
- A set of 'actions' can be associated with histogram entries -
these can be used to generate the previously mentioned synthetic
events, but can also be used for other purposes, such as for
example saving context when a 'max' latency has been hit.
- Trace events don't have a 'timestamp' associated with them, but
there is an implicit timestamp saved along with an event in the
underlying ftrace ring buffer. This timestamp is now exposed as a
a synthetic field named 'common_timestamp' which can be used in
histograms as if it were any other event field; it isn't an actual
field in the trace format but rather is a synthesized value that
nonetheless can be used as if it were an actual field. By default
it is in units of nanoseconds; appending '.usecs' to a
common_timestamp field changes the units to microseconds.
A note on inter-event timestamps: If common_timestamp is used in a
histogram, the trace buffer is automatically switched over to using
absolute timestamps and the "global" trace clock, in order to avoid
bogus timestamp differences with other clocks that aren't coherent
across CPUs. This can be overridden by specifying one of the other
trace clocks instead, using the "clock=XXX" hist trigger attribute,
where XXX is any of the clocks listed in the tracing/trace_clock
pseudo-file.
These features are described in more detail in the following sections.
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
2.2.1 Histogram Variables
-------------------------
Variables are simply named locations used for saving and retrieving
values between matching events. A 'matching' event is defined as an
event that has a matching key - if a variable is saved for a histogram
entry corresponding to that key, any subsequent event with a matching
key can access that variable.
A variable's value is normally available to any subsequent event until
it is set to something else by a subsequent event. The one exception
to that rule is that any variable used in an expression is essentially
'read-once' - once it's used by an expression in a subsequent event,
it's reset to its 'unset' state, which means it can't be used again
unless it's set again. This ensures not only that an event doesn't
use an uninitialized variable in a calculation, but that that variable
is used only once and not for any unrelated subsequent match.
The basic syntax for saving a variable is to simply prefix a unique
variable name not corresponding to any keyword along with an '=' sign
to any event field.
Either keys or values can be saved and retrieved in this way. This
creates a variable named 'ts0' for a histogram entry with the key
'next_pid'::
# echo 'hist:keys=next_pid:vals=$ts0:ts0=common_timestamp ... >> \
event/trigger
The ts0 variable can be accessed by any subsequent event having the
same pid as 'next_pid'.
Variable references are formed by prepending the variable name with
the '$' sign. Thus for example, the ts0 variable above would be
referenced as '$ts0' in expressions.
Because 'vals=' is used, the common_timestamp variable value above
will also be summed as a normal histogram value would (though for a
timestamp it makes little sense).
The below shows that a key value can also be saved in the same way::
# echo 'hist:timer_pid=common_pid:key=timer_pid ...' >> event/trigger
If a variable isn't a key variable or prefixed with 'vals=', the
associated event field will be saved in a variable but won't be summed
as a value::
# echo 'hist:keys=next_pid:ts1=common_timestamp ...' >> event/trigger
Multiple variables can be assigned at the same time. The below would
result in both ts0 and b being created as variables, with both
common_timestamp and field1 additionally being summed as values::
# echo 'hist:keys=pid:vals=$ts0,$b:ts0=common_timestamp,b=field1 ...' >> \
event/trigger
Note that variable assignments can appear either preceding or
following their use. The command below behaves identically to the
command above::
# echo 'hist:keys=pid:ts0=common_timestamp,b=field1:vals=$ts0,$b ...' >> \
event/trigger
Any number of variables not bound to a 'vals=' prefix can also be
assigned by simply separating them with colons. Below is the same
thing but without the values being summed in the histogram::
# echo 'hist:keys=pid:ts0=common_timestamp:b=field1 ...' >> event/trigger
Variables set as above can be referenced and used in expressions on
another event.
For example, here's how a latency can be calculated::
# echo 'hist:keys=pid,prio:ts0=common_timestamp ...' >> event1/trigger
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp-$ts0 ...' >> event2/trigger
In the first line above, the event's timestamp is saved into the
variable ts0. In the next line, ts0 is subtracted from the second
event's timestamp to produce the latency, which is then assigned into
yet another variable, 'wakeup_lat'. The hist trigger below in turn
makes use of the wakeup_lat variable to compute a combined latency
using the same key and variable from yet another event::
# echo 'hist:key=pid:wakeupswitch_lat=$wakeup_lat+$switchtime_lat ...' >> event3/trigger
2.2.2 Synthetic Events
----------------------
Synthetic events are user-defined events generated from hist trigger
variables or fields associated with one or more other events. Their
purpose is to provide a mechanism for displaying data spanning
multiple events consistent with the existing and already familiar
usage for normal events.
To define a synthetic event, the user writes a simple specification
consisting of the name of the new event along with one or more
variables and their types, which can be any valid field type,
separated by semicolons, to the tracing/synthetic_events file.
For instance, the following creates a new event named 'wakeup_latency'
with 3 fields: lat, pid, and prio. Each of those fields is simply a
variable reference to a variable on another event::
# echo 'wakeup_latency \
u64 lat; \
pid_t pid; \
int prio' >> \
/sys/kernel/debug/tracing/synthetic_events
Reading the tracing/synthetic_events file lists all the currently
defined synthetic events, in this case the event defined above::
# cat /sys/kernel/debug/tracing/synthetic_events
wakeup_latency u64 lat; pid_t pid; int prio
An existing synthetic event definition can be removed by prepending
the command that defined it with a '!'::
# echo '!wakeup_latency u64 lat pid_t pid int prio' >> \
/sys/kernel/debug/tracing/synthetic_events
At this point, there isn't yet an actual 'wakeup_latency' event
instantiated in the event subsystem - for this to happen, a 'hist
trigger action' needs to be instantiated and bound to actual fields
and variables defined on other events (see Section 2.2.3 below on
how that is done using hist trigger 'onmatch' action). Once that is
done, the 'wakeup_latency' synthetic event instance is created.
A histogram can now be defined for the new synthetic event::
# echo 'hist:keys=pid,prio,lat.log2:sort=pid,lat' >> \
/sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
The new event is created under the tracing/events/synthetic/ directory
and looks and behaves just like any other event::
# ls /sys/kernel/debug/tracing/events/synthetic/wakeup_latency
enable filter format hist id trigger
Like any other event, once a histogram is enabled for the event, the
output can be displayed by reading the event's 'hist' file.
2.2.3 Hist trigger 'handlers' and 'actions'
-------------------------------------------
A hist trigger 'action' is a function that's executed (in most cases
conditionally) whenever a histogram entry is added or updated.
When a histogram entry is added or updated, a hist trigger 'handler'
is what decides whether the corresponding action is actually invoked
or not.
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
Hist trigger handlers and actions are paired together in the general
form:
<handler>.<action>
To specify a handler.action pair for a given event, simply specify
that handler.action pair between colons in the hist trigger
specification.
In theory, any handler can be combined with any action, but in
practice, not every handler.action combination is currently supported;
if a given handler.action combination isn't supported, the hist
trigger will fail with -EINVAL;
The default 'handler.action' if none is explicity specified is as it
always has been, to simply update the set of values associated with an
entry. Some applications, however, may want to perform additional
actions at that point, such as generate another event, or compare and
save a maximum.
The supported handlers and actions are listed below, and each is
described in more detail in the following paragraphs, in the context
of descriptions of some common and useful handler.action combinations.
The available handlers are:
- onmatch(matching.event) - invoke action on any addition or update
- onmax(var) - invoke action if var exceeds current max
- onchange(var) - invoke action if var changes
The available actions are:
- trace(<synthetic_event_name>,param list) - generate synthetic event
- save(field,...) - save current event fields
- snapshot() - snapshot the trace buffer
The following commonly-used handler.action pairs are available:
- onmatch(matching.event).trace(<synthetic_event_name>,param list)
The 'onmatch(matching.event).trace(<synthetic_event_name>,param
list)' hist trigger action is invoked whenever an event matches
and the histogram entry would be added or updated. It causes the
named synthetic event to be generated with the values given in the
'param list'. The result is the generation of a synthetic event
that consists of the values contained in those variables at the
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
time the invoking event was hit. For example, if the synthetic
event name is 'wakeup_latency', a wakeup_latency event is
generated using onmatch(event).trace(wakeup_latency,arg1,arg2).
There is also an equivalent alternative form available for
generating synthetic events. In this form, the synthetic event
name is used as if it were a function name. For example, using
the 'wakeup_latency' synthetic event name again, the
wakeup_latency event would be generated by invoking it as if it
were a function call, with the event field values passed in as
arguments: onmatch(event).wakeup_latency(arg1,arg2). The syntax
for this form is:
onmatch(matching.event).<synthetic_event_name>(param list)
In either case, the 'param list' consists of one or more
parameters which may be either variables or fields defined on
either the 'matching.event' or the target event. The variables or
fields specified in the param list may be either fully-qualified
or unqualified. If a variable is specified as unqualified, it
must be unique between the two events. A field name used as a
param can be unqualified if it refers to the target event, but
must be fully qualified if it refers to the matching event. A
fully-qualified name is of the form 'system.event_name.$var_name'
or 'system.event_name.field'.
The 'matching.event' specification is simply the fully qualified
event name of the event that matches the target event for the
onmatch() functionality, in the form 'system.event_name'. Histogram
keys of both events are compared to find if events match. In case
multiple histogram keys are used, they all must match in the specified
order.
Finally, the number and type of variables/fields in the 'param
list' must match the number and types of the fields in the
synthetic event being generated.
As an example the below defines a simple synthetic event and uses
a variable defined on the sched_wakeup_new event as a parameter
when invoking the synthetic event. Here we define the synthetic
# echo 'wakeup_new_test pid_t pid' >> \
/sys/kernel/debug/tracing/synthetic_events
# cat /sys/kernel/debug/tracing/synthetic_events
wakeup_new_test pid_t pid
The following hist trigger both defines the missing testpid
variable and specifies an onmatch() action that generates a
wakeup_new_test synthetic event whenever a sched_wakeup_new event
occurs, which because of the 'if comm == "cyclictest"' filter only
happens when the executable is cyclictest::
# echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
wakeup_new_test($testpid) if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
Or, equivalently, using the 'trace' keyword syntax:
# echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
Creating and displaying a histogram based on those events is now
just a matter of using the fields and new synthetic event in the
tracing/events/synthetic directory, as usual::
# echo 'hist:keys=pid:sort=pid' >> \
/sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/trigger
Running 'cyclictest' should cause wakeup_new events to generate
wakeup_new_test synthetic events which should result in histogram
output in the wakeup_new_test event's hist file::
# cat /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/hist
A more typical usage would be to use two events to calculate a
latency. The following example uses a set of hist triggers to
produce a 'wakeup_latency' histogram.
First, we define a 'wakeup_latency' synthetic event::
# echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> \
/sys/kernel/debug/tracing/synthetic_events
Next, we specify that whenever we see a sched_waking event for a
cyclictest thread, save the timestamp in a 'ts0' variable::
# echo 'hist:keys=$saved_pid:saved_pid=pid:ts0=common_timestamp.usecs \
if comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_waking/trigger
Then, when the corresponding thread is actually scheduled onto the
CPU by a sched_switch event (saved_pid matches next_pid), calculate
the latency and use that along with another variable and an event field
to generate a wakeup_latency synthetic event::
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:\
onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,\
$saved_pid,next_prio) if next_comm=="cyclictest"' >> \
/sys/kernel/debug/tracing/events/sched/sched_switch/trigger
We also need to create a histogram on the wakeup_latency synthetic
event in order to aggregate the generated synthetic event data::
# echo 'hist:keys=pid,prio,lat:sort=pid,lat' >> \
/sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
Finally, once we've run cyclictest to actually generate some
events, we can see the output by looking at the wakeup_latency
synthetic event's hist file::
# cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist
- onmax(var).save(field,.. .)
The 'onmax(var).save(field,...)' hist trigger action is invoked
whenever the value of 'var' associated with a histogram entry
exceeds the current maximum contained in that variable.
The end result is that the trace event fields specified as the
onmax.save() params will be saved if 'var' exceeds the current