What’s New with ftrace? – Steven Rostedt, Vmware

This is an update of what changed in ftrace since Steven’s presentation about its state in v3.18.

What ftrace already did: function tracing, function graph tracing, snapshots, trace events, triggers, and debugging. It uses the hooks installed by gcc’s profiling. Since gcc 4.6, gcc has the -mfentry modifier to call fentry() (instead of mcount()) before the stack frame is set up, so that you get access to the parameters. This is used by live kernel patching to replace a function. The mcount/fentry call adds about 13% of overhead, so they are replaced by NOPs so the overhead becomes unmeasurable. When tracing is enabled for a function (set_ftrace_filter), the NOPs are replaced by the original jump.

In addition to filtering on function name, you can filter on PID or children. Triggers define additional things that get done when a trace point is hit: saving a stack trace, save the trace buffer (= snapshot), turn trace off or on. Also profiling info can be gathered: hit counts, stack usage. max depth limits how deep you trace (starting from the switch from userspace, i.e. syscall or page fault or interrupt).

Instead of tracing on functions, you can also trace on trace events (trace events are added by ftrace to each trace point; the trace point is defined by the source code and gets used by ftrace, perf, …).

For debugging: trace_printk prints to the trace buffer instead of the log buffer, requires no locking etc. A sysctl allows to dump the trace buffer on panic.

Since 4.0: when two different tracers are attached to different functions, a separate trampoline is instantiated for each. This avoid iterating over the tracers which is quite expensive. Before, the trace function would be called directly but only if only a single tracer is in use (globally). Now, it can still be called directly as long as it’s on a different function/tracepoint. To be able to do this, the ftrace infra needs to allocate space for the new trampoline and make it executable, so it’s a bit complicated.

Also added in 4.0 is NOT in the trace event filter logic. It existed for functions but not for events.

Also added in 4.0, the very dangerous tp_printk kernel command line option. It will use printk() instead of tracing to a buffer.

[missed some things that went to fast]

Since 4.4 it is possible to filter triggers on CPU# or PID.

Since 4.4 it is possible to set tracing options before the tracer is enabled, so you can do e.g. filtering immediately.

Since 4.4 it is possible to filter functions by module (or not module) with :mod:modulename. Including *:mod:!* to trace all functions not in modules. Since 4.14 it is also possible to do this before the module is loaded.

Since 4.4 there is a separate filter file for PIDs, which overcomes the limit on the string size on the generic expression filter.

Since 4.9 the event PID filter is extended to also trace children of that PID, so you don’t need to add each child as it appears. Since 4.14 also for functions.,

Since 4.14, the filters have full glob support, also * in the middle and ? and [].

Since 4.14, a hardware latency detector is added to detect SMIs/NMIs, by running loops with interrupts disabled. You give it a width and an interval (called window).

What’s coming: module init functions, …

Steven ran out of time so not everything is covered. Check the slides.














Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s