Measuring the Impacts of the Preempt-RT Patch – Maxime Chevallier, Smile

Maxime worked on several projects involving Preempt-RT:

  • Simulation on PC of a real-time system, needed to do real-time response on a network interface.
  • Test bench interfacing with real-time software that needs to react within 1 second but has a lot to do in that time.
  • Embedded telematic board: must never loose an incoming message. Since the customer could add CPU load, RT patch was needed to make sure message handling has priority.
  • Medical image processing: need to process each frame before the next one comes.

Real-time = deterministic behaviour: bounded latencies, absolute priorities for tasks (SCHED_FIFO, _RR and _DEADLINE), handle complex cases like priority inversion (rt-mutex with priority inheritance), starvation, …. Most of this is already in upstream Linux. What Preempt-RT still adds: full kernel preemption; various optimisations for worst-case scenario instead of common-case scenario.

Full kernel preemption consists of forcing threaded interrupts (so we get priorities for interrupts as well), making locks sleepable (spinlock normally doesn’t allow anything else on the same CPU; sleepable lock will yield when it doesn’t get the lock). Nothing else changes, so all the normal Linux OS is still there. Only the non-RT tasks will have to live with what is left over by the RT tasks.

To analyse the effect of the Preempt-RT patch, use tools like vmstat, mpstat and pidstat. E.g. mpstat shows how many interrupts each core handles. However, take care because they show results differently. For example, without threaded interrupts, interrupts are not counted as context switches in these tools, while with threaded interrupts each interrupt gives 2 (non-voluntary) context switches (one to the interrupt and one back).

As a benchmark, use stress-ng with a fixed number of operations and measure execution time. Just CPU makes no difference. “fault” (that triggers page faults) is significantly slower. So you need to test this. Note that stress-ng contains cyclictest as well.

In addition to applying preempt-RT, you need to do more things to improve predictability:

  • Disable deep-sleep CPU idle states (this increases power consumption). Tweak with cpuidle in /sys/devices/system/cpu/cpuX/cpuidle/stateX or in BIOS.
  • DVFS: use a fixed frequency
  • Disable hyperthreading

Clearly, you need to know the system. E.g. DMA can give latencies on the SoC bus. SMI is not maskable (it does thermal management…) so measure how long it takes. Hardware resource sharing (e.g. SIMD unit shared between different cores).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s