Real Safe Times in the Jailhouse Hypervisor – Jan Kiszka, Siemens

Jailhouse is a hypervisor that allows to run safety-critical tasks on a multicore in parallel with Linux. Jailhouse tries to be simple, rather than feature-complete, and concentrates on controlling access to resources (memory area, CPU, interrupts) rather than really virtualizing them, i.e. only one guest gets access to each resource. Isolation is really enforced, it’s not cooperative between the guests.

Jailhouse partitions an already booted system (load module and start daemon), after Linux is already running. It offloads work to Linux, e.g. booting, the configuration (no need to do this in the bootloader), control and monitoring. The disadvantage of this approach is that it is not easy to boot Linux in a cell (at least on x86, because there’s a lot of BIOS stuff going on there that jailhouse doesn’t support), and that your boot time becomes larger (Linux has to boot before you can start your RT cell).

The partitions are called cells in jailhouse. There is one root cell (Linux), and one or more other cells (real-time, safety-critical part). Within a cell, anything goes, but it’s not possible to access resources from another cell, or to do things with global effect (e.g. reset). This is symmetrical, so if the real-time cell crashes, it doesn’t bring down Linux. Since also the root cell should not be able to misbehave and do something wrong with the RT cell, any cell can lock down things (e.g. the possibility to do shutdown) so even Linux can not modify it. It also provides the means to validate the cells, i.e. that what is running now is the same thing that you tested before.

Jailhouse obviously can’t avoid hardware errors or errors in the RT software. It can however capture and forward hardware error reports.

Jailhouse is currently going to a certification process (first review by TUV completed).

Jailhouse initially focused on x86 with VT-x and VT-d. It supports direct interrupt delivery, so it doesn’t have to go through the hypervisor first. Basically, there is no runtime overhead except the latency added by the IOMMU. Of course, communication between cells still has some overhead. AMD is in the process of adding AMD64 support. ARMv7 port required almost no changes in the jailhouse core, but there’s still a lot of tweaking to be done. No plans for QorIQ.

The application on the RT cell will have to be adapted to deal with the fact that it doesn’t have access to all the devices it would normally have. For inter-processor communication, there are IPC interrupts and shared memory. Considering to implement a vritual PCI device.

Jailhouse has a skeleton “inmate” for an OS-less application running on an RT cell. For more complex things, you can use an existing RTOS. In that case, you have to remove the platform bringup stuff, replace any legacy BIOS, PIC, PIT based stuff, remap the timers to the ones that are made available by jailhouse, and add support for inter-cell I/O. They have implemented a reference implementation in RTEMS.

Debugging something that runs in a cell is a bit tricky. Hardware debugger may be heard to get, emulation can get slow – and may even be impossible to emulate jailhouse. Therefore, KVM was extended to make it emulate things in the same way that jailhouse makes them available. This way, the KVM debugger can be used. Of course, in this environment there are no RT guarantees (interrupts are emulated!).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s