[Sean’s a member of the OpenEmbedded board, and former member of Yocto’s advisory board.]
This talk is for people who want to start with video on i.MX6 and hopefully avoid some of the issues they encountered during a service engagement for a customer in 2013. The project was about replacing an FPGA with a portable design on i.MX6.Process and display video from two independent sensors, output to either HDMI or OLED (800×600). 1 sensor 128×1024@60, the other 640×480@30. Output should be 30fps with <100ms latency. In addition to just pushing the video, some algorithmic processing was going to be implemented by the customer so the video pipeline should leave as much CPU as possible.
Hardware was not going to be available at the start of the project, so start development on SabreLite with some generic sensors. Lesson learned: make sure you have sufficient, because they’ll die.
i.MX6 has a 3D GPU and two IPUs for video processing. All of this already has software support. They expected to be able to use the IPU for format conversion and to transfer the data frames. Turns out that the selected sensors produced Bayer formats that are not supported by the IPU (though the TRM says it does), so the IPU was just a DMA engine. It did work great for resizing, but that needs to be allocated carefully to avoid resource congestion.
The Cortex-A9 has NEON. It’s tightly coupled, so putting something on the NEON stalls the rest of the processor. They expected to be able to do more computing in parallel.
Sensors were using well-defined interfaces (CSI). Expected to be straightforward to capture frames. However, since the format was not supported by the IPU, only raw mode could be used. The format conversion took a lot of work to get the performance OK. It also consumed a large amount of the latency, mainly because of data transfers.
They expected to be able to use the GPU for computation using OpenGL. So they tried to do the frame conversion on it. They did found a conversion algorithm for the Bayer BGGR format in the GLSL shader language, but the data transfer rates into the GPU were not high enough for the input frame rate. It took about seven weeks to convince the customer that it wasn’t going to be possible.
Freescale provides v4l drivers for the IPU. These include the DMA engine and resizing functionality. Kernel 3.0.35 and GST 0.10.36. The mfs_v4lsrc element provides entry into the gstreamer pipeline. The v4l drivers worked largely as expected, except for one scheduling bug. GStreamer was OK but they had to rewrite some plugins to improve performance. They also added CPU affinity to the queue element. The Freescale elements only work with DMA buffers for a very limited extend, you have to use them in exactly the prescribed way (or modify the plugin).