Managing Casual Contributors – Ann Barcomb, University of Limerick

Ann researches how to manage contributors in open source projects. Before that she worked as a developer and as a community manager. This presentation is a summary of research results on casual contributors.

Casual contributors are important to a project because they are often the majority of contributions. Also non-patch contributions: testing, help at a conference, … They increase innovation and software quality. But they also diffuse knowledge about the project to their social network.

Casual is not a good word, because they are often committed. They’re not necessarily one-off.  Also they are often habitual contributors in other communities. Perhaps “Episodic” is a better term.

Many of the community management techniques also apply to casual contributors. But without a strategy, how do you know you’re managing them effectively? If you’re only looking at converting drive-by contributors to habitual contributors, you miss the valuable episodic contributors that also remain committed to your project.

Five factors influence the intention to remain (intention to remain is used as a predictor for actually remaining, turns out to be the best one). Motives: enjoyment, socialising, personal benefit (this is a negative motive, once the itch has been scratched there is no reason to stay). Technical barriers discourage enjoyment and socialising and are affecting episodic contributors more.

Social norms: how a contributor perceives the response of their social network to their participation (peer pressure). For open source contributions, the social network often knows less of the involvement than in other volunteer work so it is less relevant. However, they are responsive to personal invitations – this is true for volunteers in general and also for OSS contributors. Especially for non-code contributors.

Psychological sense of community: do they feel welcome? Inclusivity is often mentioned as an important factor.

Satisfaction: did the experience match the expectation? This is one of the strongest factors towards intent to remain.

Organisational commitment: identify with the community, feel part of something bigger. People who talk about their involvement with family and friends are more likely to remain (not a causal relation).

Create a strategy for episodic contributions:

  • Decide objectives: understand them, get more of them, retain them, get them to do more useful work, …
  • Identify appropriate tasks for episodic contributors: small, focused, small learning curve. There are also specialised contributors, who have skills that others in the community don’t have. For them, you want to separate their domain knowledge from the details of your project.
  • Practices to support the goals:
    • Guided introductory events, mentoring: reduce the technical barriers and offer social interaction. Appeals to social motives.
    • Encourage all contributors to talk about their participation in their network. Enable this by creating content. Why?
      • way for recruitment
      • correlation with intent to remain
    • Recognize non-coding activities
      • sense of community: they fit in
      • satisfaction
    • Awareness of contributors’ expertise, to identify specialist knowledge and to recognize their skills
    • Time-based releases in any process so people can easily plan their involvement.
  • Measure results: big research in itself so not expanded on here.





[Note: slides that are online have a lot more text than what was presented at the conference, so it should be quite readable.]


Jmake: Dependable Compilation for Kernel Janitors – Julia Lawall, Inria

Software gets bigger and bigger, this is certainly the case for Linux as well. In addition, Linux is configurable so not all code is built. Different kinds of developers are involved: casual contributors, maintainers and janitors.

Janitors clean up other people’s messes. They know coding style conventions and API changes. However, they don’t know the subsystem they affect deeply, and often they don’t have the possibilities to test well. There is a risk of a silent compiler failure: the janitor modifies some code, compilation succeeds, but the compiler actually didn’t build the modified code because it was configured out.

JMake handles the silent compiler failure to improve the reliability of janitor code. This grows out the coccinelle work: when doing this kind of tree-wide change, it is hard to make sure that you’re actually testing what you changed. People want to have immediate feedback of what they do, so an online tool that sends a mail is not appropriate. Even under allyesconfig, some parts are not built.

JMake looks at a diff and what gets built, and reports any modified line that did not get built. JMake can also find that the line can be built by compiling allyesconfig for a different architecture.

Tools available to JMake: make.cross and allyesconfig, in-tree defconfigs. Trying all that would take too long, so JMake has heuristics. For files in arch/, allyesconfig is used. For drivers etc. it is alsways x86/allyesconfig. If that fails, look in the Makefile if there is a CONFIG variable associated with the C file. A final heuristic is to use the same arch as for the other lines in the patch.

Extra challenge for .h file, because you don’t know where it will be used. Additional complication that a header file with the same name is often used for different arches. Also conditional complication is especially heavily used for .h files, i.e. not including unless some symbol is defined.

To find out which lines are actually compiled, you could look at the line numbers in the compiled code (.lst file) (but this doesn’t work for macros); introduce a syntax error in the modified line and check that an error is reported (but not reliable and compiler-specific); mutate source code and verify that the mutation is in the .i file (preprocessed code), and if yes also verify that the (unmodified) source file actually gets built. The mutation is done by adding a string (which never gets modified by preprocessor) surrounded by characters that are not valid in C. If the mutation is in a macro, it will end up on a different line but it will end up somewhere.

To run jmake, you give it a commit ID or range of commit IDs to look at. It then goes through the above process. It looks at changes in blocks that are not interrupted by #ifdef. It reports for each file how it managed to build the modification: “make” = x86/allyesconfig; “make.cross ARCH=…” = ARCH/allyesconfig; “make.cross ARCH=foo:bar_defconfig” = with a different defconfig than allyesconfig; Failure = needs to be looked at manually. For a commit that affects 83 files it took about 8.5 minutes on Julia’s laptop.

Julia ran it on 11K commits. 96% of the modified non-arch files are visible in x86/allyesconfig. 365 .c files and 75 .h files outside of arch are not visible in x86 but are in some other arch, typically arch. 415 of .c files doe compile but not all modified lines are compiled; 54 of these can be found in other arches, but 361 cases JMake fails.

Some issues:

  • Config options that are never set.
  • Changes that are done both in #ifdef and #else can never work.
  • Changes in #ifdef MODULE because not testing modules.

Julia made an objective definition of what a janitor is based on some metrics about the type of commits people make. Basically a janitor makes changes in a lot of different files, in different subsystems. She detected 21 janitor commits that do not get built on x86/allyesconfig.

The tool works well when you’re reacting to dependencies, e.g. adding new arguments to a function. It does not work when you create dependencies, e.g. adding const to a declaration – the latter would need to build all the users of that function, but JMake doesn’t do that.


Protecting Your System from the Scum of the Universe – Gilad Ben-Yossef, Arm Holdings

Gilad is the maintainer of the ARM TrustZone CryptoCell Linux device driver.

Smart devices are used for everything, so we need to be able to trust them. However, we also want a frictionless user experience and be able to do anything with it. This is guaranteed to fail, so we need a second line of defence, a trusted way of failing. If someone gets hold of our device, we don’t want them to have access to all our secrets, to get access to additional resources, and we want to know about it and be able to get them out again. We want trusted boot: reboot the device and it is safe again.

All the components are there, we have to make them fit together.

Secure boot (Android style, but others are similar): chain of trust through the boot process, each component verifies the next one. ROM uses a public key in e.g. eFuses to verify bootloader. Bootloader verifies kernel and boot fs. OS verifies the full rootfs. Root key can also be in flash with just a hash in eFuses, or it can be a certificate chain with just hte hash of the root in eFuses.

Checking rootfs is done with DM-verity. It prevents a persistent rootkit: if the persistent storage is changed, we will know. DM-verity adds hashes and signatures to a readonly filesystem using device-mapper. Check is done every time we access the filesystem, not at boot. It uses a Merkle tree of hashes of blocks to arrive at a root hash that can be verified through a signature. The Merkle tree is stored on the device, so we need to verify log4096(device size) hashes. Cfr. figure in the slides.

This works only for readonly devices: when a block changes, the entire Merkle tree changes. For read-write data, the simplest is using full-disk encryption (dm-crypt) which implicitly does authentication. dm-crypt is a device-mapper layer between the actual filesystem (e.g. ext4) and the block device (e.g. eMMC) so neither of these knows about the encryption. This uses a single key for everything accessing the device, and the key is kept in memory all the time. The key is password-protected.

Problem with whole-disk encryption: multiple users, not possible to avoid encryption for some use cases. For example, alarm clock app is in encrypted storage, if the device reboots during the night, you have to give the password before the alarm clock can start running…. fscrypt solves this by pushing encryption into the fs layer, which allows different or no encryption keys for different directories and files. So e.g. the alarm clock app may be encrypted with a key that is stored in the rootfs, while the sensitive information is encrypted with a user-provided password. Limitation of fscypt: doesn’t hide all metadata, e.g. file size is not encrypted. Multiple keys can be loaded separately into the kernel. When the key is available in the kernel, you can see the file. When the key is not loaded, you can see there is a file but not its name or content.

The problem is the key: it has to be put in the kernel and stay there, so it is vulnerable it the kernel is compromised. Solution is some trusted execution environment, e.g. TrustZone in ARM. TrustZone is a hypervisor mode (called TEE, Trusted Execution Environment) that has access to memory that the normal OS has no access to. The OS then asks the TEE to store the key in memory that is not accessible to the kernel. It is never possible to get it out again; to do encryption, the kernel asks the TEE to put the key in a hardware crypto engine.

Instead of a TEE, you can also use a Trusted Platform Module (TPM) discrete from the CPU. Keys are directly stored in there and never go to flash; they are even generated in the TPM so they really never ever go to memory. But of course the TPM can still have bugs that can be exploited. The TPM can also do attestation: give access to certificates only if a certain set of hashes (of the HW and SW state) is provided in a certain order. This is done with Integrity Measurement Architecture (IMA) subsystem in Linux. Attestation is a way to check a sequence of hashes without needing to store all the hashes.









Introducing the “Lab in a Box” Concept – Patrick Titiano & Kevin Hilman, BayLibre

BayLibre does HW and SW support for embedded (Linux) systems, and also does kernelci.

Lab in a box = PC with stuff to connect a board and running LAVA.

KernelCI is a distributed test farm on 250 boards doing 2700 boot-tests per day. Pulls from various trees, builds various configs, distributes them to the boards, and sends the results to the relevant mailing lists. The build servers that pull the git repos and build the kernels are centralized, the boards are distributed in a dozen labs. The Lab in a Box is an easy-to-set-up lab. Also AGL uses the KernelCI tools but with a different centralized test master.

First reason for Lab in a box is to clean up the cable mess and shelves. Provide something that is easily maintainable, shareable and easily duplicated. Also simplify the administration (i.e. deploy LAVA, now easier though through the use of Docker, simplify device description, simplify which tty device is which board) by adding a web administration & control panel. Ultimately, accelerate deployment. Everything in one case, including the software. But still relatively low cost.


  • A lot of stuff in one box, not much space.
  • Power control, different boards have different supply requirements.
  • Make it easy to install a board so you don’t have to spend a day to install a board.

The box is a normal PC tower with

  • Celeron quadcore with 8GB RAM and 120GB SSD – required for LAVA.
  • Plenty of fans in case cooling is needed.
  • ATX power supply also powers the boards – it provides 5V and 12V. This saves on power wiring.
  • Home made power measurement and control board called ACME cape on BBB.
  • USB hub for network consoles + FTDI USB serial cables.
  • Network switch. Each device on a separate LAN. Lavabox itself needs internet access to connect to kernelCI.
  • 6 DUT: RPi3, BBB, Le Potato, DragonBoard, R-Car M3, SABRELight. They are installed in drive bays, so easy to insert and remove.

BOM cost (ex. DUTs) about 400 EUR, but you can use components that you already have to reduce costs.

For DragonBoard, it’s fastboot so extra USB cable to drive fastboot. Some don’t have Ethernet so instead use NCM gadget or USB storage. Some devices are powered over USB.

LAVA slave provides the DHCP, TFTP, NFS, … It also knows (through config) the USB-serial ports to the boards (using udev rules, FTDI cables have a unique ID). Manages update through fastboot or USB storage where needed. lavapdu-daemon controls power of each board, with backends for various PDUs including BBB-ACME.

LAVA master schedules the tests. It also has the descriptions of the boards and gives them to the slaves (even though they are no use to the master, would be more logical to put this on the slaves). Board has a device-type (e.g. BBB, includes how to boot and give it a kernel) and device (instance, specifies which ports etc. to use). Written in jinja2 templating language, which makes it very powerful e.g. can set up custom things for a specific device.

squid proxy to avoid downloading the same kernel from over and over again.

All the pieces are in docker containers and combined with docker-compose.yml file.

Lab in a Box is an example, you can build your own and replace any of the components. The SW doesn’t depend on the specific HW, thanks to the LAVA abstraction layers that put everything in config files.


  • Less of a mess, can all fit in a nice box. Fits comfortably in an appartment.
  • Integrated SW and easy administration (still under development).
  • Good demonstrator to evangelise CI.
  • Easy to replace DUTs.


  • Tedious to build the PC case. Needs drilling and soldering.
  • Pretty densely packed so not easy to build.
  • Limit on DUT size to fit in a drive bay.
  • Only 5V and 12V DUTs, and must be balanced across ATX outputs. Can be solved with a higher-power ATX supply that has more rails.
  • No button presses for boards that need that, need separate relay for it.
  • Even with a larger case, it wouldn’t be possible to add more devices: wiring is the limitation.
  • Does not really scale to installations with dozens of boards. Develop a rack-mounted solution for that.
  • No standard DUT connector for the boards, it’s always custom wiring. Should work with board manufacturers to develop a standard connector.
  • Too complex and expensive for a 1-board lab. Build mini version for this use case.
  • Administrative control panel doesn’t exist yet – currently a YAML file.
  • No documentation (yet).

Competing project: a nanoPi hat that connects to a single DUT, that will be open hardware. Published on Tizen wiki.

BoF: Collaborating to Create the Secure OTA Update Systems for Linux – Alan Bennet & Ricardo Salveti, Open Source Foundries

There are too many open source OTA implementations doing similar things. There are a lot of pieces that could be shared, especially on the security side to make sure we get it right.


  • Atomic updates. Also fast, the system can’t be offline for a long time. Easier if you have a readonly rootfs.
  • All pieces must be updateable, including bootloader.
  • Failsafe/rollback, using a watchdog. Necessary because you may have different versions of the hardware.
  • Verification of the image (incl. signing).
  • No vendor lock-in.
  • Trusted boot.

Two basic modes: block based (= A/B bank) which usually implies full update (but can also do a diff update). Examples: swupdate, mender, rauc, resinos. File-based update doesn’t overwrite a partition but individual files. Could still be multiple partitions, or overlays. Server side is more complex because it needs to calculate what needs to be updated. Examples: OSTree (used in several projects, e.g. Project Atomic, flatpak, AGL), swup (Intel-specific).

Trusted boot is still problematic. It is hardware specific, TEE is not widely used.

Secure software distribution is also not solved. HTTPS is obviously not enough, even if you check the HTTPS certificate. E.g. downgrade attack: send a valid, signed but vulnerable old version of the software to devices. There is a specification (based on Tor): The Update Framework (TUF) that enumerates what should be checked. E.g. docker and pip implement this.

There was no time for discussion, Ricardo used all the time for the introductory presentation. However, there is a wiki page

Automation beyond Testing and Embedded System Validation – Jan Luebbe, Pengutronix

Pengutronix builds embedded linux systems for customers, everything below the application. In addition to the kernel, that includes mesa, wayland, Qt, chromium, gstreamer, …. All that changes all the time and sometimes breaks. This kind of testing is “solved” by Jenkins and Lava.

Continue reading

Low Level Sensor Programing and Security Enforcement with MRAA – Brendan Le Foll, Intel Corporation is a simple userspace I/O protocol to unify a plethora of interfaces: UART, GPIO, I2C, ADCs (IIO), 1wire, …. MRAA is the API spec, libmraa is the C/C++ implementation. Also bindings for python, nodejs, java, and unsupported bindings for a bunch of other languages (e.g. lua). Made for monkeys, so easier is better. On Linux, MRAA brings the I/O that is typically reserved for the kernel available to userspace. It’s mainly for quick prototyping, but turns out to be used in actual products. Platform quirks are abstracted, supports lots of devboards. E.g. it does the pinmuxing if necessary. Sometimes even uses devmem if the crappy vendor kernel doesn’t allow things to be done properly.

Most calls are syncrhonous.

GPIO interface allows to register ISR with a callbac function.,

On top of this API, a sensor library has been added: UPM (Useful Plugins for MRAA). It gives code examples of how to use each sensor..

To add a board, there are 3 ways:

  • Raw mode: no platform definition, just map the pins to the kernel representation e.g. gpio numbers.
  • C platform configuration: same kind of mapping, but also override things where necessary.
  • JSON file is similar to raw mode but you can give names etc., just no overrides.

To do things like devmem manipulations safely, there is a daemon that checks permissions.

On Android, there is a peripheralmanager that authorizes access to GPIOs etc. This was reused to support MRAA and a bakcend was added to libmraa that talks to peripheralmanager over Binder. This way, all the sensors become available on Android.

AFB is the equivalent of Binder in Automotive Grade Linux. Every application has a SMACK security context, and a binder in the same security context. The binder exposes the bindings that the application has access to (and only those). AFB doesn’t require the rest of AGL. To use MRAA with this, there is a global libmraa that actually talks to the kernel, and another libmraa in each application that talks to the binder which talks to the global libmraa. This way, each application can only access the messages that were meant for it. The two libmraas are in fact built differently. The application libmraa is built with BUILDARCH=AFB, which replaces all the normal kernel calls with calls to AFB’s binder. In a similar way, it is possible to build a libmraa that uses I/Os that are not directly accessible by the kernel, e.g. an extension board connected over UART.