Developers Care About the License: Using SPDX to Describe License Information – Jilayne Lovejoy, ARM

Jilayne previously worked for a company that did audit services, where she learned about all the ways that developers can not provide license information in their code. If they would do that better, then some of these tools would not be necessary.
What developers really care about is sharing the code. So how do we go about that? Given that it is by default protected by copyright? By giving permission. If you don’t do that (if you don’t specify a license), it’s not open source.
Github is notorious for not having licenses. When gh added a way to create a license file at repo creation time, the % of projects with license jumped from 10% to 20%. Repos with a higher rating have a higher % of licences.
How do we specify the license? LICENSE.txt. But the problem is that as the package travels downstream (e.g. binaries), it may get lost. In addition, there can be many different components collected together so the license may be hard to find. When someone takes a few files out of the package, the package license also becomes useful. So it’s important to put a license notice in every file.
SPDX: human and machine readable format, focuses on capturing facts, not interpretation. It’s a pretty large standard, 72 different use cases were considered for defining it.
So how to use it as a developer? First of all, use a short identifier from the SPDX License List. This includes guidelines to identify if the license text matches. Put this identifier in every source file. E.g. SPDX-License-Identifier: xxxx, either with or without the actual short license reference. Of course, also include the full license text in the project. Cfr. poco project.
What if more than one license applies? There is a license expression syntax, see appendix IV in the license list. Operators: AND, OR, WITH (= exceptions), + (= or later).
Second way to express license: provide an SPDX document. The SPDX file contains file checksums, so you need a tool to generate it. FOSSology (install or use unomaha instance), WindRiver (submit through website), Yocto generates during build, Debian generates during build, Maven plugin generates during build, Eclipse plugin (under development), DoSOCS (stores info in a database), and more.
What SPDX by itself doesn’t solve is getting good data, i.e. do you trust the SPDX file you get from someone else?
If not all files are really used to generate the binary, SPDX does allow you to express dependencies between files (i.e., the binary and the corresponding source) and use this to interpret what the license of the binary is.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s