Until a few years ago, the 80/20 rule was valid: in any significant piece of software, 80 percent of the content should not be yours. It makes no economic sense to try to develop more than 20 percent of any software because it’s likely someone has already built components with the necessary functionality. Instead, focus on developing what gives you a competitive advantage. In recent years, this balance might have even shifted to 90/10.
That’s where the software bill of materials (SBOM) comes in. It’s a formal record containing details and supply chain relationships of all the components used in building software. These components can be Open Source or proprietary, freely available or paid-for, widely available or access-restricted. The information present in an SBOM can be used in a multitude of ways, helping answer various contractual, legal, or technical queries about the software.
Early efforts for providing SBOMs were mostly spearheaded by the desire for legal compliance. Every software component is under a specific license, which might impose some obligations on its use. In order to be legally compliant, one must satisfy all the obligations of all the licenses. This is straightforward, but not easily accomplished. An obvious first step is to have a record of all components and all licenses, which is exactly what an SBOM is.
However, in the past couple of years, as a result of software supply chain attacks, the driving force behind SBOM adoption and the need to know the exact components inside each piece of software has been security. SBOMs are now expected to accompany some types of software delivery. For example, the United States Executive Order (EO) 14028 advises US government agencies to start requiring SBOMs for any hardware or software product they acquire.
What is a software bill of materials (SBOM)?
At a conceptual level, an SBOM is like a simple table of contents: it’s a comprehensive list of software components, with information on name, version, origin, and possibly additional information about licensing, vulnerabilities, provenance, or any other areas of interest. Because it can be easily understood, this information can be expressed in several formats: as a table, as a text document, as a spreadsheet, and so on. For the information to be useful, the same format should be understood and agreed upon by both members of an exchange.
Software Package Data Exchange (SPDX)
More than ten years ago, a group of interested individuals representing various companies started working on the problem of defining a common, standardized format that they called Software Package Data Exchange (SPDX). Everyone agreed that this standard should not be a competitive advantage for any specific company, so the work progressed following the open source principles completely, with open participation by anyone who wanted to contribute.
SPDX is an open standard for communicating SBOM information. Last year it was ratified as the international standard ISO/IEC 5962:2021. The SPDX specification is produced in a collaborative way gathering a large number of participants, organized into working groups according to their interests and expertise. Intel has been an active participant in many groups since the beginning, such as the technical team defining the SPDX specification, the legal team working on the SPDX License List, and the outreach team promoting the use of SPDX.
The approach taken by SPDX is that the information present in an SBOM should be factual. For example, it simply records the license declared for each software component and avoids legal interpretations of license terms or obligations. Another important characteristic of SPDX is that the information can be encoded in a variety of formats, like pure text with minimal structure, JSON, XML, RDF and even spreadsheets.
The structure of an SPDX document is hierarchical. In addition to information relevant to the document itself, like author and date, the information is presented at levels of increasing granularity, corresponding to packages, files, or snippets. Almost all the information at every level is optional, so one can generate an SBOM giving a general view or one containing information in excruciating detail. The flexibility of the format makes it ideal for any number of real-world use cases. For example, a recipient of an SBOM might only be interested in security vulnerability information, while another might care about which licenses the different components are under and the legal obligations they impose.
A number of tools can handle SPDX documents. Depending on the functionality and the precise point in the software supply chain where the tool operates, one can have a full taxonomy of tools. For example, the SPDX document might be produced while software is being built or it might be generated afterward by analyzing the software already built. Other tools consume this information and can analyze, transform, compare, or merge SPDX documents.
Working groups are currently designing the next major release version. SPDX version 3 is a major effort, restructuring the SBOM information into modular, compartmentalized sections. This will make it possible, for example, to have an SBOM with special emphasis on security and vulnerability information and less content on licensing details. Given the ever-increasing use cases for SBOMs, this modular approach is expected to result in more widespread adoption.