Commit 5bf502b6 authored by Mark Lodato's avatar Mark Lodato
Browse files

Split into multiple files.

parent 21e12e71
This diff is collapsed.
# Background
## Motivating example
Consider the example of using [curl]( through its
[official docker image][curlimages/curl]. What threats are we exposed to in the
software supply chain? (We choose curl simply because it is a popular
open-source package, not to single it out.)
The first problem is figuring out the actual supply chain. This requires
significant manual effort, guesswork, and blind trust. Working backwards:
* The "latest" tag in Docker Hub points to
* It claims to have come from a Dockerfile in the
GitHub repository.
* That Dockerfile reads the following artifacts, assuming there are no further
fetches during build time:
* Docker Hub image:
* Alpine packages: libssh2 libssh2-dev libssh2-static autoconf automake
build-base groff openssl curl-dev python3 python3-dev libtool curl
stunnel perl nghttp2
* File at URL:
* Each of the dependencies has its own supply chain, but let's look at
[curl-dev], which contains the actual "curl" source code.
* The package, like all Alpine packages, has its build script defined in an
in the Alpine git repo. There are several build dependencies:
* File at URL:
* The APKBUILD includes a sha256 hash of this file. It is not clear
where that hash came from.
* Alpine packages: openssl-dev nghttp2-dev zlib-dev brotli-dev autoconf
automake groff libtool perl
* The source tarball was _presumably_ built from the actual upstream GitHub
[curl/curl@curl-7_72_0](, by
running the commands `./buildconf && ./configure && make && ./maketgz
7.72.0`. That command has a set of dependencies, but those are not well
* Finally, there are the systems that actually ran the builds above. We have
no indication about their software, configuration, or runtime state
Suppose some developer's machine is compromised. What attacks could potentially
be performed unilaterally with only that developer's credentials? (None of these
are confirmed.)
* Directly upload a malicious image to Docker Hub.
* Point the CI/CD system to build from an unofficial Dockerfile.
* Upload a malicious Dockerfile (or other file) in the
git repo.
* Upload a malicious
* Upload a malicious APKBUILD in Alpine's git repo.
* Upload a malicious [curl-dev] Alpine package to the Alpine repository. (Not
sure if this is possible.)
* Upload a malicious (Won't
be detected by APKBUILD's hash if the upload happens before the hash is
* Upload a malicious change to the [curl/curl](
git repo.
* Attack any of the systems involved in the supply chain, as in the
[SolarWinds attack](
SLSA intends to cover all of these threats. When all artifacts in the supply
chain have a sufficient SLSA level, consumers can gain confidence that most of
these attacks are mitigated, first via self-certification and eventually through
automated verification.
Finally, note that all of this is just for curl's own first-party supply chain
steps. The dependencies, namely the Alpine base image and packages, have their
own similar threats. And they too have dependencies, which have other
dependencies, and so on. Each dependency has its
[own SLSA level](#scope-of-slsa) and the
[composition of SLSA levels](#composition-of-slsa-levels) describes the entire
supply chain's security.
For another look at Docker supply chain security, see
[Who's at the Helm?](
For a much broader look at open source security, including these issues and many
more, see [Threats, Risks, and Mitigations in the Open Source Ecosystem].
## Vision: Case Study
Let's consider how we might secure [curlimages/curl] from the
[motivating example](#motivating-example) using the SLSA framework.
### Incrementally reaching SLSA 3
Let's start by incrementally applying the SLSA principles to the final Docker
#### SLSA 0: Initial state
Initially the Docker image is SLSA 0. There is no provenance. It is difficult to
determine who built the artifact and what sources and dependencies were used.
The diagram shows that the (mutable) locator `curlimages/curl:7.72.0` points to
(immutable) artifact `sha256:3c3ff…`.
#### SLSA 1: Provenance
We can reach SLSA 1 by scripting the build and generating
[provenance]( The build script was
already automated via `make`, so we use simple tooling to generate the
provenance on every release. Provenance records the output artifact hash, the
builder (in this case, our local machine), and the top-level source containing
the build script.
In the updated diagram, the provenance attestation says that the artifact
`sha256:3c3ff…` was built from
At SLSA 1, the provenance does not protect against tampering or forging but may
be useful for vulnerability management.
#### SLSA 1.5 and 2: Build service
To reach SLSA 1.5 (and later SLSA 2), we must switch to a hosted build service
that generates provenance for us. This updated provenance should also include
dependencies on a best-effort basis. SLSA 2 additionally requires the source and
build platforms to implement additional security controls, which might need to
be enabled.
In the updated diagram, the provenance now lists some dependencies, such as the
base image (`alpine:3.11.5`) and apk packages (e.g. `curl-dev`).
At SLSA 2, the provenance is significantly more trustworthy than before. Only
highly skilled adversaries are likely able to forge it.
#### SLSA 3: Hermeticity and two-person review
SLSA 3 [requires](#level-requirements) two-party source control and hermetic
builds. Hermeticity in particular guarantees that the dependencies are complete.
Once these controls are enabled, the Docker image will be SLSA 3.
In the updated diagram, the provenance now attests to its hermeticity and
includes the `cacert.pem` dependency, which was absent before.
At SLSA 3, we have high confidence that the provenance is complete and
trustworthy and that no single person can unilaterally change the top-level
### Full graph
We can recursively apply the same steps above to lock down dependencies. Each
non-source dependency gets its own provenance, which in turns lists more
dependencies, and so on.
The final diagram shows a subset of the graph, highlighting the path to the
upstream source repository ([curl/curl]( and the
certificate file ([cacert.pem](
In reality, the graph is intractably large due to the fanout of dependencies.
There will need to be some way to trim the graph to focus on the most important
components. While this can reasonably be done by hand, we do not yet have a
solid vision for how best to do this in an scalable, generic, automated way. One
idea is to use ecosystem-specific heuristics. For example, Debian packages are
built and organized in a very uniform way, which may allow Debian-specific
### Composition of SLSA levels
An artifact's SLSA level is not transitive, so some aggregate measure of
security risk across the whole supply chain is necessary. In other words, each
node in our graph has its own, independent SLSA level. Just because an
artifact's level is N does not imply anything about its dependencies' levels.
In our example, suppose that the final [curlimages/curl] Docker image were SLSA
3 but its [curl-dev] dependency were SLSA 0. Then this would imply a significant
security risk: an adversary could potentially introduce malicious behavior into
the final image by modifying the source code found in the [curl-dev] package.
That said, even being able to _identify_ that it has a SLSA 0 dependency has
tremendous value because it can help focus efforts.
Formation of this aggregate risk measure is left for future work. It is perhaps
too early to develop such a measure without real-world data. Once SLSA becomes
more widely adopted, we expect patterns to emerge and the task to get a bit
### Accreditation and delegation
Accreditation and delegation will play a large role in the SLSA framework. It is
not practical for every software consumer to fully vet every platform and fully
walk the entire graph of every artifact. Auditors and/or accreditation bodies
can verify and assert that a platform or vendor meets the SLSA requirements when
configured in a certain way. Similarly, there may be some way to "trust" an
artifact without analyzing its dependencies. This may be particularly valuable
for closed source software.
<!-- Links -->
[Threats, Risks, and Mitigations in the Open Source Ecosystem]:
# Frequently Asked Questions
## What about reproducible builds?
When talking about [reproducible builds](
builds, there are two related but distinct concepts: "reproducible" and
"verified reproducible."
"Reproducible" means that repeating the build with the same inputs results in
bit-for-bit identical output. This property
including easier debugging, more confident cherry-pick releases, better build
caching and storage efficiency, and accurate dependency tracking.
For these reasons, SLSA 3 [requires](#level-requirements) reproducible builds
unless there is a justification why the build cannot be made reproducible.
justifications include profile-guided optimizations or code signing that
invalidates hashes. Note that there is no actual reproduction, just a claim that
reproduction is possible.
"Verified reproducible" means using two or more independent build systems to
corroborate the provenance of a build. In this way, one can create an overall
system that is more trustworthy than any of the individual components. This is
as a solution to supply chain integrity. Indeed, this is one option to secure
build steps of a supply chain. When designed correctly, such a system can
satisfy all of the SLSA build requirements.
That said, verified reproducible builds are not a complete solution to supply
chain integrity, nor are they practical in all cases:
* Reproducible builds do not address source, dependency, or distribution
* Reproducers must truly be independent, lest they all be susceptible to the
same attack. For example, if all rebuilders run the same pipeline software,
and that software has a vulnerability that can be triggered by sending a
build request, then an attacker can compromise all rebuilders, violating the
assumption above.
* Some builds cannot easily be made reproducible, as noted above.
* Closed-source reproducible builds require the code owner to either grant
source access to multiple independent rebuilders, which is unacceptable in
many cases, or develop multiple, independent in-house rebuilders, which is
likely prohibitively expensive.
Therefore, SLSA does not require verified reproducible builds directly. Instead,
verified reproducible builds are one option for implementing the requirements.
For more on reproducibility, see
[Hermetic, Reproducible, or Verifiable?](
# SLSA definitions
_Reminder: The definitions below are not yet finalized and subject to change. In
particular, (1) we expect to renumber the levels to be integers; and (2) levels
2-3 are likely to undergo further changes._
An artifact's **SLSA level** describes the integrity strength of its direct
supply chain, meaning its direct sources and build steps. To verify that the
artifact meets this level, **provenance** is required. This serves as evidence
that the level's requirements have been met.
## Terminology
An **artifact** is an immutable blob of data. Example artifacts: a file, a git
commit, a directory of files (serialized in some way), a container image, a
firmware image. The primary use case is for _software_ artifacts, but SLSA can
be used for any type of artifact.
A **software supply chain** is a sequence of steps resulting in the creation of
an artifact. We represent a supply chain as a
[directed acyclic graph](
of sources, builds, dependencies, and packages. Furthermore, each source, build,
and package may be hosted on a platform, such as Source Code Management (SCM) or
Continuous Integration / Continuous Deployment (CI/CD). Note that one artifact's
supply chain is a combination of its dependencies' supply chains plus its own
sources and builds.
The following diagram shows the relationship between concepts.
![Software Supply Chain Model](images/supply-chain-model.svg)
<td>Artifact that was directly authored or directly by persons, without modification. It is the beginning of the supply chain; we do not trace the provenance back any further.
<td>Git commit (source) hosted on GitHub (platform).
<td>Process that transforms a set of input artifacts into a set of output artifacts. The inputs may be sources, dependencies, or ephemeral build outputs.
<td>.travis.yml (process) run by Travis CI (platform).
<td>Artifact that is "published" for use by others. In the model, it is
always the output of a build process, though that build process can be a
<td>Docker image (package) distributed on DockerHub (platform).
<td>Artifact that is an input to a build process but that is not a source. In
the model, it is always a package.
<td>Alpine package (package) distributed on Alpine Linux (platform).
Special cases:
* A ZIP file is containing source code is a package, not a source, because it
is built from some other source, such as a git commit.
## Level descriptions
_This section is non-normative._
There are four SLSA levels. SLSA 3 is the current highest level and represents
the ideal end state. SLSA 1–2 offer lower security guarantees but are easier to
meet. In our experience, achieving SLSA 3 can take many years and significant
effort, so intermediate milestones are important.
<td>SLSA 3
<td>"Auditable and Non-Unilateral." High confidence that (1) one can correctly and easily trace back to the original source code, its change history, and all dependencies and (2) no single person has the power to make a meaningful change to the software without review.
<td>SLSA 2
<td>"Auditable." Moderate confidence that one can trace back to the original source code and change history. However, trusted persons still have the ability to make unilateral changes, and the list of dependencies is likely incomplete.
<td>SLSA 1.5
<td>Stepping stone to higher levels. Moderate confidence that one can determine either who authorized the artifact or what systems produced the artifact. Protects against tampering after the build.
<td>SLSA 1
<td>Entry point into SLSA. Provenance indicates the artifact's origins without any integrity guarantees.
## Level requirements
<!-- When editing this table, also edit ../ -->
<tr><th colspan="2"> <th colspan="4">Required at</tr>
<tr><th colspan="2">Requirement<th>SLSA 1<th>SLSA 1.5<th>SLSA 2<th>SLSA 3</tr>
<tr><td rowspan="4">Source
<td>Version Controlled <td> <td><td><td></tr>
<tr><td>Verified History <td> <td> <td><td></tr>
<tr><td>Retained Indefinitely <td> <td> <td>18 mo.<td></tr>
<tr><td>Two-Person Reviewed <td> <td> <td> <td></tr>
<tr><td rowspan="6">Build
<td>Scripted <td><td><td><td></tr>
<tr><td>Build Service <td> <td><td><td></tr>
<tr><td>Ephemeral Environment <td> <td> <td><td></tr>
<tr><td>Isolated <td> <td> <td><td></tr>
<tr><td>Hermetic <td> <td> <td> <td></tr>
<tr><td>Reproducible <td> <td> <td> <td></tr>
<tr><td rowspan="5">Provenance
<td>Available <td><td><td><td></tr>
<tr><td>Authenticated <td> <td><td><td></tr>
<tr><td>Service Generated <td> <td><td><td></tr>
<tr><td>Non-Falsifiable <td> <td> <td><td></tr>
<tr><td>Dependencies Complete <td> <td> <td> <td></tr>
<tr><td rowspan="3">Common
<td>Security <td> <td> <td> <td></tr>
<tr><td>Access <td> <td> <td> <td></tr>
<tr><td>Superusers <td> <td> <td> <td></tr>
_○ = required unless there is a justification_
Note: The actual requirements will necessarily be much more detailed and
nuanced. We only provide a brief summary here for clarity.
**[Source]** Requirements for the artifact's top-level source (i.e. the one
containing the build script):
* **[Version Controlled]** Every change to the source is tracked in a version
control system that identifies who made the change, what the change was, and
when that change occurred.
* **[Verified History]** The version control history indicates which actor
identities (author, uploader, reviewer, etc.) and timestamps were strongly
authenticated. For example, GitHub-generated merge commits for pull requests
meet this requirement.
* **[Retained Indefinitely]** The artifact and its change history are retained
indefinitely and cannot be deleted.
* **[Two-Person Review]** At least two trusted persons agreed to every change
in the history.
**[Build]** Requirements for the artifact's build process:
* **[Scripted]** All build steps were fully defined in some sort of "build
script". The only manual command, if any, was to invoke the build script.
* **[Build Service]** All build steps ran using some build service, such as a
Continuous Integration (CI) platform, not on a developer's workstation.
* **[Ephemeral Environment]** The build steps ran in an ephemeral environment,
such as a container or VM, provisioned solely for this build, and not reused
by other builds.
* **[Isolated]** The build steps ran in an isolated environment free of
influence from other build instances, whether prior or concurrent. Build
caches, if used, are purely content-addressable to prevent tampering.
* **[Hermetic]** All build steps, sources, and dependencies were fully
declared up front with immutable references, and the build steps ran with no
network access. All dependencies were fetched by the build service control
plane and checked for integrity.
* **[Reproducible]** Re-running the build steps with identical input artifacts
results in bit-for-bit identical output. (Builds that cannot meet this must
provide a justification.)
**[Provenance]** Requirements for the artifact's provenance:
* **[Available]** Provenance is available to the consumer of the artifact, or
to whomever is verifying the policy, and it identifies at least the
artifact, the system that performed the build, and the top-level source. All
artifact references are immutable, such as via a cryptographic hash.
* **[Authenticated]** Provenance's authenticity and integrity can be verified,
such as through a digital signature.
* **[Service Generated]** Provenance is generated by the build service itself,
as opposed to user-provided tooling running on top of the service.
* **[Non-Falsifiable]** Provenance cannot be falsified by the build service's
* **[Dependencies Complete]** Provenance records all build dependencies,
meaning every artifact that was available to the build script. This includes
the initial state of the machine, VM, or container of the build worker.
**[Common]** Common requirements for every trusted system involved in the supply
chain (source, build, distribution, etc.):
* **[Security]** The system meets some TBD baseline security standard to
prevent compromise. (Patching, vulnerability scanning, user isolation,
transport security, secure boot, machine identity, etc. Perhaps
[NIST 800-53](
or a subset thereof.)
* **[Access]** All physical and remote access must be rare, logged, and gated
behind multi-party approval.
* **[Superusers]** Only a small number of platform admins may override the
guarantees listed here. Doing so MUST require approval of a second platform
## Scope of SLSA
SLSA is not transitive. It describes the integrity protections of an artifact's
build process and top-level source, but nothing about the artifact's
dependencies. Dependencies have their own SLSA ratings, and it is possible for a
SLSA 3 artifact to be built from SLSA 0 dependencies.
The reason for non-transitivity is to make the problem tractable. If SLSA 3
required dependencies to be SLSA 3, then reaching SLSA 3 would require starting
at the very beginning of the supply chain and working forward. This is
backwards, forcing us to work on the least risky component first and blocking
any progress further downstream. By making each artifact's SLSA rating
independent from one another, it allows parallel progress and prioritization
based on risk. (This is a lesson we learned when deploying other security
controls at scale throughout Google.)
We expect SLSA ratings to be composed to describe a supply chain's overall
security stance, as described in the
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment