Datadog-Sbom-Generator
This repository contains the source code of Datadog's SBOM Generator.
Its goal is to scan a cloned repository folder to extract dependencies which would
be installed on your systems and produce a CycloneDX SBOM out of it.
If you're interested in this repository, you might be interested in Setting up Software Composition Analysis in your repositories.
How to install
- Go to the release page
- Select the version you want to use (or use the latest version)
- Download the asset depending on your operating system and your CPU architecture
- Unzip the asset
Running the scanner
To scan a repository folder and generate a SBOM, you can use this command:
datadog-sbom-generator scan -o "/tmp/sbom.json" "/path/to/directory"
For detailed documentation on all commands and options, see USAGE.md.
You can also get help directly from the command line:
datadog-sbom-generator --help
datadog-sbom-generator scan --help
Supported package managers
This tool sources all dependencies by parsing package manager files. As new package managers appears everyday, we do not support all of them. Here's a list of supported package managers:
| Language |
Package Manager |
| .NET |
Nuget |
| C++ |
Conan |
| Go |
Golang |
| Java |
Gradle, Maven, Bazel (rules_jvm_external) |
| JavaScript |
Bun, NPM, PNPM, Yarn, package.json |
| PHP |
Composer |
| Python |
Pdm, Pipenv, Poetry, Requirements, uv |
| Ruby |
Bundler |
| Rust |
Crates |
| Swift |
Swift Package Manager |
Limitations
Datadog SBOM Generator reads package manager dependencies declaration files or their lock files. It means it can only scan
dependencies which are declared in a standard and enforced way by each supported dependency manager.
We will detail here any known limitations by language.
Python
This tool supports extracting packages from the following Python lockfiles and requirements files:
requirements*.txt
Pipfile.lock
poetry.lock
pdm.lock
uv.lock
This tool also supports enriching lockfile-derived package information from the following package manager declaration files:
pyproject.toml (no lockfile)
- This tool supports extracting packages directly from
pyproject.toml when no sibling lockfile (Pipfile.lock, poetry.lock, pdm.lock, uv.lock) is present.
- Parses PEP 621
dependencies and optional-dependencies, PEP 735 dependency-groups, and Poetry dependency sections.
- Keeps exact pins as versions.
- Preserves version ranges as
datadog:version-range.
- Skips unsupported direct references, arbitrary equality (
===), and unversioned dependencies.
- All packages are marked as direct dependencies requiring transitive enrichment.
Java
Maven
- This tool only supports extracting packages and locations from
pom.xml.
- It can only scan
pom.xml files which are stored in the same repository.
- If a pom file defines a parent that is not stored in the repository or is an artifact hosted by an artifact registry, the scanner will try to download it from Maven central. If the scanner cannot locate it there, or cannot access it, it won't be able to resolve the version.
Gradle
- This tool only supports extracting packages from
gradle.lockfile.
- This tool only supports package information enrichment from
build.gradle and gradle/verification-metadata.xml files.
Bazel
- This tool supports extracting packages from
maven_install.json (and any {name}_maven_install.json variant) produced by rules_jvm_external.
- Both the v1
dependency_tree format (rules_jvm_external < 5.1) and the v2/v3 artifacts map format (rules_jvm_external ≥ 5.1) are supported.
IsDirect is not set; distinguishing direct from transitive dependencies would require parsing the Bazel workspace files.
Javascript and Typescript
NPM, Yarn and PNPM have workspace support
Bun
- This tool only supports extracting packages from
bun.lock (the text JSONC lockfile introduced in Bun 1.2). The legacy binary bun.lockb format is not supported.
- This tool only supports package information enrichment from
package.json.
package.json (no lockfile)
- This tool supports extracting packages directly from
package.json when no active lockfile (package-lock.json, yarn.lock, pnpm-lock.yaml) is present in the package directory or an ancestor directory.
- Parses
dependencies, devDependencies, and optionalDependencies sections.
- Keeps exact pins as versions.
- Preserves version ranges as
datadog:version-range.
- Skips unsupported non-version specifiers such as local file paths, URLs, and dist-tags.
- All packages are marked as direct dependencies requiring transitive enrichment.
NPM
- This tool only supports extracting packages from
package-lock.json.
- This tool only supports package information enrichment from
package.json.
Yarn
- This tool only supports extracting packages from
yarn.lock.
- This tool only supports package information enrichment from
package.json.
PNPM
- This tool only supports extracting packages from
pnpm-lock.yaml.
- This tool only supports package information enrichment from
package.json.
.Net
Nuget
- This tool supports extracting packages from
packages.lock.json and *.csproj.
- This tool only supports package information enrichment from
*.csproj when parsing packages.lock.json.
- Central and build configuration discovery:
- The tool automatically discovers
Directory.Packages.props and Directory.Build.props.
- Discovery is performed in the limit of the scanned directory.
- Only configuration files found within the scan scope are considered; parent directories outside the scan root are intentionally ignored.
Ruby
Bundler
- This tool only supports extracting packages from
Gemfile.lock.
- This tool only supports package information enrichment from
Gemfile and *.gemspec.
- If the version of a package is defined in a variable, the location reported by the scanner will be the usage of the variable.
- Dependencies sourced from Git repositories won't have any version reported.
C++
Conan
- This tool only supports extracting packages from
conan.lock.
Rust
Crates
- This tool only supports extracting packages from
Cargo.lock.
- This tool supports package information enrichment from
Cargo.toml, including dependencies declared in [dependencies], [dev-dependencies], and [build-dependencies] sections.
- Workspace support is not currently available.
- Renaming dependencies is not supported.
Swift
Swift Package Manager
- This tool only supports extracting packages from
Package.resolved (v1, v2, and v3 formats).
- This tool only supports package information enrichment from
Package.swift.
- When Xcode writes the lockfile at
.swiftpm/configuration/Package.resolved, Package.swift enrichment is not available because the manifest is two directory levels above the lockfile.
IsDirect is only set for packages declared with a URL in Package.swift. Registry-based dependencies (.package(id: ...)) are always reported as transitive.
Using as a Go library
In addition to the CLI binary, datadog-sbom-generator can be imported as a Go library.
Quick start — run the bundled example
The repository ships with a ready-made example program at examples/main.go that you can run directly against any local directory to see the full library output:
# Scan the built-in Cargo.lock fixture (no arguments needed)
go run examples/main.go
# Scan your own repository
go run examples/main.go /path/to/your/repo
The program prints the generated CycloneDX SBOM JSON to stdout and a summary of all manifest build files (with their dependencies) to stderr. It is the fastest way to verify library behaviour end-to-end without writing any code.
Installation
go get github.com/DataDog/datadog-sbom-generator/pkg/sbomgen
Usage
package main
import (
"fmt"
"log"
"github.com/DataDog/datadog-sbom-generator/pkg/sbomgen"
)
func main() {
result, err := sbomgen.GenerateSBOM([]string{"/path/to/repo"}, sbomgen.DefaultOptions())
if err != nil {
log.Fatal(err)
}
fmt.Println(string(sbom)) // CycloneDX 1.5 JSON
// Extract the manifest build files and their transitive dependencies.
buildFiles := sbomgen.GetBuildFileTrees(sbom)
for bf, rels := range buildFiles {
fmt.Printf("[%s] %s (id: %s)\n", bf.FileType, bf.FilePath, rels.ID)
for _, dep := range rels.Dependencies {
fmt.Printf(" dep: %s\n", dep.FilePath)
}
}
}
API
GenerateSBOM(dirs []string, opts Options) ([]byte, error)
Scans the given directories for lockfiles and returns a CycloneDX 1.5 SBOM as pretty-printed JSON bytes.
Returns an error if dirs is empty or if the scan fails.
GetBuildFileTrees(sbom []byte, filters ...FileType) map[BuildFile]BuildFileRelations
Parses a CycloneDX SBOM (as returned by GenerateSBOM) and returns all manifest build files enriched with their transitive dependencies.
Each BuildFile carries the file type (e.g. FileTypePomXML) and its path relative to the repository root. Each BuildFileRelations value holds an ID string (ecosystem-specific identifier, e.g. Maven "groupId:artifactId") and a Dependencies slice containing all transitively reachable build files sorted by file path.
Pass one or more FileType constants to restrict the result to a specific ecosystem:
// Only pom.xml files, with Maven transitive dependencies resolved.
mavenFiles := sbomgen.GetBuildFileTrees(sbom, sbomgen.FileTypePomXML)
DefaultOptions() Options
Returns sensible defaults: recursive scanning enabled, no path exclusions.
Options
| Field |
Type |
Description |
Recursive |
bool |
Scan subdirectories recursively (default: true) |
ExcludePaths |
[]string |
Glob patterns to exclude from scanning |
ExtractMavenPomArtifactIds |
bool |
Emit file-type components and dependency edges for Maven POMs, enabling GetBuildFileTrees dependency and ID resolution (default: true) |
Contributing
Contributions are welcome! You can contribute by:
- Reporting issues or requesting features via GitHub Issues
- Submitting pull requests with improvements or bug fixes
For detailed information on building, testing, and developing the project, see CONTRIBUTING.md.
License
The Datadog version of datadog-sbom-generator is licensed under the Apache License, Version 2.0.
Acknowledgement
This project builds upon portions of the osv-scanner project originally developed by Google and released under the Apache License 2.0.
We thank the original authors for their foundational work.