datadog-sbom-generator

module
v1.16.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 4, 2026 License: Apache-2.0

README

Datadog-Sbom-Generator

This repository contains the source code of Datadog's SBOM Generator. Its goal is to scan a cloned repository folder to extract dependencies which would be installed on your systems and produce a CycloneDX SBOM out of it.

If you're interested in this repository, you might be interested in Setting up Software Composition Analysis in your repositories.

How to install

  1. Go to the release page
  2. Select the version you want to use (or use the latest version)
  3. Download the asset depending on your operating system and your CPU architecture
  4. Unzip the asset

Running the scanner

To scan a repository folder and generate a SBOM, you can use this command:

datadog-sbom-generator scan -o "/tmp/sbom.json" "/path/to/directory"

For detailed documentation on all commands and options, see USAGE.md.

You can also get help directly from the command line:

datadog-sbom-generator --help
datadog-sbom-generator scan --help

Supported package managers

This tool sources all dependencies by parsing package manager files. As new package managers appears everyday, we do not support all of them. Here's a list of supported package managers:

Language Package Manager
.NET Nuget
C++ Conan
Go Golang
Java Gradle, Maven, Bazel (rules_jvm_external)
JavaScript Bun, NPM, PNPM, Yarn, package.json
PHP Composer
Python Pdm, Pipenv, Poetry, Requirements, uv
Ruby Bundler
Rust Crates
Swift Swift Package Manager

Limitations

Datadog SBOM Generator reads package manager dependencies declaration files or their lock files. It means it can only scan dependencies which are declared in a standard and enforced way by each supported dependency manager.

We will detail here any known limitations by language.

Python

This tool supports extracting packages from the following Python lockfiles and requirements files:

  • requirements*.txt
  • Pipfile.lock
  • poetry.lock
  • pdm.lock
  • uv.lock

This tool also supports enriching lockfile-derived package information from the following package manager declaration files:

  • Pipfile
  • pyproject.toml
pyproject.toml (no lockfile)
  • This tool supports extracting packages directly from pyproject.toml when no sibling lockfile (Pipfile.lock, poetry.lock, pdm.lock, uv.lock) is present.
  • Parses PEP 621 dependencies and optional-dependencies, PEP 735 dependency-groups, and Poetry dependency sections.
  • Keeps exact pins as versions.
  • Preserves version ranges as datadog:version-range.
  • Skips unsupported direct references, arbitrary equality (===), and unversioned dependencies.
  • All packages are marked as direct dependencies requiring transitive enrichment.
Java
Maven
  • This tool only supports extracting packages and locations from pom.xml.
  • It can only scan pom.xml files which are stored in the same repository.
  • If a pom file defines a parent that is not stored in the repository or is an artifact hosted by an artifact registry, the scanner will try to download it from Maven central. If the scanner cannot locate it there, or cannot access it, it won't be able to resolve the version.
Gradle
  • This tool only supports extracting packages from gradle.lockfile.
  • This tool only supports package information enrichment from build.gradle and gradle/verification-metadata.xml files.
Bazel
  • This tool supports extracting packages from maven_install.json (and any {name}_maven_install.json variant) produced by rules_jvm_external.
  • Both the v1 dependency_tree format (rules_jvm_external < 5.1) and the v2/v3 artifacts map format (rules_jvm_external ≥ 5.1) are supported.
  • IsDirect is not set; distinguishing direct from transitive dependencies would require parsing the Bazel workspace files.
Javascript and Typescript

NPM, Yarn and PNPM have workspace support

Bun
  • This tool only supports extracting packages from bun.lock (the text JSONC lockfile introduced in Bun 1.2). The legacy binary bun.lockb format is not supported.
  • This tool only supports package information enrichment from package.json.
package.json (no lockfile)
  • This tool supports extracting packages directly from package.json when no active lockfile (package-lock.json, yarn.lock, pnpm-lock.yaml) is present in the package directory or an ancestor directory.
  • Parses dependencies, devDependencies, and optionalDependencies sections.
  • Keeps exact pins as versions.
  • Preserves version ranges as datadog:version-range.
  • Skips unsupported non-version specifiers such as local file paths, URLs, and dist-tags.
  • All packages are marked as direct dependencies requiring transitive enrichment.
NPM
  • This tool only supports extracting packages from package-lock.json.
  • This tool only supports package information enrichment from package.json.
Yarn
  • This tool only supports extracting packages from yarn.lock.
  • This tool only supports package information enrichment from package.json.
PNPM
  • This tool only supports extracting packages from pnpm-lock.yaml.
  • This tool only supports package information enrichment from package.json.
.Net
Nuget
  • This tool supports extracting packages from packages.lock.json and *.csproj.
  • This tool only supports package information enrichment from *.csproj when parsing packages.lock.json.
  • Central and build configuration discovery:
    • The tool automatically discovers Directory.Packages.props and Directory.Build.props.
    • Discovery is performed in the limit of the scanned directory.
    • Only configuration files found within the scan scope are considered; parent directories outside the scan root are intentionally ignored.
Ruby
Bundler
  • This tool only supports extracting packages from Gemfile.lock.
  • This tool only supports package information enrichment from Gemfile and *.gemspec.
  • If the version of a package is defined in a variable, the location reported by the scanner will be the usage of the variable.
  • Dependencies sourced from Git repositories won't have any version reported.
C++
Conan
  • This tool only supports extracting packages from conan.lock.
Rust
Crates
  • This tool only supports extracting packages from Cargo.lock.
  • This tool supports package information enrichment from Cargo.toml, including dependencies declared in [dependencies], [dev-dependencies], and [build-dependencies] sections.
  • Workspace support is not currently available.
  • Renaming dependencies is not supported.
Swift
Swift Package Manager
  • This tool only supports extracting packages from Package.resolved (v1, v2, and v3 formats).
  • This tool only supports package information enrichment from Package.swift.
  • When Xcode writes the lockfile at .swiftpm/configuration/Package.resolved, Package.swift enrichment is not available because the manifest is two directory levels above the lockfile.
  • IsDirect is only set for packages declared with a URL in Package.swift. Registry-based dependencies (.package(id: ...)) are always reported as transitive.

Using as a Go library

In addition to the CLI binary, datadog-sbom-generator can be imported as a Go library.

Quick start — run the bundled example

The repository ships with a ready-made example program at examples/main.go that you can run directly against any local directory to see the full library output:

# Scan the built-in Cargo.lock fixture (no arguments needed)
go run examples/main.go

# Scan your own repository
go run examples/main.go /path/to/your/repo

The program prints the generated CycloneDX SBOM JSON to stdout and a summary of all manifest build files (with their dependencies) to stderr. It is the fastest way to verify library behaviour end-to-end without writing any code.

Installation
go get github.com/DataDog/datadog-sbom-generator/pkg/sbomgen
Usage
package main

import (
    "fmt"
    "log"

    "github.com/DataDog/datadog-sbom-generator/pkg/sbomgen"
)

func main() {
    result, err := sbomgen.GenerateSBOM([]string{"/path/to/repo"}, sbomgen.DefaultOptions())
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(string(sbom)) // CycloneDX 1.5 JSON

    // Extract the manifest build files and their transitive dependencies.
    buildFiles := sbomgen.GetBuildFileTrees(sbom)
    for bf, rels := range buildFiles {
        fmt.Printf("[%s] %s (id: %s)\n", bf.FileType, bf.FilePath, rels.ID)
        for _, dep := range rels.Dependencies {
            fmt.Printf("  dep: %s\n", dep.FilePath)
        }
    }
}
API
GenerateSBOM(dirs []string, opts Options) ([]byte, error)

Scans the given directories for lockfiles and returns a CycloneDX 1.5 SBOM as pretty-printed JSON bytes.

Returns an error if dirs is empty or if the scan fails.

GetBuildFileTrees(sbom []byte, filters ...FileType) map[BuildFile]BuildFileRelations

Parses a CycloneDX SBOM (as returned by GenerateSBOM) and returns all manifest build files enriched with their transitive dependencies.

Each BuildFile carries the file type (e.g. FileTypePomXML) and its path relative to the repository root. Each BuildFileRelations value holds an ID string (ecosystem-specific identifier, e.g. Maven "groupId:artifactId") and a Dependencies slice containing all transitively reachable build files sorted by file path.

Pass one or more FileType constants to restrict the result to a specific ecosystem:

// Only pom.xml files, with Maven transitive dependencies resolved.
mavenFiles := sbomgen.GetBuildFileTrees(sbom, sbomgen.FileTypePomXML)
DefaultOptions() Options

Returns sensible defaults: recursive scanning enabled, no path exclusions.

Options
Field Type Description
Recursive bool Scan subdirectories recursively (default: true)
ExcludePaths []string Glob patterns to exclude from scanning
ExtractMavenPomArtifactIds bool Emit file-type components and dependency edges for Maven POMs, enabling GetBuildFileTrees dependency and ID resolution (default: true)

Contributing

Contributions are welcome! You can contribute by:

  • Reporting issues or requesting features via GitHub Issues
  • Submitting pull requests with improvements or bug fixes

For detailed information on building, testing, and developing the project, see CONTRIBUTING.md.

License

The Datadog version of datadog-sbom-generator is licensed under the Apache License, Version 2.0.

Acknowledgement

This project builds upon portions of the osv-scanner project originally developed by Google and released under the Apache License 2.0. We thank the original authors for their foundational work.

Directories

Path Synopsis
cmd
This program demonstrates using the sbomgen library to generate a CycloneDX SBOM and extract build file dependencies from it.
This program demonstrates using the sbomgen library to generate a CycloneDX SBOM and extract build file dependencies from it.
internal
pkg
reporter
Package reporter is a generated GoMock package.
Package reporter is a generated GoMock package.
sbomgen
Package sbomgen provides a library API for generating CycloneDX SBOMs.
Package sbomgen provides a library API for generating CycloneDX SBOMs.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL