spdx

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 18, 2026 License: MIT Imports: 11 Imported by: 0

README

spdx

Go library for SPDX license expression parsing, normalization, and validation.

Normalizes informal license strings from the real world (like "Apache 2" or "MIT License") to valid SPDX identifiers (like "Apache-2.0" or "MIT"). Useful when working with package metadata from registries where license fields often contain non-standard values.

Installation

go get github.com/git-pkgs/spdx

Usage

Normalize informal license strings
import "github.com/git-pkgs/spdx"

// Normalize converts informal strings to valid SPDX identifiers
id, err := spdx.Normalize("Apache 2")           // "Apache-2.0"
id, err := spdx.Normalize("MIT License")        // "MIT"
id, err := spdx.Normalize("GPL v3")             // "GPL-3.0-or-later"
id, err := spdx.Normalize("GNU General Public License") // "GPL-3.0-or-later"
id, err := spdx.Normalize("BSD 3-Clause")       // "BSD-3-Clause"
id, err := spdx.Normalize("CC BY 4.0")          // "CC-BY-4.0"
Parse and normalize expressions
// Parse handles both strict SPDX IDs and informal license names
expr, err := spdx.Parse("MIT OR Apache-2.0")
fmt.Println(expr.String())  // "MIT OR Apache-2.0"

expr, err := spdx.Parse("Apache 2 OR MIT License")
fmt.Println(expr.String())  // "Apache-2.0 OR MIT"

expr, err := spdx.Parse("GPL v3 AND BSD 3-Clause")
fmt.Println(expr.String())  // "GPL-3.0-or-later AND BSD-3-Clause"

// Handles operator precedence (AND binds tighter than OR)
expr, err := spdx.Parse("MIT OR GPL-2.0-only AND Apache-2.0")
fmt.Println(expr.String())  // "MIT OR (GPL-2.0-only AND Apache-2.0)"

// ParseStrict requires valid SPDX IDs (no fuzzy normalization)
expr, err := spdx.ParseStrict("MIT OR Apache-2.0")  // succeeds
expr, err := spdx.ParseStrict("Apache 2 OR MIT")    // fails
Validate licenses
// Check if a string is valid SPDX
spdx.Valid("MIT OR Apache-2.0")     // true
spdx.Valid("FAKEYLICENSE")          // false

// Check if a single identifier is valid
spdx.ValidLicense("MIT")            // true
spdx.ValidLicense("Apache 2")       // false (informal, not valid SPDX)

// Validate multiple licenses at once
valid, invalid := spdx.ValidateLicenses([]string{"MIT", "Apache-2.0", "FAKE"})
// valid: false, invalid: ["FAKE"]
Check license compatibility
// Check if allowed licenses satisfy an expression
satisfied, err := spdx.Satisfies("MIT OR Apache-2.0", []string{"MIT"})
// true

satisfied, err := spdx.Satisfies("MIT AND Apache-2.0", []string{"MIT"})
// false (both required)
Extract licenses from expressions
licenses, err := spdx.ExtractLicenses("(MIT AND GPL-2.0-only) OR Apache-2.0")
// ["Apache-2.0", "GPL-2.0-only", "MIT"]
Get license categories

Categories are sourced from scancode-licensedb (OSS licenses only) and updated weekly.

// Get the category for a license
cat := spdx.LicenseCategory("MIT")           // spdx.CategoryPermissive
cat := spdx.LicenseCategory("GPL-3.0-only")  // spdx.CategoryCopyleft
cat := spdx.LicenseCategory("MPL-2.0")       // spdx.CategoryCopyleftLimited
cat := spdx.LicenseCategory("Unlicense")     // spdx.CategoryPublicDomain

// Check license type
spdx.IsPermissive("MIT")        // true
spdx.IsPermissive("GPL-3.0")    // false
spdx.IsCopyleft("GPL-3.0-only") // true
spdx.IsCopyleft("LGPL-2.1")     // true (weak copyleft)

// Get categories for an expression
cats, err := spdx.ExpressionCategories("MIT OR GPL-3.0-only")
// []Category{CategoryPermissive, CategoryCopyleft}

// Check expressions for copyleft
spdx.HasCopyleft("MIT OR Apache-2.0")     // false
spdx.HasCopyleft("MIT OR GPL-3.0-only")   // true
spdx.IsFullyPermissive("MIT OR Apache-2.0") // true
spdx.IsFullyPermissive("MIT OR GPL-3.0")    // false

// Get detailed license info
info := spdx.GetLicenseInfo("MIT")
// info.Category: CategoryPermissive
// info.IsException: false
// info.IsDeprecated: false

Available categories:

  • CategoryPermissive - MIT, Apache-2.0, BSD-*
  • CategoryCopyleft - GPL-, AGPL-
  • CategoryCopyleftLimited - LGPL-, MPL-, EPL-*
  • CategoryPublicDomain - Unlicense, CC0-1.0
  • CategoryCommercial - Commercial licenses
  • CategoryProprietaryFree - Free but proprietary
  • CategorySourceAvailable - Source-available licenses
  • CategoryPatentLicense - Patent grants
  • CategoryFreeRestricted - Free with restrictions
  • CategoryCLA - Contributor agreements
  • CategoryUnstated - No license stated

Normalization examples

The library handles many common variations found in package registries:

Input Output
Apache 2 Apache-2.0
Apache License 2.0 Apache-2.0
Apache License, Version 2.0 Apache-2.0
MIT License MIT
M.I.T. MIT
GPL v3 GPL-3.0-or-later
GNU General Public License v3 GPL-3.0-or-later
LGPL 2.1 LGPL-2.1-only
BSD 3-Clause BSD-3-Clause
3-Clause BSD BSD-3-Clause
Simplified BSD BSD-2-Clause
MPL 2.0 MPL-2.0
Mozilla Public License MPL-2.0
CC BY 4.0 CC-BY-4.0
Attribution-NonCommercial CC-BY-NC-4.0
Unlicense Unlicense
WTFPL WTFPL

Performance

Designed for processing large numbers of licenses:

BenchmarkNormalize-8       49116    24381 ns/op   (~5µs per license)
BenchmarkNormalizeBatch-8    372  3271336 ns/op   (~3.3µs per license at scale)
BenchmarkParse-8          236752     5263 ns/op   (includes normalization)
BenchmarkValid-8          789087     1506 ns/op   (strict validation)

Prior art

This library combines approaches from several existing implementations:

License

MIT

Documentation

Overview

Package spdx provides SPDX license expression parsing, normalization, and validation. It normalizes informal license strings (like "Apache 2" or "MIT License") to valid SPDX identifiers (like "Apache-2.0" or "MIT"), and validates/parses SPDX expressions.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrEmptyExpression     = errors.New("empty expression")
	ErrUnexpectedToken     = errors.New("unexpected token")
	ErrUnbalancedParens    = errors.New("unbalanced parentheses")
	ErrInvalidLicenseID    = errors.New("invalid license identifier")
	ErrInvalidException    = errors.New("invalid exception identifier")
	ErrMissingOperand      = errors.New("missing operand")
	ErrInvalidSpecialValue = errors.New("NONE and NOASSERTION must be standalone")
)

Parser errors

View Source
var ErrInvalidLicense = errors.New("invalid license")

ErrInvalidLicense is returned when a license string cannot be normalized or validated.

Functions

func ExtractLicenses

func ExtractLicenses(expression string) ([]string, error)

ExtractLicenses extracts all unique license identifiers from an SPDX expression. Returns a slice of license identifiers or an error if parsing fails.

Example:

ExtractLicenses("MIT OR Apache-2.0")
// returns ["MIT", "Apache-2.0"], nil

ExtractLicenses("(MIT AND GPL-2.0) OR Apache-2.0")
// returns ["Apache-2.0", "GPL-2.0", "MIT"], nil

func HasCopyleft

func HasCopyleft(expression string) bool

HasCopyleft returns true if any license in the expression has copyleft requirements. This includes both full Copyleft and Copyleft Limited (weak copyleft like LGPL, MPL).

Example:

HasCopyleft("MIT OR Apache-2.0")       // false
HasCopyleft("MIT OR GPL-3.0-only")     // true
HasCopyleft("MIT AND LGPL-2.1-only")   // true

func IsCommercial

func IsCommercial(license string) bool

IsCommercial returns true if the license is commercial/proprietary.

func IsCopyleft

func IsCopyleft(license string) bool

IsCopyleft returns true if the license has copyleft requirements. This includes both full Copyleft and Copyleft Limited (weak copyleft).

func IsFullyPermissive

func IsFullyPermissive(expression string) bool

IsFullyPermissive returns true if all licenses in the expression are permissive. This includes Permissive and Public Domain categories.

Example:

IsFullyPermissive("MIT OR Apache-2.0")     // true
IsFullyPermissive("MIT AND BSD-3-Clause") // true
IsFullyPermissive("MIT OR GPL-3.0-only")  // false

func IsPermissive

func IsPermissive(license string) bool

IsPermissive returns true if the license is in a permissive category. This includes Permissive, Public Domain, and similar open categories.

func Normalize

func Normalize(license string) (string, error)

Normalize converts an informal license string to a valid SPDX identifier. It handles common variations like "Apache 2", "MIT License", "GPL v3", etc. Returns the normalized SPDX identifier or an error if normalization fails.

Example:

Normalize("Apache 2")           // returns "Apache-2.0", nil
Normalize("MIT License")        // returns "MIT", nil
Normalize("GPL v3")             // returns "GPL-3.0-or-later", nil
Normalize("UNKNOWN-LICENSE")    // returns "", ErrInvalidLicense

func NormalizeExpression

func NormalizeExpression(expression string) (string, error)

NormalizeExpression normalizes an SPDX expression, converting each license identifier to its canonical form and ensuring proper operator precedence. This only handles case normalization of already-valid SPDX identifiers. For informal license names like "Apache 2", use NormalizeExpressionLax.

Example:

NormalizeExpression("mit OR apache-2.0")
// returns "MIT OR Apache-2.0", nil

NormalizeExpression("mit OR gpl-2.0 AND apache-2.0")
// returns "MIT OR (GPL-2.0 AND Apache-2.0)", nil

func NormalizeExpressionLax

func NormalizeExpressionLax(expression string) (string, error)

NormalizeExpressionLax normalizes an SPDX expression with lax handling of informal license names. It converts informal names like "Apache 2" or "MIT License" to their canonical SPDX forms within expressions.

Example:

NormalizeExpressionLax("Apache 2 OR MIT License")
// returns "Apache-2.0 OR MIT", nil

NormalizeExpressionLax("GPL v3 AND BSD 3-Clause")
// returns "GPL-3.0-or-later AND BSD-3-Clause", nil

func Satisfies

func Satisfies(expression string, allowed []string) (bool, error)

Satisfies checks if the allowed licenses satisfy the given SPDX expression. This is a convenience wrapper around github.com/github/go-spdx/v2/spdxexp.Satisfies.

func Valid

func Valid(expression string) bool

Valid checks if the given string is a valid SPDX expression. This performs strict validation - informal license names like "Apache 2" are not valid. Returns true if valid, false otherwise.

func ValidLicense

func ValidLicense(license string) bool

ValidLicense checks if the given string is a valid SPDX license identifier. Returns true if valid, false otherwise.

func ValidateLicenses

func ValidateLicenses(licenses []string) (bool, []string)

ValidateLicenses checks if all given license identifiers are valid SPDX identifiers. Returns true and nil if all are valid, or false and the list of invalid licenses.

Types

type AndExpression

type AndExpression struct {
	Left  Expression
	Right Expression
}

AndExpression represents an AND combination of expressions.

func (*AndExpression) Licenses

func (e *AndExpression) Licenses() []string

func (*AndExpression) String

func (e *AndExpression) String() string

type Category

type Category string

Category represents a license category from scancode-licensedb.

const (
	CategoryPermissive      Category = "Permissive"
	CategoryCopyleft        Category = "Copyleft"
	CategoryCopyleftLimited Category = "Copyleft Limited"
	CategoryCommercial      Category = "Commercial"
	CategoryProprietaryFree Category = "Proprietary Free"
	CategoryPublicDomain    Category = "Public Domain"
	CategoryPatentLicense   Category = "Patent License"
	CategorySourceAvailable Category = "Source-available"
	CategoryFreeRestricted  Category = "Free Restricted"
	CategoryCLA             Category = "CLA"
	CategoryUnstated        Category = "Unstated License"
	CategoryUnknown         Category = "Unknown"
)

func ExpressionCategories

func ExpressionCategories(expression string) ([]Category, error)

ExpressionCategories returns all unique categories for licenses in an expression. It parses the expression and returns the category for each license found.

Example:

ExpressionCategories("MIT OR Apache-2.0")
// []Category{CategoryPermissive}  (both are Permissive)

ExpressionCategories("MIT OR GPL-3.0-only")
// []Category{CategoryPermissive, CategoryCopyleft}

func LicenseCategory

func LicenseCategory(license string) Category

LicenseCategory returns the category for a given license identifier. It accepts SPDX identifiers (like "MIT", "Apache-2.0") or scancode keys. Returns CategoryUnknown if the license is not found.

Example:

LicenseCategory("MIT")           // CategoryPermissive
LicenseCategory("GPL-3.0-only")  // CategoryCopyleft
LicenseCategory("MPL-2.0")       // CategoryCopyleftLimited

type Expression

type Expression interface {
	// String returns the normalized string representation.
	String() string
	// Licenses returns all license identifiers in the expression.
	Licenses() []string
	// contains filtered or unexported methods
}

Expression represents a parsed SPDX expression.

func Parse

func Parse(expression string) (Expression, error)

Parse parses an SPDX expression string into an Expression tree. It handles both strict SPDX identifiers and informal license names (like "Apache 2" or "MIT License") by normalizing them automatically.

Example:

Parse("MIT")                     // *License{ID: "MIT"}
Parse("MIT OR Apache-2.0")       // *OrExpression{...}
Parse("mit OR apache 2")         // normalizes to "MIT OR Apache-2.0"
Parse("GPL v3 AND BSD")          // normalizes to "GPL-3.0-or-later AND BSD-2-Clause"

For strict SPDX-only parsing (no fuzzy normalization), use ParseStrict.

func ParseLax deprecated

func ParseLax(expression string) (Expression, error)

ParseLax parses an SPDX expression with lax handling of informal license names. It normalizes informal license strings like "Apache 2", "MIT License", "GPL v3".

Deprecated: Use Parse instead, which now handles informal license names automatically. ParseLax is kept for backwards compatibility.

Example:

ParseLax("Apache 2 OR MIT License")  // "Apache-2.0 OR MIT"
ParseLax("GPL v3 AND BSD 3-Clause")  // "GPL-3.0-or-later AND BSD-3-Clause"

func ParseStrict

func ParseStrict(expression string) (Expression, error)

ParseStrict parses an SPDX expression requiring strict SPDX identifiers. Unlike Parse, it does not normalize informal license names. Use this when you need to validate that an expression uses only exact SPDX license identifiers.

Example:

ParseStrict("MIT OR Apache-2.0")  // succeeds
ParseStrict("mit OR apache 2")    // fails - "apache 2" is not a valid SPDX ID

type License

type License struct {
	ID        string // The canonical license ID
	Plus      bool   // True if followed by +
	Exception string // Exception ID if using WITH
}

License represents a single SPDX license identifier.

func (*License) Licenses

func (l *License) Licenses() []string

func (*License) String

func (l *License) String() string

type LicenseError

type LicenseError struct {
	License string
	Err     error
}

LicenseError wraps an error with the license that caused it.

func (*LicenseError) Error

func (e *LicenseError) Error() string

func (*LicenseError) Unwrap

func (e *LicenseError) Unwrap() error

type LicenseInfo

type LicenseInfo struct {
	Key          string   // scancode license key
	SPDXKey      string   // primary SPDX identifier
	Category     Category // license category
	IsException  bool     // true if this is a license exception
	IsDeprecated bool     // true if deprecated
}

LicenseInfo contains detailed information about a license.

func GetLicenseInfo

func GetLicenseInfo(license string) *LicenseInfo

GetLicenseInfo returns detailed information about a license. Returns nil if the license is not found.

type LicenseRef

type LicenseRef struct {
	DocumentRef string // Optional document reference
	LicenseRef  string // The license reference ID
}

LicenseRef represents a custom license reference.

func (*LicenseRef) Licenses

func (l *LicenseRef) Licenses() []string

func (*LicenseRef) String

func (l *LicenseRef) String() string

type OrExpression

type OrExpression struct {
	Left  Expression
	Right Expression
}

OrExpression represents an OR combination of expressions.

func (*OrExpression) Licenses

func (e *OrExpression) Licenses() []string

func (*OrExpression) String

func (e *OrExpression) String() string

type SpecialValue

type SpecialValue struct {
	Value string
}

SpecialValue represents NONE or NOASSERTION.

func (*SpecialValue) Licenses

func (s *SpecialValue) Licenses() []string

func (*SpecialValue) String

func (s *SpecialValue) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL