epub

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 6, 2026 License: MIT Imports: 7 Imported by: 0

README

epub

A pure Go library with zero dependencies for reading, writing, and validating EPUB files.

go get github.com/billdaws/epub

Features

  • Read EPUB 2 and EPUB 3 archives
  • Parse OPF package metadata (title, authors, language, identifier, publication date)
  • Parse table of contents — prefers EPUB 3 nav documents, falls back to EPUB 2 NCX
  • Read manifest content items, safe for concurrent use
  • Write valid EPUB 3 archives
  • Validate parsed packages against structural rules

Usage

Reading metadata
pkg, err := epub.OpenPackage("the-republic.epub")
if err != nil {
    log.Fatal(err)
}
fmt.Println(pkg.Metadata.Title)
fmt.Println(pkg.Metadata.Authors)
Reading content items

Open keeps the archive open between reads and is safe for concurrent use.

r, err := epub.Open("the-republic.epub")
if err != nil {
    log.Fatal(err)
}
defer r.Close()

for _, item := range r.Package.Manifest {
    if item.MediaType != "application/xhtml+xml" {
        continue
    }
    data, err := r.ReadItem(item)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("%s: %d bytes\n", item.Href, len(data))
}
Reading the table of contents
toc, err := epub.OpenTOC("the-republic.epub")
if err != nil {
    log.Fatal(err)
}

var printTOC func(points []epub.NavPoint, depth int)
printTOC = func(points []epub.NavPoint, depth int) {
    for _, p := range points {
        fmt.Printf("%s%s\n", strings.Repeat("  ", depth), p.Title)
        printTOC(p.Children, depth+1)
    }
}
printTOC(toc, 0)
Writing an EPUB
nav := `<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<body>
  <nav epub:type="toc"><ol><li><a href="chapter1.xhtml">Chapter 1</a></li></ol></nav>
</body>
</html>`

chapter := `<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<body><h1>Chapter 1</h1><p>It was a dark and stormy night.</p></body>
</html>`

book := epub.Book{
    Metadata: epub.Metadata{
        Title:      "My Book",
        Language:   "en",
        Identifier: "urn:uuid:1234-5678",
        Authors:    []string{"Jane Doe"},
    },
    Items: []epub.ContentItem{
        {ID: "nav", Href: "nav.xhtml", MediaType: "application/xhtml+xml", Properties: "nav", Content: []byte(nav)},
        {ID: "ch1", Href: "chapter1.xhtml", MediaType: "application/xhtml+xml", Content: []byte(chapter)},
    },
    Spine: []string{"ch1"},
}

f, err := os.Create("output.epub")
if err != nil {
    log.Fatal(err)
}
defer f.Close()

if err := epub.Write(f, book); err != nil {
    log.Fatal(err)
}
Validating a package
pkg, err := epub.OpenPackage("suspect.epub")
if err != nil {
    log.Fatal(err)
}

violations := epub.Validate(pkg)
for _, v := range violations {
    fmt.Printf("[%s] %s\n", v.Code, v.Message)
}

API overview

Function / Type Description
Open(path) Open an EPUB for repeated content reads (thread-safe)
OpenPackage(path) Parse OPF metadata in one call
OpenTOC(path) Parse the table of contents
OpenContainer(path) Parse only the container (rootfile path)
DecodePackageV2(r, opfPath) Decode an EPUB 2 OPF from an io.Reader
DecodePackageV3(r, opfPath) Decode an EPUB 3 OPF from an io.Reader
Write(dst, book) Encode a Book as a valid EPUB 3 archive
Validate(pkg) Check a Package against structural rules

License

See LICENSE.

Contributing

See CONTRIBUTING.

Documentation

Overview

Package epub provides support for reading, writing, and validating EPUB files.

Reading

Use Open for repeated access to content items within a single archive; the file is kept open between calls. Use OpenPackage for one-shot access to the OPF metadata, OpenTOC for the table of contents, or OpenContainer when you only need the rootfile path.

Writing

Use Write to encode a Book value as a valid EPUB 3 archive.

Validation

Use Validate to check a parsed Package against the structural rules for its EPUB version.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Write

func Write(dst io.Writer, book Book) error

Write encodes book as a valid EPUB 3 file and writes it to dst. It returns an error if any required metadata field (Title, Language, Identifier) is empty.

Types

type Book

type Book struct {
	Metadata Metadata      `json:"metadata"` // bibliographic metadata written to the OPF <metadata> element
	Items    []ContentItem `json:"items"`    // content files added to the manifest and stored in the ZIP
	Spine    []string      `json:"spine"`    // IDs of Items in reading order
}

Book is the input to Write: bibliographic metadata, content items, and the spine reading order.

type Container

type Container struct {
	// RootfilePath is the path within the ZIP to the root OPF document,
	// e.g. "OEBPS/content.opf".
	RootfilePath string `json:"rootfile_path"`
}

Container represents a parsed EPUB container (META-INF/container.xml).

func OpenContainer

func OpenContainer(path string) (*Container, error)

OpenContainer opens the .epub file at path and parses its container, returning the location of the root OPF document.

type ContentItem

type ContentItem struct {
	ID         string `json:"id"`                   // manifest item id attribute; must be unique within the Book
	Href       string `json:"href"`                 // path relative to the OPF document, e.g. "chapter1.xhtml"
	MediaType  string `json:"media_type"`           // MIME type, e.g. "application/xhtml+xml"
	Properties string `json:"properties,omitempty"` // space-separated EPUB 3 properties, e.g. "nav"
	Content    []byte `json:"content"`              // raw file bytes written into the ZIP
}

ContentItem is one content file to be written into the EPUB manifest and ZIP. Href is relative to the OPF document (e.g. "chapter1.xhtml"); Write stores it under OEBPS/ in the ZIP.

type FileNotFoundError

type FileNotFoundError struct {
	Path string
}

FileNotFoundError is returned when a required file is absent from the EPUB ZIP archive. Path is the ZIP-relative path that was expected.

func (*FileNotFoundError) Error

func (e *FileNotFoundError) Error() string

type Item

type Item struct {
	ID        string `json:"id"`         // manifest item id attribute
	Href      string `json:"href"`       // relative to the OPF document's directory
	MediaType string `json:"media_type"` // e.g. "application/xhtml+xml", "image/jpeg"
	// Properties contains space-separated property values (EPUB 3 only, e.g. "nav", "cover-image").
	// Empty for EPUB 2 items.
	Properties string `json:"properties,omitempty"`
}

Item is an entry in the OPF manifest.

type ItemNotFoundError

type ItemNotFoundError struct {
	ID   string // manifest item id attribute
	Href string // ZIP-relative path that was expected
}

ItemNotFoundError is returned by Reader.ReadItem when the manifest item's content file is absent from the ZIP archive.

func (*ItemNotFoundError) Error

func (e *ItemNotFoundError) Error() string

type MalformedContainerError

type MalformedContainerError struct{}

MalformedContainerError is returned when META-INF/container.xml contains no usable rootfile entry.

func (*MalformedContainerError) Error

func (e *MalformedContainerError) Error() string

type Metadata

type Metadata struct {
	Title           string   `json:"title"`                      // dc:title
	Authors         []string `json:"authors,omitempty"`          // dc:creator values, in document order
	Language        string   `json:"language"`                   // dc:language (BCP 47 tag, e.g. "en")
	Identifier      string   `json:"identifier"`                 // dc:identifier matching unique-identifier, or first dc:identifier
	PublicationDate string   `json:"publication_date,omitempty"` // "YYYY-MM-DD" or "YYYY" as written in the file; empty if absent
}

Metadata holds Dublin Core bibliographic information from the OPF.

type MissingMetadataError

type MissingMetadataError struct {
	Field string
}

MissingMetadataError is returned by Write when a required metadata field is empty. Field is the name of the missing field (e.g. "title").

func (*MissingMetadataError) Error

func (e *MissingMetadataError) Error() string

type MissingNavElementError

type MissingNavElementError struct{}

MissingNavElementError is returned when the EPUB 3 navigation document contains no <nav epub:type="toc"> element.

func (*MissingNavElementError) Error

func (e *MissingNavElementError) Error() string

type MissingTOCError

type MissingTOCError struct{}

MissingTOCError is returned when neither an EPUB 3 navigation document nor an EPUB 2 NCX item is present in the OPF manifest.

func (*MissingTOCError) Error

func (e *MissingTOCError) Error() string
type NavPoint struct {
	Title    string     `json:"title"`              // display text for the TOC entry
	Src      string     `json:"src"`                // ZIP-relative path, may include a fragment (e.g. "OEBPS/ch1.xhtml#sec1")
	Children []NavPoint `json:"children,omitempty"` // nested entries, nil if there are none
}

NavPoint is one entry in the table of contents. Children represent nested sections.

func OpenTOC

func OpenTOC(epubPath string) ([]NavPoint, error)

OpenTOC opens the .epub file at epubPath and returns the table of contents. It prefers the EPUB 3 navigation document when present and falls back to the EPUB 2 NCX.

type Package

type Package struct {
	Version  string      `json:"version"`  // e.g. "2.0" or "3.0", as written in the OPF
	Metadata Metadata    `json:"metadata"` // bibliographic metadata from the OPF <metadata> element
	Manifest []Item      `json:"manifest"` // all items declared in the OPF <manifest>
	Spine    []SpineItem `json:"spine"`    // reading order declared in the OPF <spine>
}

Package is the parsed contents of an OPF package document.

func DecodePackageV2

func DecodePackageV2(r io.Reader, opfPath string) (*Package, error)

DecodePackageV2 parses r as an EPUB 2 OPF document. It ignores the version attribute; use this when you already know the content is EPUB 2.

func DecodePackageV3

func DecodePackageV3(r io.Reader, opfPath string) (*Package, error)

DecodePackageV3 parses r as an EPUB 3 OPF document. It ignores the version attribute; use this when you already know the content is EPUB 3.

func OpenPackage

func OpenPackage(path string) (*Package, error)

OpenPackage opens the .epub file at path, locates the OPF document via the container, and returns the parsed Package.

type Reader

type Reader struct {
	// Package is the parsed OPF document for the open EPUB.
	Package *Package
	// contains filtered or unexported fields
}

Reader holds an open EPUB file and its parsed Package, allowing content items to be read without reopening the archive on each call. The caller must call Close when done. Reader is safe for concurrent use.

func Open

func Open(path string) (*Reader, error)

Open opens the .epub file at path, parses its package document, and returns a Reader. The caller must call Close when done to release the file handle.

func (*Reader) Close

func (r *Reader) Close() error

Close closes the underlying EPUB file. It waits for any in-progress ReadItem calls to complete before closing.

func (*Reader) ReadItem

func (r *Reader) ReadItem(item Item) ([]byte, error)

ReadItem returns the raw bytes of item's content file within the EPUB. item must come from r.Package.Manifest. Multiple goroutines may call ReadItem concurrently.

type SpineItem

type SpineItem struct {
	IDRef  string `json:"id_ref"` // references a manifest item by its ID
	Linear bool   `json:"linear"` // false only when the OPF explicitly sets linear="no"
}

SpineItem is one entry in the OPF spine, identifying a manifest item by ID.

type UnsupportedVersionError

type UnsupportedVersionError struct {
	Version string // the version string from the OPF version attribute
	Path    string // path to the OPF document within the EPUB archive
}

UnsupportedVersionError is returned when an OPF document declares a version that this package does not recognise.

func (*UnsupportedVersionError) Error

func (e *UnsupportedVersionError) Error() string

type Violation

type Violation struct {
	Code    ViolationCode `json:"code"`    // machine-readable rule identifier
	Message string        `json:"message"` // human-readable description of the violation
}

Violation describes a single structural rule violation in a Package.

func Validate

func Validate(pkg *Package) []Violation

Validate checks that pkg conforms to the structural rules for its EPUB version and returns all violations found. An empty slice means the package is valid. Validate does not open any files; it inspects only the parsed Package value.

type ViolationCode

type ViolationCode string

ViolationCode identifies the structural rule that was violated.

const (
	// Metadata violations.
	ViolationMissingTitle      ViolationCode = "missing-title"
	ViolationMissingLanguage   ViolationCode = "missing-language"
	ViolationMissingIdentifier ViolationCode = "missing-identifier"

	// Manifest violations.
	ViolationEmptyManifest        ViolationCode = "empty-manifest"
	ViolationMissingItemID        ViolationCode = "missing-item-id"
	ViolationMissingItemHref      ViolationCode = "missing-item-href"
	ViolationMissingItemMediaType ViolationCode = "missing-item-media-type"
	ViolationDuplicateItemID      ViolationCode = "duplicate-item-id"
	ViolationDuplicateItemHref    ViolationCode = "duplicate-item-href"

	// Spine violations.
	ViolationEmptySpine       ViolationCode = "empty-spine"
	ViolationBrokenSpineIDRef ViolationCode = "broken-spine-idref"

	// Version-specific structural violations.
	ViolationMissingNav ViolationCode = "missing-nav" // EPUB 3: nav document required
	ViolationMissingNCX ViolationCode = "missing-ncx" // EPUB 2: NCX document required
)

Directories

Path Synopsis
tools
commit-msg command
Validates a commit message against the Conventional Commits specification using the rules from @commitlint/config-conventional.
Validates a commit message against the Conventional Commits specification using the rules from @commitlint/config-conventional.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL