buildcache

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 31, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package buildcache provides incremental build caching for markata-go.

The cache tracks input hashes (content + frontmatter + template) for each post and allows skipping rebuild when inputs haven't changed. It also tracks global hashes (templates, config) to invalidate the entire cache when needed.

Cache File Structure

The cache is stored in .markata/build-cache.json with the structure:

{
  "version": 1,
  "config_hash": "abc123",
  "templates_hash": "def456",
  "posts": {
    "path/to/post.md": {
      "input_hash": "xyz789",
      "output_path": "output/post/index.html",
      "template": "post.html"
    }
  },
  "graph": {
    "dependencies": {"path/to/post.md": ["linked-slug"]},
    "path_to_slug": {"path/to/post.md": "post-slug"}
  }
}

Package buildcache provides incremental build caching for markata-go.

Index

Constants

View Source
const CacheFileName = "build-cache.json"

CacheFileName is the name of the build cache file.

View Source
const CacheVersion = 1

CacheVersion is incremented when the cache format changes.

View Source
const DefaultCacheDir = ".markata"

DefaultCacheDir is the directory for cache files.

View Source
const FullHTMLCacheDir = "fullhtml-cache"

FullHTMLCacheDir is the subdirectory for cached full page HTML files.

View Source
const HTMLCacheDir = "html-cache"

HTMLCacheDir is the subdirectory for cached rendered HTML files.

View Source
const PostCacheDir = "post-cache"

PostCacheDir is the subdirectory for cached parsed post JSON files.

Variables

This section is empty.

Functions

func ComputePostInputHash

func ComputePostInputHash(content, frontmatter, template string) string

ComputePostInputHash computes the input hash for a post. This combines: content + frontmatter fields that affect output + template name.

func ContentHash

func ContentHash(content string) string

ContentHash computes a hash of just the markdown content.

func HashContent

func HashContent(content string) string

HashContent computes a SHA256 hash of the given content.

func HashDirectory

func HashDirectory(dir string, extensions []string) (string, error)

HashDirectory computes a combined hash of all files in a directory. Files are processed in sorted order for deterministic output.

func HashFile

func HashFile(path string) (string, error)

HashFile computes a SHA256 hash of a file's contents.

Types

type Cache

type Cache struct {

	// Version of the cache format
	Version int `json:"version"`

	// ConfigHash is the hash of the config file contents
	ConfigHash string `json:"config_hash"`

	// TemplatesHash is the combined hash of all template files
	TemplatesHash string `json:"templates_hash"`

	// Posts maps source path to cached post metadata
	Posts map[string]*PostCache `json:"posts"`

	// Feeds maps feed slug to cached feed metadata
	Feeds map[string]*FeedCache `json:"feeds,omitempty"`

	// Graph tracks dependencies between posts for transitive invalidation
	Graph *DependencyGraph `json:"graph,omitempty"`
	// contains filtered or unexported fields
}

Cache manages incremental build state.

func Load

func Load(cacheDir string) (*Cache, error)

Load reads the cache from disk. Returns a new empty cache if file doesn't exist.

func New

func New(cacheDir string) *Cache

New creates a new empty cache.

func (*Cache) CacheArticleHTML

func (c *Cache) CacheArticleHTML(sourcePath, contentHash, articleHTML string) error

CacheArticleHTML stores rendered HTML for a post.

func (*Cache) CacheFullHTML

func (c *Cache) CacheFullHTML(sourcePath, fullHTML string) error

CacheFullHTML stores the full page HTML for a post.

func (*Cache) CacheLinkHrefs

func (c *Cache) CacheLinkHrefs(sourcePath, articleHash string, hrefs []string)

CacheLinkHrefs stores extracted hrefs for a post keyed by article hash.

func (*Cache) CachePostData

func (c *Cache) CachePostData(sourcePath string, modTime int64, postData *CachedPostData) error

CachePostData stores parsed post data to disk.

func (*Cache) GetAffectedPosts

func (c *Cache) GetAffectedPosts(changedSlugs []string) []string

GetAffectedPosts returns all posts that need rebuilding when the given slugs change. This performs transitive closure via the dependency graph.

func (*Cache) GetCachedArticleHTML

func (c *Cache) GetCachedArticleHTML(sourcePath, contentHash string) string

GetCachedArticleHTML returns the cached rendered HTML for a post if available. Returns empty string if not cached or cache is stale.

func (*Cache) GetCachedFullHTML

func (c *Cache) GetCachedFullHTML(sourcePath string) string

GetCachedFullHTML returns the cached full page HTML for a post if available.

func (*Cache) GetCachedLinkHrefs

func (c *Cache) GetCachedLinkHrefs(sourcePath, articleHash string) []string

GetCachedLinkHrefs returns cached hrefs for a post if the article hash matches. Returns nil if no matching cache exists.

func (*Cache) GetCachedPost

func (c *Cache) GetCachedPost(sourcePath string) *PostCache

GetCachedPost returns the cached post metadata if the file hasn't changed. Returns nil if file is not in cache or has changed.

func (*Cache) GetCachedPostData

func (c *Cache) GetCachedPostData(sourcePath string, modTime int64) *CachedPostData

GetCachedPostData returns cached post data if ModTime matches. Returns nil if post is not cached or file has changed.

func (*Cache) GetChangedSlugs

func (c *Cache) GetChangedSlugs() []string

GetChangedSlugs returns the slugs that changed during this build.

func (*Cache) GetFeedHash

func (c *Cache) GetFeedHash(slug string) string

GetFeedHash returns the cached hash for a feed, or empty string if not cached.

func (*Cache) GetStats

func (c *Cache) GetStats() CacheStats

GetStats returns build statistics as a struct.

func (*Cache) GraphSize

func (c *Cache) GraphSize() int

GraphSize returns the number of posts with dependencies tracked.

func (*Cache) IsFileUnchanged

func (c *Cache) IsFileUnchanged(sourcePath string, modTime int64) bool

IsFileUnchanged checks if a file's ModTime matches the cached value. Returns true if the file has not changed since last build. Returns false if file is not in cache or ModTime differs.

func (*Cache) MarkRebuilt

func (c *Cache) MarkRebuilt(sourcePath, inputHash, outputPath, template string)

MarkRebuilt records that a post was rebuilt with the given hash. The slug is used for dependency invalidation tracking.

func (*Cache) MarkRebuiltWithSlug

func (c *Cache) MarkRebuiltWithSlug(sourcePath, slug, inputHash, outputPath, template string)

MarkRebuiltWithSlug records that a post was rebuilt with the given hash. Also records that this slug changed, for dependency invalidation.

func (*Cache) MarkSkipped

func (c *Cache) MarkSkipped()

MarkSkipped records that a post was skipped (already up to date).

func (*Cache) MarkSlugChanged

func (c *Cache) MarkSlugChanged(slug string)

MarkSlugChanged records that a slug changed this build. Used for dependency invalidation.

func (*Cache) RemoveStale

func (c *Cache) RemoveStale(currentPaths map[string]bool) int

RemoveStale removes cache entries for posts that no longer exist. Returns the number of entries removed.

func (*Cache) ResetStats

func (c *Cache) ResetStats()

ResetStats resets the build statistics for a new build.

func (*Cache) Save

func (c *Cache) Save() error

Save writes the cache to disk.

func (*Cache) SetConfigHash

func (c *Cache) SetConfigHash(hash string) bool

SetConfigHash updates the config hash and invalidates if changed.

func (*Cache) SetDependencies

func (c *Cache) SetDependencies(sourcePath, sourceSlug string, targets []string)

SetDependencies records what targets a source post links to. This delegates to the underlying DependencyGraph.

func (*Cache) SetFeedHash

func (c *Cache) SetFeedHash(slug, hash string)

SetFeedHash stores the hash for a feed in the cache.

func (*Cache) SetTemplatesHash

func (c *Cache) SetTemplatesHash(hash string) bool

SetTemplatesHash updates the templates hash and invalidates if changed.

func (*Cache) ShouldRebuild

func (c *Cache) ShouldRebuild(sourcePath, inputHash, template string) bool

ShouldRebuild checks if a post needs rebuilding based on input hash. Returns true if the post should be rebuilt (hash mismatch or not in cache). Also returns true if any post this one depends on has changed.

func (*Cache) ShouldRebuildWithSlug

func (c *Cache) ShouldRebuildWithSlug(sourcePath, _, inputHash, template string) bool

ShouldRebuildWithSlug checks if a post needs rebuilding based on input hash and also checks if any of the posts it depends on have changed this build. The slug is used for dependency tracking.

func (*Cache) Stats

func (c *Cache) Stats() (skipped, rebuilt int)

Stats returns build statistics.

func (*Cache) UpdateModTime

func (c *Cache) UpdateModTime(sourcePath string, modTime int64, slug string)

UpdateModTime updates the ModTime for a post.

type CacheStats

type CacheStats struct {
	Skipped int
	Rebuilt int
}

CacheStats holds build statistics.

type CachedPostData

type CachedPostData struct {
	Path           string            `json:"path"`
	Content        string            `json:"content"`
	Slug           string            `json:"slug"`
	Href           string            `json:"href"`
	Title          *string           `json:"title,omitempty"`
	Date           *time.Time        `json:"date,omitempty"`
	Published      bool              `json:"published"`
	Draft          bool              `json:"draft"`
	Private        bool              `json:"private"`
	Skip           bool              `json:"skip"`
	Tags           []string          `json:"tags,omitempty"`
	Description    *string           `json:"description,omitempty"`
	Template       string            `json:"template"`
	Templates      map[string]string `json:"templates,omitempty"`
	RawFrontmatter string            `json:"raw_frontmatter"`
	InputHash      string            `json:"input_hash"`
	Extra          map[string]any    `json:"extra,omitempty"`
}

CachedPostData holds the serializable parts of a Post for caching. This excludes rendered HTML which is cached separately.

type DependencyGraph

type DependencyGraph struct {

	// Dependencies maps source path -> target slugs (what this post links TO)
	// Key is the source file path, values are slugs of linked posts
	Dependencies map[string][]string `json:"dependencies,omitempty"`

	// PathToSlug maps source path -> its slug (for reverse lookups during traversal)
	PathToSlug map[string]string `json:"path_to_slug,omitempty"`

	// Dependents maps target slug -> source paths (who links to this post)
	// This is computed from Dependencies and not persisted
	Dependents map[string][]string `json:"-"`
	// contains filtered or unexported fields
}

DependencyGraph tracks relationships between posts for incremental builds. It maintains both forward dependencies (what a post links to) and reverse dependencies (what posts link to a given post) for efficient invalidation.

Example:

post-a.md contains [[post-b]] and [[post-c]]
post-b.md contains [[post-c]]

Dependencies (forward):
  "post-a" -> ["post-b", "post-c"]
  "post-b" -> ["post-c"]

Dependents (reverse, computed):
  "post-b" -> ["post-a"]
  "post-c" -> ["post-a", "post-b"]

When post-c changes, GetAffectedPosts returns ["post-a", "post-b"] (transitive).

func NewDependencyGraph

func NewDependencyGraph() *DependencyGraph

NewDependencyGraph creates a new empty dependency graph.

func (*DependencyGraph) Clear

func (g *DependencyGraph) Clear()

Clear removes all dependencies from the graph.

func (*DependencyGraph) GetAffectedPosts

func (g *DependencyGraph) GetAffectedPosts(changed []string) []string

GetAffectedPosts returns all posts that need to be rebuilt when the given posts change. This performs a transitive closure using BFS to find all posts that directly or indirectly depend on the changed posts.

The input is a list of changed post slugs. The output is a list of source paths that need rebuilding. The changed posts themselves are NOT included in the result.

Example: If A->B->C and C changes, returns [A, B] (both depend on C).

func (*DependencyGraph) GetDependencies

func (g *DependencyGraph) GetDependencies(sourcePath string) []string

GetDependencies returns the targets that a source post links to.

func (*DependencyGraph) GetDirectDependents

func (g *DependencyGraph) GetDirectDependents(target string) []string

GetDirectDependents returns posts that directly link to the given target. The target can be a slug or path.

func (*DependencyGraph) HasDependencies

func (g *DependencyGraph) HasDependencies(sourcePath string) bool

HasDependencies returns true if the source has any dependencies.

func (*DependencyGraph) HasDependents

func (g *DependencyGraph) HasDependents(target string) bool

HasDependents returns true if any posts depend on the target.

func (*DependencyGraph) RebuildReverse

func (g *DependencyGraph) RebuildReverse()

RebuildReverse reconstructs the Dependents map from Dependencies. This should be called after loading the graph from disk.

func (*DependencyGraph) RemoveSource

func (g *DependencyGraph) RemoveSource(sourcePath string)

RemoveSource removes all dependencies for a source post. Use this when a post is deleted.

func (*DependencyGraph) SetDependencies

func (g *DependencyGraph) SetDependencies(sourcePath, sourceSlug string, targets []string)

SetDependencies records what targets a source post links to. This replaces any existing dependencies for the source. The sourceSlug is the slug of the source post (for reverse lookups). The targets are slugs of linked posts.

func (*DependencyGraph) Size

func (g *DependencyGraph) Size() int

Size returns the number of source posts with dependencies.

type FeedCache

type FeedCache struct {
	// Hash is the hash of the feed's content (post slugs, config)
	Hash string `json:"hash"`
}

FeedCache stores cached metadata for a single feed.

type PostCache

type PostCache struct {
	// InputHash is the hash of content + frontmatter + template
	InputHash string `json:"input_hash"`

	// OutputPath is the primary output file path
	OutputPath string `json:"output_path"`

	// Template is the template name used for rendering
	Template string `json:"template"`

	// OutputHash is the hash of the rendered output (optional, for verification)
	OutputHash string `json:"output_hash,omitempty"`

	// ContentHash is the hash of just the markdown content (for render caching)
	ContentHash string `json:"content_hash,omitempty"`

	// ArticleHTMLPath is the path to the cached rendered HTML file
	ArticleHTMLPath string `json:"article_html_path,omitempty"`

	// FullHTMLPath is the path to the cached full page HTML file
	FullHTMLPath string `json:"full_html_path,omitempty"`

	// ModTime is the file modification time (Unix nanoseconds)
	ModTime int64 `json:"mod_time,omitempty"`

	// Slug is the post's slug for dependency tracking
	Slug string `json:"slug,omitempty"`

	// LinkHrefsHash is a hash of the post's article HTML used for link extraction caching
	LinkHrefsHash string `json:"link_hrefs_hash,omitempty"`

	// LinkHrefs caches extracted href values for the post
	LinkHrefs []string `json:"link_hrefs,omitempty"`
}

PostCache stores cached metadata for a single post.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL