pruner

package

v0.8.0 Latest Latest Go to latest Published: Mar 29, 2026 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/SynapsesOS/synapses

Links

Open Source Insights

Documentation ¶

Overview ¶

Package pruner strips boilerplate from web content using the Tier 0 (0.8B) model.

Web pages contain 30-50% non-technical noise: navigation menus, cookie banners, footers, sidebars, and ads. Sending this noise to the distillation pipeline wastes LLM compute and dilutes the resulting summary. The Pruner extracts only the core technical paragraphs before handing content to the Ingestor.

This is a Tier 0 (Reflex) task: simple extraction, no reasoning, no JSON output. The 0.8B model is fast enough (<3s on CPU) and accurate enough for this job.

Index ¶

type Pruner
- func New(client llm.LLMClient, timeout time.Duration) *Pruner
- func (p *Pruner) Prune(ctx context.Context, content string) (string, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Pruner ¶

type Pruner struct {
	// contains filtered or unexported fields
}

Pruner strips boilerplate from web page text using a small LLM.

func New ¶

func New(client llm.LLMClient, timeout time.Duration) *Pruner

New creates a Pruner backed by the given LLM client. timeout is the per-request deadline; defaults to 10s if <= 0.

func (*Pruner) Prune ¶

func (p *Pruner) Prune(ctx context.Context, content string) (string, error)

Prune extracts core technical content from raw web page text. Returns the pruned content, or the original content if the LLM call fails. The returned string is always non-empty if input was non-empty.

Source Files ¶

View all Source files

pruner.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL