ferret

package module
v2.0.0-alpha.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 27, 2026 License: Apache-2.0 Imports: 20 Imported by: 10

README

Ferret

Go Report Status Build Status Mastodon Follow Telegram Group Ferret release Apache-2.0 License

ferret

Try it! Docs CLI Test runner Web worker


📢 Notice: This branch contains the upcoming Ferret v2. For the stable v1 release, please visit Ferret v1.


What is it?

Ferret is a declarative system for working with web data - extracting it, querying it, and turning it into structured results for testing, analytics, machine learning, and other workflows. It allows users to focus on the data they need while abstracting away the complexity of browser automation, page interaction, and underlying execution details.

Features
  • Declarative query language
  • Works with static and dynamic web pages
  • Embeddable in Go applications
  • Extensible runtime and function system
  • Portable and fast

Getting started

go get github.com/MontFerret/ferret/v2@latest

There are currently two ways to start with Ferret v2:

  • Native v2 API - recommended for new projects
  • compat module - recommended as a first migration step for existing v1 integrations
New projects

Use the native v2 API built around the following flow:

Engine -> compile query -> create session -> run
package main

import (
	"context"
	"fmt"
	"log"

	"github.com/MontFerret/ferret/v2/pkg/engine"
)

func main() {
	ctx := context.Background()

	eng, err := engine.New()
	if err != nil {
		log.Fatal(err)
	}
	defer eng.Close()

	plan, err := eng.Compile(`RETURN 1 + 1`)
	if err != nil {
		log.Fatal(err)
	}

	session, err := plan.NewSession()
	if err != nil {
		log.Fatal(err)
	}
	defer session.Close()

	result, err := session.Run(ctx)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(result.Content)
}
Migration from v1

Ferret v2 introduces a new architecture and public API, so embedding it directly is different from v1.

To make migration easier, v2 includes a compat module that provides a v1-style API. Its goal is to make upgrades incremental instead of forcing a full rewrite up front.

For many projects, the easiest migration path will be:

  • switch imports from v1 to the compat package
  • get the project compiling again
  • migrate incrementally to the native v2 API over time

A small helper script for rewriting import paths is planned to simplify this process further.

The compatibility layer is intended as a migration aid, not the long-term preferred API. New projects should use the native v2 packages directly.

Alpha status

Ferret v2 is currently in active development.

Alpha releases are intended for early adopters, experimentation, and feedback. Some APIs and language features may still change before the stable v2 release.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ParseLogLevel     = runtime.ParseLogLevel
	MustParseLogLevel = runtime.MustParseLogLevel
	FormatError       = diagnostics.Format
)

Functions

This section is empty.

Types

type AfterCompileHook

type AfterCompileHook func(ctx context.Context, err error) error

AfterCompileHook runs after each compilation attempt. Hooks run in LIFO order and receive the compilation error (if any).

type AfterRunHook

type AfterRunHook func(ctx context.Context, err error) error

AfterRunHook runs after each session run attempt. Hooks run in LIFO order, receive the run error (if any), and aggregate hook errors.

type BeforeCompileHook

type BeforeCompileHook func(ctx context.Context) error

BeforeCompileHook runs before compilation starts. Hooks run in FIFO order and stop on the first error.

type BeforeRunHook

type BeforeRunHook func(ctx context.Context) (context.Context, error)

BeforeRunHook runs before each session run. It can return a derived context for subsequent hooks and VM execution.

type Bootstrap

type Bootstrap interface {
	Host() HostConfigurer
	Hooks() HookRegistrar
}

Bootstrap defines an interface for configuring the host and registering lifecycle hooks with the runtime engine.

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

func New

func New(setters ...Option) (*Engine, error)

func (*Engine) Close

func (e *Engine) Close() error

func (*Engine) Compile

func (e *Engine) Compile(ctx context.Context, src *file.Source) (*Plan, error)

func (*Engine) Load

func (e *Engine) Load(data []byte) (*Plan, error)

Load decodes a serialized program artifact and wraps it in a reusable plan.

func (*Engine) Run

func (e *Engine) Run(ctx context.Context, src *file.Source, opts ...SessionOption) (*Output, error)

type EngineCloseHook

type EngineCloseHook func() error

EngineCloseHook runs during engine shutdown. Close hooks are executed in LIFO order and their errors are aggregated.

type EngineHookRegistrar

type EngineHookRegistrar interface {
	// OnInit registers a hook executed in FIFO order during engine initialization.
	// A nil hook is ignored.
	OnInit(hook EngineInitHook)
	// OnClose registers a hook executed in LIFO order when the engine is closed.
	// A nil hook is ignored.
	OnClose(hook EngineCloseHook)
}

EngineHookRegistrar registers hooks for engine initialization and shutdown.

type EngineInitHook

type EngineInitHook func() error

EngineInitHook runs during engine initialization. Returning an error stops initialization immediately.

type HookRegistrar

type HookRegistrar interface {
	// Engine returns the registrar for engine lifecycle hooks.
	Engine() EngineHookRegistrar
	// Plan returns the registrar for plan lifecycle hooks.
	Plan() PlanHookRegistrar
	// Session returns the registrar for session lifecycle hooks.
	Session() SessionHookRegistrar
}

HookRegistrar provides access to lifecycle hook registrars for engine, plan, and session stages.

type HostConfigurer

type HostConfigurer interface {
	Params() runtime.Params
	Library() runtime.Library
	Encoding() encoding.CodecRegistrar
}

type Module

type Module interface {
	Name() string
	Register(Bootstrap) error
}

Module represents a self-contained unit of functionality that can be registered with the engine.

type Option

type Option func(env *options) error

func WithAfterCompileHook

func WithAfterCompileHook(hook AfterCompileHook) Option

WithAfterCompileHook returns an Option that registers a hook to execute after each compilation attempt. The hook receives the compilation error (if any). It returns an error if hook is nil.

func WithAfterRunHook

func WithAfterRunHook(hook AfterRunHook) Option

WithAfterRunHook returns an Option that registers a hook to execute after each session run attempt. The hook receives the run error (if any). It returns an error if hook is nil.

func WithBeforeCompileHook

func WithBeforeCompileHook(hook BeforeCompileHook) Option

WithBeforeCompileHook returns an Option that registers a hook to execute before each compilation attempt. It returns an error if hook is nil.

func WithBeforeRunHook

func WithBeforeRunHook(hook BeforeRunHook) Option

WithBeforeRunHook returns an Option that registers a hook to execute before each session run. The hook can replace the context used by subsequent hooks and VM execution. It returns an error if hook is nil.

func WithCompilerOptions

func WithCompilerOptions(opts ...compiler.Option) Option

WithCompilerOptions creates an Option that appends the provided compiler options to the options if not empty.

func WithEncodingCodec

func WithEncodingCodec(contentType string, codec encoding.Codec) Option

WithEncodingCodec registers or overrides a codec for the given content type.

func WithEncodingRegistry

func WithEncodingRegistry(registry *encoding.Registry) Option

WithEncodingRegistry sets a custom encoding registry for query execution.

func WithEngineCloseHook

func WithEngineCloseHook(hook EngineCloseHook) Option

WithEngineCloseHook returns an Option that registers a hook to execute when the engine is closed. It returns an error if hook is nil.

func WithEngineInitHook

func WithEngineInitHook(hook EngineInitHook) Option

WithEngineInitHook returns an Option that registers a hook to execute during engine initialization. It returns an error if hook is nil.

func WithFunctions

func WithFunctions(funcs *runtime.Functions) Option

WithFunctions creates an Option that sets the provided *runtime.Functions to the options if not nil.

func WithFunctionsRegistrar

func WithFunctionsRegistrar(setter func(fns runtime.FunctionDefs)) Option

WithFunctionsRegistrar creates an Option that invokes the provided registrar with the engine's runtime.FunctionDefs if the registrar is not nil.

func WithLog

func WithLog(writer io.Writer) Option

WithLog sets the writer for logging output. The writer can be any io.Writer, such as os.Stdout or a file.

func WithLogFields

func WithLogFields(fields map[string]any) Option

WithLogFields sets the fields to be included in log entries. These fields can provide additional context for debugging and monitoring purposes.

func WithLogLevel

func WithLogLevel(lvl runtime.LogLevel) Option

WithLogLevel sets the logging level for the engine. The logging level determines the severity of log messages that will be recorded.

func WithMaxActiveSessions

func WithMaxActiveSessions(n int) Option

WithMaxActiveSessions sets an engine-wide limit on concurrently active sessions.

This limit applies to Session objects created from any plan compiled by the engine. When the limit is reached, Plan.NewSession blocks until another session is closed or the provided context is canceled.

Use this when you want to put a global cap on query execution concurrency and the host-side resources that come with it, such as CPU, memory, network traffic, or downstream service pressure.

This is different from WithMaxIdleVMsPerPlan and WithMaxVMsPerPlan: WithMaxActiveSessions controls how many sessions may be running or checked out at once across the engine, while the VM options control how each individual plan manages its VM pool.

A value of 0 disables the limit.

func WithMaxIdleVMsPerPlan

func WithMaxIdleVMsPerPlan(n int) Option

WithMaxIdleVMsPerPlan sets how many closed-session VMs each plan keeps warm for reuse after they become idle.

This is a retention setting, not a concurrency limit. It only controls how many unused VMs remain cached in a plan's pool after sessions are closed. When the idle cache is full, additional returned VMs are closed instead of retained.

Use this when the same compiled plan is executed repeatedly and you want to trade some steady-state memory for faster session creation by reusing already initialized VMs.

This is different from WithMaxVMsPerPlan: WithMaxIdleVMsPerPlan controls how many unused VMs stay cached, while WithMaxVMsPerPlan controls the maximum total number of VMs the plan may own at all, including both idle and currently borrowed VMs.

A value of 0 disables idle retention for the plan.

func WithMaxVMsPerPlan

func WithMaxVMsPerPlan(n int) Option

WithMaxVMsPerPlan sets a hard per-plan limit on the total number of VMs the plan's pool may own at one time.

The total includes both idle VMs kept in the pool and VMs currently borrowed by active sessions created from that plan. When the limit is reached and no idle VM is available to reuse, session creation fails with vm.ErrPoolExhausted.

Use this when you need a strict upper bound on the memory or resource footprint of a single hot plan, even if that plan is under heavy concurrent load.

This is different from WithMaxActiveSessions: WithMaxVMsPerPlan limits VM ownership for one plan, while WithMaxActiveSessions limits active session concurrency across the entire engine.

This is also different from WithMaxIdleVMsPerPlan: WithMaxVMsPerPlan is a hard cap, while WithMaxIdleVMsPerPlan only decides how many unused VMs are retained after demand drops.

A value of 0 means the plan may create as many VMs as needed, subject only to other limits such as WithMaxActiveSessions.

func WithModules

func WithModules(module ...Module) Option

WithModules creates an Option that appends the provided modules to the options if not empty.

func WithNamespace

func WithNamespace(ns runtime.Namespace) Option

WithNamespace creates an Option that sets the library from the provided runtime.Namespace to the options if not nil.

func WithPlanCloseHook

func WithPlanCloseHook(hook PlanCloseHook) Option

WithPlanCloseHook returns an Option that registers a hook to execute when a plan is closed. It returns an error if hook is nil.

func WithProgramLoader

func WithProgramLoader(loader *artifact.Loader) Option

WithProgramLoader sets a custom artifact loader for Engine.Load.

func WithSessionCloseHook

func WithSessionCloseHook(hook SessionCloseHook) Option

WithSessionCloseHook returns an Option that registers a hook to execute when a session is closed. It returns an error if hook is nil.

func WithoutStdlib

func WithoutStdlib() Option

WithoutStdlib disables the standard library, so no built-in functions are registered by default.

type Output

type Output struct {
	ContentType string
	Content     []byte
}

type Plan

type Plan struct {
	// contains filtered or unexported fields
}

func (*Plan) Close

func (p *Plan) Close() error

func (*Plan) NewSession

func (p *Plan) NewSession(ctx context.Context, setters ...SessionOption) (*Session, error)

func (*Plan) Params

func (p *Plan) Params() []string

Params returns the list of parameter names declared in the query.

type PlanCloseHook

type PlanCloseHook func() error

PlanCloseHook runs when a plan is closed. Close hooks are executed in LIFO order and their errors are aggregated.

type PlanHookRegistrar

type PlanHookRegistrar interface {
	// BeforeCompile registers a hook executed in FIFO order before compilation starts.
	// A nil hook is ignored.
	BeforeCompile(hook BeforeCompileHook)
	// AfterCompile registers a hook executed in LIFO order after each compilation attempt.
	// It receives the compilation error (if any), and a nil hook is ignored.
	AfterCompile(hook AfterCompileHook)
	// OnClose registers a hook executed in LIFO order when a plan is closed.
	// A nil hook is ignored.
	OnClose(hook PlanCloseHook)
}

PlanHookRegistrar registers hooks for compilation and plan shutdown.

type Session

type Session struct {
	// contains filtered or unexported fields
}

Session represents the execution of a compiled Ferret program. It holds the state of the execution, including the virtual machine, environment, and encoding registry. A Session is created from a Plan and can be run to obtain results.

Session is not safe for concurrent use by multiple goroutines, except that Close is idempotent and safe to call multiple times, including concurrently. It is typically used for a single logical execution. When a Session is created directly via Plan.NewSession, it may be reused for multiple sequential runs as long as the environment and encoding registry are not modified between runs. Helper APIs such as Engine.Run may take ownership of the Session and close it after a single execution, in which case the caller must not attempt to reuse it.

func (*Session) Close

func (s *Session) Close() error

Close releases the session's borrowed VM and runs close hooks. It is idempotent and safe to call multiple times, including concurrently.

func (*Session) Run

func (s *Session) Run(c context.Context) (*Output, error)

type SessionCloseHook

type SessionCloseHook func() error

SessionCloseHook runs when a session is closed. Close hooks are executed in LIFO order and their errors are aggregated.

type SessionHookRegistrar

type SessionHookRegistrar interface {
	// BeforeRun registers a hook executed in FIFO order before each session run.
	// Hooks can replace the context passed to subsequent hooks and VM execution.
	// A nil hook is ignored.
	BeforeRun(hook BeforeRunHook)
	// AfterRun registers a hook executed in LIFO order after each run attempt.
	// It receives the run error (if any), and a nil hook is ignored.
	AfterRun(hook AfterRunHook)
	// OnClose registers a hook executed in LIFO order when a session is closed.
	// A nil hook is ignored.
	OnClose(hook SessionCloseHook)
}

SessionHookRegistrar registers hooks for session execution and shutdown.

type SessionOption

type SessionOption func(*sessionOptions) error

func WithEnvironmentOptions

func WithEnvironmentOptions(opts ...vm.EnvironmentOption) SessionOption

func WithOutputContentType

func WithOutputContentType(contentType string) SessionOption

func WithSessionParam

func WithSessionParam(name string, value runtime.Value) SessionOption

func WithSessionParams

func WithSessionParams(params runtime.Params) SessionOption

Directories

Path Synopsis
Package compat provides a Ferret v1-compatible public API surface, built on top of the Ferret v2 engine.
Package compat provides a Ferret v1-compatible public API surface, built on top of the Ferret v2 engine.
compiler
Package compiler provides a v1-compatible Compiler for the Ferret compatibility layer.
Package compiler provides a v1-compatible Compiler for the Ferret compatibility layer.
runtime
Package runtime provides v1-compatible runtime types for the Ferret compatibility layer.
Package runtime provides v1-compatible runtime types for the Ferret compatibility layer.
runtime/core
Package core provides v1-compatible runtime core types for the Ferret compatibility layer.
Package core provides v1-compatible runtime core types for the Ferret compatibility layer.
runtime/values
Package values provides v1-compatible concrete value types and helpers for the Ferret compatibility layer.
Package values provides v1-compatible concrete value types and helpers for the Ferret compatibility layer.
runtime/values/types
Package types provides v1-compatible type constants for the Ferret compatibility layer.
Package types provides v1-compatible type constants for the Ferret compatibility layer.
examples
embedded command
extensible command
pkg
asm
parser/tools command
sdk
vm
vm/internal/mem
Package mem provides VM storage helpers and the narrow ownership layer used to clean up direct register-held closers.
Package mem provides VM storage helpers and the narrow ownership layer used to clean up direct register-held closers.
test
e2e command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL