scraper

package
v0.1.31 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 9, 2026 License: GPL-3.0 Imports: 10 Imported by: 0

Documentation

Overview

Package scraper extracts metadata such as titles, descriptions, keywords, and favicons from web pages, with configurable options for the scraping process.

Index

Constants

This section is empty.

Variables

View Source
var ErrScrapeNotStarted = errors.New("scrape not started")

Functions

This section is empty.

Types

type OptFn

type OptFn func(*Options)

func WithContext

func WithContext(ctx context.Context) OptFn

func WithCustomSpinner

func WithCustomSpinner(sp *rotato.Rotato) OptFn

func WithSpinner

func WithSpinner(mesg string) OptFn

type Options

type Options struct {
	// contains filtered or unexported fields
}

type Scraper

type Scraper struct {
	Options
}

func New

func New(s string, opts ...OptFn) *Scraper

New creates a new Scraper.

func (*Scraper) Desc

func (s *Scraper) Desc() (string, error)

Desc retrieves the page description from the Scraper's Doc field, defaulting to a predefined value if not found.

default: `no description available (unfiled)`

func (*Scraper) Favicon

func (s *Scraper) Favicon() (string, error)

Favicon returns the URL of the favicon, or an empty string if not found.

func (*Scraper) Host added in v0.1.31

func (s *Scraper) Host() string

func (*Scraper) Keywords

func (s *Scraper) Keywords() (string, error)

Keywords extracts the content of the meta keywords tag.

func (*Scraper) Start

func (s *Scraper) Start() error

Start fetches and parses the URL content.

func (*Scraper) TagsRepo added in v0.1.31

func (s *Scraper) TagsRepo() ([]string, error)

TagsRepo returns repository topics (if present).

func (*Scraper) Title

func (s *Scraper) Title() (string, error)

Title retrieves the page title from the Scraper's Doc field, falling back to a default value if not found.

default: `untitled (unfiled)`

Directories

Path Synopsis
Package wayback provides a client for the Internet Archive Wayback Machine API.
Package wayback provides a client for the Internet Archive Wayback Machine API.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL