Documentation
¶
Overview ¶
Package generator provides functionality for generating llms.txt files from website content using Firecrawl and OpenAI APIs.
The package supports:
- Website mapping and URL discovery via Firecrawl
- Content scraping with configurable parameters
- AI-powered title and description generation via OpenAI
- Concurrent processing with rate limiting
- Generation of both summary (llms.txt) and full content (llms-full.txt) files
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ParseDomainFromURL ¶
Types ¶
type FirecrawlClient ¶
type FirecrawlClient interface { MapWebsite(ctx context.Context, url string, limit int, options FirecrawlOptions) ([]string, error) ScrapeURL(ctx context.Context, url string, options FirecrawlOptions) (*ScrapedData, error) }
func NewFirecrawlClient ¶
func NewFirecrawlClient(apiKey string) (FirecrawlClient, error)
NewFirecrawlClient initializes a new *firecrawl.FirecrawlApp given an API key.
type FirecrawlOptions ¶
type GenerationOptions ¶
type GenerationResult ¶
type LLMsTxtGenerator ¶
type LLMsTxtGenerator struct {
// contains filtered or unexported fields
}
func NewLLMsTxtGenerator ¶
func NewLLMsTxtGenerator(firecrawlClient FirecrawlClient, SummarizerClient gollm.SummarizerClient, options GenerationOptions) *LLMsTxtGenerator
NewLLMsTxtGenerator creates a new instance of LLMsTxtGenerator with the provided clients and options.
Parameters:
- firecrawlClient: Client for website mapping and content scraping
- openaiClient: Client for AI-powered content analysis and description generation
- options: Configuration options for generation behavior, timeouts, and processing limits
Returns a configured generator ready to process websites and generate llms.txt files.
func (*LLMsTxtGenerator) GenerateLLMsTXT ¶
func (g *LLMsTxtGenerator) GenerateLLMsTXT(ctx context.Context, targetURL string) (*GenerationResult, error)
GenerateLLMsTXT generates both llms.txt and llms-full.txt files from a target URL.
The process includes:
- Mapping the website to discover all available URLs
- Processing URLs in configurable batches with rate limiting
- Scraping content from each URL using Firecrawl
- Generating AI-powered titles and descriptions using OpenAI
- Building structured output files
Parameters:
- ctx: Context for cancellation and timeout control
- targetURL: The base URL of the website to process
- logger: Structured logger for progress tracking and debugging
Returns GenerationResult containing the generated content and processing statistics, or an error if the generation process fails.
func (*LLMsTxtGenerator) SystemPrompt ¶
func (g *LLMsTxtGenerator) SystemPrompt() string
func (*LLMsTxtGenerator) UserPrompt ¶
func (g *LLMsTxtGenerator) UserPrompt(uri string) string