Documentation
¶
Overview ¶
Package filefilter provides a concurrent pipeline for filtering files by size, content type, and glob patterns.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ErrPathNotAllowed = errors.New("paths are not allowed in exclude rules")
ErrPathNotAllowed is returned when an exclude rule contains a path separator.
Functions ¶
func ExpandExcludeNames ¶
ExpandExcludeNames validates and converts user-provided names into global glob patterns. It enforces the rule that inputs must be basenames (example: "node_modules"), not paths.
func IsTextContent ¶
IsTextContent determines if the data slice contains text content based on the null byte method. See: https://docs.google.com/document/d/1GYir_j0ITTxg_CqyAw8BeUZYCCUyNMAePbGw5nsTGYE/
Types ¶
type Analytics ¶
type Analytics interface {
RecordSizeFiltered(total int)
RecordFileFilterTimeMs(startTime time.Time)
}
Analytics defines the metrics recording interface used by the filter pipeline.
type FileFilter ¶
FileFilter defines the contract for any logic that decides if a file should be dropped.
func FileSizeFilter ¶
func FileSizeFilter(logger *zerolog.Logger) FileFilter
FileSizeFilter returns a filter that drops empty files and files larger than 1 MB.
func TextFileOnlyFilter ¶
func TextFileOnlyFilter(logger *zerolog.Logger) FileFilter
TextFileOnlyFilter returns a filter that drops binary files based on a null-byte heuristic.
type Option ¶
type Option func(*Pipeline)
Option defines the functional option type.
func WithAnalytics ¶
WithAnalytics sets the analytics recorder for filter metrics.
func WithConcurrency ¶
WithConcurrency allows overriding the default worker count.
func WithExcludeGlobs ¶
WithExcludeGlobs adds user-defined patterns to the pipeline's exclude list.
func WithFilters ¶
func WithFilters(filters ...FileFilter) Option
WithFilters allows passing multiple filters (FileSizeFilter, TextFileOnlyFilter).
func WithLogger ¶
WithLogger sets the logger for the pipeline.
type Pipeline ¶
type Pipeline struct {
// contains filtered or unexported fields
}
Pipeline holds the configuration for the filtering process.
func NewPipeline ¶
NewPipeline creates a filter pipeline with reasonable defaults. Default concurrency is set to runtime.NumCPU().