Documentation
¶
Index ¶
- Constants
- func CalculateOptimalScrapers(requestScrapers []string, configPriority []string, parsed *ParsedInput) []string
- func DetectPartSuffix(nameWithoutExt, id string) (int, string, string)
- func FilterScrapersForURL(userScrapers []string, parsed *ParsedInput) []string
- func GroupByID(results []MatchResult) map[string][]MatchResult
- func ReorderWithPriority(scrapers []string, priority string) []string
- type MatchResult
- type Matcher
- type ParsedInput
Constants ¶
const ( // PatternExplicit indicates explicit multipart patterns (pt1, part2, -1, -2) // These are always considered multipart without directory context validation. PatternExplicit = "explicit" // PatternLetter indicates ambiguous single-letter patterns (A, B, C) // These need directory context validation to confirm multipart status. PatternLetter = "letter" // PatternNone indicates no multipart pattern detected PatternNone = "" )
Pattern type constants for multipart detection
Variables ¶
This section is empty.
Functions ¶
func CalculateOptimalScrapers ¶
func CalculateOptimalScrapers( requestScrapers []string, configPriority []string, parsed *ParsedInput, ) []string
CalculateOptimalScrapers determines the optimal scraper list for a given input. This consolidates the scraper selection logic used by both /scrape and /rescrape endpoints, ensuring consistent behavior and preventing logic drift.
The function applies two optimizations when a URL is detected: 1. FILTER: Reduces scraper list to only URL-compatible scrapers 2. REORDER: Places hinted scraper first for best performance
Parameters:
- requestScrapers: User's explicitly selected scrapers (can be empty)
- configPriority: Default scraper priority from configuration
- parsed: Result from ParseInput containing URL detection info (can be nil)
Returns the optimized scraper list to use for scraping.
func DetectPartSuffix ¶
DetectPartSuffix parses the portion of filename after the first occurrence of id and returns (number, suffix, patternType) where:
- number: 0 for single-part, 1..N for part index
- suffix: normalized string to append to base name (including leading dash)
- patternType: "explicit" for unambiguous patterns, "letter" for ambiguous single-letter, "" for no pattern detected
func FilterScrapersForURL ¶
func FilterScrapersForURL(userScrapers []string, parsed *ParsedInput) []string
FilterScrapersForURL filters a list of scrapers to only those compatible with a parsed URL. This helper is used by API endpoints to optimize scraper selection when URL is detected.
Parameters:
- userScrapers: User's selected scrapers (can be empty to use all compatible)
- parsed: Result from ParseInput containing CompatibleScrapers
Returns filtered scrapers or all compatible scrapers if userScrapers is empty. If no compatible scrapers exist, returns empty slice (caller should handle this case).
func GroupByID ¶
func GroupByID(results []MatchResult) map[string][]MatchResult
GroupByID groups match results by their ID
func ReorderWithPriority ¶
ReorderWithPriority moves the priority scraper to the front of the list. This is useful when multiple compatible scrapers exist for a URL - the hinted scraper should be tried first for best performance.
Parameters:
- scrapers: List of scraper names
- priority: Scraper name to move to front
Returns reordered list with priority scraper first. If scrapers is empty, returns a single-item list with just the priority scraper.
Types ¶
type MatchResult ¶
type MatchResult struct {
File scanner.FileInfo
ID string // Extracted JAV ID (e.g., "IPX-535")
PartNumber int // 0 = single-part, 1..N = part index
PartSuffix string // "-A", "-pt1", "-part2" (always with leading dash)
IsMultiPart bool // Whether this is a multi-part file
MatchedBy string // "regex" or "builtin"
MultipartPattern string // Pattern type: "explicit", "letter", or "" (see PatternExplicit, PatternLetter, PatternNone)
}
MatchResult represents a matched file with extracted ID
func FilterMultiPart ¶
func FilterMultiPart(results []MatchResult) []MatchResult
FilterMultiPart filters results to only include multi-part files
func FilterSinglePart ¶
func FilterSinglePart(results []MatchResult) []MatchResult
FilterSinglePart filters results to only include single-part files
func ValidateMultipartInDirectory ¶
func ValidateMultipartInDirectory(results []MatchResult) []MatchResult
ValidateMultipartInDirectory validates letter-based multipart patterns by checking for sibling files in the same directory with the same ID. Files with ambiguous letter patterns (-A, -B, -C) are only marked as multipart if multiple files with the same movie ID exist in the same directory. This prevents false positives for files like "ABW-121-C.mp4" where -C means Chinese subtitles, not part 3.
type Matcher ¶
type Matcher struct {
// contains filtered or unexported fields
}
Matcher identifies JAV IDs from filenames
func NewMatcher ¶
func NewMatcher(cfg *config.MatchingConfig) (*Matcher, error)
NewMatcher creates a new file matcher
func (*Matcher) Match ¶
func (m *Matcher) Match(files []scanner.FileInfo) []MatchResult
Match extracts JAV IDs from a list of files
func (*Matcher) MatchFile ¶
func (m *Matcher) MatchFile(file scanner.FileInfo) *MatchResult
MatchFile attempts to extract a JAV ID from a single file
func (*Matcher) MatchString ¶
MatchString is a helper to extract ID from a string directly
type ParsedInput ¶
type ParsedInput struct {
ID string // Extracted movie ID
ScraperHint string // Suggested scraper ("dmm", "r18dev", or "")
IsURL bool // true if input was a URL
CompatibleScrapers []string // List of scrapers that can handle this URL (if IsURL)
}
ParsedInput represents the result of parsing user input
func ParseInput ¶
func ParseInput(input string, registry *models.ScraperRegistry) (*ParsedInput, error)
ParseInput determines if input is a URL or ID and extracts the movie ID. The parser is agnostic about URL patterns - it delegates URL detection to scrapers that implement the URLHandler interface. If no scraper handles the URL, the input is treated as a plain movie ID.
When input is a URL, the function also returns the list of all compatible scrapers that can handle the URL, avoiding redundant registry iteration in callers.