scanner

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 10, 2025 License: MIT Imports: 16 Imported by: 0

Documentation

Overview

Package scanner provides file system scanning capabilities for the doppel duplicate file finder.

This package handles the initial phase of duplicate detection by:

  • Recursively traversing directory structures
  • Applying filters to exclude unwanted files and directories
  • Grouping files by size to optimize duplicate detection
  • Processing command-line directory arguments and removing subdirectories

The scanner works in conjunction with the filter package to efficiently collect candidate files for duplicate detection.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetDirectoriesFromArgs

func GetDirectoriesFromArgs(c *cli.Command) ([]string, error)

GetDirectoriesFromArgs returns the directories to scan from command arguments.

func GroupFilesBySize

func GroupFilesBySize(ctx context.Context,
	directories []string, filterConfig *filter.Config, stats *model.Stats, verbose bool) (map[int64][]FileInfo, error,
)

GroupFilesBySize scans directories and groups files by their size.

func HashFile

func HashFile(filePath string, hasher hash.Hash, buf []byte) (string, error)

HashFile computes the hash of an entire file.

func QuickHashFile added in v1.0.0

func QuickHashFile(filePath string, size int64, hasher *xxh3.Hasher, buf []byte) (uint64, error)

QuickHashFile computes a XXH3 hash of the first and the last portions of a file This is used as a quick preliminary check before computing the full hash.

Types

type FileInfo

type FileInfo struct {
	Path string `json:"path" yaml:"path"`
	Size int64  `json:"size" yaml:"size"`
	Hash string `json:"hash" yaml:"hash"`
}

FileInfo represents a file with its path, size, and hash.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL