duplicate

package
v0.80.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 1, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package duplicate provides memo duplicate detection for P2-C002.

Package duplicate - detector implementation for P2-C002.

Package duplicate - similarity calculation for P2-C002.

Index

Constants

View Source
const (
	DuplicateThreshold = 0.9 // >90% = duplicate
	RelatedThreshold   = 0.7 // 70-90% = related
	DefaultTopK        = 5
)

Thresholds for duplicate detection.

View Source
const TimeDecayDays = 7

TimeDecayDays is the decay period for time proximity calculation.

Variables

View Source
var DefaultWeights = Weights{
	Vector:     0.5,
	TagCoOccur: 0.3,
	TimeProx:   0.2,
}

DefaultWeights are the default weights for similarity calculation.

Functions

func CalculateWeightedSimilarity

func CalculateWeightedSimilarity(b *Breakdown, w Weights) float64

CalculateWeightedSimilarity computes weighted similarity from breakdown.

func CosineSimilarity

func CosineSimilarity(a, b []float32) float64

CosineSimilarity calculates cosine similarity between two vectors.

func ExtractTitle

func ExtractTitle(content string) string

ExtractTitle extracts title from memo content (first line).

func FindSharedTags

func FindSharedTags(tags1, tags2 []string) []string

FindSharedTags returns tags that appear in both slices.

func TagCoOccurrence

func TagCoOccurrence(tags1, tags2 []string) float64

TagCoOccurrence calculates Jaccard similarity between two tag sets.

func TimeProximity

func TimeProximity(newTime, candidateTime time.Time) float64

TimeProximity calculates time proximity using exponential decay. Returns 1.0 for same day, decaying exponentially over TimeDecayDays.

func Truncate

func Truncate(content string, maxLen int) string

Truncate truncates content to maxLen characters.

Types

type Breakdown

type Breakdown struct {
	Vector     float64 `json:"vector"`
	TagCoOccur float64 `json:"tag_co_occur"`
	TimeProx   float64 `json:"time_prox"`
}

Breakdown shows how similarity was calculated.

type DetectRequest

type DetectRequest struct {
	Title   string   `json:"title"`
	Content string   `json:"content"`
	Tags    []string `json:"tags,omitempty"`
	TopK    int      `json:"top_k,omitempty"`
	UserID  int32    `json:"user_id"`
}

DetectRequest contains input for duplicate detection.

type DetectResponse

type DetectResponse struct {
	Duplicates   []SimilarMemo `json:"duplicates,omitempty"`
	Related      []SimilarMemo `json:"related,omitempty"`
	LatencyMs    int64         `json:"latency_ms"`
	HasDuplicate bool          `json:"has_duplicate"`
	HasRelated   bool          `json:"has_related"`
}

DetectResponse contains detection results.

type DuplicateDetector

type DuplicateDetector interface {
	// Detect finds duplicate and related memos for given content.
	Detect(ctx context.Context, req *DetectRequest) (*DetectResponse, error)

	// Merge merges source memo into target memo.
	Merge(ctx context.Context, userID int32, sourceID, targetID string) error

	// Link creates a bidirectional relation between two memos.
	Link(ctx context.Context, userID int32, memoID1, memoID2 string) error
}

DuplicateDetector detects duplicate and related memos.

func NewDuplicateDetector

func NewDuplicateDetector(s *store.Store, embedding ai.EmbeddingService, model string) DuplicateDetector

NewDuplicateDetector creates a new DuplicateDetector.

func NewDuplicateDetectorWithWeights

func NewDuplicateDetectorWithWeights(s *store.Store, embedding ai.EmbeddingService, model string, weights Weights) DuplicateDetector

NewDuplicateDetectorWithWeights creates a detector with custom weights.

type SimilarMemo

type SimilarMemo struct {
	Breakdown  *Breakdown `json:"breakdown,omitempty"`
	ID         string     `json:"id"`
	Name       string     `json:"name"`
	Title      string     `json:"title"`
	Snippet    string     `json:"snippet"`
	Level      string     `json:"level"`
	SharedTags []string   `json:"shared_tags,omitempty"`
	Similarity float64    `json:"similarity"`
}

SimilarMemo represents a memo similar to the input.

type Weights

type Weights struct {
	Vector     float64
	TagCoOccur float64
	TimeProx   float64
}

Weights for similarity calculation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL