threading

package
v0.74.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 17, 2026 License: MIT Imports: 5 Imported by: 0

Documentation

Overview

Package threading provides generic email thread reconstruction. It can reconstruct threading relationships even when In-Reply-To and References headers are missing, using subject matching, date proximity, and embedded message hints.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DefaultSubjectNormalizer

func DefaultSubjectNormalizer(subject string) string

DefaultSubjectNormalizer removes common reply/forward prefixes.

Types

type Config

type Config struct {
	// MaxParentAge is the maximum age difference for subject-based matching.
	// Default: 7 days.
	MaxParentAge time.Duration

	// RequireParticipantOverlap requires messages to share at least one
	// participant for subject-based matching. Default: true.
	RequireParticipantOverlap bool

	// SubjectNormalizer is a custom function to normalize subjects.
	// If nil, uses DefaultSubjectNormalizer.
	SubjectNormalizer func(string) string
}

Config contains configuration options for thread reconstruction.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns the default reconstruction configuration.

type EmbeddedHint

type EmbeddedHint struct {
	// SenderPattern is a pattern to match against participant addresses
	// (e.g., "john.smith" or "john.smith@enron.com").
	SenderPattern string

	// Date is the date of the embedded message (if parseable).
	Date time.Time

	// Subject is the subject of the embedded message (if available).
	Subject string

	// Type indicates the type of embedding: "reply", "forward", "quoted".
	Type string
}

EmbeddedHint represents information about a message embedded in the body, such as a quoted reply or forwarded message.

type Reconstructor

type Reconstructor struct {
	// contains filtered or unexported fields
}

Reconstructor builds thread relationships across messages.

func NewReconstructor

func NewReconstructor() *Reconstructor

NewReconstructor creates a new thread reconstructor with default config.

func NewReconstructorWithConfig

func NewReconstructorWithConfig(config Config) *Reconstructor

NewReconstructorWithConfig creates a new thread reconstructor with custom config.

func (*Reconstructor) AddMessage

func (r *Reconstructor) AddMessage(msg ThreadableMessage)

AddMessage adds a message to the reconstruction pool.

func (*Reconstructor) AddMessages

func (r *Reconstructor) AddMessages(msgs []ThreadableMessage)

AddMessages adds multiple messages to the reconstruction pool.

func (*Reconstructor) GetThreadingInfo

func (r *Reconstructor) GetThreadingInfo(msgID string) (ThreadingInfo, bool)

GetThreadingInfo returns the threading info for a message.

func (*Reconstructor) GetThreads

func (r *Reconstructor) GetThreads() []*Thread

GetThreads returns all reconstructed threads.

func (*Reconstructor) Reconstruct

func (r *Reconstructor) Reconstruct()

Reconstruct performs thread reconstruction across all messages.

func (*Reconstructor) Stats

func (r *Reconstructor) Stats() Stats

Stats returns threading statistics.

type Stats

type Stats struct {
	TotalMessages        int `json:"total_messages"`
	TotalThreads         int `json:"total_threads"`
	UniqueSubjects       int `json:"unique_subjects"`
	MessagesWithParent   int `json:"messages_with_parent"`
	MessagesWithRefs     int `json:"messages_with_refs"`
	SingleMessageThreads int `json:"single_message_threads"`
	SmallThreads         int `json:"small_threads"`  // 2-5 messages
	MediumThreads        int `json:"medium_threads"` // 6-20 messages
	LargeThreads         int `json:"large_threads"`  // 21+ messages
}

Stats contains statistics about thread reconstruction.

type Thread

type Thread struct {
	// ID is a unique identifier for the thread.
	ID string `json:"id"`

	// Subject is the normalized subject of the thread.
	Subject string `json:"subject"`

	// RootMessageID is the MessageID of the first message in the thread.
	RootMessageID string `json:"root_message_id"`

	// MessageIDs contains all message IDs in the thread, sorted by date.
	MessageIDs []string `json:"message_ids"`

	// Participants contains all unique email addresses in the thread.
	Participants []string `json:"participants"`

	// StartDate is the date of the first message.
	StartDate time.Time `json:"start_date"`

	// EndDate is the date of the last message.
	EndDate time.Time `json:"end_date"`

	// Size is the number of messages in the thread.
	Size int `json:"size"`
}

Thread represents a collection of related messages.

type ThreadableMessage

type ThreadableMessage interface {
	// GetMessageID returns the unique message identifier.
	GetMessageID() string

	// GetDate returns the message date.
	GetDate() time.Time

	// GetSubject returns the message subject.
	GetSubject() string

	// GetInReplyTo returns the In-Reply-To header value (may be empty).
	GetInReplyTo() string

	// GetReferences returns the References header values (may be empty).
	GetReferences() []string

	// GetParticipants returns all email addresses involved in the message
	// (From, To, Cc, Bcc).
	GetParticipants() []string

	// GetEmbeddedMessageHints returns hints about embedded/quoted messages
	// that can be used for threading when headers are missing.
	GetEmbeddedMessageHints() []EmbeddedHint

	// SetThreadingInfo is called after reconstruction to provide
	// the computed threading information back to the message.
	SetThreadingInfo(info ThreadingInfo)
}

ThreadableMessage is the interface that messages must implement for thread reconstruction.

type ThreadingInfo

type ThreadingInfo struct {
	// ThreadID is a unique identifier for the thread this message belongs to.
	ThreadID string

	// ParentID is the MessageID of the parent message in the thread.
	// Empty if this is a root message.
	ParentID string

	// References is the reconstructed chain of message IDs leading to this message.
	References []string

	// Depth is the nesting depth in the thread (0 for root messages).
	Depth int
}

ThreadingInfo contains the computed threading information for a message.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL