Documentation
¶
Overview ¶
Package sync provides synchronization logic between Notion and local storage.
Index ¶
- Variables
- type Crawler
- func (c *Crawler) AddDatabase(ctx context.Context, databaseID, folder string, forceUpdate bool) error
- func (c *Crawler) AddRootPage(ctx context.Context, pageID, folder string, forceUpdate bool) error
- func (c *Crawler) Commit(ctx context.Context, message string) error
- func (c *Crawler) CommitChanges(ctx context.Context, message string) error
- func (c *Crawler) EnsureTransaction(ctx context.Context) error
- func (c *Crawler) GetPage(ctx context.Context, pageID string, folder string) error
- func (c *Crawler) GetStatus(ctx context.Context, folderFilter string) (*StatusInfo, error)
- func (c *Crawler) ListPages(ctx context.Context, folderFilter string, asTree bool) ([]*FolderInfo, error)
- func (c *Crawler) ProcessQueue(ctx context.Context, folderFilter string, maxPages int, maxFiles int, ...) error
- func (c *Crawler) ProcessQueueWithCallback(ctx context.Context, folderFilter string, ...) error
- func (c *Crawler) Pull(ctx context.Context, opts PullOptions) (*PullResult, error)
- func (c *Crawler) Reindex(ctx context.Context, dryRun bool) error
- func (c *Crawler) ScanPage(ctx context.Context, pageID string) error
- func (c *Crawler) SetTransaction(tx store.Transaction)
- func (c *Crawler) Transaction() store.Transaction
- type CrawlerOption
- type FileManifest
- type FileRegistry
- type FolderInfo
- type FolderStatus
- type PageInfo
- type PageRegistry
- type PullOptions
- type PullResult
- type QueueCallback
- type QueueInfo
- type State
- type StatusInfo
Constants ¶
This section is empty.
Variables ¶
var ErrFileTooLarge = errors.New("file exceeds maximum size limit")
ErrFileTooLarge is returned when a file exceeds the maximum size limit.
Functions ¶
This section is empty.
Types ¶
type Crawler ¶
type Crawler struct {
// contains filtered or unexported fields
}
Crawler synchronizes Notion pages to local storage using folder-based organization.
func NewCrawler ¶
NewCrawler creates a new crawler.
func (*Crawler) AddDatabase ¶
func (c *Crawler) AddDatabase(ctx context.Context, databaseID, folder string, forceUpdate bool) error
AddDatabase adds all pages from a database to a folder.
func (*Crawler) AddRootPage ¶
AddRootPage adds a page as a root page in a folder and queues it for syncing.
func (*Crawler) Commit ¶ added in v0.4.0
Commit commits the current transaction with the given message. After commit, a new transaction is automatically started.
func (*Crawler) CommitChanges ¶
CommitChanges commits pending changes to git.
func (*Crawler) EnsureTransaction ¶ added in v0.4.0
EnsureTransaction ensures a transaction is available. If no transaction exists, creates a new one.
func (*Crawler) GetPage ¶
GetPage fetches a single page and places it in the correct location based on its parent hierarchy. Unlike AddRootPage, this does not mark the page as a root page. If folder is empty, it will be determined from the parent chain.
func (*Crawler) ListPages ¶
func (c *Crawler) ListPages(ctx context.Context, folderFilter string, asTree bool) ([]*FolderInfo, error)
ListPages returns page information for display.
func (*Crawler) ProcessQueue ¶
func (c *Crawler) ProcessQueue( ctx context.Context, folderFilter string, maxPages int, maxFiles int, maxQueueFiles int, maxTime time.Duration, ) error
ProcessQueue processes all queue entries, optionally filtered by folder. maxPages limits the number of pages to fetch (0 = unlimited). maxTime limits the duration of the sync (0 = unlimited).
func (*Crawler) ProcessQueueWithCallback ¶
func (c *Crawler) ProcessQueueWithCallback( ctx context.Context, folderFilter string, maxPages, maxFiles, maxQueueFiles int, maxTime time.Duration, callback QueueCallback, ) error
ProcessQueueWithCallback is like ProcessQueue but calls the callback after each queue file is processed.
func (*Crawler) Pull ¶
func (c *Crawler) Pull(ctx context.Context, opts PullOptions) (*PullResult, error)
Pull fetches all pages changed since the last pull and queues them for sync.
func (*Crawler) ScanPage ¶
ScanPage re-scans a page to discover all child pages and queues them. This is useful for discovering new child pages under an existing root page.
func (*Crawler) SetTransaction ¶ added in v0.4.0
func (c *Crawler) SetTransaction(tx store.Transaction)
SetTransaction sets an external transaction.
func (*Crawler) Transaction ¶ added in v0.4.0
func (c *Crawler) Transaction() store.Transaction
Transaction returns the current transaction.
type CrawlerOption ¶
type CrawlerOption func(*Crawler)
CrawlerOption configures the crawler.
func WithCrawlerLogger ¶
func WithCrawlerLogger(l *slog.Logger) CrawlerOption
WithCrawlerLogger sets a custom logger.
type FileManifest ¶
type FileManifest struct {
FileID string `json:"file_id"`
ParentPageID string `json:"parent_page_id"`
DownloadedAt time.Time `json:"downloaded_at"`
}
FileManifest is stored alongside downloaded files as {filename}.meta.json Contains metadata for local file identification.
type FileRegistry ¶
type FileRegistry struct {
ID string `json:"id"` // File ID extracted from S3 URL
FilePath string `json:"file_path"` // Local file path (directory + name)
SourceURL string `json:"source_url"` // Original S3 URL
LastSynced time.Time `json:"last_synced"`
}
FileRegistry is stored in .notion-sync/ids/file-{id}.json Contains metadata for tracking downloaded files (images, attachments, etc.).
type FolderInfo ¶
type FolderInfo struct {
Name string
RootPages int
TotalPages int
OrphanedPages int
Pages []*PageInfo
}
FolderInfo contains information about a folder.
type FolderStatus ¶
type FolderStatus struct {
Name string
PageCount int
RootPages int
LastSynced *time.Time
QueuedPages int
}
FolderStatus contains status for a specific folder.
type PageInfo ¶
type PageInfo struct {
ID string
Title string
Path string
LastSynced time.Time
IsRoot bool
IsOrphaned bool
ParentID string
Children []*PageInfo
}
PageInfo contains displayable information about a page.
type PageRegistry ¶
type PageRegistry struct {
ID string `json:"id"`
Type string `json:"type"` // "page" or "database"
Folder string `json:"folder"`
FilePath string `json:"file_path"`
Title string `json:"title"`
LastEdited time.Time `json:"last_edited"`
LastSynced time.Time `json:"last_synced"`
IsRoot bool `json:"is_root"`
ParentID string `json:"parent_id,omitempty"`
Children []string `json:"children,omitempty"`
ContentHash string `json:"content_hash,omitempty"`
}
PageRegistry is stored in .notion-sync/ids/page-{id}.json Contains all metadata needed to locate and identify a page or database.
type PullOptions ¶
type PullOptions struct {
Folder string // Filter to specific folder (empty = all folders)
Since time.Duration // Override for cutoff time (0 = use LastPullTime)
MaxPages int // Maximum number of pages to queue (0 = unlimited)
All bool // Include pages not yet tracked
DryRun bool // Preview without modifying
Verbose bool // Show detailed output
}
PullOptions configures the pull operation.
type PullResult ¶
type PullResult struct {
PagesFound int
PagesQueued int
PagesSkipped int
NewPages int
UpdatedPages int
CutoffTime time.Time
}
PullResult contains the result of a pull operation.
type QueueCallback ¶
type QueueCallback func() error
QueueCallback is called after each queue file is processed (written or deleted).
type State ¶
type State struct {
Version int `json:"version"`
Folders []string `json:"folders"`
LastPullTime *time.Time `json:"last_pull_time,omitempty"`
OldestPullResult *time.Time `json:"oldest_pull_result,omitempty"` // Oldest page seen in last pull
}
State is persisted in .notion-sync/state.json Simplified to only contain folder names. Page metadata is stored in: - Frontmatter of markdown files (last_synced, file_path) - Page registries (.notion-sync/ids/page-{id}.json).
type StatusInfo ¶
type StatusInfo struct {
FolderCount int
TotalPages int
TotalRootPages int
QueueEntries []*QueueInfo
Folders map[string]*FolderStatus
}
StatusInfo contains sync status information.