notion

package
v0.13.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 19, 2026 License: Apache-2.0 Imports: 15 Imported by: 0

README

Notion

Extract page and database metadata from a Notion workspace using the Notion API.

Usage

source:
  name: notion
  scope: my-workspace
  config:
    token: ntn_your_integration_token
    extract:
      - pages
      - databases

Configuration

Key Type Required Description
token string Yes Notion internal integration token.
base_url string No Override Notion API base URL. Defaults to https://api.notion.com.
extract []string No Entity types to extract. Defaults to all: pages, databases.

Entities

The extractor emits document entities for both pages and databases.

Entity: document (page)
Field Sample Value
urn urn:notion:my-workspace:document:abc123-def456
name Data Pipeline Architecture
properties.page_id abc123-def456
properties.created_at 2024-01-15T10:30:00Z
properties.updated_at 2024-03-20T14:15:00Z
properties.created_by Alice
properties.last_edited_by Bob
properties.web_url https://www.notion.so/Data-Pipeline-abc123
properties.archived false
Entity: document (database)
Field Sample Value
urn urn:notion:my-workspace:document:db-789
name Project Tracker
description Track all engineering projects
properties.database_id db-789
properties.created_at 2024-01-10T09:00:00Z
properties.updated_at 2024-03-18T16:00:00Z
properties.created_by Alice
properties.columns ["Name", "Status", "Priority"]
properties.web_url https://www.notion.so/db-789
Edges
Type Source Target Description
child_of document document Page is a child of another page
belongs_to document document Page belongs to a database
owned_by document user Page/database is owned by its creator
documented_by document any Page references a data asset via URN in its content
URN Reference Detection

The extractor reads page block content and scans for URN patterns (urn:service:scope:type:id), emitting documented_by edges to link documentation to data assets.

Contributing

Refer to the contribution guidelines for information on contributing to this module.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Block

type Block struct {
	ID   string `json:"id"`
	Type string `json:"type"`
	// We flatten all block types into a generic map for URN scanning.
	Paragraph    *blockContent `json:"paragraph,omitempty"`
	Heading1     *blockContent `json:"heading_1,omitempty"`
	Heading2     *blockContent `json:"heading_2,omitempty"`
	Heading3     *blockContent `json:"heading_3,omitempty"`
	BulletedList *blockContent `json:"bulleted_list_item,omitempty"`
	NumberedList *blockContent `json:"numbered_list_item,omitempty"`
	Quote        *blockContent `json:"quote,omitempty"`
	Callout      *blockContent `json:"callout,omitempty"`
	Code         *blockContent `json:"code,omitempty"`
}

Block represents a Notion block (used for reading page content).

func (*Block) PlainText

func (b *Block) PlainText() string

PlainText extracts all plain text from a block.

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client wraps the Notion API.

func NewClient

func NewClient(token string) *Client

NewClient creates a new Notion API client.

func (*Client) GetBlockChildren

func (c *Client) GetBlockChildren(ctx context.Context, blockID string) ([]Block, error)

GetBlockChildren returns the top-level blocks of a page or block.

func (*Client) SearchDatabases

func (c *Client) SearchDatabases(ctx context.Context) ([]Database, error)

SearchDatabases returns all databases.

func (*Client) SearchPages

func (c *Client) SearchPages(ctx context.Context) ([]Page, error)

SearchPages returns all pages, optionally filtered by query.

func (*Client) SetBaseURL

func (c *Client) SetBaseURL(url string)

SetBaseURL overrides the API base URL (used for testing).

type Config

type Config struct {
	Token   string   `json:"token" yaml:"token" mapstructure:"token" validate:"required"`
	BaseURL string   `json:"base_url" yaml:"base_url" mapstructure:"base_url" default:"https://api.notion.com"`
	Extract []string `json:"extract" yaml:"extract" mapstructure:"extract" validate:"omitempty,dive,oneof=pages databases"`
}

type Database

type Database struct {
	ID             string         `json:"id"`
	CreatedTime    time.Time      `json:"created_time"`
	LastEditedTime time.Time      `json:"last_edited_time"`
	CreatedBy      User           `json:"created_by"`
	LastEditedBy   User           `json:"last_edited_by"`
	Title          []RichText     `json:"title"`
	Description    []RichText     `json:"description"`
	Archived       bool           `json:"archived"`
	URL            string         `json:"url"`
	Parent         Parent         `json:"parent"`
	Properties     map[string]any `json:"properties"`
}

Database represents a Notion database.

type Extractor

type Extractor struct {
	plugins.BaseExtractor
	// contains filtered or unexported fields
}

func New

func New(logger log.Logger) *Extractor

func (*Extractor) Extract

func (e *Extractor) Extract(ctx context.Context, emit plugins.Emit) error

func (*Extractor) Init

func (e *Extractor) Init(ctx context.Context, config plugins.Config) error

type Page

type Page struct {
	ID             string         `json:"id"`
	CreatedTime    time.Time      `json:"created_time"`
	LastEditedTime time.Time      `json:"last_edited_time"`
	CreatedBy      User           `json:"created_by"`
	LastEditedBy   User           `json:"last_edited_by"`
	Archived       bool           `json:"archived"`
	URL            string         `json:"url"`
	Parent         Parent         `json:"parent"`
	Properties     map[string]any `json:"properties"`
}

Page represents a Notion page.

type Parent

type Parent struct {
	Type        string `json:"type"`
	PageID      string `json:"page_id,omitempty"`
	DatabaseID  string `json:"database_id,omitempty"`
	WorkspaceID string `json:"workspace,omitempty"`
}

Parent represents the parent of a page or database.

type RichText

type RichText struct {
	PlainText string `json:"plain_text"`
}

RichText represents a Notion rich text object.

type User

type User struct {
	ID   string `json:"id"`
	Name string `json:"name"`
}

User represents a Notion user.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL