mcp-data-platform

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 21, 2026 License: Apache-2.0

README

txn2/mcp-data-platform

GitHub license Go Reference codecov Go Report Card OpenSSF Scorecard SLSA 3

A semantic data platform MCP server that composes multiple data tools with bidirectional cross-injection - tool responses automatically include critical context from other services.

Features

  • Semantic-First Data Access: All data queries include business context from DataHub
  • Bidirectional Cross-Injection:
    • Trino results enriched with DataHub metadata (owners, tags, deprecation)
    • DataHub searches include query availability from Trino
    • S3 listings enriched with matching DataHub datasets
    • DataHub S3 searches include storage availability
  • OAuth 2.1 Authentication: OIDC, API keys, PKCE, Dynamic Client Registration
  • Role-Based Personas: Tool filtering with wildcard patterns (allow/deny rules)
  • Comprehensive Audit Logging: PostgreSQL-backed audit trail
  • Middleware Architecture: Extensible request/response processing

Architecture

graph LR
    subgraph "MCP Data Platform"
        DataHub[DataHub<br/>Semantic Metadata]
        Platform[Platform<br/>Bridge]
        Trino[Trino<br/>Query Engine]
        S3[S3<br/>Object Storage]

        DataHub <-->|"enrichment"| Platform
        Platform <-->|"enrichment"| Trino
        Platform <-->|"enrichment"| S3
    end

    Client([MCP Client]) --> Platform
    Platform --> Client

Cross-Injection Flow:

  • Trino → DataHub: Query results include owners, tags, glossary terms, deprecation warnings
  • DataHub → Trino: Search results include query availability and sample SQL
  • S3 → DataHub: Object listings include matching dataset metadata from DataHub
  • DataHub → S3: Search results for S3 datasets include storage availability

Installation

Go Install
go install github.com/txn2/mcp-data-platform/cmd/mcp-data-platform@latest
From Source
git clone https://github.com/txn2/mcp-data-platform.git
cd mcp-data-platform
go build -o mcp-data-platform ./cmd/mcp-data-platform

Quick Start

Standalone Server
# Run with stdio transport (default)
./mcp-data-platform

# Run with configuration file
./mcp-data-platform --config configs/platform.yaml

# Run with SSE transport
./mcp-data-platform --transport sse --address :8080
Claude Code CLI
claude mcp add mcp-data-platform -- mcp-data-platform
Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "mcp-data-platform": {
      "command": "mcp-data-platform",
      "args": ["--config", "/path/to/platform.yaml"]
    }
  }
}

Configuration

Create a platform.yaml configuration file:

server:
  name: mcp-data-platform
  transport: stdio

auth:
  oidc:
    enabled: true
    issuer: "https://auth.example.com/realms/platform"
    client_id: "mcp-data-platform"
  api_keys:
    enabled: true
    keys:
      - key: "${API_KEY_ADMIN}"
        name: "admin"
        roles: ["admin"]

personas:
  definitions:
    analyst:
      display_name: "Data Analyst"
      roles: ["analyst"]
      tools:
        allow: ["trino_*", "datahub_*"]
        deny: ["*_delete_*"]
    admin:
      display_name: "Administrator"
      roles: ["admin"]
      tools:
        allow: ["*"]
  default_persona: analyst

semantic:
  provider: datahub
  cache:
    enabled: true
    ttl: 5m

injection:
  trino_semantic_enrichment: true
  datahub_query_enrichment: true

audit:
  enabled: true
  log_tool_calls: true
  retention_days: 90

database:
  dsn: "${DATABASE_URL}"
Environment Variables
Variable Description Default
DATABASE_URL PostgreSQL connection string for audit logs -
API_KEY_ADMIN Admin API key (if using API key auth) -

Core Packages

Package Description
pkg/platform Main platform facade and configuration
pkg/auth OIDC and API key authentication
pkg/oauth OAuth 2.1 server with DCR and PKCE
pkg/persona Role-based personas and tool filtering
pkg/semantic Semantic metadata provider abstraction
pkg/query Query execution provider abstraction
pkg/middleware Request/response middleware chain
pkg/registry Toolkit registration and management
pkg/audit Audit logging with PostgreSQL storage
pkg/tuning Prompts, hints, and operational rules
pkg/storage S3-compatible storage provider abstraction
pkg/toolkits Toolkit implementations (Trino, DataHub, S3)
pkg/client Platform client utilities

Development

# Run tests with race detection
go test -race ./...

# Run linter
golangci-lint run ./...

# Run security scan
gosec ./...

# Build
go build -o mcp-data-platform ./cmd/mcp-data-platform

Library Usage

The platform can be imported and used as a library:

import (
    "github.com/txn2/mcp-data-platform/pkg/platform"
)

// Load configuration
cfg, err := platform.LoadConfig("platform.yaml")
if err != nil {
    log.Fatal(err)
}

// Create platform
p, err := platform.New(platform.WithConfig(cfg))
if err != nil {
    log.Fatal(err)
}
defer p.Close()

// Start the platform
if err := p.Start(ctx); err != nil {
    log.Fatal(err)
}

// Access the MCP server
mcpServer := p.MCPServer()

Contributing

We welcome contributions for bug fixes, tests, and documentation. Please ensure:

  1. All tests pass (go test -race ./...)
  2. Code is formatted (gofmt)
  3. Linter passes (golangci-lint run ./...)
  4. Security scan passes (gosec ./...)

License

Apache License 2.0


Open source by Craig Johnston, sponsored by Deasil Works, Inc.

Directories

Path Synopsis
cmd
mcp-data-platform command
Package main provides the entry point for the mcp-data-platform server.
Package main provides the entry point for the mcp-data-platform server.
internal
server
Package server provides a factory for creating the MCP server.
Package server provides a factory for creating the MCP server.
pkg
audit
Package audit provides audit logging for the platform.
Package audit provides audit logging for the platform.
audit/postgres
Package postgres provides PostgreSQL storage for audit logs.
Package postgres provides PostgreSQL storage for audit logs.
auth
Package auth provides authentication support for the platform.
Package auth provides authentication support for the platform.
middleware
Package middleware provides the middleware chain for tool handlers.
Package middleware provides the middleware chain for tool handlers.
oauth
Package oauth provides OAuth 2.1 server capabilities.
Package oauth provides OAuth 2.1 server capabilities.
oauth/postgres
Package postgres provides PostgreSQL storage for OAuth.
Package postgres provides PostgreSQL storage for OAuth.
persona
Package persona provides persona-based access control and customization.
Package persona provides persona-based access control and customization.
platform
Package platform provides the main platform orchestration.
Package platform provides the main platform orchestration.
query
Package query provides abstractions for query execution providers.
Package query provides abstractions for query execution providers.
query/trino
Package trino provides a Trino implementation of the query provider.
Package trino provides a Trino implementation of the query provider.
registry
Package registry provides toolkit registration and management.
Package registry provides toolkit registration and management.
semantic
Package semantic provides abstractions for semantic metadata providers.
Package semantic provides abstractions for semantic metadata providers.
semantic/datahub
Package datahub provides a DataHub implementation of the semantic provider.
Package datahub provides a DataHub implementation of the semantic provider.
storage
Package storage provides abstractions for storage providers.
Package storage provides abstractions for storage providers.
storage/s3
Package s3 provides an S3 implementation of the storage provider.
Package s3 provides an S3 implementation of the storage provider.
toolkits/datahub
Package datahub provides a DataHub toolkit adapter for the MCP data platform.
Package datahub provides a DataHub toolkit adapter for the MCP data platform.
toolkits/s3
Package s3 provides an S3 toolkit adapter for the MCP data platform.
Package s3 provides an S3 toolkit adapter for the MCP data platform.
toolkits/trino
Package trino provides a Trino toolkit adapter for the MCP data platform.
Package trino provides a Trino toolkit adapter for the MCP data platform.
tools
Package tools provides MCP tool definitions for mcp-data-platform.
Package tools provides MCP tool definitions for mcp-data-platform.
tuning
Package tuning provides AI tuning capabilities for the platform.
Package tuning provides AI tuning capabilities for the platform.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL