rlhf

package
v0.1.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 9, 2026 License: MIT Imports: 12 Imported by: 0

README

RLHF Auto-Collection Package

This package implements automatic collection of RLHF (Reinforcement Learning from Human Feedback) data for improving AI model performance.

Architecture

Components
  • Collector: Main service that manages interaction capture, queuing, and submission
  • FeedbackPrompt: TUI component for collecting explicit user feedback
  • Types: Data structures for interactions and feedback signals
Key Features
  1. Automatic Capture: Captures all user-AI interactions automatically
  2. Implicit Feedback: Tracks user actions (regenerate, copy, edit, continue)
  3. Explicit Feedback: Periodic prompts for user ratings and comments
  4. Background Processing: Non-blocking batch submission to API
  5. Privacy Controls: Opt-out mechanism and review-before-submit option
  6. Retry Logic: Automatic retry with exponential backoff on failures

Usage

Initializing the Collector
import (
    "github.com/AINative-studio/ainative-code/internal/rlhf"
    "github.com/AINative-studio/ainative-code/internal/config"
    rlhfClient "github.com/AINative-studio/ainative-code/internal/client/rlhf"
)

// Load configuration
cfg := config.LoadRLHFConfig()

// Create RLHF API client
client := rlhfClient.New(
    rlhfClient.WithBaseURL(cfg.Endpoint),
)

// Create collector
collector := rlhf.NewCollector(cfg, client)

// Start background worker
if err := collector.Start(); err != nil {
    log.Fatal(err)
}
defer collector.Stop()
Capturing Interactions
// Capture a user-AI interaction
interactionID := collector.CaptureInteraction(
    "What is the capital of France?",  // prompt
    "The capital of France is Paris.", // response
    "claude-3-5-sonnet-20241022",      // model ID
)
Recording Implicit Feedback
// User regenerates response (negative signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionRegenerate)

// User copies response (positive signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionCopy)

// User edits response (negative signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionEdit)

// User continues conversation (positive signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionContinue)
Recording Explicit Feedback
// User provides rating and comment
collector.RecordExplicitFeedback(
    interactionID,
    0.95,                    // score (0.0-1.0)
    "Very helpful response!" // comment
)
Checking for Feedback Prompts
// Check if user should be prompted for feedback
if collector.ShouldPromptForFeedback() {
    // Show feedback prompt UI
    model := rlhf.NewFeedbackPromptModel(interactionID)
    // ... run TUI model
}

Configuration

Required Settings
services:
  rlhf:
    enabled: true
    endpoint: https://api.ainative.studio/v1/rlhf
    api_key: your_api_key
    auto_collect: true
Optional Settings
services:
  rlhf:
    # Privacy controls
    opt_out: false
    review_before_submit: false

    # Batch settings
    batch_size: 10
    batch_interval: 5m

    # Feedback prompt settings
    prompt_interval: 5

    # Implicit feedback scores
    implicit_feedback:
      enabled: true
      regenerate_score: 0.2
      edit_response_score: 0.3
      copy_response_score: 0.8
      continue_score: 0.7

API Reference

Collector
NewCollector
func NewCollector(cfg *config.RLHFConfig, client *rlhf.Client) *Collector

Creates a new RLHF collector instance.

Start
func (c *Collector) Start() error

Starts the background worker for batch processing.

Stop
func (c *Collector) Stop() error

Gracefully stops the collector and flushes remaining interactions.

CaptureInteraction
func (c *Collector) CaptureInteraction(prompt, response, modelID string) string

Captures a user-AI interaction. Returns the interaction ID.

RecordImplicitFeedback
func (c *Collector) RecordImplicitFeedback(interactionID string, action FeedbackAction)

Records an implicit feedback signal for an interaction.

RecordExplicitFeedback
func (c *Collector) RecordExplicitFeedback(interactionID string, score float64, feedback string)

Records explicit user feedback for an interaction.

ShouldPromptForFeedback
func (c *Collector) ShouldPromptForFeedback() bool

Returns true if the user should be prompted for feedback.

FeedbackAction Types
const (
    ActionRegenerate FeedbackAction = "regenerate"
    ActionEdit       FeedbackAction = "edit"
    ActionCopy       FeedbackAction = "copy"
    ActionContinue   FeedbackAction = "continue"
)

Data Structures

InteractionData
type InteractionData struct {
    ID               string
    Prompt           string
    Response         string
    Timestamp        time.Time
    ModelID          string
    SessionID        string
    ImplicitScore    float64
    ExplicitScore    float64
    UserFeedback     string
    Metadata         map[string]interface{}
    ImplicitSignals  []ImplicitSignal
    HasExplicitScore bool
}
ImplicitSignal
type ImplicitSignal struct {
    Action    string
    Timestamp time.Time
    Score     float64
}

Testing

Running Tests
# Run all tests
go test ./internal/rlhf/...

# Run with coverage
go test -cover ./internal/rlhf/...

# Run specific test
go test -run TestCollector_CaptureInteraction ./internal/rlhf/...
Mock Client

Use the provided MockRLHFClient for testing:

mockClient := &MockRLHFClient{
    SubmittedBatches: make([]*rlhf.BatchInteractionFeedback, 0),
    ShouldFail:       false,
}

collector := rlhf.NewCollector(cfg, mockClient)

Privacy & Security

Data Collected
  • User prompts
  • AI responses
  • Timestamps
  • Model IDs
  • Session IDs (anonymous UUIDs)
  • Implicit signals (user actions)
  • Explicit feedback (ratings and comments)
Data NOT Collected
  • Personal identifying information (PII)
  • File paths or system information
  • API keys or credentials
  • IP addresses
Privacy Features
  • Opt-out: Set opt_out: true to disable all collection
  • Review before submit: User can review data before API submission
  • Session isolation: Each session has a unique anonymous ID
  • No disk storage: Data only in memory until submitted

Performance

Benchmarks

Typical performance characteristics:

  • Capture: < 1ms per interaction
  • Memory: ~1KB per queued interaction
  • CPU: < 0.1% additional usage
  • Network: 1 API call per batch (default: 10 interactions)
Optimization Tips
  1. Adjust batch size: Larger batches = fewer API calls
  2. Tune interval: Balance between latency and freshness
  3. Monitor queue size: Use GetQueueSize() to check backlog
  4. Handle failures: Retry logic prevents data loss

Troubleshooting

Common Issues
No interactions captured
  • Check auto_collect: true in config
  • Verify opt_out: false
  • Ensure collector is started with Start()
High memory usage
  • Reduce batch_size
  • Decrease batch_interval for more frequent submission
  • Check for API connection issues
Submissions failing
  • Verify API endpoint is reachable
  • Check API key is valid
  • Review error logs
Logging

Enable debug logging to troubleshoot:

logger.SetLevel("debug")

Look for RLHF-related log messages:

[INFO] Starting RLHF auto-collector
[DEBUG] Captured interaction: id=abc123
[DEBUG] Recorded implicit feedback: action=copy score=0.8
[INFO] Processing RLHF batch: batch_size=10
[INFO] RLHF batch submitted: success_count=10

Contributing

When contributing to this package:

  1. Add tests: All new features must have unit tests
  2. Update docs: Document new configuration options
  3. Privacy first: Always respect user privacy settings
  4. Performance: Profile any changes that affect capture path
  5. Backwards compatibility: Don't break existing configs

License

Copyright 2026 AINative Studio. All rights reserved.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Collector

type Collector struct {
	// contains filtered or unexported fields
}

Collector manages automatic collection and submission of RLHF data

func NewCollector

func NewCollector(cfg *config.RLHFConfig, client *rlhf.Client) *Collector

NewCollector creates a new RLHF collector instance

func (*Collector) CaptureInteraction

func (c *Collector) CaptureInteraction(prompt, response, modelID string) string

CaptureInteraction captures a new interaction for potential submission

func (*Collector) GetInteractionCount

func (c *Collector) GetInteractionCount() int

GetInteractionCount returns the total interaction count

func (*Collector) GetQueueSize

func (c *Collector) GetQueueSize() int

GetQueueSize returns the current queue size

func (*Collector) RecordExplicitFeedback

func (c *Collector) RecordExplicitFeedback(interactionID string, score float64, feedback string)

RecordExplicitFeedback records explicit user feedback

func (*Collector) RecordImplicitFeedback

func (c *Collector) RecordImplicitFeedback(interactionID string, action FeedbackAction)

RecordImplicitFeedback records an implicit feedback signal

func (*Collector) ShouldPromptForFeedback

func (c *Collector) ShouldPromptForFeedback() bool

ShouldPromptForFeedback determines if user should be prompted for feedback

func (*Collector) Start

func (c *Collector) Start() error

Start begins the background collection and submission worker

func (*Collector) Stop

func (c *Collector) Stop() error

Stop gracefully shuts down the collector

type FeedbackAction

type FeedbackAction string

FeedbackAction represents user actions that indicate quality

const (
	ActionRegenerate FeedbackAction = "regenerate"
	ActionEdit       FeedbackAction = "edit"
	ActionCopy       FeedbackAction = "copy"
	ActionContinue   FeedbackAction = "continue"
)

type FeedbackPromptModel

type FeedbackPromptModel struct {
	// contains filtered or unexported fields
}

FeedbackPromptModel represents the feedback prompt UI component

func NewFeedbackPromptModel

func NewFeedbackPromptModel(interactionID string) FeedbackPromptModel

NewFeedbackPromptModel creates a new feedback prompt model

func (FeedbackPromptModel) GetResult

func (m FeedbackPromptModel) GetResult() *FeedbackResult

GetResult returns the feedback result

func (FeedbackPromptModel) Init

func (m FeedbackPromptModel) Init() tea.Cmd

Init initializes the model

func (FeedbackPromptModel) IsDismissed

func (m FeedbackPromptModel) IsDismissed() bool

IsDismissed returns whether the prompt was dismissed

func (FeedbackPromptModel) IsSubmitted

func (m FeedbackPromptModel) IsSubmitted() bool

IsSubmitted returns whether the feedback was submitted

func (FeedbackPromptModel) Update

Update handles messages

func (FeedbackPromptModel) View

func (m FeedbackPromptModel) View() string

View renders the feedback prompt

type FeedbackResult

type FeedbackResult struct {
	InteractionID string
	Rating        int
	Comment       string
	Dismissed     bool
}

FeedbackResult represents the result of a feedback prompt

type ImplicitSignal

type ImplicitSignal struct {
	Action    string // "regenerate", "edit", "copy", "continue"
	Timestamp time.Time
	Score     float64
}

ImplicitSignal represents an implicit feedback action

type InteractionData

type InteractionData struct {
	ID               string
	Prompt           string
	Response         string
	Timestamp        time.Time
	ModelID          string
	SessionID        string
	ImplicitScore    float64
	ExplicitScore    float64
	UserFeedback     string
	Metadata         map[string]interface{}
	ImplicitSignals  []ImplicitSignal
	HasExplicitScore bool
}

InteractionData represents a captured interaction with metadata

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL