rlhf

package

v0.1.4 Latest Latest Go to latest Published: Jan 9, 2026 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/AINative-studio/ainative-code

Links

Open Source Insights

README ¶

RLHF Auto-Collection Package

This package implements automatic collection of RLHF (Reinforcement Learning from Human Feedback) data for improving AI model performance.

Architecture

Components

Collector: Main service that manages interaction capture, queuing, and submission
FeedbackPrompt: TUI component for collecting explicit user feedback
Types: Data structures for interactions and feedback signals

Key Features

Automatic Capture: Captures all user-AI interactions automatically
Implicit Feedback: Tracks user actions (regenerate, copy, edit, continue)
Explicit Feedback: Periodic prompts for user ratings and comments
Background Processing: Non-blocking batch submission to API
Privacy Controls: Opt-out mechanism and review-before-submit option
Retry Logic: Automatic retry with exponential backoff on failures

Usage

Initializing the Collector

import (
    "github.com/AINative-studio/ainative-code/internal/rlhf"
    "github.com/AINative-studio/ainative-code/internal/config"
    rlhfClient "github.com/AINative-studio/ainative-code/internal/client/rlhf"
)

// Load configuration
cfg := config.LoadRLHFConfig()

// Create RLHF API client
client := rlhfClient.New(
    rlhfClient.WithBaseURL(cfg.Endpoint),
)

// Create collector
collector := rlhf.NewCollector(cfg, client)

// Start background worker
if err := collector.Start(); err != nil {
    log.Fatal(err)
}
defer collector.Stop()

Capturing Interactions

// Capture a user-AI interaction
interactionID := collector.CaptureInteraction(
    "What is the capital of France?",  // prompt
    "The capital of France is Paris.", // response
    "claude-3-5-sonnet-20241022",      // model ID
)

Recording Implicit Feedback

// User regenerates response (negative signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionRegenerate)

// User copies response (positive signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionCopy)

// User edits response (negative signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionEdit)

// User continues conversation (positive signal)
collector.RecordImplicitFeedback(interactionID, rlhf.ActionContinue)

Recording Explicit Feedback

// User provides rating and comment
collector.RecordExplicitFeedback(
    interactionID,
    0.95,                    // score (0.0-1.0)
    "Very helpful response!" // comment
)

Checking for Feedback Prompts

// Check if user should be prompted for feedback
if collector.ShouldPromptForFeedback() {
    // Show feedback prompt UI
    model := rlhf.NewFeedbackPromptModel(interactionID)
    // ... run TUI model
}

Configuration

Required Settings

services:
  rlhf:
    enabled: true
    endpoint: https://api.ainative.studio/v1/rlhf
    api_key: your_api_key
    auto_collect: true

Optional Settings

services:
  rlhf:
    # Privacy controls
    opt_out: false
    review_before_submit: false

    # Batch settings
    batch_size: 10
    batch_interval: 5m

    # Feedback prompt settings
    prompt_interval: 5

    # Implicit feedback scores
    implicit_feedback:
      enabled: true
      regenerate_score: 0.2
      edit_response_score: 0.3
      copy_response_score: 0.8
      continue_score: 0.7

API Reference

Collector

NewCollector

func NewCollector(cfg *config.RLHFConfig, client *rlhf.Client) *Collector

Creates a new RLHF collector instance.

Start

func (c *Collector) Start() error

Starts the background worker for batch processing.

Stop

func (c *Collector) Stop() error

Gracefully stops the collector and flushes remaining interactions.

CaptureInteraction

func (c *Collector) CaptureInteraction(prompt, response, modelID string) string

Captures a user-AI interaction. Returns the interaction ID.

RecordImplicitFeedback

func (c *Collector) RecordImplicitFeedback(interactionID string, action FeedbackAction)

Records an implicit feedback signal for an interaction.

RecordExplicitFeedback

func (c *Collector) RecordExplicitFeedback(interactionID string, score float64, feedback string)

Records explicit user feedback for an interaction.

ShouldPromptForFeedback

func (c *Collector) ShouldPromptForFeedback() bool

Returns true if the user should be prompted for feedback.

FeedbackAction Types

const (
    ActionRegenerate FeedbackAction = "regenerate"
    ActionEdit       FeedbackAction = "edit"
    ActionCopy       FeedbackAction = "copy"
    ActionContinue   FeedbackAction = "continue"
)

Data Structures

InteractionData

type InteractionData struct {
    ID               string
    Prompt           string
    Response         string
    Timestamp        time.Time
    ModelID          string
    SessionID        string
    ImplicitScore    float64
    ExplicitScore    float64
    UserFeedback     string
    Metadata         map[string]interface{}
    ImplicitSignals  []ImplicitSignal
    HasExplicitScore bool
}

ImplicitSignal

type ImplicitSignal struct {
    Action    string
    Timestamp time.Time
    Score     float64
}

Testing

Running Tests

# Run all tests
go test ./internal/rlhf/...

# Run with coverage
go test -cover ./internal/rlhf/...

# Run specific test
go test -run TestCollector_CaptureInteraction ./internal/rlhf/...

Mock Client

Use the provided MockRLHFClient for testing:

mockClient := &MockRLHFClient{
    SubmittedBatches: make([]*rlhf.BatchInteractionFeedback, 0),
    ShouldFail:       false,
}

collector := rlhf.NewCollector(cfg, mockClient)

Privacy & Security

Data Collected

User prompts
AI responses
Timestamps
Model IDs
Session IDs (anonymous UUIDs)
Implicit signals (user actions)
Explicit feedback (ratings and comments)

Data NOT Collected

Personal identifying information (PII)
File paths or system information
API keys or credentials
IP addresses

Privacy Features

Opt-out: Set opt_out: true to disable all collection
Review before submit: User can review data before API submission
Session isolation: Each session has a unique anonymous ID
No disk storage: Data only in memory until submitted

Performance

Benchmarks

Typical performance characteristics:

Capture: < 1ms per interaction
Memory: ~1KB per queued interaction
CPU: < 0.1% additional usage
Network: 1 API call per batch (default: 10 interactions)

Optimization Tips

Adjust batch size: Larger batches = fewer API calls
Tune interval: Balance between latency and freshness
Monitor queue size: Use GetQueueSize() to check backlog
Handle failures: Retry logic prevents data loss

Troubleshooting

Common Issues

No interactions captured

Check auto_collect: true in config
Verify opt_out: false
Ensure collector is started with Start()

High memory usage

Reduce batch_size
Decrease batch_interval for more frequent submission
Check for API connection issues

Submissions failing

Verify API endpoint is reachable
Check API key is valid
Review error logs

Logging

Enable debug logging to troubleshoot:

logger.SetLevel("debug")

Look for RLHF-related log messages:

[INFO] Starting RLHF auto-collector
[DEBUG] Captured interaction: id=abc123
[DEBUG] Recorded implicit feedback: action=copy score=0.8
[INFO] Processing RLHF batch: batch_size=10
[INFO] RLHF batch submitted: success_count=10

Contributing

When contributing to this package:

Add tests: All new features must have unit tests
Update docs: Document new configuration options
Privacy first: Always respect user privacy settings
Performance: Profile any changes that affect capture path
Backwards compatibility: Don't break existing configs

License

Documentation ¶

Index ¶

type Collector
- func NewCollector(cfg *config.RLHFConfig, client *rlhf.Client) *Collector
type FeedbackAction
type FeedbackPromptModel
- func NewFeedbackPromptModel(interactionID string) FeedbackPromptModel
type FeedbackResult
type ImplicitSignal
type InteractionData

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Collector ¶

type Collector struct {
	// contains filtered or unexported fields
}

Collector manages automatic collection and submission of RLHF data

func NewCollector ¶

func NewCollector(cfg *config.RLHFConfig, client *rlhf.Client) *Collector

NewCollector creates a new RLHF collector instance

func (*Collector) CaptureInteraction ¶

func (c *Collector) CaptureInteraction(prompt, response, modelID string) string

CaptureInteraction captures a new interaction for potential submission

func (*Collector) GetInteractionCount ¶

func (c *Collector) GetInteractionCount() int

GetInteractionCount returns the total interaction count

func (*Collector) GetQueueSize ¶

func (c *Collector) GetQueueSize() int

GetQueueSize returns the current queue size

func (*Collector) RecordExplicitFeedback ¶

func (c *Collector) RecordExplicitFeedback(interactionID string, score float64, feedback string)

RecordExplicitFeedback records explicit user feedback

func (*Collector) RecordImplicitFeedback ¶

func (c *Collector) RecordImplicitFeedback(interactionID string, action FeedbackAction)

RecordImplicitFeedback records an implicit feedback signal

func (*Collector) ShouldPromptForFeedback ¶

func (c *Collector) ShouldPromptForFeedback() bool

ShouldPromptForFeedback determines if user should be prompted for feedback

func (*Collector) Start ¶

func (c *Collector) Start() error

Start begins the background collection and submission worker

func (*Collector) Stop ¶

func (c *Collector) Stop() error

Stop gracefully shuts down the collector

type FeedbackAction ¶

type FeedbackAction string

FeedbackAction represents user actions that indicate quality

const (
	ActionRegenerate FeedbackAction = "regenerate"
	ActionEdit       FeedbackAction = "edit"
	ActionCopy       FeedbackAction = "copy"
	ActionContinue   FeedbackAction = "continue"
)

type FeedbackPromptModel ¶

type FeedbackPromptModel struct {
	// contains filtered or unexported fields
}

FeedbackPromptModel represents the feedback prompt UI component

func NewFeedbackPromptModel ¶

func NewFeedbackPromptModel(interactionID string) FeedbackPromptModel

NewFeedbackPromptModel creates a new feedback prompt model

func (FeedbackPromptModel) GetResult ¶

func (m FeedbackPromptModel) GetResult() *FeedbackResult

GetResult returns the feedback result

func (FeedbackPromptModel) Init ¶

func (m FeedbackPromptModel) Init() tea.Cmd

Init initializes the model

func (FeedbackPromptModel) IsDismissed ¶

func (m FeedbackPromptModel) IsDismissed() bool

IsDismissed returns whether the prompt was dismissed

func (FeedbackPromptModel) IsSubmitted ¶

func (m FeedbackPromptModel) IsSubmitted() bool

IsSubmitted returns whether the feedback was submitted

func (FeedbackPromptModel) Update ¶

func (m FeedbackPromptModel) Update(msg tea.Msg) (FeedbackPromptModel, tea.Cmd)

Update handles messages

func (FeedbackPromptModel) View ¶

func (m FeedbackPromptModel) View() string

View renders the feedback prompt

type FeedbackResult ¶

type FeedbackResult struct {
	InteractionID string
	Rating        int
	Comment       string
	Dismissed     bool
}

FeedbackResult represents the result of a feedback prompt

type ImplicitSignal ¶

type ImplicitSignal struct {
	Action    string // "regenerate", "edit", "copy", "continue"
	Timestamp time.Time
	Score     float64
}

ImplicitSignal represents an implicit feedback action

type InteractionData ¶

type InteractionData struct {
	ID               string
	Prompt           string
	Response         string
	Timestamp        time.Time
	ModelID          string
	SessionID        string
	ImplicitScore    float64
	ExplicitScore    float64
	UserFeedback     string
	Metadata         map[string]interface{}
	ImplicitSignals  []ImplicitSignal
	HasExplicitScore bool
}

InteractionData represents a captured interaction with metadata

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL