English | 简体中文
AI Proxy
Next-generation AI gateway, using OpenAI as the protocol entry point.
Feature
- Intelligent error retry
- Channel selection based on priority and error rate
- Alert notifications
- Channel balance warning
- Error rate warning
- Unauthorized channel warning
- and more...
- Logging and auditing
- Comprehensive request log data
- Request and response body recording
- Request log tracing
- Data statistics and analysis
- Request volume statistics
- Error volume statistics
- RPM TPM statistics
- Consumption statistics
- Model statistics
- Channel error rate analysis
- and more...
- Rerank support
- PDF support
- STT model mapping support
- Multi-tenant system separation
- Model RPM TPM limits
- Think model support
<think> split to reasoning_content
- Prompt Token Cache billing support
- Inline tiktoken, no need to download tiktoken file
- API
Swagger documentation support http://host:port/swagger/index.html
How to use
Sealos
Use Sealos built-in model capabilities, click to Sealos.
FastGPT
Use AI Proxy to access models, click to FastGPT.
Deploy
Use Docker
docker run -d --name aiproxy -p 3000:3000 -v $(pwd)/aiproxy:/aiproxy ghcr.io/labring/aiproxy:latest
Use Docker Compose
Copy docker-compose.yaml to directory.
docker-compose up -d
Envs
Basic Configuration
LISTEN: The listen address, default is :3000
ADMIN_KEY: The admin key for the AI Proxy Service, admin key is used to admin api and relay api, default is empty
INTERNAL_TOKEN: Internal token for service authentication, default is empty
FFPROBE_ENABLED: Whether to enable ffprobe, default is false
Debug Options
DEBUG: Enable debug mode, default is false
DEBUG_SQL: Enable SQL debugging, default is false
Database Options
SQL_DSN: The database connection string, default is empty, eg: postgres://postgres:postgres@localhost:5432/postgres
LOG_SQL_DSN: The log database connection string, default is empty, eg: postgres://postgres:postgres@localhost:5432/postgres
REDIS_CONN_STRING: The redis connection string, default is empty, eg: redis://localhost:6379
DISABLE_AUTO_MIGRATE_DB: Disable automatic database migration, default is false
SQL_MAX_IDLE_CONNS: The maximum number of idle connections in the database, default is 100
SQL_MAX_OPEN_CONNS: The maximum number of open connections to the database, default is 1000
SQL_MAX_LIFETIME: The maximum lifetime of a connection in seconds, default is 60
SQLITE_PATH: The path to the sqlite database, default is aiproxy.db
SQL_BUSY_TIMEOUT: The busy timeout for the database, default is 3000
Notify Options
NOTIFY_NOTE: Custom notification note, default is AI Proxy
NOTIFY_FEISHU_WEBHOOK: The feishu notify webhook url, default is empty, eg: https://open.feishu.cn/open-apis/bot/v2/hook/xxxx
Model Configuration
DISABLE_MODEL_CONFIG: Disable model configuration, default is false
RETRY_TIMES: Number of retry attempts, default is 0
ENABLE_MODEL_ERROR_AUTO_BAN: Enable automatic banning of models with errors, default is false
MODEL_ERROR_AUTO_BAN_RATE: Rate threshold for auto-banning models with errors, default is 0.3
TIMEOUT_WITH_MODEL_TYPE: Timeout settings for different model types, default is {}
DEFAULT_CHANNEL_MODELS: Default models for each channel, default is {}
DEFAULT_CHANNEL_MODEL_MAPPING: Model mapping for each channel, default is {}
Logging Configuration
LOG_STORAGE_HOURS: Hours to store logs (0 means unlimited), default is 0
LOG_CONTENT_STORAGE_HOURS: Hours to store log content ip endpoint ttfb_milliseconds, default is 0
SAVE_ALL_LOG_DETAIL: Save all log details, default is false
LOG_DETAIL_REQUEST_BODY_MAX_SIZE: Maximum size for request body in log details, default is 128KB
LOG_DETAIL_RESPONSE_BODY_MAX_SIZE: Maximum size for response body in log details, default is 128KB
LOG_DETAIL_STORAGE_HOURS: Hours to store log details, default is 72 (3 days)
CLEAN_LOG_BATCH_SIZE: Batch size for cleaning logs, cleaning interval is 1 minute, default is 2000
Service Control
DISABLE_SERVE: Disable serving requests, default false
GROUP_MAX_TOKEN_NUM: Maximum number of tokens per group (0 means unlimited), default is 0
GROUP_CONSUME_LEVEL_RATIO: Consumption level ratio for groups, default is {}
GEMINI_SAFETY_SETTING: Safety setting for Gemini models, default is BLOCK_NONE
BILLING_ENABLED: Enable billing functionality, default is true
IP_GROUPS_THRESHOLD: IP group threshold, when the same IP is used by multiple groups, send a warning, default is 0
IP_GROUPS_BAN_THRESHOLD: IP group ban threshold, when the same IP is used by multiple groups, ban it and all groups, default is 0