

AI bot to help reduce operational toil
Architecture
┌──────────────────┐
│ Slack API │
└────────┬─────────┘
│
┌────────▼─────────┐
│ Slack Integration│
└────────┬─────────┘
│
┌──────────────┐ ┌────────▼─────────┐ ┌──────────────┐
│ OpenAI/ │◄─────────────────┤ Bot ├───────────────►│ PostgreSQL │
│ Ollama │ │ (Coordinator) │ │ Database │
└──────────────┘ └────────┬─────────┘ └──────────────┘
│
┌────────────────────────┼──────────────────────┐
│ │ │
┌───────▼──────┐ ┌───────▼──────┐ ┌─────▼─────┐
│ Tools │ │ Background │ │ Web │
│ │ │ Workers │ │ Server │
└──────┬───────┘ └──────┬───────┘ └───────────┘
│ │
┌────────────┴──────────┐ ┌─────────┴──────────────────────────┐
│ ● channel_monitor │ │ ● backfill_thread_worker │
│ ● classifier │ │ ● channel_onboard_worker │
│ ● documentation │ │ ● documentation_refresh_worker │
│ ● channel reports │ │ │
│ ● runbooks │ └────────────────────────────────────┘
│ ● bot usage │
└───────────────────────┘
How AI is Used
As shown in the architecture above, Ratchet sits at the intersection of Slack, LLMs, Tools, and a Database.
When it's added to a public channel in Slack, it starts recording all messages in its database (including some historical
messages as part of onboarding). Now, whenever someone pings @ratchet ..., it treats that Slack thread as
a conversation between users and the LLM. It uses the tools (built-in and those added by customers) to respond
in the Slack thread.
Currently, Ratchet focuses on two aspects of toil:
Operational toil (-ops channels)
To help deal with on-call issues, Ratchet can classify incoming messages as alerts firing and will try to post a helpful
runbook based on historical triaging of that alert. Additionally, it can post a channel report where it provides
insights on where the team can focus next to reduce operational toil.
Documentation (-help channels)
Helping users use your systems/services well is a full-time job. Ratchet can integrate with your documentation sources and
try to answer user questions. On top of that, depending on the source, it can also update your documentation based on
conversations in Slack, which ensures documentation health keeps improving.
Built with
| Tool |
Purpose |
| Production |
|
| Go |
Main programming language for the application |
| Slack |
Platform for bot interaction and message handling |
| PostgreSQL |
Primary database for storing messages, runbooks and embeddings |
| SQLc |
Type-safe SQL query generation from schema |
| Riverqueue |
Background job processing and scheduling |
| pgvector |
Vector database for storing embeddings |
| Development |
|
| Ollama |
Local LLM inference server |
| Docker |
Containerization and deployment |
| Github Actions |
CI/CD pipeline automation |
| Cursor |
IDE for writing code |
Lessons Learned
- PostgreSQL as database, queue, and vector DB is working out great.
- Slack as a data source seems to be enough to derive value.
- Though the Slack API is poorly documented and inconsistent.
- Investing in building UI for visibility ended up wasting a lot of time.
- Even with AI tools, it is hard to get right for a backend engineer.
- Even after you figure out HTML/CSS/JS, dealing with security concerns and deploying to production is a pain.
- JSON API on the other hand is great. Just works and you can post-process output with
jq efficiently.
- River queue and its UI is great though.
- For database schema, instead of using individual columns for each attribute, using an
attrs column as jsonb is great.
- SQLc support for custom types and automatic serialization/deserialization to jsonb is great.
- Given the focus of the bot is on AI, we could have saved time by:
- Not focusing on non-AI features (like matching open/close incidents manually or building UI).
- Not aiming for perfect data collection, when AI is good with imperfect data.
- On the AI front:
- Ollama is great for local development.
- The qwen2.5:7b model is fast and good enough for local development.
- Cursor IDE is great for writing code.
- Using paid models like Claude Sonnet to improve your own prompts does wonders.
- Giving the LLM as much context as possible (like all historical messages instead of only new ones) helps.
- MCP, tools, agents:
- Initially we were classifying each user message intent and then running a set of deterministic logic based on that.
- Instead just exposing all the functionality as tools and letting AI figure out how to proceed is much more powerful.
Contributing
- To a Slack workspace where you are an admin, add an app using the manifest from
app-manifest.yaml.
- Get access to a working Slack app/bot token and add it to the
.env file (which is gitignored) as:
RATCHET_SLACK_APP_TOKEN=xapp-...
RATCHET_SLACK_BOT_TOKEN=xoxb-...
RATCHET_CLASSIFIER_INCIDENT_CLASSIFICATION_BINARY=path/to/binary
- Install Docker (for PostgreSQL) and Ollama (for local LLM access).
- Start the binary using:
go run ./cmd/ratchet --help
go run ./cmd/ratchet
- Onboard a channel where ratchet bot is added and then ask it process a message. Remember you will need
to onboard again if you want to test it against new messages since in devmode we do not enable Slack socketmode.
curl http://localhost:5001/api/channels/ratchet-test/onboard -X POST
curl http://localhost:5001/api/commands/generate?channel_id=FAKECHANNELID\&ts=1750324493.161329
OpenTelemetry (OTEL)
Tracing
Traces are created with the OTEL SDK but are sent to both an OTEL endpoint and Sentry.
Currently, Sentry is not fully compatible with OTEL by design:
- Span attributes are not copied to searchable tags; they're copied to the
otel context
- Span events are not copied at all
Development Environment
Traces are sampled at 100% in dev.
To export traces to Sentry, add the following to your .envrc file:
export SENTRY_DSN=<Project DSN URL>
To export traces to a Grafana Tempo OTEL endpoint, add the following to your .envrc file:
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
To view traces, run Grafana and Tempo in Docker by following instructions in https://github.com/grafana/tempo/blob/main/example/docker-compose/readme.md.
You can delete the k6-tracing load generator service from the docker-compose.yml file.
Then, you can access Grafana at http://localhost:3000.