Talk
Talking with AI is a breeze

Highlighted Features
- Focus on voice-driven dialogues
- Broad range of service providers to choose from
- Modern and stylish user interface
- Unified, standalone binary
How to use
Prepare a talk.yaml file. Here is a simple example utilising ChatGPT, Whisper and
Elevenlabs:
speech-to-text:
whisper: open-ai-01
llm:
chat-gpt: open-ai-01
text-to-speech:
elevenlabs: elevenlabs-01
# provide your confidential information below.
creds:
open-ai-01: "sk-2dwY1IAeEysbnDNuAKJDXofX1IAeEysbnDNuAKJDXofXF5"
elevenlabs-01: "711sfpb9kk15sds8m4czuk5rozvp43a4"
How about Google Text-to-Speech and Google Speech-to-Text? No problem.
See talk.google.example.yaml
Full example: talk.full.example.yaml
Docker
docker run -it -v ./talk.yaml:/etc/talk/talk.yaml -p 8000:8000 proxoar/talk
With proxy:
docker run -it -v ./talk.yaml:/etc/talk/talk.yaml \
-e HTTPS_PROXY=http://localhost:7890 \
-e HTTP_PROXY=http://localhost:7890 \
-e ALL_PROXY=socks5://127.0.0.1:7891 \
-p 8000:8000 proxoar/talk
Refer to terraform. The same applies to Kubernetes.
From scratch
# clone projects
mkdir talk-projects && cd talk-projects
git clone git@github.com:proxoar/talk.git
git clone git@github.com:proxoar/talk-web.git
# build web with yarn
cd talk-web && ./script/build-and-copy-to-backend.sh
cd ../talk && go build cmd/talk/talk.go
# run
./talk --config ./talk.yaml
# or simply `./talk` as it automatically locates talk.yaml in `/etc/talk/` and `./talk.yaml`
./talk
Troubleshooting
** I Can't start recording**
Browsers keep HTTPS website from reading your microphone for security.
Solutions:
- Run Talk behind a reverse proxy like Nginx and setup TLS in Nginx.
- Open
chrome://flags/ in Chrome, find Insecure origins treated as secure and enable it:
Rest assured, HTTPS support is on its way and will be implemented shortly
Browser compatibility
|
Arc |
Chrome |
FireFox |
Edge |
Safari |
| Microphone |
✅ |
✅ |
✅ |
❌ |
❌ |
| UI |
✅ |
✅ |
✅ |
✅ |
❌ |
Q&A
Q: Why not use TypeScript for both the frontend and backend development?
A:
- When I embarked on this project, I was largely inspired by Hugh, a project
primarily coded in Python, supplemented with HTML and a touch of JavaScript. To broaden the horizons of text-to-speech
providers, I revamped the backend logic using Go, transforming it into a Go-based project.
- Crafting backend logic with Go feels incredibly intuitive—it distills everything down to a single binary.
- Moreover, my skills in frontend development were somewhat rudimentary at that time.
Q: Will a mobile browser-friendly version be made available?
A: Streamlining the website for mobile usage would be a time-intensive endeavour and, given my current time constraints,
it isn't the primary concern. As it stands, the site performs optimally on desktop browsers based on the Chromium
Engine, with certain limitations on browsers such as Safari.
Roadmap
- Google TTS
- Google STT
- OpenAI Whisper STT
- Setting language, speed, stability, etc
- Choose voice
- Docker image
- Server Side Events(SSE)
- More LLMs other than ChatGPT
- Download and import text history
- Download chat MP3
- Prompt template
Contributing
We're in the midst of a dynamic development stage for this project and warmly invite new contributors.
Doc for contributing will be ready soon.
Credits
Front-end
- React: The library for web and native user interfaces
- vite: Next generation frontend tooling. It's fast!
- valtio: Valtio makes proxy-state simple for React and Vanilla
- wavesurfer.js: Audio waveform player
- granim.js: Create fluid and interactive gradient animations with this small
javascript library.
Back-end
- This project draws inspiration from Hugh, a remarkable tool that enables
seamless communication with AI using minimal code.
- go-openai: OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for
Go.
- echo: High performance, minimalist Go web framework
- elevenlabs-go: A Go API client library for the ElevenLabs speech synthesis
- r3labs/sse: Server Sent Events server and client for Golang
platform.
UI
We would also like to thank all other open-source projects and communities not listed here for their valuable
contributions to our project.