Talk to ProposalForge in plain English — powered end-to-end by Cloudflare's Workers AI Neurons, Durable Objects, and the @cloudflare/voice SDK.
A Neuron is Cloudflare's unit of AI compute — the metering currency for Workers AI. Every transcription, every spoken reply, and every model inference consumes Neurons. ProposalForge's voice assistant runs its real-time speech pipeline entirely on Neurons at the edge, so there are no separate GPU servers to manage and the free tier covers everyday use.
The 🎙️ voice button opens a real-time, full-duplex conversation with ProposalForge. Speech-to-text, the reasoning model, and text-to-speech are orchestrated by a stateful agent running on the edge — no phone numbers, no native app, just the browser microphone. Here is exactly what powers it:
| Layer | Cloudflare Technology | Role in Voice |
|---|---|---|
| 🧠 AI compute | Workers AI (Neurons) | Runs the speech + language models on Cloudflare's edge GPUs, metered in Neurons. |
| 🎤 Speech-to-Text | @cf/deepgram/flux |
Continuous streaming STT with built-in turn detection — transcribes you as you speak. |
| 🔊 Text-to-Speech | @cf/deepgram/aura-1 |
Converts the assistant's reply into natural spoken audio, streamed back to your browser. |
| 🗣️ Voice runtime | @cloudflare/voice SDK | Wires the microphone stream → STT → LLM → TTS pipeline and handles barge-in / interruptions. |
| 📌 Session state | Durable Objects | One VoiceAgent instance per call holds the live conversation, auth, and audio session. |
| 🧩 Reasoning (LLM) | Google Gemini → @cf/google/gemma-4-26b-a4b-it |
Gemini is primary for resilience; Workers AI Gemma is the automatic fallback. |
| 🔧 Tools / data | MCP Server over HTTP | Lets the voice agent list, create, send, and PDF your proposals — securely scoped to your token. |
| 🌐 Edge routing | Workers (proxy) | A proxy Worker serves the assistant on a clean branded URL at the network edge. |
The whole pipeline is declared on a Durable Object. The transcriber and TTS bind directly to the Workers AI Neuron runtime via this.env.AI:
Models run in the same Cloudflare data center as the Worker — no round-trip to a distant GPU cloud, so replies feel instant.
STT, LLM, and TTS are all billed in Neurons. The free tier includes a generous daily Neuron allowance — enough for everyday voice use at zero cost.
Durable Objects keep each conversation alive with full context, so the assistant remembers what you said earlier in the call.
The agent can only touch your proposals — every MCP tool call carries your JWT, with no shared database binding.
If one model provider is unavailable, the agent automatically falls back to another so the conversation keeps going.
The entire voice stack is serverless and globally distributed — no GPU instances to provision, patch, or scale.
Sign in, tap the 🎙️ button, and ask it to draft, send, or summarize a proposal — out loud.
Try Voice AI