personal_project
experimentRealtime Voice Interview Agent
A realtime AI that interviews you out loud, just like a real hiring manager would.
// what problem this solves
Voice AI demos make realtime agents look simple, but production systems usually fail at orchestration. Agents connect but cannot hear the user, tools fire while the assistant is still introducing itself, and orphaned processes stay alive in the background burning money and confusing sessions. The hard part is not getting a demo to speak once. It is making the whole lifecycle behave reliably under real conditions.
// what I built
I built a production-hardened orchestration layer around Gemini 2.5 and LiveKit for a realtime voice interview agent. Instead of stopping at the happy path, I designed the system to handle the full session lifecycle end to end: secure token issuance, room setup, atomic agent dispatch, synchronized audio startup, and state-aware tool execution. The result was a voice agent that behaved like a product, not a fragile demo.
// how it works
The key architectural shift was moving away from manual dispatch and adopting an atomic auto-dispatch pattern through LiveKit room configuration embedded in the JWT. That kept the agent and user in sync from the moment the room was created. I also solved the classic audio-blindness issue by reversing the expected startup sequence: the agent joins first to establish the audio pipeline, then identifies the user through JWT identity once the session is live. On top of that, I added a Greeting Guard that blocks tool execution until the assistant has finished speaking and is actively listening, which prevents race conditions and bad writes at the start of a call.
// result
- 99.9% reliable audio startup by eliminating random subscription and timing failures
- 100% server-side credential generation with Firebase-backed verification and zero token exposure
- Near-zero dispatch failure rate through atomic room creation and agent startup
- Zombie-free Cloud Run lifecycle with proper health-server binding and clean shutdown behavior
- State-aware tool execution that prevents race conditions during the first seconds of a call
the stack