Voice AI Therapist Training Simulator
Developed comprehensive prompt and evaluation frameworks for a Voice AI-based virtual patient simulator designed to train therapists in realistic clinical scenarios. This project transformed a brittle proof-of-concept into a production-ready, publicly-launched platform by introducing scalable architectures for patient avatar creation, state management, and performance evaluation.

Project Highlights
The Challenge
The development team was working with the Eleven Labs agent architecture, which created significant limitations for scalability and resilience. Their proof-of-concept was brittle and prone to breaking during demos, creating uncertainty about production readiness. The core challenge wasn't technical capability—the team excelled at front-end and back-end development. The problem was that they had never worked with voice AI agents before. They were learning everything from scratch and needed an expert to accelerate their development process and guide them toward production-grade architecture. The business needed to move from an unstable demo to a publicly-launchable product that could scale reliably.
The Approach
As the subject matter expert on Voice AI, I joined the team as an individual contributor focused on building out the prompt and evaluation architecture. For the prompt architecture, I drew on elements of game design combined with modular, extensible prompt creation principles, creating a 23-unit dynamic framework. I implemented industry best practices for Voice AI evaluation with a 7-section framework customized for therapeutic training. I introduced LiveKit and LangFuse to replace Eleven Labs limitations, enabling proper state management and patient progress tracking. My AI workflow used Claude Code and Gemini for ideation, documentation, and rapid prototyping—setting up a working agent in a single afternoon for validation before team handoff.
Learning & Outcomes
Delivered a working product ready for public launch in December 2025, followed by commercial production in Q1 2026. Created a 23-unit dynamic patient prompt architecture enabling scalable, precise avatar creation, and a 7-section evaluation framework ensuring quality control. The frameworks fundamentally changed team operations: avatar creation became extensible and scalable, state management became reliable, and systematic performance tracking eliminated the demo-breaking issues of the original proof-of-concept.