Selected work

Projects

Production systems, research, and technical explorations spanning conversational AI, speech synthesis, and audio processing.

In Production

Flowent — Real-Time Voice Agent for Language Immersion

A production voice agent: streaming bidirectional audio, sub-second turn latency, and multi-turn speech interaction, grounded in cognitive-science immersion methodology.

Live in production on the App Store and Google Play, with sub-second turn-taking and streaming TTS.

voice-agentreal-timettslanguage-learning

Demo Details

react-native-tts-kit

On-device text-to-speech for React Native — local inference, no server round-trip, private by default.

on-devicereact-nativettsmobile

Demo Source Details

Research

Visual-Guided Prosody & Emotional TTS

A text-to-speech system that conditions prosody and emotional expression on visual cues, producing more natural, context-aware synthesized speech.

ttsprosodyemotional-synthesiscomputer-vision

Details

Automated TTS Dataset Generation

An AI-driven workflow that generates high-quality TTS training datasets, removing the data bottleneck for expressive speech models.

ttsdatasetsautomationpython

Details

Other

GPT-2 from Scratch

A GPT-2 language model implemented from scratch in PyTorch, trained to generate Shakespeare-style text.

nlppytorchtransformersgpt

Demo Details

Music Style Transfer with RAVE

Real-time music style transfer using RAVE (Realtime Audio Variational autoEncoder) for creative audio manipulation.

audiopytorchravemusic

Demo Details

Stem Separation (Spleeter + HT-Demucs)

Audio stem separation using Spleeter and HT-Demucs for high-quality vocal and instrumental isolation.

audiopytorchspleeterht-demucs

Demo Details

Evaluating & Improving Chain-of-Thought Reasoning

A framework for evaluating and improving Chain-of-Thought reasoning in LLMs, using causal mediation analysis and faithfulness measurement.

llmscotcausal-inferencepython

Source Details