Back

All Posts

Technical insights on AI systems, production ML, and research findings.

projectnlppytorchtransformersgpt

GPT 2 from Scratch

Implementation of GPT-2 language model from scratch, demonstrating text generation capabilities.

projectaudiopytorchravemusic

Music Style Transfer with RAVE

Real-time music style transfer using RAVE (Realtime Audio Variational autoEncoder) for creative audio manipulation.

projectaudiopytorchspleeterht-demucs

Spleeter-HT-Demucs Stem Separation

Advanced audio stem separation using Spleeter and HT-Demucs models for high-quality vocal and instrumental isolation.

projectttsaiprosodyemotional-synthesiscomputer-visiondeep-learning

Visual-guided prosody and emotional speech synthesis

Advanced text-to-speech system that uses visual cues to generate natural prosody and emotional expression in synthesized speech, creating more human-like and contextually appropriate voice output.