Updated 8 days ago

Persona

Providing an AI Language learning companion that helps you gain new perspectives on the languages you are learning and their cultural contexts.

Members 2

Team info →

youtu.be/i0L6Ytlljoc

github.com/lxyhan/Persona-UofT-Hacks-12

medium.com/@mattparvaneh

Persona transforms language learning by combining advanced computer vision techniques, neural networks, and real-time 3D animation into an AI tutoring agent that truly understands you. Our system simultaneously processes facial expressions for emotional engagement, tracks precise lip movements for pronunciation feedback, and generates fluid, lip-synced 3D animations - all in real-time. Through natural conversations with our intelligent avatar, users explore language from multiple perspectives, receiving instant, personalized guidance that adapts to their unique learning style. Built in 36 hours, Persona demonstrates how cutting-edge AI can create a more intuitive, comprehensive approach to language mastery.

Creative Technical Architecture:

Real-Time Computer Vision Pipeline:

Continuous facial analysis using deep learning models
Advanced facial landmark detection for precise pronunciation tracking
Emotion recognition neural networks for engagement monitoring
Multi-threaded processing for simultaneous feature extraction

Dynamic 3D Animation System:

Real-time rigging and animation using Mixamo and Blender
Live lip-sync generation through Rhubarb phoneme detection
Custom animation blending for fluid character movement
Synchronized facial expression mapping to avatar

Natural Language Processing and Speech Generation

WhisperAPI for real-time speech-to-text processing
ElevenLabs for dynamic voice generation
Claude-LLM-powered conversation engine
Parallel processing of multiple AI models simultaneously

System Integration: Our microservices architecture orchestrates multiple complex processes in parallel:

Real-time video processing and facial analysis
Dynamic 3D character animation and rendering
Speech processing and generation
LLM-based conversation management
Synchronized audio-visual output generation

Built in 36 hours, Persona demonstrates the potential of combining cutting-edge technologies in computer vision, 3D graphics, and AI. The system maintains fluid, natural interactions while processing multiple real-time data streams - from facial expression analysis to pronunciation feedback - creating an unprecedented language learning experience that adapts to each user's needs instantly.