BhashaBlend
AI-powered video dubbing and subtitle generation platform for regional languages.
Overview
BhashaBlend is an AI-powered video localization platform that enables automatic dubbing and subtitle generation for English videos in Marathi and Hindi we built during SIH '23 Hackathon.
The system extracts speech from videos, converts it into text using Whisper, translates and aligns subtitles, and reintegrates multilingual captions into the original video. This improves accessibility and expands content reach for regional language audiences.
Architecture
BhashaBlend follows a modular media processing pipeline:
- User uploads an English video
- Audio is extracted using FFmpeg
- Speech is transcribed using Whisper
- Subtitles are segmented and aligned
- Text is translated into Marathi and Hindi
- Multilingual subtitles are generated
- Subtitles are embedded back into the video
- Final dubbed and subtitled video is delivered
This pipeline ensures high transcription accuracy and synchronization.
Key Features
Speech-to-Text Pipeline
- High-accuracy transcription using Whisper
- Noise reduction and preprocessing
- Timestamp-based subtitle segmentation
Multilingual Subtitle Generation
- English to Marathi and Hindi translation
- Context-aware sentence alignment
- Support for dual-language subtitles
Video Processing
- Audio extraction and re-muxing
- Subtitle embedding using FFmpeg
- Output in standard MP4 format
