TranscriptAI Logo

TranscriptAI

AI/ML SaaS Application

A full-featured AI transcription platform built in partnership with Pima Web Design. TranscriptAI converts audio and video files into accurate, speaker-labeled transcripts.

Technologies Used

PythonOpenAI Whisperpyannote.audioFFmpeglibrosanoisereducePyTorchReactTypeScriptElectronVite

Application Screenshots

Drag and drop upload with security controls

Model selection and speaker detection settings

Audio waveform with speaker-labeled transcript

Usage stats and account management

Project Goals

Build a production-grade transcription product that supports multiple media formats, speaker identification, fast processing, and export-ready outputs.

My Solution

I engineered an audio processing pipeline using FFmpeg for format conversion, enhancement for cleaner signals, diarization for speaker mapping, and Whisper for high-accuracy transcription.

Key Features

OpenAI Whisper transcription (tiny to large models)
Automatic speaker diarization (up to 10 speakers)
GPU acceleration with CUDA support
Audio enhancement and noise reduction
Support for 15+ audio/video formats
Auto-detect language or manual selection
Interactive transcript viewer with audio sync
AI-powered summary generation
Keyword extraction
Color-coded speaker labels in exports
DOCX export with timestamps
Batch upload for multiple files
Secure and encrypted file handling
Usage tracking and analytics dashboard

Results & Impact

TranscriptAI achieved 98.7 percent average accuracy on clear audio, supported up to 10 speakers, and delivered usable outputs through an interface accessible to non-technical users.