Glossary
Auto-Captioning
Auto-captioning is the automatic generation of text captions overlaid on video using speech-to-text technology.
Short-form video captions are essential — over 80% of social media videos are watched without sound. Auto-captioning uses speech-to-text (STT) technology to transcribe spoken words and display them as animated text overlays, timed to each word.
AutoClip uses Deepgram for speech-to-text transcription with word-level timestamps, ensuring captions are precisely synchronized with speech. Multiple caption styles are available, from clean minimal text to trendy animated styles.
Related Terms
Frequently Asked Questions
Why are captions important for short-form video?
Over 80% of social media videos are watched on mute. Captions ensure your content is accessible and engaging even without sound.
Can I customize caption styles in AutoClip?
Yes, AutoClip offers multiple caption styles. Pro and Scale plans include additional caption customization options.
Are AutoClip's captions burned into the video permanently?
Yes. AutoClip burns captions directly into the video file — not as a separate subtitle layer. This ensures captions display correctly on every platform (TikTok, Reels, Shorts) without compatibility issues.
Put Auto-Captioning to Work
AutoClip handles the full pipeline — viral moment detection, 9:16 reframing, captions, and auto-posting. Start clipping for free.
Get Started Free