AutoClip Research

Data Studies on Viral Clips and Shorts

AutoClip runs original research on clip virality patterns using production data from our AI pipeline. Every study draws from real clips — no survey data, no simulations. The findings are public and free to cite.

We process clips across YouTube and Twitch channels for thousands of clippers. That volume produces patterns worth publishing. This hub collects them.

Published Studies

86%
of all signal matches are scene cuts or energy peaks

Methodology: 175 complete clips from AutoClip's production pipeline. Corpus-level aggregates only — no user data. Scoring via Gemini 2.5 Flash across six signal types.

Virality signalsClip durationSignal distributionScore analysis
Read full study

More studies in progress. We publish findings when the corpus is large enough to be meaningful — typically 100+ clips per analysis.

“We built this because we kept hearing that virality is unpredictable. It's not. Scene cuts and energy peaks account for 86% of the signal in our corpus. Once you know that, you can choose source video differently. You can prioritize clips with visible edit rhythm. The data changes how you clip.”

How AutoClip research works

All studies use AutoClip's production pipeline data — clips processed by real clippers, not synthetic test runs. Before any analysis, we filter to records with status: complete only, which excludes failed jobs, in-progress jobs, and internal test clips.

Anonymization: We publish corpus-level aggregates. No study includes individual clip URLs, channel names, user identifiers, or any content that could be traced back to a specific clipper or creator. The data is counts and distributions, not records.

Sample sizes: We don't publish findings until the sample is large enough to be meaningful. The virality signals study used 175 clips, drawn from the 2,000 most recently processed records at the time of analysis. As the corpus grows, we'll revisit earlier findings with larger samples.

Scoring model: AutoClip's AI pipeline uses Gemini 2.5 Flash for virality scoring. The model analyzes video transcripts and six audio/visual signal types (scene cuts, energy peaks, speech rate changes, music swells, laughter, applause). The composite score (0-100) reflects hook strength, emotional impact, pacing, quotability, and visual appeal — each scored 0-20.

License: All AutoClip research is published under CC BY 4.0. You can cite, reproduce, or adapt the findings with attribution.

Journalists: cite AutoClip data

If you're writing about viral clip patterns, short-form video performance, or AI in content creation, the findings on this page are free to cite under CC BY 4.0. Attribution: “AutoClip research, autoclip.dev/research.”

For data questions, press inquiries, or early access to upcoming studies, email george.teifel@gmail.com.