Generate App Store preview videos from HTML
Build an App Store preview video in HTML — phone-frame entrance, paginated screenshots, cross-faded captions — and render to deterministic MP4 at every required size.
An App Store preview video is a 15-to-30-second silent loop that auto-plays on your App Store listing, showing the app in motion before a user taps. The expensive way is to film a screen recording and edit it in After Effects every time the UI changes. The cheap, reproducible way is to model the preview as HTML — a phone frame, a stack of screenshots, a caption track — and render the MP4 from hf-seek-driven animation.
This post walks through that template: a phone-frame that slides up from the bottom, three screenshots that paginate inside it with a soft horizontal swipe, and captions that cross-fade above each frame.
The format Apple actually wants
Apple's App Store preview spec is rigid: H.264 (or HEVC), specific resolutions per device family (886×1920 for 6.5" iPhones, 1080×1920 for 6.7", 1200×1600 for iPads, and so on), 30 fps, and 15–30 seconds. The file has to fit the device frame exactly — Apple does not letterbox.
That's a forcing function. You can't shoot one master and crop. Every device family is a separate render. The HTML-driven pipeline shines here because you change one CSS variable (the viewport size) and re-render the same animation at the new aspect.
Why a phone frame at all
The phone frame is a signpost. The viewer's eye is scanning a grid of apps; the phone-shaped silhouette inside your preview tells them "this is software running on your device" before they parse a single word of copy. Drop the frame and the preview reads as a brand video, which is the wrong genre.
The phone enters from the bottom because the App Store layout puts your screenshots below the title — the eye is already moving downward, and an upward slide meets that motion. An entrance from the side fights the reading direction.
The three-panel pagination
The screen inside the phone is a horizontal reel — three panels at 100% width each, joined edge-to-edge, with the whole reel three times the width of the screen. To page from screen 1 to screen 2, translate the reel by -33.333%. To page to screen 3, translate by -66.666%. The transitions are short (0.2s) ease-in-out swipes between panels; the dwell on each panel is longer (1.4s) so the caption can be read.
addEventListener('hf-seek', (e) => {
const t = e.detail.time;
let x = 0;
if (t < 2.4) x = 0;
else if (t < 2.6) x = easeInOut((t - 2.4) / 0.2); // swipe 1 -> 2
else if (t < 4.0) x = 1;
else if (t < 4.2) x = 1 + easeInOut((t - 4.0) / 0.2); // swipe 2 -> 3
else x = 2;
reel.style.transform = `translateX(${-x * 33.333}%)`;
});The page-dots at the bottom flip discrete states — no fade, no slide. They're the same kind of indicator the user has seen ten thousand times in onboarding carousels, and the recognition does the work.
Captions: cross-fade, never slide
Captions sit above the phone, not on top of the screenshot. The captions cross-fade between panels — caption 1 fades out as caption 2 fades in, with a 0.2s overlap. Sliding captions left and right at the same time as the screen swipes creates a visual collision; the eye can't track both motions and the message blurs.
Determinism is the entire game
Apple's review process is slow. If your preview video is non-deterministic — if rendering it twice produces visually different MP4s — you can't reliably ship a fix without re-rendering every locale. Make the animation hf-seek-driven, pin every easing curve, and the next render is byte-identical to the last unless you changed source.
This is the argument made at length in the deterministic video manifesto — for App Store assets in particular, it pays for itself the first time you change a caption and don't have to QA fourteen locales.
Render to MP4
hyperframes render input.html --out clip.mp4 --duration 6 --fps 30For the App Store-specific sizes, render the same source at each required viewport:
# 6.5" iPhone (886x1920)
hyperframes render preview.html --out preview-6_5.mp4 \
--duration 15 --fps 30 --width 886 --height 1920
# 6.7" iPhone (1080x1920)
hyperframes render preview.html --out preview-6_7.mp4 \
--duration 15 --fps 30 --width 1080 --height 1920
# 12.9" iPad (1200x1600)
hyperframes render preview.html --out preview-12_9.mp4 \
--duration 15 --fps 30 --width 1200 --height 1600Common mistakes
Filming a screen recording and calling it a preview. Screen recordings have variable frame rates, compression artifacts from the OS recorder, and a UI that goes stale the minute you ship a new build. The HTML preview tracks your design system — when the button color changes in your app, you change one CSS variable and re-render.
Stuffing in more than three screens. App Store previews are 15 seconds for first-time viewers. Three screens at five seconds each is the right pace. Five screens at three seconds each is a slideshow, and the viewer doesn't read fast enough.
Adding voiceover. The preview auto-plays muted. Apple is explicit about this. Any animation that depends on audio cues will land badly. Design for silence; let the captions carry the narrative.
For more on programmatic asset pipelines see batch personalized videos from CSV and the broader pattern in programmatic video generation in Node.js.
FAQ
Does Apple require a specific codec?
H.264 High Profile or HEVC, in an M4V or MP4 container, 30 fps. HyperFrames outputs H.264 by default, which Apple accepts in every device family. HEVC saves bandwidth but is slower to encode; for a 15-second preview the savings are negligible.
What if the app UI is more complex than three screens?
It usually isn't — once you cut. The preview is not a feature tour. Pick the three moments that describe the app to a stranger and ignore everything else. If you can't pick three, you don't have a story; you have a backlog.
Can the preview match my marketing site exactly?
Yes — that's the point of building it in HTML. Share a stylesheet between the marketing site and the preview template, drive both from the same design tokens, and the preview becomes an extension of the site rather than a separate asset.
How long should the slide-up entrance be?
0.8 to 1.0 seconds with a cubic ease-out. Shorter feels jumpy; longer feels lazy. The phone should be fully in place before the first screen's caption appears.
What about Google Play?
Google Play accepts a YouTube URL rather than a direct MP4 upload, but the constraints are similar: 30 seconds max, 1920×1080 or 1080×1920, no audio narration required. The same HTML template renders for both stores by changing only the output dimensions.
Close
The App Store preview is a high-stakes asset that gets rebuilt every release if you film it and rebuilt every quarter if you template it. Pick the second option. Model the phone in HTML, drive the animation with hf-seek, render every required size from one source, and the next time your designer changes the primary color you ship the new previews in under five minutes.
Start at the quickstart or jump straight into the playground.
Cite this postBibTeX · APA · Markdown
@misc{tanaka2026generate,
author = {Kira Tanaka},
title = {Generate App Store preview videos from HTML},
year = {2026},
url = {https://hyperframes.video/blog/app-store-preview-video},
note = {HyperFrames blog}
}Kira Tanaka. (2026, May 21). Generate App Store preview videos from HTML. HyperFrames. https://hyperframes.video/blog/app-store-preview-video
[Generate App Store preview videos from HTML](https://hyperframes.video/blog/app-store-preview-video) — Kira Tanaka, 2026
Kira works on the render core: headless Chromium scheduling, frame capture, and the encoder pipeline. She cares about reproducible builds and small numbers next to the word "variance."
Animated app onboarding screens to MP4
Build an animated app onboarding video in HTML — three-screen carousel with sliding screens, fading headlines, scaling illustrations, advancing dots — and render to MP4.
Animated meme generator (deterministic, scriptable)
Build a scriptable meme video generator in HTML — top-text bottom-text reveal, punchline punch-scale, shaky-cam emphasis — and render reproducible MP4s from a CSV.
Animated newsletter header MP4s (that fall back to a still)
Build an animated newsletter header in HTML, render it to a deterministic MP4, and ship a still PNG as the fallback for clients that strip video.
Building with HyperFrames? Come hang out.
We're on GitHub, in Discord, and the playground is one click away. Bring weird ideas — we collect them.