# HyperFrames — Full Content HTML to MP4 video rendering for developers and AI agents. Open-source Remotion alternative — write HTML, render deterministic, frame-perfect video from Node.js, Next.js or CI. Source: https://hyperframes.video Generated: 2026-05-23T19:18:11.334Z Index: https://hyperframes.video/llms.txt Sitemap: https://hyperframes.video/sitemap.xml This file is the complete prose of the HyperFrames docs and blog, concatenated in a single plain-text stream for LLM ingestion. Code is MIT-licensed; prose is CC BY 4.0. Attribution appreciated. --- ## Docs # Composition URL: https://hyperframes.video/docs/concepts/composition Description: A HyperFrames composition is plain HTML with a sprinkling of data attributes. The root carries dimensions, children carry timing. A composition is one HTML document. No framework, no JSX, no project file. The renderer reads it the way a browser does — then samples it deterministically frame by frame. ## What you'll learn - The difference between an HTML document and a HyperFrames composition - What raw markup looks like before and after timing is added - How to size and pace a composition with knobs ## Raw HTML vs composition The composition is the same markup you'd write for any web page — plus three or four `data-*` attributes that turn it into video. Drag the slider to compare.
plain html
Hello, friend.
`} afterHtml={`
composition
Hello, friend.
`} /> ## The composition root The first element with both `data-width` and `data-height` is the composition root. Anything outside it is ignored by the renderer — useful for hiding scaffolding, dev-only overlays, or notes. If you omit `data-duration` on the root, the total length is inferred from the longest child track. ## Children are tracks Every direct child of the root with `data-start` or `data-duration` becomes a *track*. A track can be a media element, a `
`, a custom element, or a `` you draw into yourself. Tracks may overlap in time; layering uses source order plus `z-index`. ## Try the dimensions
{{$W}} × {{$H}}
`} knobs={[ { name: "W", type: "number", default: "1280", min: 320, max: 1920 }, { name: "H", type: "number", default: "720", min: 240, max: 1080 }, { name: "DUR", type: "number", default: "1.5", min: 0.2, max: 5 } ]} /> ## Why not React? Because every model, every templating engine, and every CMS already speaks HTML. A composition you can write in three lines of Vim is a composition an LLM can generate in one prompt, a templating system can serve from a server response, and a junior dev can read at 2am. React adds a build step, a runtime, and a hydration model that the renderer has to either reimplement or ignore. We chose ignore. That said — if you like JSX, render to HTML and feed *that* to HyperFrames. The renderer is happy with whatever HTML you can produce. ## Next - [Data attributes](/docs/concepts/data-attributes) — the full reference table - [Timing & tracks](/docs/concepts/timing-and-tracks) — how the seek clock works --- # Data attributes URL: https://hyperframes.video/docs/concepts/data-attributes Description: Every HyperFrames data-* attribute, what it does, and what it looks like in isolation. The quick-jump reference. HyperFrames reads timing and layout from HTML `data-*` attributes. They are inert in a normal browser and stripped from the captured DOM, so they never leak into the rendered pixels. ## What you'll learn - The full attribute table for roots, tracks, and media - What `data-fade` and `data-loop` look like on their own - Where to find the JSON schema that backs all of this ## Composition root | Attribute | Required | Default | Description | |---|---|---|---| | `data-width` | yes | — | Output width in pixels. | | `data-height` | yes | — | Output height in pixels. | | `data-duration` | no | inferred | Total duration in seconds. If omitted, derived from longest child track. | | `data-fps` | no | `60` | Output frame rate. | | `data-bg` | no | `#000` | Background color when transparent encoding is off. | ## Tracks | Attribute | Description | |---|---| | `data-start` | Seconds from composition start. | | `data-duration` | How long this track is on stage. | | `data-end` | Alternative to `data-duration`. | | `data-loop` | If present, the track loops within its window. | | `data-fade` | `"in"`, `"out"`, or `"both"` — applies a 200 ms fade. | | `data-track` | Optional name for the inspector. | ## Media | Attribute | Applies to | Description | |---|---|---| | `data-volume` | `
`}, {label: "Poll data", lang: "json", code: `{ "question": "Should we ship on Friday?", "total": 1284, "options": [ { "label": "Yes, ship it", "percent": 46, "color": "green", "winner": true }, { "label": "Wait til Monday", "percent": 28, "color": "signal" }, { "label": "Ship on Thursday", "percent": 18, "color": "ink" }, { "label": "Other", "percent": 8, "color": "blue" } ] }`} ]} /> This is the pattern for [programmatic video from data](/docs/recipes/programmatic-video-from-data) — one HTML template, N rows of JSON, N MP4s. A daily standup poll, a Twitter poll archive, a customer survey roll-up — same template, different data. ## Render to MP4 Save the polished version as `poll.html`, then: ```bash hyperframes render poll.html --out clip.mp4 --duration 6 --fps 30 ``` Six seconds at 30 fps gives you 2.8s of bar race, a 400ms beat, the 600ms ribbon sweep, and a 2.2s hold on the final state. That hold is critical — short videos on social autoplay loop instantly, and a clean final frame is what gets screenshotted. See the [quickstart](/docs/getting-started/quickstart) for installation and the [playground](/playground) to iterate on the HTML interactively before committing it to a batch. ## FAQ ### How do I generate one MP4 per poll automatically? Build an HTML template with placeholders, then loop over your poll dataset and write one HTML file per poll. Run `hyperframes render` against each. The [programmatic video recipe](/docs/recipes/programmatic-video-from-data) walks through the full pipeline including filename templating and parallel rendering. ### Can I show more than four options? Yes — the layout is grid-based, so adding rows scales vertically. Above six options the card gets tall enough that the eye stops reading top-to-bottom and starts scanning. If you have eight options, either bucket the bottom four into "Other" or break into two columns. ### Why not use Chart.js or D3? You can. For a chart this constrained — one axis, fixed number of rows, brand colors — a library adds 80kb to render four divs. The HTML version is 60 lines, has no dependencies, and is trivial to drive from `hf-seek`. Libraries become worth it around the complexity of stacked area charts or live legends. ### How do I match my brand colors? The example uses HyperFrames brand tokens. Replace `#2b66ff`, `#1f8a5b`, `#ff3b1f`, `#0a0a0a` with your palette. The variable knobs above let you preview a single accent color; for a full brand swap, replace all four and the cream background `#f6f5f1`. ### Can the bars race in different speeds? Yes, but it usually looks worse. The visual contract is "all options finish at the same moment, but their final lengths differ." Staggered finishes feel like a competition, not a result. If you want emphasis, hold the bars for 400ms and *then* slide in the winner ribbon — that's a result reveal, not a race. ## Related - [Animated bar chart tutorial](/blog/animated-bar-chart-tutorial) — vertical variant, same easing - [Animated comparison tables](/blog/animated-comparison-table) — when the data is 2x2, not 4x1 - [Easing curves cheatsheet](/blog/easing-curves-cheatsheet) — why cubic ease-out is the right call here --- # Podcast audiograms with animated waveforms URL: https://hyperframes.video/blog/animated-podcast-audiogram Published: 2026-05-21T12:00:00.000Z Tags: podcast, audiogram, waveform, animation, mp4 Author: hf-team A podcast audiogram generator is what turns an audio-only medium into a scrollable social asset. The pattern is settled: an episode title on top, a square cover or color block in the middle, captions in sync underneath, and a 32-bar waveform somewhere prominent. The waveform is what makes the eye stop — it implies audio, and the implication is enough to make a static post into a video moment. The polished version below renders the waveform from a single deterministic function. No audio file, no FFT, no library — each bar's height is a phase-offset sine wave with a per-bar amplitude envelope. The result reads as "audio playing" without ever touching audio.
Episode 47
Why deterministic rendering matters
`} /> ## Why synthetic beats sampled The first instinct when building an audiogram is to extract the audio's actual waveform — FFT the file, get amplitude buckets per frame, drive the bars. This works and is wrong for the use case. The viewer cannot hear the audio. Most social autoplays muted. The visual job of the waveform is "imply audio happening" — not "represent specific audio." A synthetic waveform with believable motion does that job and has three operational advantages: 1. **No audio pipeline.** You don't need ffmpeg, an audio decoder, or the original audio file at render time. 2. **Deterministic by construction.** A function of `t` is byte-reproducible; a sampled FFT depends on decoder version, sample rate, and bucketing. 3. **You control the dynamics.** Real podcast audio has long quiet stretches and short loud bursts, which produce visually unreadable bar patterns. Synthetic dynamics stay in the readable range. ## The synthetic waveform function Each bar's height comes from three components: a phase based on its index, a slow envelope that breathes across the bar array, and two faster sine waves that wobble the height. ```javascript bars.forEach((bar, i) => { const phase = i * 0.42; const env = 0.55 + 0.45 * Math.sin(i * 0.31 + t * 0.7); const wob = Math.sin(t * 6.0 + phase) * 0.5 + Math.sin(t * 11.0 + phase * 2) * 0.3; const h = 12 + Math.max(0, env + wob) * 42; bar.style.height = `${h}px`; }); ``` Three things make this read as "audio": - **The phase offset** (`i * 0.42`) — adjacent bars don't move in lockstep. Without phase offsets, all 32 bars move identically and the eye reads it as one big block bouncing, not as a waveform. - **The envelope** — the slow `sin(i * 0.31 + t * 0.7)` creates a "shape" across the bar array that drifts over time. Loud sections cluster, then move. - **The two-frequency wobble** — 6Hz + 11Hz beating gives the bars a non-periodic feel. A single sine looks too clean; two slightly-related frequencies look like noise. ## Caption sync Captions don't need lip-sync precision because the audio is muted. They need *cadence* sync — appear in the rough region where the speaker would be saying them, hold for a readable beat, then swap. ```javascript const captions = [ { t: 0.2, text: "The web platform already knows how to render." }, { t: 1.7, text: "What it can't do is render the same way twice." }, { t: 3.3, text: "That's the problem deterministic rendering solves." }, { t: 4.8, text: "Same HTML, same MP4, every time." } ]; let current = captions[0].text; for (const c of captions) if (t >= c.t) current = c.text; cap.innerHTML = `${current}`; ``` This is a step function — the caption is whichever entry has the largest `t` value not exceeding the current frame time. No fade between captions; a hard cut reads as "next line of dialog" and matches how people parse captioned video. Aim for 1.5 - 2.0 seconds per caption, 6-10 words each. Shorter captions get lost, longer captions don't have time to read. If your transcript naturally has long sentences, break them into clauses on prepositions ("the platform / already knows / how to render"). ## Tweak the audiogram
{{title}}
`} knobs={[ {name: "barColor", type: "color", default: "#ff3b1f"}, {name: "title", type: "text", default: "Why deterministic rendering matters"}, {name: "intensity", type: "number", default: "1.0"} ]} /> ## Data shape for an episode For a podcast publishing weekly, the pipeline is: produce the episode, ASR-transcribe the chosen 30-second clip, group ASR output into ~4-6 captions, run the template once per clip, post the MP4. The waveform is fully synthetic so it's not part of the data — only the episode metadata and captions are.
Episode {{episode_number}}
{{episode_title}}
`} ]} /> This matches the [programmatic video from data](/docs/recipes/programmatic-video-from-data) pipeline. Loop over your episode list, write one HTML file per clip, render each. For a back-catalog of 100 episodes, a single CI job produces 100 MP4s in parallel — see [batch personalized videos from CSV](/blog/batch-personalized-videos-from-csv) for the orchestration pattern. ## Render to MP4 ```bash hyperframes render audiogram.html --out clip.mp4 --duration 6 --fps 30 ``` For Twitter/X and LinkedIn feed, square 1080×1080 is the default. For Reels and TikTok, switch to 1080×1920 and bump the waveform height so it carries the vertical canvas. For Spotify Canvas (the 8-second looping artwork some podcasts use), bump `--duration 8`. The [quickstart](/docs/getting-started/quickstart) covers installation and the full CLI surface, and [deterministic rendering](/docs/concepts/deterministic-rendering) explains why every render of this template is byte-identical. ## FAQ ### Can I drive the waveform from the actual audio? Yes, but you have to do it ahead of time. Extract amplitude buckets server-side (ffmpeg + a 32-bin FFT works), store them as an array in the JSON, and have the seek listener look up `buckets[Math.floor(t * fps)]` per bar. The visual difference from synthetic is subtle, and the operational cost is high — most teams stick with synthetic for a year before deciding it doesn't matter. ### How long should the clip be? Six to eight seconds for a feed-native autoplay, 30-60 seconds for a dedicated "watch this clip" share. Below 6 seconds the captions don't have room to breathe; above 60 seconds the format becomes a transcript reader and the waveform is just decoration. ### Should I include audio in the MP4? If you have the audio for the captioned clip, yes — `hyperframes render` accepts an `--audio` flag that overlays a track on the rendered video. The waveform is still synthetic and visual-only; the audio plays underneath for the percentage of viewers who tap to unmute. ### What's the right bar count? 24-32 bars for a square format, 36-48 for a wide format. Fewer than 20 bars look chunky and read as "loading dots" rather than waveform. More than 50 bars become a fuzz at typical render resolutions — the individual bars stop being visible. ### Why a gradient cover instead of the show artwork? The example uses a gradient because it's a generic template. For your actual show, swap in the cover art as a base64 data URI or a local file referenced from the HTML. Avoid hot-linking to a remote URL — the render pipeline is deterministic only if all assets resolve identically every time. ## Related - [Animated quote cards for Twitter](/blog/animated-quote-cards-twitter) — same caption-on-card grammar for text-only quotes - [Burn subtitles into MP4](/blog/burn-subtitles-into-mp4) — the technique for full-show captions, not just clips - [Animated KPI cards](/blog/animated-kpi-stat-cards) — count-up patterns for "listen count" overlays --- # Animated job listing videos for LinkedIn URL: https://hyperframes.video/blog/animated-job-listing-video Published: 2026-05-21T12:00:00.000Z Tags: recruiting, linkedin, job-listing, animation, mp4 Author: marcus-okafor A job listing video is the LinkedIn-native format that turns a static posting into a 6-second clip: the company logo lands, the role title fades up, requirements stagger in one-by-one, and a pulsing "Apply" button parks at the bottom. Done well, it doubles the click-through over a plain-text post because the requirements appear in *sequence* rather than as a wall of bullets, which is how the eye prefers to consume a list. The polished version below runs entirely off `hf-seek`. Five requirements, five distinct fade-up beats, a final pulse.

Senior Frontend Engineer

Hyperframes · Remote
Requirements
  • 5+ years building production React apps
  • Strong CSS and animation fundamentals
  • Experience with Next.js App Router
  • Comfortable with TypeScript strict mode
  • Care for craft over surface area
Apply Now
`} /> ## The recruiting case for video Static job posts have a fundamental UX problem on social: they ask the reader to scan and decide in one motion. Video breaks the decision into a sequence — see the logo, read the title, parse the requirements one at a time, end on a clear action. The eye does not need to scan because the post scans itself. LinkedIn's own data shows video posts get 3-5x the engagement of text posts in the recruiting category. A 6-second auto-looping clip costs the same to produce as a polished image (less, with a templating pipeline) and outperforms in feed. ## The beat sheet Six seconds, eight beats: | Time | Beat | Duration | |---|---|---| | 0.1s | Logo drops in | 500ms | | 0.5s | Title fades up | 500ms | | 0.7s | Company line fades up | 500ms | | 1.0s | "Requirements" label appears | 400ms | | 1.2-3.0s | Bullets stagger (180ms gap × 5) | 1.8s | | 3.2s | Apply button enters | 500ms | | 4.0s | Pulse begins | infinite | | 6.0s | Loop point | — | The pulse continuing past the loop is intentional — when the MP4 autoplays again, the pulse picks back up immediately from the start, which feels like continuous motion across the loop boundary. ## Driving every beat from a single seek listener The whole animation is a function of `t`. There's one helper, `fade(el, t, start, dur, offset)`, and every element calls it with different parameters. ```javascript function fade(el, t, start, dur, offset) { const p = Math.max(0, Math.min(1, (t - start) / dur)); const e = 1 - Math.pow(1 - p, 3); el.style.opacity = e; el.style.transform = `translateY(${offset - offset * e}px)`; } addEventListener('hf-seek', (event) => { const t = event.detail.time; fade(logo, t, 0.1, 0.5, -12); // drops from above fade(title, t, 0.5, 0.5, 8); // rises from below fade(company, t, 0.7, 0.5, 8); bullets.forEach((li, i) => fade(li, t, 1.2 + i * 0.18, 0.5, 8)); }); ``` This is the pattern for every staggered reveal in HyperFrames. The seek listener does not track state — it computes the visual state from scratch every frame. Seek to any `t` in any order and you get a deterministic result. That's what [deterministic rendering](/docs/concepts/deterministic-rendering) buys you. ## The pulse The Apply button pulses with `Math.sin((t - 4.0) * Math.PI * 2) * 0.025 + 1` — a 1Hz sinusoid producing scale values between 0.975 and 1.025. That's a 5% peak-to-peak amplitude, just enough to read as "alive" without being distracting. ```javascript if (t > 4.0) { const pulse = Math.sin((t - 4.0) * Math.PI * 2) * 0.025 + 1; btn.style.transform = `scale(${pulse})`; } ``` Two mistakes to avoid: 1. **Don't pulse the opacity.** A fading-in-and-out button reads as "loading," not "ready." Pulse scale only. 2. **Don't go above 1.05.** A 10% bump is the threshold where pulse becomes thump. You're nudging the eye, not slapping it. ## Customize for your role

{{role}}

Apply Now
`} knobs={[ {name: "brand", type: "color", default: "#2b66ff"}, {name: "initial", type: "text", default: "H"}, {name: "role", type: "text", default: "Senior Frontend Engineer"} ]} /> ## Data shape for batch rendering For a recruiting team posting 20 roles a week, the template takes one JSON object per role and emits one MP4. The pipeline scales linearly with role count, and the output filenames map to job IDs for ATS integration.