# HeyGen HeyGen is an AI video generation platform that turns scripts, photos, and text into finished video — without cameras, crews, or editing skills. Photorealistic AI avatars with natural lip-sync are the headline feature, but the surface now spans text-to-video, photo-to-video, voice cloning, AI dubbing, and translation into 175+ languages and dialects. Used by enterprise customers like HubSpot, Workday, JPMorgan, and Intel for training, marketing, sales outreach, and global content localization. San Francisco-based; offers free and paid tiers plus enterprise pricing. ## Capabilities - **AI avatars.** Photorealistic digital humans, "photo avatars" from a single still image, a public avatar library, and "Digital Twin" personal clones. - **Voice.** Text-to-speech, voice cloning, multi-accent voices, AI dubbing with lip-sync. - **Translation.** Localize a single source video into 175+ languages and dialects with the original speaker's voice and synced lips. - **Video Agent.** "One-shot text-to-video" — describe what you want, get the rendered output. - **AI Studio.** Text-based video editor for finer control. - **Generative integrations.** Pipeline through Sora, Veo, Kling, and Flux (image generation) inside the studio. - **Use-case templates.** Product placement ads, UGC-style videos, training content, sales outreach, social media shorts. ## Open-source play: HyperFrames HeyGen open-sourced **[[HyperFrames]]** in May 2026 — an HTML-based, agent-friendly video rendering framework released under Apache 2.0. The framing is significant: HeyGen sells managed AI video generation; HyperFrames gives away the deterministic-rendering plumbing. The bet appears to be that owning the agent-authored video format is more valuable than gating the renderer. HyperFrames competes head-on with [[Remotion]] on licensing — Apache 2.0 vs Remotion's source-available custom license — and with the existing video-template ecosystem on authoring (HTML+CSS+GSAP vs React/TSX). ## Where it fits HeyGen sits in the **AI avatar / talking-head video** generation tier — the "I want a polished spokesperson video without filming one" segment. Adjacent products in the broader AI video space include [[Sora]] (generative scenes from text), [[Veo 3]] (Google's text-to-video, also generative scenes), and image-foundation models like [[FLUX.1]] / [[FLUX.2]] for stills. They overlap less with HeyGen than they appear to: HeyGen is the *avatar performance* layer, not raw scene synthesis. ## References - Official website: <https://www.heygen.com/> - X / Twitter: <https://x.com/HeyGen> - HyperFrames (open-source by HeyGen): <https://github.com/heygen-com/hyperframes> — see [[HyperFrames]] ## Related - [[HyperFrames]] - [[Sora]] - [[Veo 3]] - [[FLUX.1]] - [[FLUX.2]] - [[Voice Cloning]] - [[AI Multimodal]] - [[Remotion]]