# HeyGen
HeyGen is an AI video generation platform that turns scripts, photos, and text into finished video — without cameras, crews, or editing skills. Photorealistic AI avatars with natural lip-sync are the headline feature, but the surface now spans text-to-video, photo-to-video, voice cloning, AI dubbing, and translation into 175+ languages and dialects.
Used by enterprise customers like HubSpot, Workday, JPMorgan, and Intel for training, marketing, sales outreach, and global content localization. San Francisco-based; offers free and paid tiers plus enterprise pricing.
## Capabilities
- **AI avatars.** Photorealistic digital humans, "photo avatars" from a single still image, a public avatar library, and "Digital Twin" personal clones.
- **Voice.** Text-to-speech, voice cloning, multi-accent voices, AI dubbing with lip-sync.
- **Translation.** Localize a single source video into 175+ languages and dialects with the original speaker's voice and synced lips.
- **Video Agent.** "One-shot text-to-video" — describe what you want, get the rendered output.
- **AI Studio.** Text-based video editor for finer control.
- **Generative integrations.** Pipeline through Sora, Veo, Kling, and Flux (image generation) inside the studio.
- **Use-case templates.** Product placement ads, UGC-style videos, training content, sales outreach, social media shorts.
## Open-source play: HyperFrames
HeyGen open-sourced **[[HyperFrames]]** in May 2026 — an HTML-based, agent-friendly video rendering framework released under Apache 2.0. The framing is significant: HeyGen sells managed AI video generation; HyperFrames gives away the deterministic-rendering plumbing. The bet appears to be that owning the agent-authored video format is more valuable than gating the renderer.
HyperFrames competes head-on with [[Remotion]] on licensing — Apache 2.0 vs Remotion's source-available custom license — and with the existing video-template ecosystem on authoring (HTML+CSS+GSAP vs React/TSX).
## Where it fits
HeyGen sits in the **AI avatar / talking-head video** generation tier — the "I want a polished spokesperson video without filming one" segment. Adjacent products in the broader AI video space include [[Sora]] (generative scenes from text), [[Veo 3]] (Google's text-to-video, also generative scenes), and image-foundation models like [[FLUX.1]] / [[FLUX.2]] for stills. They overlap less with HeyGen than they appear to: HeyGen is the *avatar performance* layer, not raw scene synthesis.
## References
- Official website: <https://www.heygen.com/>
- X / Twitter: <https://x.com/HeyGen>
- HyperFrames (open-source by HeyGen): <https://github.com/heygen-com/hyperframes> — see [[HyperFrames]]
## Related
- [[HyperFrames]]
- [[Sora]]
- [[Veo 3]]
- [[FLUX.1]]
- [[FLUX.2]]
- [[Voice Cloning]]
- [[AI Multimodal]]
- [[Remotion]]