# Cloudflare Workers AI [[Cloudflare]] Workers AI is a serverless inference platform that runs open-source models (Llama, Mistral, Whisper, Stable Diffusion, BGE embeddings, and dozens more) on Cloudflare's GPU fleet, accessible from any [[Cloudflare Workers]] script via a binding or from outside via REST. It's not a model lab — Cloudflare doesn't train these. The product is operational: take well-known open models, host them on GPUs at the edge, expose them under a uniform API, charge per token or per request. No quotas to request, no GPU provisioning, no batch endpoints to wrangle. ## Why It Matters For AI features in app code, Workers AI removes the "where do I run the model" question. Embedding generation, summarization, transcription, image generation — all become single function calls from a Worker. Paired with [[Cloudflare Vectorize]] it forms a complete RAG stack with three bindings: AI, Vectorize, and your data store. ## Model Catalog (representative) - **LLMs**: Llama 3.x family, Mistral, Gemma, Qwen - **Embeddings**: BGE base/large, M2-BERT - **Speech**: Whisper (transcription) - **Vision**: Llava (image understanding), Stable Diffusion (generation) - **Translation**: M2M-100 - **Classification, summarization, code completion**: rotating catalog ## Pricing Shape - Free tier: 10K neurons/day (sufficient for hobby projects) - Paid: $0.011 per 1K neurons (a "neuron" abstracts over token/image/second costs) - LoRA fine-tuning supported on a subset of models ## Common Use Cases - **In-Worker AI features** — summarize, classify, embed without leaving the request path - **RAG pipelines** with [[Cloudflare Vectorize]] - **Voice-to-text** workflows via Whisper - **Image generation** for thumbnails, avatars, content - **Cheap inference layer** in front of more expensive frontier models ## References - Workers AI home: https://ai.cloudflare.com/ - Model catalog: https://developers.cloudflare.com/workers-ai/models/ - Pricing: https://developers.cloudflare.com/workers-ai/platform/pricing/ ## Related - [[Cloudflare]] - [[Cloudflare Workers]] - [[Cloudflare Vectorize]] - [[Cloudflare Agents SDK]] - [[Cloudflare Sandbox SDK]] - [[Wrangler]]