# Gemini [[Google DeepMind]]'s main AI model series, "for the agentic era". Gemini 2+ are multi-modal [[Large Language Models (LLMs)]]. They can handle a full range of multi-modal inputs: text, documents, images, video, and audio. In addition, they also supports streaming, which enables it to receive, analyze, understand, analyze, and react to various kinds of inputs (including live video!). They're also able to return bounding boxes for objects within an image, which can enable various scenarios. They can also read, write, and execute code. Gemini 2.5 Pro is especially good at it. Importantly, Gemini 2.5 Pro also has a HUGE context window (1M tokens!), which makes it quite unique. There are different variants of Gemini available: - **Gemini 2.5 Pro/Flash**: text-focused models, 1M token context. Pro excels at code and complex reasoning - **[[Gemini 3]]**: current-generation frontier family. Umbrella for the 3.x minor releases below - **Gemini 3.1 Flash Live**: highest-quality real-time audio/voice model. See [[Gemini 3.1 Flash Live]] - **Gemini 3.1 Flash TTS**: controllable, expressive text-to-speech with inline audio tags. See [[Gemini 3.1 Flash TTS]] - **Gemini 3.5 Flash**: fastest agentic/coding model in the family, priced well above earlier Flash tiers. See [[Gemini 3.5 Flash]] Gemini is the LLM behind [[NotebookLM]]. The [[Gemini Mobile App]] provides access to Gemini models on Android and iOS, including real-time voice via Gemini Live. ## References - Model variants: https://ai.google.dev/gemini-api/docs/models - AI Studio: https://aistudio.google.com - Realtime streaming: https://aistudio.google.com/live - Simon Willison's first insights: https://simonwillison.net/2024/Dec/11/gemini-2/ - Introductions videos - https://www.youtube.com/watch?v=7RqFLp0TqV0 - https://www.youtube.com/watch?v=qE673AY-WEI - Multi-modal Live API demos - https://www.youtube.com/watch?v=J_q7JY1XxFE - https://www.youtube.com/watch?v=J62TUCRapR8 - https://www.youtube.com/watch?v=n8Dz2GA2hDc - https://www.youtube.com/watch?v=9hE5-98ZeCg - Native tool use: https://www.youtube.com/watch?v=EVzeutiojWs - Native audio output: https://www.youtube.com/watch?v=qE673AY-WEI - Spatial understanding: https://www.youtube.com/watch?v=-XmoDzDMqj4 - Behind the scenes: https://www.youtube.com/watch?v=L7dw799vu5o ## Related - [[Gemini 3]] - [[Gemini 3.1 Flash Live]] - [[Gemini 3.5 Flash]] - [[Gemini 3.5 Pro]] - [[Gemini Omni]] - [[Nano Banana]] - [[Gemini Spark]] - [[Gemini Mobile App]] - [[Gemini CLI]] - [[Gemma]] - [[Google AI Studio]] - [[Vertex AI]] - [[NotebookLM]]