# Parakeet V3
Parakeet V3 (parakeet-tdt-0.6b-v3) is a 600-million-parameter multilingual ASR model by NVIDIA, part of the NeMo Parakeet series. Licensed under CC-BY-4.0. Extends v2 by expanding language support from English-only to 25 European languages with automatic language detection.
## Key features
- 25 European languages with automatic language detection (no prompting needed)
- Speeds exceeding 2000x real-time (RTFx) on the Open ASR leaderboard, among the fastest ASR models available
- Long audio support: up to 24 minutes with full attention (A100 80GB), up to 3 hours with local attention
- RNN-Transducer architecture enables streaming recognition with minimal latency
- Trained on the Granary multilingual corpus
## Model variants
- **parakeet-tdt-0.6b-v3**: 600M parameters, 25 languages
- **parakeet-tdt-1.1b**: 1.1B parameters, higher accuracy
## How it compares to Whisper
Parakeet V3 is significantly faster than [[Whisper]] variants while achieving competitive or superior accuracy on English benchmarks. Its RNN-Transducer architecture enables streaming use cases that Whisper's encoder-decoder approach does not natively support.
## References
- HuggingFace: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3
- NVIDIA blog: https://developer.nvidia.com/blog/pushing-the-boundaries-of-speech-recognition-with-nemo-parakeet-asr-models/
## Related
- [[Whisper]]
- [[Speech-to-Text (STT)]]
- [[NeMo]]