All Models
MetaactiveOpen Source
Llama 4 Maverick
llama-4-maverickMeta's mixture-of-experts Llama 4 model with 17B active / 400B total parameters.
Context Window
1.0M
tokens
Max Output
32.8K
tokens
Input Price
—
per 1M tokens
Output Price
—
per 1M tokens
Details
Familyllama-4
Parameters17Bx128E
Training Cutoff2025-03-01
ReleasedApril 5, 2025
Aliasesmeta-llama/Llama-4-Maverick-17B-128E-Instruct
Evaluation Scores(4 benchmarks)
HumanEval
78.5%
MMLU-Pro
73.4%
MATH-500
69%
GPQA Diamond
49.5%
Quick Access
curl pikaainews.com/api/models/meta-llama-4-mavericknpx pika-models info meta-llama-4-maverickGet API Access
Third-Party Providers & Aggregators
Cerebras
Wafer-scale inference. 1000+ tokens/sec for select models.
DeepInfra
Lowest per-token rates for open-source models.
Fireworks AI
Fastest inference engine. Multimodal support, HIPAA/SOC2.
Groq
Ultra-fast LPU inference. Best latency for real-time apps.
OpenRouter
500+ models, one API key. Pay-per-token, no minimums.
SiliconFlow
China-optimized inference. Strong Qwen/DeepSeek support.
Together AI
Fast open-source model inference. Sub-100ms latency.
