MetaactiveOpen Source

Llama 4 Maverick

llama-4-maverick

Meta's mixture-of-experts Llama 4 model with 17B active / 400B total parameters.

Context Window

1.0M

tokens

Max Output

32.8K

tokens

Input Price

—

per 1M tokens

Output Price

—

per 1M tokens

Details

Familyllama-4

Parameters17Bx128E

Training Cutoff2025-03-01

ReleasedApril 5, 2025

Aliasesmeta-llama/Llama-4-Maverick-17B-128E-Instruct

Capabilities

VisionFunctionsStreamingJSON ModeCodeTool UseMultimodal

Documentation

Evaluation Scores(4 benchmarks)

HumanEvalFunction-level Python code generation

78.5%

MMLU-ProHarder successor to MMLU

73.4%

MATH-500Competition-style math

69%

GPQA DiamondPhD-level science questions

49.5%

Quick Access

curl pikaainews.com/api/models/meta-llama-4-maverick

npx pika-models info meta-llama-4-maverick

Get API Access

Official

AWS Bedrock

Amazon Bedrock. Official partnerships with Anthropic, Meta, Mistral, Cohere.

Third-Party Providers & Aggregators

Cerebras

Wafer-scale inference. 1000+ tokens/sec for select models.

DeepInfra

Lowest per-token rates for open-source models.

Fireworks AI

Fastest inference engine. Multimodal support, HIPAA/SOC2.

Groq

Ultra-fast LPU inference. Best latency for real-time apps.

OpenRouter

500+ models, one API key. Pay-per-token, no minimums.

SiliconFlow

China-optimized inference. Strong Qwen/DeepSeek support.

Together AI

Fast open-source model inference. Sub-100ms latency.