Alibaba QwenactiveOpen Source

Qwen3-8B

qwen3-8b

Efficient 8B model for cost-sensitive deployments.

Context Window

131.1K

tokens

Max Output

8.2K

tokens

Input Price

$0.06

per 1M tokens

Output Price

$0.24

per 1M tokens

Details

Familyqwen3

Parameters8B

Training Cutoff2025-03-01

ReleasedApril 29, 2025

Capabilities

FunctionsStreamingJSON ModeCodeReasoningTool Use

Documentation

https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions

Evaluation Scores(4 benchmarks)

HumanEvalFunction-level Python code generation

70.2%

MATH-500Competition-style math

64.5%

MMLU-ProHarder successor to MMLU

57%

GPQA DiamondPhD-level science questions

38.2%

Quick Access

curl pikaainews.com/api/models/qwen-qwen3-8b

npx pika-models info qwen-qwen3-8b

Get API Access

Official

DashScope (Alibaba)

Official Qwen API via Alibaba Cloud.

Third-Party Providers & Aggregators

Cerebras

Wafer-scale inference. 1000+ tokens/sec for select models.

DeepInfra

Lowest per-token rates for open-source models.

Fireworks AI

Fastest inference engine. Multimodal support, HIPAA/SOC2.

Groq

Ultra-fast LPU inference. Best latency for real-time apps.

OpenRouter

500+ models, one API key. Pay-per-token, no minimums.

SiliconFlow

China-optimized inference. Strong Qwen/DeepSeek support.

Together AI

Fast open-source model inference. Sub-100ms latency.

Other qwen3 models

Qwen

Qwen3-Max-Thinking

qwen3-max-thinking

131.1K ctx$2.40/1M

Qwen

Qwen3-Next

qwen3-next

131.1K ctx

Qwen

Qwen3-Max

qwen3-max

131.1K ctx$2.40/1M

Qwen

Qwen3-32B

qwen3-32b

131.1K ctx$0.30/1M

Qwen

Qwen3-4B

qwen3-4b

131.1K ctx$0.02/1M

Qwen

Qwen3-1.7B

qwen3-1.7b

32.8K ctx$0.01/1M

Qwen

Qwen3-14B

qwen3-14b

131.1K ctx$0.15/1M

Qwen

Qwen3-0.6B

qwen3-0.6b

32.8K ctx

Qwen

Qwen3-235B-A22B

qwen3-235b-a22b

131.1K ctx$0.80/1M