Skip to main content
Back to Explore

IBM: Granite 4.1 8B

IBM·Released 2026-04-30
Open Source131K ctx

About

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...

Pricing

Input

$0.05

per 1M tokens

Output

$0.10

per 1M tokens

Blended

$0.06

per 1M tokens

Cheaper than 88% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.06

Monthly

$1.88

vs. Similar Models

Llama 3.1 Instruct 70BQ:+0.1
$0.56+796%
Ministral 3 3BQ:+0.1
$0.10+60%
Qwen3 30B A3B (Non-reasoning)Q:+0.1
$0.13+113%
Qwen3 4B (Non-reasoning)Q:+0.1
$0.19+201%

Performance

119

tokens/sec

Faster than 62% of models

0.47

seconds

Faster than 89% of models

0.47

seconds

Faster than 93% of models

Market Median

94 tok/s

27% faster

Median TTFT

1.10s

58% faster

Throughput/Dollar

1897

tok/s per $/1M

Speed Comparison

Hy3-preview (Non-reasoning)
119 tok/s+0%
Z.ai: GLM 5.2
119 tok/s+0%
GLM-4.7 (Non-reasoning)
117 tok/s-1%

Context Window

131K

tokens

Larger than 27% of models

Max Output

131K

tokens

100% of context

Benchmarks

MMLU-ProNot evaluated
GPQA Diamond
43.3%
HLE
3.8%
LiveCodeBenchNot evaluated
SciCode
21.8%
TerminalBench Hard
0.0%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
38.6%
Long Context Recall
12.0%
Tau2
27.8%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models