Skip to main content
Back to Explore

IBM: Granite 4.1 8B

IBM·Released 2026-04-30
Open Source131K ctx

About

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It supports a 131K-token context window and is designed for enterprise tasks...

Pricing

Input

$0.05

per 1M tokens

Output

$0.10

per 1M tokens

Blended

$0.06

per 1M tokens

Cheaper than 88% of models. Median price is $0.54/1M tokens.

Cost Calculator

Tokens per day1M
100K100M

Daily

$0.06

Monthly

$1.88

vs. Similar Models

Llama 3.1 Instruct 70BQ:+0.1
$0.56+796%
Ministral 3 3BQ:+0.1
$0.10+60%
Qwen3 30B A3B (Non-reasoning)Q:+0.1
$0.13+113%
Qwen3 4B (Non-reasoning)Q:+0.1
$0.19+201%

Performance

132

tokens/sec

Faster than 65% of models

0.48

seconds

Faster than 90% of models

0.48

seconds

Faster than 94% of models

Market Median

94 tok/s

40% faster

Median TTFT

1.11s

57% faster

Throughput/Dollar

2119

tok/s per $/1M

Speed Comparison

MoonshotAI: Kimi K2 Thinking
132 tok/s-1%
Qwen: Qwen3 Coder Next
133 tok/s+1%
Tiny Aya Global
133 tok/s+1%

Context Window

131K

tokens

Larger than 27% of models

Max Output

131K

tokens

100% of context

Benchmarks

MMLU-ProNot evaluated
GPQA Diamond
43.3%
HLE
3.8%
LiveCodeBenchNot evaluated
SciCode
21.8%
TerminalBench Hard
0.0%
MATH-500Not evaluated
AIMENot evaluated
AIME 2025Not evaluated
IFBench
38.6%
Long Context Recall
12.0%
Tau2
27.8%
Market AverageTop Score

Open Source

Quick Compare

Similar Models

Compare all 7 models