About

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

Related Models

Qwen: Qwen3.6 27B2026-04-27 Qwen: Qwen3.6 35B A3B2026-04-27 Qwen: Qwen3.5 Plus 2026-04-202026-04-27 Qwen: Qwen3.6 Flash2026-04-27 Qwen: Qwen3.6 Max Preview2026-04-27 Qwen3.6 27B (Non-reasoning)2026-04-22 Qwen3.6 Max Preview2026-04-20 Qwen3.6 35B A3B (Non-reasoning)2026-04-16

Benchmarks

MMLU-Pro

68.6%

GPQA Diamond

42.7%

HLE

2.9%

LiveCodeBench

33.2%

SciCode

17.4%

TerminalBench Hard

2.3%

MATH-500Not evaluated

AIMENot evaluated

AIME 2025

27.3%

IFBench

32.3%

Long Context Recall

15.3%

Tau2

29.2%

Market AverageTop Score

Open Source

HuggingFace

apache-2.08BGGUF / GPTQ / AWQ

Downloads

5.4M

Likes

897

VRAM (FP16)

8-16 GB

GPU

RTX 4070 / M2 Pro

Qwen: Qwen3 VL 8B Instruct

About

Related Models

Market Position

Pricing

Cost Calculator

vs. Similar Models

Performance

Benchmarks

Open Source

Quick Compare

Similar Models

Market Position