NVIDIA has released Qwen3.5, an open-source vision-language model with 397B parameters for native multimodal agents, ideal for tasks like coding, visual reasoning, and complex search.
Why it matters
Qwen3.5 is a significant milestone in multimodal AI, offering improved capabilities for native multimodal agents with wide-ranging applications.
Community talk
Qwen 3.5 35B A3B is better than free-tier Chatgpt and Gemini
Qwen3.5 27B scores 42 on Intelligence Index and is the most intelligent model under 230B. Nearest model GLM-4.7-Flash 31B-A3B, Scores 30
Qwen3.5 is dominating the charts on HF
Alibaba CEO: Qwen will remain open-source
Qwen3.5 9B is the first local model that I tried, that can make adequate flappy bird version
Qwen3.5-9B abliterated — 0% refusals + vision
Qwen3.5-4B Uncensored Aggressive Release (GGUF)
Finished a Qwen 3.5 9B Opus 4.5 Distill!
Unsloth fixed version of Qwen3.5-35B-A3B is incredible at research tasks.
Qwen 3.5 27b: a testament to the transformer architecture
Breaking : Today Qwen 3.5 small
Qwen3.5 Small Dense model release seems imminent.
What if LLM agents passed KV-cache to each other instead of text? I tried it -- 73-78% token savings across Qwen, Llama, and DeepSeek
Qwen 3.5-35B-A3B is beyond expectations. It's replaced GPT-OSS-120B as my daily driver and it's 1/3 the size.
Qwen 3.5-27B punches waaaaay above its weight (with a slightly different prompt) -- very impressed
New Qwen3.5-35B-A3B Unsloth Dynamic GGUFs + Benchmarks
PewDiePie fine-tuned Qwen2.5-Coder-32B to beat ChatGPT 4o on coding benchmarks.
Qwen3.5 feels ready for production use - Never been this excited
Qwen3 9B can run fine on android phones at q4_0
Qwen3.5-0.8B - Who needs GPUs?
Qwen3.5-35B-A3B hits 37.8% on SWE-bench Verified Hard — nearly matching Claude Opus 4.6 (40%) with the right verification strategy
Qwen3.5-27B Q4 Quantization Comparison
Running Qwen 3.5 0.8B locally in the browser on WebGPU w/ Transformers.js
Visualizing All Qwen 3.5 vs Qwen 3 Benchmarks
Qwen 3.5 2B on Android
PSA: Qwen 3.5 requires bf16 KV cache, NOT f16!!
Running Qwen3.5 27b dense with 170k context at 100+t/s decode and ~1500t/s prefill on 2x3090 (with 585t/s throughput for 8 simultaneous requests)
The last AMD GPU firmware update, together with the latest Llama build, significantly accelerated Vulkan! Strix Halo, GNU/Linux Debian, Qwen3.5-35-A3B CTX<=131k, llama.cpp@Vulkan&ROCm, Power & Efficiency
Qwen3.5 35b a3b first small model to not hallucinate summarising 50k token text
Dense (non-thinking) > MoE? Qwen-3.5-27B is blowing me away in coding
Qwen 3.5 27B is the best Chinese translation model under 70B
How to switch Qwen 3.5 thinking on/off without reloading the model
Qwen3 Coder Next | Qwen3.5 27B | Devstral Small 2 | Rust & Next.js Benchmark
Qwen3.5 35B-A3B replaced my 2-model agentic setup on M1 64GB
Little Qwen 3.5 27B and Qwen 35B-A3B models did very well in my logical reasoning benchmark
Qwen3.5-35B-A3B running on a Raspberry Pi 5 (16GB and 8GB variants)
Follow-up: Qwen3.5-35B-A3B — 7 community-requested experiments on RTX 5080 16GB
Qwen 3.5 Architecture Analysis: Parameter Distribution in the Dense 27B vs. 122B/35B MoE Models
Qwen3.5 27B vs Devstral Small 2 - Next.js & Solidity (Hardhat)
Qwen 3.5 Family Comparison by ArtificialAnalysis.ai
Qwen3.5-35B-A3B Q4 Quantization Comparison
Introducing FasterQwenTTS
speed of GLM-4.7-Flash vs Qwen3.5-35B-A3B
Strix Halo, GNU/Linux Debian, Qwen3.5-(27,35,122B) CTX<=131k, llama.cpp@ROCm, Power & Efficiency
Qwen3.5-27B-heretic-gguf
Qwen3.5-35B-A3B is awesome
Qwen3.5 122B in 72GB VRAM (3x3090) is the best model available at this time — also it nails the “car wash test”
Qwen/Qwen3.5-35B-A3B creates FlappyBird
[P] I trained Qwen2.5-1.5b with RLVR (GRPO) vs SFT and compared benchmark performance
Running Qwen3.5-0.8B on my 7-year-old Samsung S10E
Qwen 3.5 small , soon
qwen3.5 35b-a3b evaded the zero-reasoning budget by doing its thinking in the comments
Is Qwen3.5 a coding game changer for anyone else?
System prompt for Qwen3.5 (27B/35BA3B) to reduce overthinking?
What's Next for Qwen After Junyang Lin's Departure?