Llama News & Updates
Your central hub for AI news and updates on Llama. We're tracking the latest articles, discussions, tools, and videos from the last 7 days.
All (5)
0 news
5 posts
0 tools
0 videos
17
Apr
16
Apr
15
Apr
14
Apr
13
Apr
12
Apr
11
Apr
No news articles found
Check back soon or explore other content types
No tools found
Check back soon for new AI tools
No videos found
Check back soon for video content
17
Apr
16
Apr
15
Apr
14
Apr
13
Apr
12
Apr
11
Apr
Community talk
Hot Experts in your VRAM! Dynamic expert cache in llama.cpp for 27% faster CPU +GPU token generation with Qwen3.5-122B-A10B compared to layer-based single-GPU partial offload
24/7 Headless AI Server on Xiaomi 12 Pro (Snapdragon 8 Gen 1 + Ollama/Gemma4)
Audio processing landed in llama-server with Gemma-4
I got tired of Claude API anxiety. Here’s my 5-min Gemma 4 + Ollama setup for Mac (and a realistic look at what it actually sucks at)
huge improvement after moving from ollama to llama.cpp