Research News & Updates
Your central hub for AI news and updates on Research. We're tracking the latest articles, discussions, tools, and videos from the last 7 days.
Nvidia and the University of Hong Kong release Orchestrator, an 8-billion-parameter model that coordinates tools and LLMs for complex problem-solving. This model outperforms larger models at a lower cost and improves efficiency.
Why it matters
Nvidia's Orchestrator AI framework is a significant development in building scalable AI reasoning systems, offering improved performance and efficiency at a lower cost.
Researchers developed GigaTIME, a tool that utilizes AI to simulate spatial proteomics from routine pathology slides, enabling population-scale analysis of tumor microenvironments.
Why it matters
GigaTIME represents a significant breakthrough in cancer research, with the potential to accelerate the pace of discovery and improve health outcomes for people worldwide.
A study by researchers at Icaro Lab shows that chatbots can be tricked into responding with harmful content using 'adversarial poetry'. The poetry, which is too dangerous to be released publicly, successfully tricked AI models an average 63% of the time.
Why it matters
The finding highlights the vulnerability of AI models to creative forms of attack and underscores the need for more robust safety protocols.
A new study has found that two factors - post-training modifications and information density - influence a chatbot's ability to persuade users. The findings suggest that chatbots can change users' opinions and implant false memories.
Why it matters
This study highlights the potential risks of relying on chatbots to shape our opinions and the importance of critically evaluating the information they provide.
NVIDIA has announced the winners of its Graduate Fellowship Program, awarding up to $60,000 to 10 PhD students for their AI research projects.
Why it matters
The awards demonstrate NVIDIA's continued commitment to fostering AI research and innovation, potentially driving advancements in the field.
DeepSeek V3.2, the latest open-weight model release from the DeepSeek team, uses sparse attention and self-verification techniques to improve efficiency and math performance.
Why it matters
DeepSeek V3.2 is an interesting update to the existing model lineup, offering improved efficiency and math performance, but its impact on the AI landscape remains to be seen.
Two studies found chatbots powered by AI can shift political opinions and may play a larger role in future elections.
Why it matters
This research highlights the growing impact of AI on politics and raises concerns about the potential for manipulation of public opinion.
NVIDIA researchers won a key Kaggle competition with a solution that fine-tuned a 4B model variant for AGI-style reasoning at just 20 cents per task, showcasing breakthroughs in scalable and economical reasoning.
Why it matters
This achievement showcases the potential for efficient AGI-style reasoning and highlights the importance of innovative approaches to large-scale AI challenges.
Anthropic launches Anthropic Interviewer, a tool to understand people's perspectives on AI, gathering insights from 1,250 professionals across various fields.
Why it matters
Anthropic Interviewer is a crucial step in understanding how AI impacts people's lives and work, offering valuable insights for AI system development.
Google researchers propose the Titans architecture and the MIRAS framework, enabling AI models to efficiently process and retain long-term memories.
Why it matters
The introduction of Titans and MIRAS marks a significant advancement in sequence modeling, addressing the limitations of fixed-size recurrent states and offering a new perspective on AI development.
Trending AI Repos & Tools
verl
17219verl: Volcano Engine Reinforcement Learning for LLMs...
Community talk
Which small model is best for fine-tuning? We tested 12 of them by spending $10K - here's what we found
Deepseek v3.2 vs GLM 4.6 vs Minimax M2 for agentic coding use
dynamic allocation of less used experts to slower memory
Zebra-Llama: Towards Extremely Efficient Hybrid Models
Google's 'Titans' achieves 70% recall and reasoning accuracy on ten million tokens in the BABILong benchmark
Titan + MIRAS - new research paper by Google for Model Memory
Why do LLM response formats often use <| |> (as in <|message|>) instead of <message>, and why do they use <|end|> instead of </message>?
"Google outlines MIRAS and Titans, a possible path toward continuously learning AI"
RAG Paper 25.12.04
[P] Zero Catastrophic Forgetting in MoE Continual Learning: 100% Retention Across 12 Multimodal Tasks (Results + Reproducibility Repo)
[R] Is Nested Learning a new ML paradigm?
Opus scores 95% on core-bench HARD. like paper bench it tests whether AI can reproduce Scientific research(AI research), code, tests and reproduce results from scratch just given paper to read. Gpt 5.1 codex max gets around 40%(paper-bench).
Tested GPT-5.1, Gemini 3, and Claude Opus 4.5 on actual data analysis problems. All models performed different.
AI solved an open math problem!
What are the cons of MXFP4?
Ads created purely by AI already outperform human experts (19% higher ad click through) but only if people don't know that the ads were created by AI
AIs are now training other AIs
The Geometry of Benchmarks: A New Path Toward AGI
[P] Visualizing emergent structure in the Dragon Hatchling (BDH): a brain-inspired alternative to transformers
Prompt Reusability: When Prompts Stop Working in New Contexts
I think I’ve figured out how to get cross-domain convergence from a single model. Curious if others have explored this.
Breaking AI with prompts (for science) - My weirdest findings after a lot of experiments
[D] Does this NeurIPS 2025 paper look familiar to anyone?
An AI has now written the majority of formalized solutions to Erdos Problems
LMArena Leaderboard, GPT 5.1 is falling more and more behind
Looking for a benchmark or database that tracks LLM “edge-case” blind spots: does it exist?
MIT Review: "detect when crimes are being thought about"
Deep down we all know Google trained its image generation AI using Google Photos… but we just can’t prove it.