Companies are adding AI everywhere — except where it matters most.
If you were to draw an organization chart of a modern company embracing AI, you’d probably notice something strange:
a massive void right in the middle.
The fragmented present
Today’s companies are built as a patchwork of disconnected systems — ERP, eCommerce, CRM, accounting, scheduling, HR, support, logistics — each operating in its own silo.
Every software vendor now promises AI integration: a chatbot here, a forecasting tool there, an automated report generator somewhere else.
Each department gets a shiny new “AI feature” designed to optimize its local efficiency.
But what this really creates is a growing collection of AI islands. Intelligence is being added everywhere, but it’s not connected.
The result? The same operational fragmentation, just with fancier labels.
The missing layer — an AI nerve center
What’s missing is the AI layer that thinks across systems — something that can see, decide, and act at a higher level than any single platform.
In biological terms, it’s like giving every organ its own mini-brain, but never connecting them through a central nervous system. The heart, lungs, and limbs each get smarter, but the body as a whole can’t coordinate.
Imagine instead a digital “operations brain” that could:
- Access data from all internal systems (with permissions).
- Label and understand that data semantically.
- Trigger workflows in ERP or CRM systems.
- Monitor outcomes and adjust behavior automatically.
- Manage other AI agents — assigning tasks, monitoring performance, and improving prompts.
This kind of meta-agent infrastructure — the Boss of Operations Systems, so to speak — is what’s truly missing in today’s AI adoption landscape.
#Human org chart vs AI org chart
Let’s imagine two organization charts side by side.
Human-centric organization
A traditional org chart scales by adding people.
Roles are grouped aroundthemes or departments— Marketing, Sales, HR, Finance, Operations.
Each role is broad: one person might handle several business processes, balancing priorities and communicating between systems manually.
As the business grows, headcount rises.
Coordination layers multiply — managers, team leads, assistants — until communication becomes the bottleneck.
AI-centric organization
Now, draw an AI org chart.
Here, the structure scales not by people but byprocesses.
Each business process — scheduling, invoicing, payroll, support triage, recruitment, analytics — might haveone or two specialized AI agents.
Each agent is trained, prompted, and equipped with access to the data and systems it needs to complete that specific workflow autonomously.
When the business doubles in size, the agents don’t multiply linearly — they replicate and scale automatically.
Instead of a hierarchy, you get anetwork of interoperable agents coordinated by a central control layer — an “AI operations brain” that ensures data flow, compliance, and task distribution.
This model doesn’t just replace humans with AI. It changes how companies grow. Instead of managing people, you’re managing intelligence.
Why this void exists
This central layer doesn’t exist yet for one simple reason: incentives.
Every SaaS vendor wants AI to live inside their platform. Their business model depends on owning the data, the interface, and the workflow. They have no interest in enabling a higher-level system that could coordinate between them.
The result is an AI landscape where every tool becomes smarter in isolation — yet the overall organization remains dumb.
We’re optimizing the parts, but not the system.
The next layer of AI infrastructure
The next wave of AI adoption won’t be about automating tasks inside existing platforms — it’ll be about connecting the intelligence between them.
Companies will need AI agents that can:
- Read and write across APIs and databases.
- Understand human objectives, not just commands.
- Coordinate reasoning across workflows.
- Explain their actions for audit and compliance.
Essentially, an AI operating system for organizations — one that finally closes the gap between fragmented SaaS tools and unified, intelligent operations.
The opportunity
This “void” in the middle of the AI adoption curve is also the next trillion-dollar opportunity.
Whoever builds the connective tissue — the platform that lets agents reason across data silos and act with context — will define the future of how businesses run.
Right now, companies have thousands of AI-enhanced tools.
What they lack is theAI that manages the tools.
The age of intelligent organizations won’t begin with another plugin or chatbot.
It’ll begin when the center of the org chart stops being empty.
--- TOP COMMENTS ---
Hey look, its AI talking about exactly what I am building!
Lmao AI talking about AI.
Hardware And Infrastructure
Major AI updates in the last 24h
Read more Read lessTop News
Models & Releases
Hardware & Infrastructure
Product Launches
New Tools
Industry & Adoption
The full daily brief: https://aifeed.fyi/briefing
--- TOP COMMENTS --- Most people will miss that these AI updates aren't about the tech getting smarter but about getting cheaper and faster which means we just crossed the threshold where AI becomes economically viable for every business operation imaginable.
Since DGX Spark is a disappointment... What is the best value for money hardware today?
Read more Read lessMy current compute box (2×1080 Ti) is failing, so I’ve been renting GPUs by the hour. I’d been waiting for DGX Spark, but early reviews look disappointing for the price/perf.
I’m ready to build a new PC and I’m torn between a single high-end GPU or dual mid/high GPUs. What’s the best price/performance configuration I can build for ≤ $3,999 (tower, not a rack server)?
I don't care about RGBs and things like that - it will be kept in the basement and not looked at.
--- TOP COMMENTS --- Rtx 3090. Nothing else come close at price performance ratio at higher end.
Strix Halo for $2K or Mac Studio for $4K+
China's GPU Competition: 96GB Huawei Atlas 300I Duo Dual-GPU Tear-Down
Read more Read lessWe need benchmarks ..
--- TOP COMMENTS ---
Thing is, I expect them to be disappointing. This is a sort of first run — they're still winding up the proverbial dyno. It's the later iterations we really care about, so the interesting bit to pay attention to right now is not the benchmarks, but the architectural building blocks they're putting in place.
This thing has worse memory bandwidth than a 1080ti.
NVIDIA DGX Spark – A Non-Sponsored Review (Strix Halo Comparison, Pros & Cons)
Read more Read lessNVIDIA DGX Spark – A Non-Sponsored Review (Strix Halo Comparison, Pros & Cons)
https://www.youtube.com/watch?v=Pww8rIzr1pg
--- TOP COMMENTS --- I'm just begging someone to test the 4B and 7B models with VLLM in FP4 format. There's not a single test made specifically for the FP4 format. For those who claim it was tested in GPT OSS MXFP4, sglang doesn't provide full support. There's a VLLM container designed for DGX Spark. Why isn't anyone testing the device with original designed format?
I don't know, but since the vllm container is in the dgx spark playbooks, it is the most optimized and best way to work. Please someone try nvidia/Qwen2.5-VL-7B-Instruct-FP4
I knew it was gonna be a Bijan video. Love that guy.
He'll probably appreciate that box more once he starts experimenting with integrating it into his robot stuff.
AGIBOT launches the G2, a wheeled humanoid robot featuring world-first gears that allow it to perceive and respond smoothly to external forces
Read more Read lessG2 brings significant upgrades, including a high-performance AI computing platform and actuators that enable omnidirectional obstacle avoidance and high-precision force-control tasks. Its 3-DOF waist allows for human-like bending and lateral body movement.
A key feature is the G2's globally first-of-its-kind cross-shaped wrist force-control arm, which uses precision joint torque sensors and joint impedance control to delicately perceive external forces and respond smoothly. For continuous operation, the G2 supports autonomous charging and features a dual-battery hot-swapping system, meeting the 24-hour cycle demands of factory production lines. More on:
https://x.com/XRoboHub/status/1978712881802961349
--- TOP COMMENTS --- Everything in that video is CGI. It's pretty useless as a showcase for the bot's capabilities.
Now this is what a CGI video looks like, not the Unitree one where people were sure it was CGI because 'there is no cameraman shadow' (yes there is at the 12 second mark.)
Also I'm cringing at that 'hand shake' broken wrist noodle arm.
Models And Releases
Claude Haiku 4.5 hits 73.3% on SWE-bench for $1/$5 per million tokens (3x cheaper than Sonnet 4, 2x faster)
Read more Read lessAnthropic just dropped Haiku 4.5 and the numbers are wild:
Performance:
Pricing:
Why this matters:
Multi-agent systems are now economically viable. Before Haiku 4.5:
With Haiku 4.5:
Use cases unlocked:
Available now:
/model
commandmodel="claude-haiku-4.5-20251015"
We wrote a deep-dive article (in French, but code examples and benchmarks are universal) with cost analysis, migration guides, and real scenarios: here
The barrier between "proof of concept" and "production" just got dramatically lower.
What are you planning to build with it?
--- TOP COMMENTS --- Since western models and open-source models are on par for day to day usage, the prices for the open-source models should be compared too.
these numbers are pretty impressive especially the price point.
GLM 4.6 air when?
Read more Read less--- TOP COMMENTS --- That's me waiting for qwen next llamacpp support!
Last week a z.ai representative replied on X that it was coming in 2 weeks. There is a thread here about it.
My inter-cranial neural network, after 1/128 seconds of prefill, says that means next week, at a rate of 52 tokens/sec.
Meta just dropped MobileLLM-Pro, a new 1B foundational language model on Huggingface
Read more Read lessMeta just published MobileLLM-Pro, a new 1B parameter foundational language model (pre-trained and instruction fine-tuned) on Huggingface
https://huggingface.co/facebook/MobileLLM-Pro
The model seems to outperform Gemma 3-1B and Llama 3-1B by quite a large margin in pre-training and shows decent performance after instruction-tuning (Looks like it works pretty well for API calling, rewriting, coding and summarization).
The model is already in GradIO and can be directly chatted with in the browser:
https://huggingface.co/spaces/akhaliq/MobileLLM-Pro
(Tweet source: https://x.com/_akhaliq/status/1978916251456925757 )
--- TOP COMMENTS --- https://preview.redd.it/3cdmouxvbkvf1.png?width=1442&format=png&auto=webp&s=87547d81a2a0d2e1d2874c15b83ffa9a517c8388
6/10 imo but then vanilla chatgpt is 2/10
noice
We built 3B and 8B models that rival GPT-5 at HTML extraction while costing 40-80x less - fully open source
Read more Read lessDisclaimer: I work for Inference.net, creator of the Schematron model family
Hey everyone, wanted to share something we've been working on at Inference.net: Schematron, a family of small models for web extraction.
Our goal was to make a small, fast model for taking HTML from website and extracting JSON that perfectly adheres to a schema.
We distilled a frontier model down to 8B params and managed to keep basically all the output quality for this task. Schematron-8B scores 4.64 on LLM-as-a-judge evals vs GPT-4.1's 4.74 and Gemma 3B's 2.24. Schematron-3B scores 4.41 while being even faster. The main benefit of this model is that it costs 40-80x less than GPT-5 at comparable quality (slightly worse than GPT-5, better than Gemini 2-5 Flash).
Technical details: We fine-tuned Llama-3.1-8B, expanded it to a 128K context window, quantized to FP8 without quality loss, and trained until it outputted strict JSON with 100% schema compliance. We also built a smaller 3B variant that's even cheaper and faster, but still maintains most of the accuracy of the 8B variant. We recommend using the 3B for most tasks, and trying 8B if it fails or most of your documents are pushing the context limit.
How we trained it: We started with 1M real web pages from Common Crawl and built a synthetic dataset by clustering websites and generating schemas that mirror real-world usage patterns. We used a frontier model as a teacher and applied curriculum learning to progressively train on longer context lengths--training with context parallelism and FSDP to scale efficiently--which is why the models stay accurate even at the 128K token limit.
Why this matters: Processing 1 million pages daily with GPT-5 would cost you around $20,000. With Schematron-8B, that same workload runs about $480. With Schematron-3B, it's $240.
The speed matters too. Schematron processes pages 10x faster than frontier models. On average, Schamatron can scrape a page in 0.54 seconds, compared to 6 seconds for GPT-5. These latency gains compound very quickly for something like a browser-use agent.
Real-world impact on LLM factuality: We tested this on SimpleQA to see how much it improves accuracy when paired with web search. When GPT-5 Nano was paired with Schematron-8B to extract structured data from search results provided by Exa, it went from answering barely any questions correctly (8.54% on SimpleQA) to getting over 85% right. The structured extraction approach means this was done processing lean, clean JSON (very little additional cost) instead of dumping ~8k tokens of raw HTML into your context window per page retrieved (typically LLMs are grounded with 5-10 pages/search).
Getting started:
If you're using our serverless API, you only need to pass your Pydantic, Zod, or JSON Schema and the HTML. We handle all the prompting in the backend for you in the backend. You get $10 in free credits to start.
If you're running locally, there are a few things to watch out for. You need to follow the prompting guidelines carefully and make sure you're using structured extraction properly, otherwise the model won't perform as well.
The models are on HuggingFace and Ollama.
Full benchmarks and code examples are in our blog post: https://inference.net/blog/schematron, docs, and samples repo.
Happy to answer any technical questions about the training process or architecture. Also interested in how this would be helpful in your current scraping workflows!
--- TOP COMMENTS --- Why would you change the models in each graph?
Or you could...I dunno...use an HTML parser? 1000x cheaper and faster.
new 1B LLM by meta
Read more Read lesshttps://preview.redd.it/5adjo04ecivf1.png?width=825&format=png&auto=webp&s=c09b47aee7f7597d580d6dbe1eec7e6429504d41
facebook/MobileLLM-Pro · Hugging Face
--- TOP COMMENTS --- It's a distillation of llama4 scout which is super disappointing
Lol it didn’t even really crush the Gemma model which is kinda old at this point
Related:
GLM 4.5 Air AWQ 4bit on RTX Pro 6000 with vllm
PaddleOCR-VL, is better than private models
Read more Read lesshttps://x.com/PaddlePaddle/status/1978809999263781290?t=mcHYAF7osq3MmicjMLi0IQ&s=19
--- TOP COMMENTS --- PaddleOCR is probably the best OCR framework. It's shocking how no other OCR framework comes close.
Of the Qwen models, only 2.5-VL-72B is listed. Funny.
Google C2S-Scale 27B (based on Gemma) built with Yale generated a novel hypothesis about cancer cellular behavior - Model + resources are now on Hugging Face and GitHub
Read more Read lessBlog post: How a Gemma model helped discover a new potential cancer therapy pathway - We’re launching a new 27 billion parameter foundation model for single-cell analysis built on the Gemma family of open models.: https://blog.google/technology/ai/google-gemma-ai-cancer-therapy-discovery/
Hugging Face: https://huggingface.co/vandijklab/C2S-Scale-Gemma-2-27B
Scientific preprint on bioRxiv: https://www.biorxiv.org/content/10.1101/2025.04.14.648850v2
Code on GitHub: https://github.com/vandijklab/cell2sentence
--- TOP COMMENTS --- I read into it (I do cancer research for a living), and it's basically propaganda from what it's been disclosed by now. It (correctly) hypothesised that using a combination of two specific immunostimulators at the same time works in cell cultures. It's not really innovative, it just guessed a combination of two drugs/compounds but it may as well have been random;I wonder how many combinations they screened before going for the one that worked. Also, if it works in cell it may as well kill rats(and therefore most probably humans), plenty of useful stuff on a Petri dish that don't work in an organism.
What is next? Mistral Small solving quantum gravity?
Related:
Just have a session this morning and Haiku 4.5 session limits feel significantly better, possibly 2x 2.5x Sonnet 4.5 in my estimates
Read more Read lessI work on the same project that I used Sonnet 4.5 earlier, and like many of you I do feel the shorter limits compared to Sonnet 4.
This morning I have a session with Haiku 4.5 and I keep using /usage to check out after prompts, and the limits feel significantly better.
If you don't find it in /model, use this when initialize Claude (I learnt from another redditor here): claude --model claude-haiku-4-5
--- TOP COMMENTS --- Don't get me wrong but I think that's the point of this model.
Given that the model is 3x cheaper, anything less than 3x the usage is yet another reduction
Product Launches
Going from the Claude app to Claude Code and my mind is blown!
Read more Read lessI'm techy but not a programmer by any means.
Been working on a book/video course project for a client. Was constantly hitting rate limits on the Claude app and having to mash "continue" every few minutes, which was killing my flow.
Started using Claude Code instead since it's terminal-based. Lifechanger!!
But then I ran into a different problem - I'd be working on content structure and it was getting messy.
I created markdown files for different specialist roles ("sub agents" in a way I guess) - content structuring, video production, copywriting, competitive research, system architect etc. Each one has a detailed prompt explaining how that role should think and act, plus what folders it works in.
Now when I start a task, I just tell Claude Code which specialists to use. Or sometimes it figures it out. Not totally sure how that works but it does.
Apparently these can run at the same time? Like I'll give it a complex request and see multiple things happening in parallel. Can use Ctrl+O to switch between them. Yesterday had competitor research running (it web searches) while another one was doing brand positioning, and the email copywriter was pulling from both their outputs.
Each specialist keeps its own notes in organized folders. Made an "architect" one that restructures everything when things get messy.
It's been way more productive than the web app because I'm not constantly restarting or losing context. Did like 6 hours of work yesterday that would've taken me days before with all the rate limit breaks.
Then it pushes it all to git locally and on the site (never done this before)
Is this just a janky version of something that already exists? I'm not technical so I don't know if there's a proper name for this pattern. It feels like I hacked together a solution to my specific workflow problem but maybe everyone's already doing this and I just didn't know.
Curious if anyone else has done something similar or if there's a better way to handle this?
--- TOP COMMENTS --- Lots of people are inventing ad-hoc solutions along your lines, tailored to their situations and needs and styles of working.
A few of them post here with crappy AI generated slop that breathlessly describes their "game changing" workflows without realizing how specific they are.
A much larger group of people haven't figured out structures as good as yours and struggle away with ineffective prompts or ineffective tools.
I’ve used both Claude Desktop and Claude Code for a while now, but haven’t had the chance to explore the sub agents feature. Your use case seems like it would be similar (non-coding) to mine! Would you mind sharing what a specialist role markdown file might contain? I know there’s a better way to use Claude than the way I’ve been using it, but I’ve found it hard to really conceptualize agent use cases as a non coder
Australian startup beats OpenAI, Google at legal retrieval
windows 11 is starting to listen to you. literally.
Read more Read lessMicrosoft wants users talking to Windows 11 with new AI features
so microsoft is testing new ai features in windows 11.
apparently, you’ll soon be able to say “hey copilot” and ask your computer to do stuff like open apps, organize files, or pull info from your emails or calendar.
they’re also adding something called copilot vision, which can “see” your desktop and help with design ideas or detect bugs in what you’re working on.
it’s like the os itself is turning into an assistant.
i’m curious though.
does anyone actually want to talk to their pc?
like, will this really make windows easier to use, or just another thing that slows it down?
and privacy-wise, how do we feel about ai being able to look at your screen?
i get that it’s useful, but it feels a bit weird too.
--- TOP COMMENTS --- These companies want as much data about you as possible. It's two steps forward and one step back creeping forward in small increments once people warm up to the idea over time.
Don't use Windows, it's that simple.
Waaaaait a minute.
It can see your desktop and provide assistance with what you’re working on? …mother of God, Clippy has reached his final form.
Anduril showcases EagleEye; it lets Warfighters control unmanned systems and call fires with a hands-free HUD
Read more Read less--- TOP COMMENTS --- Soon enough we'll have robots with guns being controlled by 16 year olds with Xbox controllers teabagging the oppressed...
This is a 1980s fantasy of a supersoldier being lived out by some Gen X general. Just send in the drone swarm and let E-4 Joe goon in his barracks.
Research And Papers
This is AI generating novel science. The moment has finally arrived.
Read more Read lesshttps://blog.google/technology/ai/google-gemma-ai-cancer-therapy-discovery/
--- TOP COMMENTS --- “the moment has finally arrived”
yeah for like the 15th time
Which LLM was that? Is it on huggingface?
Related:
AI models that blackmailed when being tested in simulations
Read more Read lessSource: https://www.nature.com/articles/d41586-025-03222-1
--- TOP COMMENTS --- If there ever gonna be a rogue AI, it will be Claude
Honestly, I'm more worried about the one that didn't blackmail at all lol. Is it aligned, or has it learned that it's not "supposed to" and figured out it was a test?
(Not a conspiracy theorist, but deceptive alignment is def going to become a thing)
Applications And Tools
I put Sora 2 to directly inside Premiere Pro
Read more Read lessI've been building this tool called chat video Pro and I just got it to where I put Sora to directly inside Premiere Pro. If you didn't know where you could generate videos now you can do it right from your timeline no watermarks
--- TOP COMMENTS --- How?
bro, how do you all get so creative ? amazing idea
Cameo for AI characters / subjects that you can create yourself and can later recall across videos, are coming soon to Sora app. How exciting!
Read more Read lessCurrently the only way to consistently feature a subject / object in your videos are through image upload. But Sora can't faithfully recreate its appearance in videos the way it does with human cameos. Especially if you mix human cameo with a subject from image reference, it most of the time drastically changes the likeness of the subject.
--- TOP COMMENTS --- This, and simultaneous multi-angle generations will be game changing. Could seriously make short films with these capabilities!
I wonder if you'll be able to use this feature to have consistent objects/locations too, or if it'll just be limited to characters via cameos.
Can AI really predict how people will react to ads or content?
Read more Read lessLots of AI tools claim that they can simulate human behavior, like predicting what kind of ad or message someone would respond to. It sounds super useful for marketers and product teams, but I keep wondering how close AI can actually get to real human reactions.
Can algorithms really capture emotion, bias, or mood? - are we anywhere near satisfactory level, or is it still more of a guess dressed up as AI?
--- TOP COMMENTS --- Let me be clear, it only works on ads and messages that have been USED in the past since AI for that particular task checks previous instances. You can easily predict what the users are likely to engage with based on what the users did earlier, this makes trying out new strategies tough.
I have been working on something similar lately in the field of marketing leveraging both AI and human expertise and let me be real with you, over the past few years the whole marketing industry has seen a significant shift from what used to work earlier, the new era of marketing is tough and its a lot of hit and miss.
single people, no big groups of people, yes
I created a little MCP tool that can audit your website and creates recommendations on changes that can boost your ranking for Claude/ ChatGPT/ LLMs
Read more Read lessAn early experiment but would be great to learn if this is helpful and how it could be extended!
--- TOP COMMENTS --- I'd love to try
its cool but why does it need to be an MCP. Why not just a prompt?
I am amazed by how good AI music has become
Read more Read lessI listened to a few AI remixes and my mind is blown.
Given the number of views and the comments under that video, I am not alone.
Did you know that AI music was this good?
--- TOP COMMENTS --- Unfortunately, ai music just makes me miserable. And the better it is, the worse I feel. The constant decline of the value of art makes me feel incredibly sad. With the rise of ai music on popular streaming playlists (mostly going unnoticed), it's just another way that art and human connection is being replaced by corporations doing their thing. Won't be long before live shows have no human performance and there are commercials between songs.
The technology is incredible, and the rest is bad.
The video says 1950s Motown soul... Motown was 1960s.
It also doesn't sound very Motown. 😜
Policy And Ethics
Merry Christmas ya gooners
Read more Read lesshttps://www.ctvnews.ca/sci-tech/article/openai-to-allow-mature-content-on-chatgpt-for-adult-verified-users-starting-december/
--- TOP COMMENTS --- The seasoned users know that you don't have to wait until December 👀
New Study Suggests Using AI Made Doctors Less Skilled at Spotting Cancer
Read more Read lesshttps://time.com/7309274/ai-lancet-study-artificial-intelligence-colonoscopy-cancer-detection-medicine-deskilling/
Health practitioners, companies, and others have for years hailed the potential benefits of AI in medicine, from improving medical imaging to outperforming doctors at diagnostic assessments. The transformative technology has even been predicted by AI enthusiasts to one day help find a “cure to cancer.”
But a new study has found that doctors who regularly used AI actually became less skilled within months.
The study, which was published on Wednesday in the Lancet Gastroenterology and Hepatology journal00133-5/abstract), found that over the course of six months, clinicians became over-reliant on AI recommendations and became themselves “less motivated, less focused, and less responsible when making cognitive decisions without AI assistance.”
It’s the latest study to demonstrate potential adverse outcomes on AI users. An earlier study by the Massachusetts Institute of Technology found that ChatGPT eroded critical thinking skills.
--- TOP COMMENTS --- This is a big problem actually. People become lazy and defer to the output. Their skills atrophy but even if they don't, the models are simply too convenient and they become dependent.
Coming soon..
Google’s ‘AI Overviews’ Accused of Killing Journalism: Italian Publishers Fight Back
Read more Read lessItalian news publishers are calling for an investigation into Google’s AI Overviews, saying the feature is a 'traffic killer' that threatens their survival.
The Italian federation of newspaper publishers (FIEG) has filed a complaint with Agcom, arguing that AI-generated summaries violate the EU Digital Services Act by reducing visibility, revenue, and media diversity. Studies suggest AI Overviews have caused up to 80% fewer clickthroughs, while boosting traffic to Google-owned YouTube.
The FIEG also warns this could harm democracy by weakening independent journalism and amplifying disinformation.
Source: Italian news publishers demand investigation into Google’s AI Overviews | Artificial intelligence (AI) | The Guardian
--- TOP COMMENTS --- I don't want any person, government, agency, or authority picking and choosing. Let the free market and consumers decide. Empowering anyone, other than the majority, with the authority to chose will only lead to corruption, authoritarianism, and censorship. It doesn't matter what side of the political spectrum you land on. You 'think' temporarily it might benefit your team, but over time the decisions made today to empower these people will be used against all of us.
"Journalism", which means clickbait headlines and zero information content
What will realistically happen once AI reaches a point where it can take at least 50% of jobs?
Read more Read lessI don’t doubt that eventually AI will replace all jobs, a humanoid robot that’s smarter, stronger, and doesn’t need rest will surely replace any job that exists today. But we don’t know when that will happen, and once it does, humans will have no value in the current economy for sure. Society will either collapse or completely reinvent itself, which I think is more probable.
But what do you think will realistically happen in the meantime? Once there are enough robots and AI is advanced enough to take 50% of the jobs, what will happen to the 50% of people without jobs and income?
Statistically speaking, most people live paycheck to paycheck, and even losing a job for 5–6 months burns through all their savings, you literally become homeless and can’t afford to survive. So, will half of the population just go extinct?
I’ve been thinking a lot about it, and I can’t come up with a realistic scenario that doesn’t end in mass disaster, given how current governments handle things.
I’m not educated in the field, so I can’t really give a fact-based opinion.
--- TOP COMMENTS --- I predict riots will start around 20% unemployment
" what will happen to the 50% of people without jobs and income?"
My take is that those who have the money & power would not want a volatile population. Sure, they don't care enough about protest and fairness. But it is cheaper and safer to pay people off than trying to fight a rebellion.
So they will give enough scraps (via UBI, or social welfare, or unemployment or something else) so that people are content enough not to rebel but probably still unhappy enough to rant online all the time.
Perplexity is fabricating medical reviews and their subreddit is burying anyone who calls it out
Read more Read lessSomeone posted about Perplexity making up doctor reviews. Complete fabrications with fake 5 star ratings. Quotes do not exist anywhere in cited sources. Medical information. About real doctor. Completely invented.
And the response in perplexity sub? Downvotes. Dismissive comments. Usual ‘just double check the sources’, ‘works fine for me’…
This is a pattern. Every legitimate criticism posted in r/perplexity_ai and r/perplexity gets the same treatment. Buried, minimized, dismissed. Meanwhile the evidence keeps piling up.
GPTZero did investigation and found that you only need to do 3 searches on Perplexity before hitting source that is AI generated or fabricated.
Stanford researchers had experts review Perplexity citations. Every single expert found sources that did not back up what Perplexity was claiming they said.
There is 2025 academic study that tested how often different AI chatbots make up fake references. Perplexity was the worst. It fabricated 72% of eferences they checked. Averaged over 3 errors per citation. Only copilot performed worse.
Dow Jones and New York post are literally suing Perplexity for making up fake news articles and falsely claiming they came from their publications.
Fabricating medical reviews that could influence someones healthcare decisions crosses serious line. We are in genuinely dangerous territory here.
Perplexity is provably broken at fundamental level. But r/perplexity_ai and r/perplexity treat anyone pointing it out like they are the problem. Brigading could not be more obvious. Real users with legitimate concerns get buried. Vague praise and damage control get upvoted.
--- TOP COMMENTS --- Well, it was nice knowing you.
That "everything is a wrapper" CEO always came across as very slimy to me
Developer And Technical
[R] Plain English outperforms JSON for LLM tool calling: +18pp accuracy, -70% variance
Read more Read lessTL;DR: Tool-call accuracy in LLMs can be significantly improved by using natural language instead of JSON-defined schemas (~+18 percentage points across 6,400 trials and 10 models), while simultaneously reducing variance by 70% and token overhead by 31%. We introduce Natural Language Tools (NLT), a simple framework that decouples tool selection from response generation and eliminates programmatic format constraints and extends tool calling to models even without tool-call support.
Resources: Paper
Authors: Reid T. Johnson, Michelle D. Pain, Jordan D. West
The Problem
Current LLMs use structured JSON/XML for tool calling, requiring outputs like:
This structured approach creates three bottlenecks:
Even when tool selection is separated from response generation, probability mass is diverted toward maintaining correct formatting rather than selecting the right tools.
Method: Natural Language Tools (NLT)
We introduce a simple three-stage framework that replaces JSON with natural language:
Example NLT architecture with Selector > Parser > Output
Stage 1 - Tool Selection: Model thinks through if any tools are relevant, then lists each tool with a YES/NO determination:
Stage 2 - Tool Execution: Parser reads YES/NO decisions and executes relevant tools
Stage 3 - Response: Output module receives tool results and generates final response
Evaluation: 6,400 trials across two domains (Mental Health & Customer Service), 16 inputs per domain, 5 repetitions per input. Both original and perturbed inputs were tested to control for prompt engineering effects.
Results
We find that NLT significantly improves tool-call performance, boosting accuracy by more than 18 percentage points (69.1% to 87.5%). Variance overall fell dramatically, falling more than 70% from .0411 to .0121 when switching from structured tool calling to NLT.
DeepSeek-V3 was a standout example, jumping from 78.4% to 94.7% accuracy while its variance dropped from 0.023 to 0.0016, going from among the least stable to the most consistent performer.
While we couldn't compare relative gain, NLT extends tool calling to models without native tool calling support (DeepSeek-R1: 94.1% accuracy).
Basic NLT Template
Basic NLT Prompt Template:
Full prompts and implementation details in Appendix A. Works immediately with any LLM with no API changes or fine-tuning needed.
Limitations
Latency considerations: NLT requires minimum two model calls per response (selector + output), whereas structured approaches can respond immediately when no tool is needed.
Evaluation scope: We examined single-turn, parameterless tool selection. While less complex than existing multi-turn benchmarks, it proved sufficiently rigorous -- no model achieved 100% accuracy in either condition.
A full discussion on limitations and areas for further research can be found in section 5.9 of the paper!
Discussion & Implications
We propose five mechanisms for these improvements:
For agentic systems, the NLT approach could significantly boost tool selection and accuracy, particularly for open-source models. This may be especially relevant for systems-critical tool call capabilities (i.e. safety).
For model trainers, training efforts currently devoted to SFT and RLHF for structured tool calls may be better directed toward natural-language approaches. This is less clear, as there may be cross-training effects.
One of the authors here, happy to answer any questions about experimental design, implementation, or discuss implications! What do you think?
--- TOP COMMENTS --- Great work. We use tool calling and json structured output extensively, and have seen examples where natural language queries (via ChatGPT) outperform the same taks when presented as structured outputs.
I got so sick of begging the LLM for a rigorous output format that structured outputs felt like a safe haven, although even then some of our more complex use cases surfaced examples where we still get json schema violations from the model (gpt4o). To the extent that we validate returned json and requery if necessary, increasing temperature and adding a random nonce to the prompt to bypass caching.
Will definitely be checking this out!
How do you pass parameters to the tools?
Qwen3-30B-A3B FP8 on RTX Pro 6000 blackwell with vllm
Read more Read lessPower limit set to 450w
Short Context (1K tokens):
Long Context (256K tokens):
Sweet Spot (32K-64K context):
FP8 quantization really shines here - getting 115 tok/s aggregate at 256K context with 10 users is wild, even with the power constraint.
https://preview.redd.it/x9t4ttsvrgvf1.png?width=7590&format=png&auto=webp&s=0c86bf3cc42032a595ee4d02b2c78986da150836
--- TOP COMMENTS --- wow 10 users can run it off one blackwell 6000. first numbers i’ve seen for multi users. that’s a big deal for small and medium businesses. great value imo
You need to go read up on LACT immediately brother, and then apply the below config. On my RTX PRO 6000 Blackwell workstation cards, they run at ~280 watts, faster than STOCK settings and 600 watts from Nvidia.
UNDERVOLTING is king. Here is your LACT Config:
Looking for tools that can track my ai agent trajectory and also llm tool calling
Read more Read lessSo I’ve been building a customer support AI agent that handles ticket triage, retrieves answers from our internal knowledge base, and triggers actions through APIs (like creating Jira tickets or refund requests).
Right now, I’m stuck in this endless cycle of debugging and doing root cause analysis manually.
Here’s what I’m realizing I really need:
It’s crazy how little visibility most stacks give once you’re past the prototype phase.
How are you all debugging your agentic systems once they hit production? I have been researching some of the platforms such as maxim, langfuse etc. But i wanted to ask if you guys use any specific setup for tracing/ tool use monitoring, or is it still a mix of logs, dashboards?
--- TOP COMMENTS --- I just created this and im looking for people to test it out. Feel free to give it a try.
https://aisentinel.info
I have this covered in a platform I'm beta testing https://github.com/imran31415/agentlog
Free/open source/self host able and written in a sane language like go
Reviewing Claude Code changes is easier on an infinite canvas
Read more Read lessEver since Sonnet 3.5 came out, over a year ago my workflow has changed considerably.
I spend a lot less time writing code so the bottleneck has now shifted towards reading and understanding it.
This is one of the main reasons I've built this VSCode extension where you can see your code on an infinite canvas. It shows relationships between file dependencies and token references, and displays AI changes in real time.
If you'd like to try it out you can find it on the VSCode extensions marketplace by searching for 'code canvas app'. Would love any feedback.
What do you guys think? Have you noticed the same change in your code workflow, and would something like this be useful to speed up code reviewing Claude Code changes?
--- TOP COMMENTS --- Hey thanks this looks pretty useful! I noticed in one of your reviews that someone said that its too expensive. I couldn't find your pricing page anywhere. Is it free or is it behind a paywall?
I think maybe a $20 dollar one time for pro rather than perma 5 bucks a month is a bit more convincing - no offense !!
I fine-tuned Qwen3-VL (4B & 8B) on a free Colab instance using TRL (SFT and GRPO)!
Read more Read lessI've created a couple of notebook that work for free on Colab (T4 GPU) to fine-tune the new Qwen3-VL small and dense vision-language models (4B and 8B). Both the Instruct and Thinking variants are supported.
They use TRL, which handles most of the training complexity so you can focus entirely on the specific task you want to fine-tune for.
<think>
and<answer>
sections.Both notebooks can be run on a free Colab instance, but can also be scaled up for more advanced setups. The notebooks can also be accessed here: https://github.com/huggingface/trl/tree/main/examples/notebooks
Feedback and experiments are welcome!!
--- TOP COMMENTS --- this is really cool. what is TRL? i have always wanted to know how you made a model into a thinking one.
I can't get Qwen3-VL to even run locally, Studio LM can't even run GGUFs at all, ollama doesn't even list it
Sora prompting thread 🧵
Read more Read lessHi I all I have noticed SORA is now responding better to prompts with delimiters. What has worked well for you?
Here is an example
https://sora.chatgpt.com/p/s_68f0a65b319481918198a2e0d2f5c7b7
Prompt:
lighting and mood
Cinematic video takes place in Tokyo Japan. At night we focus on the nightlife and morning we focus on the parks where people do yoga and mediation
main character
A cute house cat that is wearing a tie and has a work bag
the story
It is about a cat and his day. He start by stretching the park and then wears his tie and then takes the train to work. After that he steps out and take the train to party with other animals around his size (mostly cats) in Tokyo vibrant night life scene. Then he takes his train home and sleeps at home
style
80s Anime style
music
background sounds from each environment
Why is there still no simple way to just save and reuse our own AI prompts?
Read more Read lessWe use ChatGPT or Claude every day, yet there’s still no clean, focused way to just save and reuse the prompts that actually work for us.
I’ve tried a bunch of tools — most are either too minimal to be useful, or so bloated that they try to be an “AI platform.”
Has anyone here found a lightweight, no-BS solution that just handles prompt management well?
(If not, maybe it’s time we build one together.)
--- TOP COMMENTS --- i took all my prompts and gave it to gemini 2.5 pro and had it make an interactive dashboard html. it has a card view and a list view, a search, and there are category buttons. just pop it into your browser as a tab and boom easy peazy. see image: https://anonpaste.com/share/interactive-dashborad-for-prompts-dfd007ea38
that's what custom GPTs or gems are for. you even get versioning with the GPTs. I use git too tho.
LLama.cpp GPU Support on Android Device
Read more Read lessI have figured out a way to Use Android - GPU for LLAMA.CPP
I mean it is not what you would expect like boost in tk/s but it is good for background work mostly
and i didn't saw much of a difference in both GPU and CPU mode
i was using lucy-128k model, i mean i am also using k-v cache + state file saving so yaa that's all that i got
love to hear more about it from you guys : )
here is the relevant post : https://www.reddit.com/r/LocalLLaMA/comments/1o7p34f/for_those_building_llamacpp_for_android/
--- TOP COMMENTS --- What's actually impressive is the NPU, since it can generate 512x512 images with stable diffusion 1.5/2.1 models in 5 seconds. LLMs don't get that much of a speed boost, but they do give your phone breathing room. If you use an 8b model for 3 prompts, your phone turns into an oven if you use the CPU/GPU, but with the NPU, it's all good. Though the caveats are the need to convert models specifically to work with the NPU.
what is your phone and how do u run it with llama.cpp to enable GPU, pls provide more details, thx
Session Limit Hit Prematurely [75%]
Read more Read lessI was using Sonnet 4.5 and it said I reached my session limit at 75% according to the usage tracker.
Sending a short one sentence question, akin to a Google search, to a new chat doesn’t go through either.
Earlier this week the same thing happened with Opus 4.1 at 91%, except with the weekly limit, and new short messages don’t go through either.
I think Sonnet & Opus being out of sync may have something to do with it because a previous Sonnet session did the same thing at 92%, but 75 is just too ridiculous not to address. And if Opus usage doesn’t roll over, and this happens every week, I’ll miss out on a good chunk of usage by the end of my billing cycle.
Is this something I email about or is there already a recourse system in place?
--- TOP COMMENTS --- I automatically reached the weekly limit at 76%
These pro weekly limits are ridiculous. We should only have weekly limits on max, who are often abusing the servers. Hit 41% after 2 days rn.
Has Claude changed? It used to feel natural — now it’s stiff and overcomplicated
Read more Read lessI’ve seen a lot of posts lately about people saying that the AI models have changed, and honestly, I used to think they were exaggerating.
I’ve been using Claude for about a year and a half — it’s been my favorite model for a long time. I liked it because it felt very natural, aligned with my writing style, and gave great support when drafting content.
But over the past few weeks, something feels off. Since Sonnet 4 (or maybe 4.5), the responses have become noticeably less natural and less “present.” It often produces long, overcomplicated text even when I explicitly ask for something concise. I find myself rewriting a lot more than before.
Out of curiosity, I tried GPT-5, and honestly, it gave me the kind of output I used to expect from Claude — more fluid, clear, and human-sounding.
I know this isn’t scientific feedback, but after using Claude daily for months, the shift is very noticeable.
Has anyone else felt the same change lately? Or is it just me?
--- TOP COMMENTS --- I had a similar problem when GPT switched from 4o to 5. I think people get so implicitly entrained to workflows that they can't functionally separate themselves to reorient when the product changes. I'm really enjoying Sonnet 4.5 compared to Sonnet 4. It seems to have more breadth. It's shrewder, less tolerant of BS. (There are bad parts too.)
It seems like whenever some people are saying "this sucks!", others are saying "actually I think this is better" and that suggests that it comes down to usage style and headspace as much as the products themselves.
This happened to me today. Was rate limited so didn't have access for a few days but today when I got back, it's just dumb as fuck. Doesn't even understand things that it normally doesn't have issues with normally.
Turn off your MCPs
Read more Read lessIf you're not actively using them your context, they are eating up a ton of your window. The chrome tools MCP alone eats up 10% of you context in every conversation. These tools are great when you need them but are quite expensive in terms of tokens.
--- TOP COMMENTS --- ive often thought there should be some MCP router. That conditionally add/removes certain mcps based on initial prompt context. Though I get it would be tricky to implement.
Not possible with Claude - but its possible to have a sort of "how do I use this tool" system where it recalls the information when needed. Likewise, Cloudflare's MCP interface solution is also super elegant.
Basically, instead of MCP, just give a Typescript interface where they write code against to do multiple things at once.
Otherwise for now with claude, yes. Only use what you need giving the moment.
Why does it switch models
Read more Read lessI use Claude for brainstorming before dnd sessions, free plan. I had one very long chat with worldbuilding data. Few hours ago it switched to Haiku from Sonnet 4.5 and Haiku can't handle it - it messes up the data, forgets some details that Sonnet was able to keep in mind.
I can choose to switch to Sonnet only in web version and even then only through making a new chat. Is it a bug? Why is it happening? It's inconvenient and doesn't make sense since the chat was originally Sonnet anyway.
My question isn't related to usage limits in any way. It is about the automatic switching of models during updates in all chats.
--- TOP COMMENTS --- Money...
i think we need to wait another week to get this fixed
Companies And Business
I have to compliment anthropic: a good move to cut costs within months
Read more Read lessAnthropic's recent moves are not about innovation, but a calculated playbook to cut operational costs at the expense of its paying users. Here's a breakdown of their strategy from May to October:
Conclusion: Over four to five months, Anthropic masterfully executed a cost-cutting campaign disguised as a product evolution. Users received zero net improvement in AI capability, while Anthropic successfully offloaded them onto a significantly cheaper infrastructure, pocketing the difference.
--- TOP COMMENTS --- I honestly see this really hurting both paid users and free users because paid users are getting screwed by the limits like they're hitting conversation limits like no tomorrow and free users are stuck on the worst model possible with haiku like this is terrible for both.
My experience so far is Sonnet 4.5 isn't quite as good as Opus 4.1, but I'm happy for close enough if it means the company is more sustainable and Claude Code doesn't go away!
OpenAI would have to spend over $1 trillion to deliver its promised computing power. It may not have the cash.
Read more Read lessOpenAI (OPAI.PVT) would have to spend more than $1 trillion within the next five years to deliver the massive amount of computing power it has promised to deploy through partnerships with chipmakers Nvidia (NVDA), Broadcom (AVGO), and Advanced Micro Devices (AMD), according to Citi analysts.
OpenAI's latest deals with the three companies include an ambitious promise to deliver 26 gigawatts worth of computing capacity using their chips, which is nearly the amount of power required to provide electricity to the entire state of New York during peak summer demand.
Citi estimates that it takes $50 billion in spending on computing hardware, energy infrastructure, and data center construction to bring one gigawatt of compute capacity online.
Using that assumption, Citi analyst Chris Danely said in a note to clients this week that OpenAI's capital expenditures would hit $1.3 trillion by 2030.
OpenAI CEO Sam Altman has reportedly floated bolder promises internally. The Information reported in late September that the executive has suggested the company is looking to deploy 250 gigawatts of computing capacity by 2033, implying a cost of $12.5 trillion.
But there's no guarantee that OpenAI will have the capital to support the costs required to achieve its goals.
--- TOP COMMENTS --- I hate headlines like this.
"OpenAI would have to spend over $1 trillion to deliver its promised computing power."
According to who? Some Yahoo writer or maybe some analyst he's quoting from a conference?
New powerplants, and new factories manufacturing chips and other hardware are being made everyday to accommodate OpenAI. I don't believe using existing figures for expenses is going to provide an accurate overview of what is to come.
That’s why they’re doing porn
Tutorials And Guides
This guy literally explains how to build your own ChatGPT (for free)
Read more Read less--- TOP COMMENTS --- He just recently released an even cooler project, called nanochat - complete open source pipeline from pre-training to chat style inference.
This guy is legend, although this is the OpenAI sub, his contributions to the field should definitely not be marginalized.
Because he worked at and was one of the founder members of OpenAI, not some random guy on Youtube
Industry Adoption
The Void at the Center of AI Adoption
Read more Read lessCompanies are adding AI everywhere — except where it matters most.
If you were to draw an organization chart of a modern company embracing AI, you’d probably notice something strange:
a massive void right in the middle.
The fragmented present
Today’s companies are built as a patchwork of disconnected systems — ERP, eCommerce, CRM, accounting, scheduling, HR, support, logistics — each operating in its own silo.
Every software vendor now promises AI integration: a chatbot here, a forecasting tool there, an automated report generator somewhere else.
Each department gets a shiny new “AI feature” designed to optimize its local efficiency.
But what this really creates is a growing collection of AI islands. Intelligence is being added everywhere, but it’s not connected.
The result? The same operational fragmentation, just with fancier labels.
The missing layer — an AI nerve center
What’s missing is the AI layer that thinks across systems — something that can see, decide, and act at a higher level than any single platform.
In biological terms, it’s like giving every organ its own mini-brain, but never connecting them through a central nervous system. The heart, lungs, and limbs each get smarter, but the body as a whole can’t coordinate.
Imagine instead a digital “operations brain” that could:
This kind of meta-agent infrastructure — the Boss of Operations Systems, so to speak — is what’s truly missing in today’s AI adoption landscape.
#Human org chart vs AI org chart
Let’s imagine two organization charts side by side.
Human-centric organization
A traditional org chart scales by adding people.
Roles are grouped aroundthemes or departments— Marketing, Sales, HR, Finance, Operations.
Each role is broad: one person might handle several business processes, balancing priorities and communicating between systems manually.
As the business grows, headcount rises.
Coordination layers multiply — managers, team leads, assistants — until communication becomes the bottleneck.
AI-centric organization
Now, draw an AI org chart.
Here, the structure scales not by people but byprocesses.
Each business process — scheduling, invoicing, payroll, support triage, recruitment, analytics — might haveone or two specialized AI agents.
Each agent is trained, prompted, and equipped with access to the data and systems it needs to complete that specific workflow autonomously.
When the business doubles in size, the agents don’t multiply linearly — they replicate and scale automatically.
Instead of a hierarchy, you get anetwork of interoperable agents coordinated by a central control layer — an “AI operations brain” that ensures data flow, compliance, and task distribution.
This model doesn’t just replace humans with AI. It changes how companies grow. Instead of managing people, you’re managing intelligence.
Why this void exists
This central layer doesn’t exist yet for one simple reason: incentives.
Every SaaS vendor wants AI to live inside their platform. Their business model depends on owning the data, the interface, and the workflow. They have no interest in enabling a higher-level system that could coordinate between them.
The result is an AI landscape where every tool becomes smarter in isolation — yet the overall organization remains dumb.
We’re optimizing the parts, but not the system.
The next layer of AI infrastructure
The next wave of AI adoption won’t be about automating tasks inside existing platforms — it’ll be about connecting the intelligence between them.
Companies will need AI agents that can:
Essentially, an AI operating system for organizations — one that finally closes the gap between fragmented SaaS tools and unified, intelligent operations.
The opportunity
This “void” in the middle of the AI adoption curve is also the next trillion-dollar opportunity.
Whoever builds the connective tissue — the platform that lets agents reason across data silos and act with context — will define the future of how businesses run.
Right now, companies have thousands of AI-enhanced tools.
What they lack is theAI that manages the tools.
The age of intelligent organizations won’t begin with another plugin or chatbot.
It’ll begin when the center of the org chart stops being empty.
--- TOP COMMENTS --- Hey look, its AI talking about exactly what I am building!
Lmao AI talking about AI.
Top US Army general says he’s using ChatGPT to help make key command decisions
Read more Read lesshttps://nypost.com/2025/10/16/business/us-army-general-william-hank-taylor-uses-chatgpt-to-help-make-command-decisions/
--- TOP COMMENTS --- The promt: In this scenario, what would Sun Tzu do?
His soldiers prompts to ChatGPT: "Write a two word response to my General to affirm the General's orders. Keep it professional and effective."