Topic: Hardware And Infrastructure

Nvidia’s $46.7B Q2 proves the platform, but its next fight is ASIC economics on inference
Behind Nvidia's strong quarterlyu results are ASICs gaining ground in key Nvidia segments, challenging their growth in the quarters to come....

Key Takeaways:
- Nvidia's data center revenue reached $41.1 billion, up 56% year over year.
- Custom ASICs, led by Broadcom, are gaining ground in key Nvidia segments and challenging their growth in the quarters to come.
- Nvidia's platform advantage and ecosystem lock-in are strategic strengths, but may not be enough to counter ASIC competitors' price and performance advantages.
Google Debuts Device-Bound Session Credentials Against Session Hijacking
Article URL: https://www.feistyduck.com/newsletter/issue_128_google_debuts_device_bound_session_credentials_against_session_hijacking Comments URL: ht...

Key Takeaways:
- DBSC uses public-key cryptography to bind session credentials to a device, making them inaccessible on other devices.
- Google has announced a beta of DBSC in Google Workspace for users running Chrome on Windows.
- DBSC has the potential to make session hijacking a thing of the past if adopted by other browser vendors.

NVIDIA Jetson Thor Unlocks Real-Time Reasoning for General Robotics and Physical AI
Robots around the world are about to get a lot smarter as physical AI developers plug in NVIDIA Jetson Thor modules — new robotics computers that can ...

Key Takeaways:
- Jetson Thor modules offer 7.5x more AI compute, 3.1x more CPU performance, and 2x more memory than its predecessor, enabling real-time processing of high-speed sensor data.
- The modules are being adopted by companies like Agility Robotics and Boston Dynamics to enhance their humanoid robots' real-time perception and decision-making capabilities.
- Jetson Thor supports all popular generative AI frameworks and AI reasoning models with unmatched real-time performance, empowering developers to easily experiment and run inference locally.

AI models need a virtual machine
Article URL: https://blog.sigplan.org/2025/08/29/ai-models-need-a-virtual-machine/ Comments URL: https://news.ycombinator.com/item?id=45074467 Points:...

Key Takeaways:
- A well-specified AI VM would enforce a clean separation between model logic and integration logic, making models interchangeable components.
- A VM specification can enforce safety by design, routing all tool usage and external access through a well-defined interface, and providing built-in access control, audit logs, and fail-safes.
- A VM specification could provide transparent performance and resource tracking, verifiability of model output, and enable potential formal proof capabilities for trust.

Framework actually did it: I upgraded a laptop’s entire GPU in just three minutes
On Tuesday, I told you how the modular computer company Framework was finally fulfilling its promise of the "holy grail for gamers" - a laptop with mo...

Key Takeaways:
- The modular system allows for easy swap-out of laptops' GPUs with no technical expertise required.
- Framework partnered with Nvidia to create an upgrade that fits and works in an existing laptop, a first for the industry.
- The system is expected to become more mainstream, with Framework aiming to deliver future upgrades without being niche.
Dissecting the Apple M1 GPU, the end
Article URL: https://rosenzweig.io/blog/asahi-gpu-part-n.html Comments URL: https://news.ycombinator.com/item?id=45034537 Points: 541 # Comments: 110...

Key Takeaways:
- A team led by Hector Martin and other open-source developers reverse-engineered the Apple M1 GPU, paving the way for Linux to run natively on Apple devices.
- The project, Asahi Linux, now offers full graphics acceleration, including wireless and audio capabilities, and is capable of running Proton gaming with Direct3D 12 support.
- The work done on the M1 GPU has demonstrated the possibility of running conformant OpenGL, OpenGL ES, OpenCL, and Vulkan drivers on Apple platforms.

Framework is working on a giant haptic touchpad, Trackpoint nub, and eGPU for its laptops
Today, Framework announced the second-gen Framework Laptop 16 with two industry firsts: the first Nvidia graphics card upgrade you can perform at home...

Key Takeaways:
- Framework is working on a wide haptic touchpad similar to Apple's MacBooks.
- An eGPU for reuse of GPU modules is in development, targeting makers and potentially requiring 3D printing.
- The second-gen Framework Laptop 16 includes upgradeable Nvidia graphics and 240W laptop charging over USB-C.

Framework is now selling the first gaming laptop that lets you easily upgrade its GPU — with Nvidia’s blessing
Framework CEO Nirav Patel said he would deliver "the holy grail for gamers" with the Framework Laptop 16. In 2023, he suggested it'd be the first cons...

Key Takeaways:
- The new Framework Laptop 16 will ship with a mobile Nvidia GeForce RTX 5070 8GB that can be swapped in as little as two minutes, with a 30 to 40 percent uplift in performance compared to the original AMD Radeon RX 7700S.
- The laptop will also support up to four simultaneous displays, including the internal screen, and has four USB-C ports that can support 240W power input.
- Framework is taking preorders for the new laptop starting at $1,499 and will also release the new GPU and other upgrades as individual components for the existing Framework Laptop 16.

IBM and AMD Join Forces to Build the Future of Computing
Companies aim to merge AI accelerators, quantum computers, and high-performance computing to help solve a wide range of the world's most difficult pro...

Key Takeaways:
- IBM and AMD are collaborating on scalable, open-source platforms for quantum-centric supercomputing, leveraging IBM's quantum computers and AMD's high-performance computing and AI accelerators.
- The joint effort aims to tackle real-world problems at unprecedented speed and scale, leveraging the strengths of quantum and classical computing paradigms.
- The partnership could help progress IBM's vision to deliver fault-tolerant quantum computers by the end of this decade, leveraging AMD's real-time error correction capabilities.

Plaud upgrades its card-sized AI note-taker with better range
Plaud, the company behind an AI wearable that actually works, is launching an upgraded version of its credit card-sized note-taking device. Just like ...

The future of AI hardware isn’t one device — it’s an entire ecosystem
I dream of a gadget that can do it all. Instead, when I leave for the office, I pack one or two phones, a portable battery bank, a laptop, a Kindle, a...

Key Takeaways:
- Google views the future of AI hardware as a diverse set of accessories that work together in a personalized way, rather than a single dominant device.
- The company is experimenting with various form factors, including wearables, earbuds, and smart glasses, to see what works best.
- The goal is to create a seamless, ambient computing experience that anticipates users' needs and makes their lives easier, but this approach may lead to increased gadget clutter.
Community talk
Rising Tools
Nemotron-H family of models is (finally!) supported by llama.cpp
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and desig..
GPUPrefixSums – state of the art GPU prefix sum algorithms
Article URL: https://github.com/b0nes164/GPUPrefixSums Comments URL: https://news.ycombinator.com/it..
New AMD unified memory product - 512 bit bus = ~512GB/s memory bandwidth
Finally China entering the GPU market to destroy the unchallenged monopoly abuse. 96 GB VRAM GPUs under 2000 USD, meanwhile NVIDIA sells from 10000+ (RTX 6000 PRO)
[D] Huawei’s 96GB GPU under $2k – what does this mean for inference?
The Huawei GPU is not equivalent to an RTX 6000 Pro whatsoever
Patched P2P NVIDIA driver now works with multiple 5090s (and possibly blackwell 2.0 in general). Also works for 4090/3090.
Qwen3-coder is mind blowing on local hardware (tutorial linked)
Alibaba Creates AI Chip to Help China Fill Nvidia Void
GPT-OSS 120B on a 3060Ti (25T/s!) vs 3090
Deepseek r1 671b on a $500 server. Interesting lol but you guessed it. 1 tps. If only we can get hardware that cheap to produce 60 tps at a minimum.
GPT-OSS-120B on Single RTX 6000 PRO
128GB GDDR6, 3PFLOP FP8, Tb/s of interconnect, $6000 total. Build instructions/blog tomorrow.
Why do data centres consume so much water instead of using dielectric immersion cooling/closed loop systems?
Can 2 RTX 6000 Pros (2X98GB vram) rival Sonnet 4 or Opus 4?
Making progress on my standalone air cooler for Tesla GPUs
"NVIDIA Jetson Thor Unlocks Real-Time Reasoning for General Robotics and Physical AI"
3090 vs 5090 taking turns on inference loads answering the same prompts - pretty cool visual story being told here about performance
56GB VRAM achieved: Gigabyte 5090 Windforce OC (65mm width!!) + Galax HOF 3090 barely fit but both running x8/x8 and I just really want to share :)
Digit and Aimoga humanoid robots seems prepping for supermarkets
Training a 11M language model for Raspberry Pi Pico - progress
Automated microgreens mini-farm ran by Claude Code