What Sovereign AI Means for Devs
Rumor is, it wasn't reported to protect user data, but as an act of mercy to spare us prompts about passive-aggressive vulnerability, fishing for praise, and humblebragging.
PODCAST
The Rise of Sovereign AI and Global AI Innovation in a World of US Protectionism
Everyone worried about the rise of Skynet and the Terminator - nobody predicted the rise of MechaHitler.
Still, it flagged how sovereign AI is becoming a national priority as governments look to secure sensitive data and reduce reliance on US infrastructure. Frank outlined how geopolitics, power availability, and urgency are reshaping deployment strategies. Some countries, like France, are building large-scale, state-of-the-art facilities, while others are moving faster with more modest, local setups.
Indonesia offers a practical example of this approach:
Repurposed infrastructure: An existing telecom data center was upgraded with H100s for inference use.
Rapid deployment: The entire setup went live in just four months.
Click below to listen - it's what John Connor would do.
HIDDEN GEMS
The Safety Dance // Gem // Song
A curated collection of resources on building safety-critical AI systems, focused on reliability, fault tolerance, and real-world deployment challenges.
Memories Are Made of This // Gem // Song
Demonstrating how to add persistent long‑term memory to a Gemini 2.5 chatbot using the Gemini API and Mem0, enabling the model to recall past conversations and deliver more personalized responses.
All For Me Grog // Gem // Song
An open-source LLM compiler called Grog for building modular, type-safe pipelines by composing functions into graphs, with built-in caching, observability, and cost tracking for production use.
An open-source GitHub project, Pangolin, offering a self-hosted alternative to Cloudflare Tunnels, combining a WireGuard-based proxy with a dashboard UI, identity access control, and built-in routing via Traefik.
PODCAST
From the Legal Trenches to Tech
It was nice talking to a lawyer and not being referred to as the accused for once.
Nick shared how his legal background shaped the AI tools he's building to help disability attorneys work faster and fairer. He’s using LLMs and structured logic to automate complex parts of Social Security disability law, especially parsing medical records and auditing testimony. By encoding regulatory frameworks into decision trees and layering in RAG, his team surfaces the specific medical and legal evidence needed to support or appeal a claim.
Parsing medical records is a key use case where LLMs outperform keyword search:
They interpret inconsistent clinical phrasing (e.g. “limited mobility” vs “reduced range”) against regulatory criteria
ICD-10 mapping and temporal logic let the models track whether all conditions are met within required timeframes
Click below - this episode is now in session.
MEME OF THE WEEK
UP COMING READING GROUP - JULY 24
Small Language Models are the Future of Agentic AI
Are LLMs overkill for most agentic AI tasks?
This month's MLOps Reading Group is tackling that head-on with “Small Language Models are the Future of Agentic AI.”
The authors argue that small models can be faster, cheaper, easier to deploy - and actually better for many use cases. Expect debate around tradeoffs, practical deployment tips, and a look at their proposed LLM-to-SLM conversion framework.
Join us for a sharp, technical discussion on where small models fit in modern agent design. It’s happening Thursday, July 24 - grab the paper, bring your questions, and come argue with smart people.
BLOG
The Impact of Prompt Bloat on LLM Output Quality
Like many, I learned the hard way that just because I could handle a lot, doesn't mean I should.
It’s the same with LLMs. Overloading prompts with unnecessary context can reduce output quality, even when you're within token limits. Longer isn’t always better - irrelevant or poorly placed information can distract the model or throw off its reasoning.
This is especially true with chain-of-thought prompting:
Even when LLMs detect irrelevant input, they often can’t ignore it, leading to off-track or incorrect reasoning.
Semantically similar distractions are particularly damaging, as they’re harder for models to distinguish from useful context.
Tools like ScaleDown and DSPy help clean prompts automatically.
Click below to read - it’s not too much to handle.
ML CONFESSIONS
We added a few new features to improve churn prediction. One of them relied on a pretty brittle extraction script - nothing crazy, just some light parsing from raw logs.
A few weeks later, performance tanked. I went through everything - schema, pipeline, even retrained the old model. No difference.
Eventually noticed the new feature was always zero. Turns out the extraction was quietly failing, returning None, and then getting imputed as zero downstream. It had been like that during training too, so the model just learned with that garbage baked in.
I fixed the extractor. Accuracy got worse.
Rolled it back, flagged the bug as “low priority,” and haven’t touched it since.
Share your confession here.
HOT TAKE
Most AI coding startups are just thin wrappers on VSCode and GPT‑4 with no moat, shaky licensing, and a burn rate that only makes sense if Google buys them.
Too harsh? Or exactly right?