0:00
/
0:00
Transcript

When Your Notepad App Gets a CVE: AI Security Is Everybody’s Problem Now

Episode 2 recap — ToxSec x Exploring ChatGPT live stream

This week I sat down with the host of Exploring ChatGPT for our second live stream on AI and cybersecurity. We’re turning this into a weekly Saturday series — alternating YouTube channels, figuring out AI video editing tools as we go, and generally building the plane while flying it.

Here’s what we covered and where the conversation’s headed.

What we talked about

Microsoft is cramming Copilot into everything — and it’s going about as well as you’d expect. Notepad. The most boring binary on your system. notepad.exe just caught a CVE because Microsoft decided it needed AI integration. Users who relied on a simple offline text editor suddenly had their data exfiltrated through the cloud layer nobody asked for. The trend is clear: every surface that touches an LLM is now an attack surface.

Stacked intelligence means stacked vulnerabilities. The Exploring ChatGPT newsletter has been covering how technology layers build on each other — internet → AI → robotics. We dug into whether vulnerabilities stack the same way. Short answer: yes. The CIA triad (confidentiality, integrity, availability) applies at every layer. A threat on the internet propagates into AI, which propagates into the robot running that AI. When an LLM hallucinates in a chatbot, you get a wrong answer. When it hallucinates inside a Tesla, you get a car crash. When it hallucinates inside a home robot with physical agency — that’s a different conversation entirely.

Someone yelled a basic jailbreak at a robot holding a paintball gun. It shot the CEO. That video keeps making the rounds because it demonstrates exactly what we’ve been warning about. The same prompt injection attacks that work on chatbots work on physical robots running LLMs. And it gets worse — imagine a malicious QR code that a robot’s vision model reads, follows the link, ingests a prompt injection, and executes instructions in the physical world. Five years out, this is a real attack chain.

The red team / blue team asymmetry is accelerating at machine speed. Defenders have to protect everything, everywhere, all at once. Attackers only need to be right once. AI tools like Claude Code are force-multiplying both sides, but the asymmetry still favors offense. You can vibe-hack your way into a credit card database the same way you can vibe-code your way through an app. The bar for entry just dropped through the floor.

Your model’s security isn’t your problem. Your platform’s security is. If you’re building on a SaaS API — Claude, ChatGPT, whatever — the model layer security is on Anthropic or OpenAI. Your job is securing the platform around it. That means IAM (Identity and Access Management), principle of least privilege, and human-in-the-loop for sensitive operations. The OpenClaw/MaltBook fiasco showed what happens when you ship an agent with full root access by default: your attack surface becomes your entire computer.

RAG poisoning is the new hotness. Retrieval Augmented Generation is everywhere right now. It’s how you give an LLM specific knowledge without bloating context. But if you’re not carefully curating what goes into your RAG, attackers can inject poison datasets. There’s even a fun one where Anthropic’s own “magic string” — an emergency stop mechanism — gets smuggled into RAG data, bricking your Claude instance every time it retrieves that chunk. Debugging that is a nightmare. Takeaway: human-in-the-loop on all RAG ingestion. No exceptions.

Vector embeddings are not encryption. Quick PSA we hit during the RAG discussion. When data goes into a RAG, it gets embedded into vector space. Looks like gibberish to a human. People treat this like it’s encrypted. It absolutely is not. Pull the embeddings, replay them through any LLM, and you’ve got the original data back. Treat RAG contents as plaintext.

Dark web models exist and they’ll build you whatever you want. Model extraction attacks strip the safety controls off flagship models and host the raw capability on onion sites. No guardrails, no refusals. Ask it to build a virus targeting Windows 11 and it’ll hand you one. These are popular and getting more capable.

The AGI race outweighs regulation. We touched on the state of AI regulation — the US is kicking the can, the EU keeps pushing deadlines, and the fundamental tension hasn’t changed. Regulate too hard and you fall behind in what is effectively a new arms race. Palantir’s putting LLMs in weaponized drones. Nobody’s pumping the brakes when the other side might not.

Ideas for future episodes

A few threads came up that deserve their own deep dives:

  • Dark web AI models — model extraction, uncensored LLMs, and the underground marketplace for AI-powered attack tooling

  • AI regulation reality check — what’s actually been implemented vs. proposed, the US vs. EU vs. China landscape, and the OpenAI ads / influence campaign problem

  • Pegasus and zero-day economics — the $250K/year spyware subscription model and how zero-click exploits flow from discovery to weaponization

  • Agentic security — red agents vs. blue agents, autonomous attack and defense at machine speed, and what happens when humans can’t keep up

  • The misconduct deep dive — “Empire of AI” book discussion, conflicts of interest in the AI industry, and who’s really steering the ship

Watch / subscribe

New episodes drop every Saturday. We’re alternating between YouTube channels and posting recordings on Substack.

ToxSec drops weekly. Subscribe.

Discussion about this video

User's avatar

Ready for more?