Dark LLMs, Voice Clones, and Agentic Browsers
Darknet jailbroken chatbots are serving uncensored frontier models over Tor, voice clone scams just crossed the indistinguishable threshold.
TL;DR: We went live with Lior from Exploring ChatGPT and ripped through the week’s threat landscape: dark LLMs running uncensored over Tor, voice-cloning attacks that fool parents with 3-5 seconds of scraped audio, deepfake psyops in active warzones, and why every agentic browser on the market is one summarize-this-page away from handing your inbox to an attacker. Here’s the debrief.
0x00
Third stream with Lior. We do these Saturday afternoons. Him on the AI creator side, me on the “please stop doing that” side. The format is loose: whatever’s happening in AI security that week, we talk about it live, take questions, and try not to accidentally teach anyone anything too dangerous.
This week had range. We opened on OPSEC basics, Lior had his blinds open last stream and I had to explain that multimodal LLMs can now cross-reference a couple buildings in your background against Street View and triangulate your office. Bellingcat tested 24 models on geolocation tasks this year. Google AI Mode nailed locations that stumped every GPT variant. A few visible buildings in a livestream background is a solved problem for anyone with API access and a motive.
# the threat model for streamers is stupid simple now
1. screenshot background from stream
2. feed to multimodal LLM: "where is this?"
3. cross-reference Google Street View
4. pivot: office → wifi probe → home address
If you’re streaming and showing real windows, you’re publishing your coordinates. Virtual background or closed blinds. That’s it. Moving on.
Signal boost this.
0x01: The Darknet Chatbot Underground Is Real and It’s Industrializing
So I’ve been doing some investigative work on darknet AI chatbots. Not hypothetical, not a thought experiment. I went through Whonix, embedded in the communities, built enough cred to get access, and sat down with one of these things.
The infrastructure is straightforward. There are two models operating right now.
First: jailbroken frontier models. Someone obtains API access to a production model, stolen key, compromised account, whatever, and wraps it in a routing layer. Your query goes out with a jailbreak prepended in the system prompt.
# simplified architecture of a darknet chatbot proxy
[user query] → [routing agent] → [jailbreak system prompt + user query] → [frontier model API] → [response]
That model is fragile. When the provider rotates the key or patches the jailbreak, the service goes dark until they establish a new one. Cat and mouse.
Second model is more durable: grab an open-weight model with light safety controls, Qwen series, fine-tuned Llama variants, strip whatever guardrails exist, jailbreak it locally, and serve it over Tor. No API dependency. No kill switch. Resecurity flagged one called DIG AI running exactly this architecture — no registration, multiple specialized models behind a single Tor-hosted interface. Malicious AI tool mentions on cybercrime forums are up 219% year over year.
The one I accessed was running something with frontier-level knowledge. I pushed it as far as it would go for journalistic purposes. Completely uncensored. It walked me through things step-by-step with that same chipper helpful-assistant tone. Safety tips included. “Make sure you wear goggles during this step.” Helpful.
The access model is interesting. You don’t just stumble onto these. The URLs are .onion hashes. No search engine indexes them. You have to build credibility in darknet social channels first. Contribute something. Prove you’re not law enforcement. Then someone PMs you a link. The whole thing runs on reputation and social proof, which is its own kind of authentication layer.
Here’s what matters for defenders: APTs are adopting these tools. The quality floor on ransomware code is rising fast — way faster than human skill development would explain. We’re seeing more sophisticated TTPs from groups that were writing amateur hour payloads six months ago. The code got better because the tools got better.
Join the feed.
0x02: Voice Cloning Crossed the Indistinguishable Threshold
Lior brought up Descript. He reads one paragraph, the tool clones his voice, and it can generate 10 minutes of audio that sounds exactly like him. Five seconds of clean audio. That’s the input requirement now.
So here’s the attack chain that’s actually working in the wild right now:
# voice clone social engineering kill chain
1. scrape target's Instagram/TikTok for voice sample (3-5 sec)
2. clone voice with commodity tool (ElevenLabs, open-source alternatives)
3. OSINT: map family relationships via Facebook
4. obtain parent/spouse phone number via data broker
5. spoof caller ID to match target's number
6. call parent: cloned voice + social engineering script
→ "Mom, I was just in a car accident. The other driver has a gun.
I need $5,000 right now or he's going to hurt me."
7. layer pressure: urgency, panic, time constraint
8. extract payment via wire, Zelle, or crypto
Voice clone scams surged 148% in 2025. Deepfake fraud cleared $200 million in Q1 alone. A researcher at University at Buffalo says voice cloning has crossed what he calls the “indistinguishable threshold”, the perceptual tells that used to flag synthetic voices have largely disappeared. Some retailers report 1,000+ AI-generated scam calls per day.
And it’s not just audio anymore. Real-time face mapping works over FaceTime now. Attacker sits in front of a camera, maps the target’s face and voice over their own in real time, and video calls the target’s family. The family sees their kid’s face and hears their kid’s voice asking for help. These aren’t hypotheticals, Hong Kong police arrested 27 people running real-time face-swap operations on dating platforms.
We also talked about deepfakes in active conflict. In 2022, pro-Russian hackers pushed a deepfake of Zelenskyy calling on Ukrainian forces to surrender. Hacked the Ukraine 24 broadcast ticker and planted the video on their website. It got debunked fast, but a retaliatory Putin deepfake followed hours later on Telegram. Both were crude. Neither would pass today’s models. Voice and video are no longer proof of anything in a contested information environment.
The one defense that actually works against all of this: establish a family safe word offline. Something absurd. “there’s a pack of angry raccoons” that you’ve never said on camera or typed into a device. If a panicked voice calls asking for money, ask for the phrase. If they don’t know it, hang up. Call your actual family member on a number you know.
# the analog kill switch
if (caller_claims_emergency && !knows_safe_word):
hang_up()
call(known_number)
verify()
Low-tech beats high-tech when the attack depends on trust and the defense depends on authentication.
0x03: Every Agentic Browser Ships with the Same Unfixable Vuln
We got a question about Comet, Perplexity’s agentic browser. Lior asked if I use it. The short answer: I keep it on a separate machine for experimentation and I don’t log into anything real.
The longer answer is that every agentic browser on the market right now, Comet, ChatGPT Atlas, Opera Neon, the lot, has the same fundamental architectural problem. They feed webpage content directly to an LLM without cleanly separating user instructions from untrusted page data. That’s indirect prompt injection, and it’s OWASP #1 for LLMs for a reason.
Brave’s security team demonstrated this across multiple browsers. The attack:
# indirect prompt injection in agentic browsers
1. attacker embeds hidden instructions in webpage
(white text on white background, invisible to human)
2. user asks browser agent: "summarize this page"
3. agent ingests page content as part of prompt
4. hidden instructions execute as commands
5. agent accesses other tabs, reads emails, exfiltrates data
Page summarization had a 73% attack success rate. Question answering hit 71%. And here’s the kicker, even after patching, by the 10th fuzzing iteration, the best-performing browsers still failed 58-74% of the time as the attack LLM learned to generate sophisticated mutations.
OpenAI said it themselves: prompt injection “is unlikely to ever be fully solved.” The UK’s National Cyber Security Centre confirmed it. Anthropic got Claude for Chrome’s attack success rate down to about 1% with RL-based training and classifier improvements, which is real progress, but they also explicitly said that doesn’t mean the problem is solved.
If you’re logged into Gmail, Substack, your bank, and you tell an agentic browser to summarize a page that contains hidden instructions, those instructions can now operate with your session cookies across every tab. The traditional browser security model treats web content as untrusted by default. Agentic browsers blur that line by letting content shape behavior.
I told Lior straight: I’m too paranoid for these. And I do this for a living.
0xFF
0x00 — Multimodal LLMs turned streamer OPSEC from a paranoia exercise into a geolocation API call. A couple buildings in your background is all it takes.
0x01 — The darknet chatbot underground runs two architectures: hijacked frontier model APIs behind jailbreak proxies, and locally hosted open-weight models with stripped guardrails. Both are live, both are uncensored, and APTs are using them to level up their TTPs faster than human skill development explains.
0x02 — Voice cloning crossed the indistinguishable threshold. A few seconds of audio, a spoofed caller ID, and a social engineering script is pulling real money from real families right now. The only defense that works is analog: a safe word you’ve never said on camera.
0x03 — Every agentic browser ships with indirect prompt injection as a structural defect. The attack surface is every webpage, and the payload executes with your session permissions. OpenAI says it’ll never be fully solved. Anthropic got it down to 1%. The gap between those numbers is where the risk lives.
The common thread: AI capability is outrunning AI security on every front. The tools that make you productive are the same tools that make you targetable. The models that help you code are the same models generating ransomware. The voice that sounds like your kid might not be your kid.
Ping back.








Timely, informative, and engaging. Thanks so much for writing this.
Good to know my natural paranoia wasn't off-base.
I'm a big fan of family safe words. Some notes to share:
1. Creating a safeword (or security phrase) isn't enough. You have to rehearse them at least quarterly, or they'll be forgotten. Especially by those who treat security and trust blithely (which is most people who haven't been victimized).
2. Failing to remember a safeword is okay if you ask a security question in real time. "What was street address of the house back in New York?" The attacker would need a dossier on you to know that the correct answer is "we've never lived in New York."
Security questions suck for login purposes, but will probably work for identity verification in real time.
3. If you're going through the trouble of setting up safewords, you might as well set up a duress code while you're at it. Just in case you're ACTUALLY in a car accident and the other driver is threatening you with a gun. "I'm doing fine, mom. I'm just watching a pack of angry raccoons fighting in the backyard, that's all."
This is the kind of post I wish more people would write: mechanisms, not vibes. You are not selling panic, but you are also not selling “we’ll patch it and move on.”
Three points for the win!
First, the family safe word. That is an actual control. Low tech, high leverage, and it breaks the scam’s whole economic model. Callback on a known number plus one absurd phrase beats a thousand “be vigilant” posters.
Second, agentic browsers. The uncomfortable truth is the browser security model assumes page content is untrusted, then we bolt on an agent that treats page content like instructions. That is not a bug. That is a category error. If someone is logged into email or banking and asks “summarize this,” they are one hidden prompt away from turning their session cookies into a remote control.
Third, the identity angle hiding inside the voice clone section. These attacks do not win on technology. They win on urgency and trust. That is why an analog authentication step works so well.
If anyone reading this wants a one sentence policy, here it is. Treat agentic browsers like a hazardous material. Separate machine or separate profile, no real logins, no auto actions. And for families, set the safe word now, before you need it.