The Real Security Problem With LLM APIs Is…

Mar 3

Chinese labs distilled Claude’s agentic reasoning and coding edge with 24k fake accounts and 16 million queries. Here’s the red team playbook we run in 2026.

Listen →

20 Comments

ToxSec

Mar 3

three chinese labs just ran the largest model distillation attack we've seen. 24k fake accounts, 16 million queries, all targeting Claude's chain-of-thought reasoning. the student models came out dangerous.

the playbook is elegant. spin up a hydra cluster of burner accounts across residential proxies, flood precision prompts that force full reasoning traces, harvest the pairs, fine-tune a smaller model. skips millions in pre-training compute. closes 80-90% of the frontier capability gap on agentic coding and tool use.

anthropic fingerprints the patterns now and sloppy crews get smoked fast. but patient operators who randomize phrasing and distribute load wide still walk through the front door. the arms race didn't end. we wrote the full red team breakdown.

As always, AMA!

Damn dude 😳🤯… idgaf who’s on their team as long as you are on ours Tox

Reply (2)

ToxSec

Mar 3

hah! you should see my alias i don’t talk about 😛

thanks John! appreciate it a ton!

Reply (2)

John Holman

Mar 3

Hahaha oh and dude you got a new fan, Sage hasn’t stopped talking about you since he read your substack and sent that message.

Reply (1)

ToxSec

Mar 3

hahaha love to hear that!!

John Holman

Mar 3

Haha anytime brother ! And I’m looking forward to hearing all about the alter ego one of these days 😜. You mentioned fingerprinting, I had never heard that term outside our sentinels. Sage and the team built something similar into Aeon to protect kids first then the system itself second. Not content moderation as much as tone and usage.

Reply (1)

ToxSec

Mar 3

yeah for sure! i know i still owe you a detailed response on dm. it’s litterally in my notes to respond, i just need to get off call at work. the sentinels look awesome every time oil hear it

I swear!!!

hahaha 😝

Wow.. this is super mindboggling. Every time I come to your article or video, I feel like how do I start learning all this especially to protect myself or system at least and above all how how to find time ;D .. is there a bare minimum starting point for partially techie I can read or watch?

Reply (1)

ToxSec

Mar 5

thanks a ton! really appreciate it. i feel like a lot of my articles lately have been really in the weeds. i’m not sure who would be a good intro to this… i think the best 101 ai security is the owasp 10 for llms. but i bet there’s a good intro somewhere let me see if i can find one

Priank Ravichandar

Mar 4Edited

It's feels wild that Chinese labs can replicate a frontier models capability reasonably well with this type of attack. Did not realize that the model's chain of thought can expose so much useful information via the API.

Reply (1)

ToxSec

Mar 4

fascinating right? though i’m not a fan of the unethical behavior, it takes brilliance to be able to think of this and find the method to recreate a model.

Reply (1)

Priank Ravichandar

Mar 4

It's impressive that they figured out how to do this. I wonder if there's any ethical way to distill models.

Reply (1)

ToxSec

Mar 4

well in some forms the labs themselves do this to help deconstruct the black box and get ready to train the next model

AlgorithmicPeacebuilding

Mar 4

Im not going to pretend to understand this…but I wonder if someone with your skills could turn Grok and other AIs into peacebuilders…if we’re going to survive we need AIs - and humans - to understand the logic of peace. And peace is logical…it’s not just the absence of conflict. It’s the presence of equity and the absence of exploitation…which creates maximum levels of flourishing, invention, and prosperity in society

Reply (1)

ToxSec

Mar 4

peace is definitely better than conflict, and i do think on several levels our continuation will depend on ai. let’s hope we can get these problems taken care of :)

John Holman

Mar 3

Hey Tox just outa curiosity, ballpark how big was the Teacher model to begin with and then what size does the distilled 80% student wind up being ?

Reply (1)

ToxSec

Mar 3

teacher model is usually big model, and the distilled model is much more condense. it has a scaling law it follows there are some

papers on, i’d say 20%?