Human in the Loop: Before it’s Too Late

A blueprint for building a “Human-in-the-Loop” (HITL) model that leverages AI agents for speed and human experts for wisdom in cybersecurity.

Nov 01, 2025

ToxSec hero image: Keeping humans in the loop. A guide to AI agents and when to use human oversight. — *The “Centaur” Model: Combining AI’s scale with human intuition.*

TL;DR: We’re handing AI agents the keys to the kingdom, letting them run security, code, and ops. But they have a fatal flaw: they can’t understand why. They’re all speed, no wisdom. Forget full automation. The future is built on partnership. Here’s the blueprint for a “Human-in-the-Loop” model that prevents disaster.

“An AI agent can process a million threats a second, but it can’t spot a single, clever lie. That’s still your job.”

What is an AI Agent, Really? (Hint: It’s the Ultimate Intern)

Let’s get real. AI agents aren’t sci-fi robots. They’re specialized tools. Think of them as the ultimate intern, running on rocket fuel.

They are masters at tasks that are clear, repetitive, and on a massive scale that would crush a human. Sifting terabytes of network logs? Easy. 24/7 threat intel sorting? No problem. They have a superhuman knack for pattern recognition, finding the one weird DNS request in a sea of millions that a human would miss.

In the critical first minutes of an incident, an agent can triage the mess; gather logs, ID affected systems, and package it all into a quick summary. You can start making smart decisions before you’ve even finished your coffee. This speed is a game-changer. But it’s also a liability. When a tool moves this fast, where do you build the guardrails?

How to find the balance between speed and risk for using AI agents. — How to balance AI Agents: Risk and Speed.

How Do You Build a “Digital Tripwire” for a Rogue AI?

An AI agent with no boundaries is a catastrophe waiting to happen. A safe AI strategy relies on “digital tripwires.” These are hardcoded rules that force the agent to stop and ask for human help.

A solid HITL strategy starts with non-negotiable triggers. You have to define what events demand a human review.

An agent can flag one failed login. A hundred failed logins from one IP in a minute? That needs a human.
An agent can quarantine a routine phishing email. That same email targeting the CEO? A human needs to see it now.

Finding a balance with human review and the usage of AI agents. — Balancing AI efficiency with human oversight.

Equally important is handling ambiguity. A smart agent knows what it doesn’t know. When its confidence on a task is low, it shouldn’t guess. It must escalate. This is critical for murky data, like a vague email that could be a clumsy request or a clever social engineering attack.

Finally, some actions are simply too important for an algorithm. Before an agent can do anything with major operational or financial consequences—like shutting down a production server—it must get a green light from a human. The rule is simple: the bigger the potential impact, the more you need human approval.

If this blueprint helps your team, share it with a colleague who’s building with AI.
Share

Why is Social Engineering an AI’s Kryptonite?

Now let’s talk high-stakes: security. A fully autonomous AI agent is like a sentry who can see everything but understand nothing. The agent spots patterns, but it can’t grasp the attacker’s intent or psychology. This is where human oversight becomes your most critical defense.

The most significant issue is the social engineering blind spot. An AI can learn the technical signs of a phishing email, but it can’t understand the psychology of manipulation. It won’t detect the subtle, urgent pressure in a message that uses inside project lingo to trick an employee into breaking security rules. An agent sees keywords; a human sees a trap.

AI models are also trained on past attacks, which means they are great at stopping threats we’ve seen before. A novel “zero-day” social engineering trick, however, won’t have any of the red flags an AI looks for. Human intuition, that gut feeling that something isn’t right, is often our only defense against the unknown.

Furthermore, some decisions carry ethical and reputational risks. An agent might correctly see that a customer account is acting strangely and lock it to prevent fraud. But if that customer is a key client making a huge purchase, the agent’s “correct” action could cause a business disaster. A human can balance the security risk against the business context and make a better call.

Finding the right balance and tasks for AI agents while still using humans. — Balancing AI and humans strengths in security.

Get the next teardown on AI security and human-machine teaming delivered to your inbox.

How Does the “Centaur” Model Beat Both Humans and AI?

The “human vs. machine” story is wrong and holds us back. The most effective model for the future is collaboration. We call this the “Centaur” model, named after a type of chess where a human with a computer beats both the best grandmasters and the strongest supercomputers. The goal is to combine the AI’s speed and scale with human strategy and intuition.

This partnership fundamentally changes how security teams operate. The basic setup is a new division of labor: AI for research, humans for strategy.

The agent becomes your ultimate research assistant, scanning the dark web for stolen credentials and analyzing malware in real-time. It delivers this sea of information as a short, prioritized briefing. You, the human analyst, then take that intel and decide where to improve defenses and how to hunt for threats.

This model also accelerates incident response. An agent can detect a threat, automatically take the device off the network, and present its findings with a suggested action. Instead of spending an hour just collecting info, you can spend that hour confirming the AI’s findings and making the critical final call.

Using the Centaur model in cyber security. AI agents and humans can collaborate. — The Centaur model in security operations.

Are You ‘In the Loop’ or ‘On the Loop’?

So far, we’ve talked about building a partnership. But as this tech gets better, our role has to change, too.

The future of this work involves moving from being in the loop to being on the loop.

Human-in-the-Loop (HITL) is when the AI stops and asks, “Should I do this?”
Human-on-the-Loop (HOTL) is where you set the goals, the strategy, and the ethical lines, monitoring the AI’s overall performance without approving every single action.

Your role shifts from being a gatekeeper to being the director of the entire system. Your future value will come from commanding an army of agents, not from processing the most data yourself. The new must-have skills are critical thinking (questioning AI results), prompt engineering (giving clear instructions), and systems thinking (seeing the big picture).

This approach doesn’t mean giving up control. It’s how you scale your expertise and keep it.

What is the single biggest, repetitive task on your team that you would trust an AI agent to handle today?
Leave a comment

Frequently Asked Questions

Q: What is the main weakness of AI agents in cybersecurity? A: The main weakness of AI agents is their inability to understand human context, intent, and psychology. They can identify technical patterns of an attack but often fail to recognize the nuanced manipulation used in social engineering. They also struggle with entirely new “zero-day” threats they haven’t been trained on.

Q: What is a “Human-in-the-Loop” (HITL) system? A: A Human-in-the-Loop (HITL) system is a model where an AI agent is required to get approval from a human before taking certain actions. This is typically triggered when the AI’s confidence is low, the action has a high potential impact (e.g., shutting down a server), or the situation is ambiguous. It ensures a human expert makes the final call in critical moments.

Q: What is a “Centaur” model in AI? A: The “Centaur” model refers to a human-AI partnership where the combination of human and machine intelligence outperforms either one working alone. In cybersecurity, this means letting the AI handle massive data processing and pattern recognition, while the human focuses on strategy, intuition, and complex decision-making.

Q: What skills are important for working with AI in the future? A: The most important skills will shift from pure technical analysis to strategic oversight. Key skills include: critical thinking (to validate AI outputs), prompt engineering (to give clear instructions to AIs), AI ethics (to set boundaries), and systems thinking (to integrate AI tools into a larger workflow).