The conversation around AI in cybersecurity has shifted rapidly from “what if” to “how much.” Red Canary experts across data science, security research, and product leadership recently came together for a four-part SecOps Weekly (formerly known as Office Hours) miniseries to demystify the reality of AI as it relates to security operations.
If you missed the live sessions, here is your recap of the journey from building agents to defending against AI-powered threats and measuring real-world ROI.
Part 1: Taming the agent – From autonomy to reliability
The series kicked off with a fundamental truth: Autonomy is the enemy of reliability. While the cool factor of a fully autonomous AI agent is high, the de facto security operations center (SOC) requires determinism and consistency.
Key takeaways
- The workflow is the agent: Don’t think of an agent as a single magic box. Think of it as a workflow.
- Rein in the loop: When building their email phishing triage agent, the team learned that moving tools (like IP reputation lookups) out of a non-deterministic LLM loop and into structured, deterministic code nodes dramatically increased accuracy and consistency.
- Measurement is mandatory: You can’t ship an agent based on a few successful chats. The team runs simulations thousands of times against labeled datasets to identify where the probabilistic nature of LLMs might lead to errors.
“The single hardest part in building agents isn’t plugging graphs together… it’s the data retrieval, doing that in a deterministic way that’s reliable, and making sure you’re not overloading context windows.”– Jimmy Astle
Part 2: Defending against AI-powered threats
Are we facing a Terminator scenario? Not exactly. The consensus—at least here at Red Canary—is that AI represents an evolution in tooling, but not a revolution in attack techniques.
Key takeaways
- Adversaries use AI for efficiency: Attackers use LLMs for the same reasons we do: to write code faster, localize phishing messages, and process data. They are subject to similar mistakes as defenders.
- The defender’s paradox (flipped): While agents allow attackers to move faster, they also make them noisier. An agent “stomping around” an environment makes more mistakes than a precise human operator, giving defenders more opportunities to detect them.
- Protect your AI infrastructure: Your internal LLMs and agents are new attack surfaces. Defend them with “block and tackle” security using least privilege, input sanitization (to prevent prompt injection), and visibility.
- Home field advantage: Use Honeytokens and deception. If an agentic attacker sees a juicy (but fake) login page, it’s programmed to take the bait.
“All of the stuff that works really well in defending against regular threats—defense in depth, privilege management, and good visibility—is going to be really effective.”– Brian Donohue
Part 3: Measuring impact – The three pillars of ROI
How do you prove to a CFO that AI is worth the investment? It comes down to three measurable areas: Alert reduction, investigation quality, and analyst happiness.
Key takeaways
- Speed vs. accuracy: Our panelists highlight an identity agent that conducted a 30-minute human investigation in just three minutes. That reduction in mean time to resolve (MTTR) is a testament to how these technologies can truly affect security for the better. Astonishment aside, however, the goal isn’t just speed, it’s consistency. And that consistency requires human intervention.
- The snowball effect: You don’t need a 90 percent efficiency gain on day one. A series of 5–7 percent improvements across hundreds of SOC processes creates a massive cumulative impact.
- The rise of the DAE: The age of the detection automation engineer (DAE) is upon us. These are specialists who no longer just write static rules, but manage the instruction sets and context engineering for the agents that handle the frontline triage.
“If I ask the LLM the same question ten times, I will get roughly the same answer probably nine times. But one time, it’s gonna be completely off base. That is the fundamental nature of this technology, and one of the hardest parts about building these agents is taming that autonomy and consistency.” – Jimmy Astle
Part 4: Strategic choices – Build vs. buy
The final session addressed the executive dilemma that pretty much all modern organizations are facing right now: Should you build your own AI stack or buy from a partner?
Key takeaways
- When to build: Build in-house if the use case is highly strategic, custom to your unique business data, and you have the specialized machine learning (ML) talent to maintain it.
- When to buy: Buy when you need speed, lower costs, and want to leverage a vendor’s “data moat.” Our agents are trained on millions of labeled events that a single enterprise simply wouldn’t have.
- Leave space for magic: Mary emphasized that because AI technology moves faster than quarterly roadmaps, leaders must leave room for research and development (R&D) teams to continue to experiment with “what if” scenarios.
- Transparency as trust: The “black box” era is over. To trust AI, defenders need to see the agent’s work such as what questions it asked, what tools it called, and why it reached its conclusion.
“The technology is moving faster than I could even imagine…things we launched last month, I couldn’t have even conceived a year ago.” – Mary Writz
The path forward
The miniseries concludes with a powerful sentiment: We are in the fastest-moving disruptive technology cycle in history. Jobs are changing, threats are evolving, but the core mission of the SOC remains the same—never miss a threat.
By keeping agents narrowly scoped, measuring them relentlessly, and prioritizing human checkpoints, the transition from AI hype to SOC outcome isn’t just possible, it’s already happening.