Comparing open source attack simulation platforms for red teams

Does it make sense to use both Red Canary’s Atomic Red Team and MITRE’s CALDERA for adversary simulation? This exact question came up in the Atomic Red Team Slack channel recently, and the lack of recent, available resources comparing these and other open source attack emulation platforms was immediately apparent.

For today’s blog, we’re going to take a look at technical differences, coverage disparities, and more comparisons between MITRE’s CALDERA, Red Canary’s Atomic Red Team, and Hunters Forge’s Mordor. For the purposes of this comparison, we’ll judge coverage against MITRE ATT&CK (enterprise), but we’re also going to compare these tools on usability, respective features, and more.

All of these projects were clearly inspired by one another—and each is a great and useful contribution to the security community. Despite their apparent similarities, Atomic Red Team, CALDERA, and Mordor are different in both subtle and overt ways. As such, this article shouldn’t be considered a competitive analysis but an exploration and endorsement of each.

Comparing coverage

As you can see from the matrices below, Atomic Red Team has the broadest coverage among the three toolkits. As of this writing, there were 92 individual contributors to Atomic Red Team on GitHub, making it a very active community endeavor. The framework includes more than 500 individual tests covering roughly 159 ATT&CK techniques.

What follows are simple ATT&CK Navigator layers showing relative coverage across the three tools:

*Atomic Red Team coverage against ATT&CK*

Here’s the full, interactive version of the matrix, which includes live links to the current Atomic Red Team and Mordor coverage .json files—in case anyone wants to play along at home.

Beyond coverage

While these aren’t apples-to-apples comparisons (CALDERA, for example, is a browser-based application, whereas Atomic Red Team and Mordor are not), it is a fair representation of what a researcher has access to with each tool right out of the box, so to speak.

Note that you might need to practice patience when you download these tools, as they may be blocked, or partly blocked without you even realizing it on your proxy, network stack, or by other tools. If you’re using these in your employer’s environment, make sure you have permission to run these tests and confirm that your security controls aren’t interfering with your ability to do so.

Atomic Red Team

Atomic Red Team is a collection of lightweight tests that emulate a wide variety of known adversary techniques. It’s used for many purposes, including but not limited to:

Validating assumptions about security controls (i.e., is my EDR sensor generating the telemetry it is supposed to?)
Testing detection coverage
Learning what malicious activity looks like

We’re not going to go too deep into the weeds with Atomic Red Team because there are loads of resources just a few clicks away on the Atomic Red Team page, and there is ample documentation outlining how to contribute to Atomic Red Team. However, the new and improved Invoke-Atomic is worth exploring on its own in some detail.

Invoke-Atomic is a PowerShell execution framework that vastly simplifies the otherwise manual (command-line intensive) process for running one or many atomic tests. In fact, it has improved so much as of late that the Atomic Red Team maintainers decided to spin the PowerShell framework out as its own open source project.

Moving Invoke-Atomic to its own GitHub repo noticeably improved the usability of Atomic Red Team by allowing testers to download Invoke-Atomic independently. Previously, you’d have to clone the entire Atomic Red Team repo onto a desktop with pesky security controls that, in most cases, would fire off a bunch of false positive alerts. To be clear, there is nothing outright malicious in Atomic Red Team, but there’s a whole lot of analytics that are surely suspicious (and that many security teams would and probably should want to detect).

To use Invoke-Atomic, you’ll need to open PowerShell (or install PowerShell core first, which will also work on macOS or Linux machines), and then follow the instructions on the Invoke-Atomic GitHub wiki. There is an installation script in the repo, but you can manually install Invoke-Atomic as well. Make sure you install both the module and also the YAML module.

Some new features of Invoke-Atomic enable you to:

Display only test names and numbers: Invoke-AtomicTest All -ShowDetailsBrief
Execute all tests or a given technique: Invoke-AtomicTest T1117
Execute specific tests for a given technique (by number): Invoke-AtomicTest T1117 -TestNumbers 1, 2
Execute specific tests for a given technique (by name): Invoke-AtomicTest T1117 -TestNames "Regsvr32 remote COM scriptlet execution","Regsvr32 local DLL execution"
Get prerequisites: Invoke-AtomicTest T1117 -TestNumber 1 -GetPrereqs

That last bullet runs the get prerequisite commands listed in the dependencies section for the test, empowering a tester to make any necessary configuration or installation changes before they begin testing.

This might seem like an obscure feature, but , in the testing I did on CALDERA, I ran into a lot of dependency issues. Speaking of CALDERA…

Watch a Video on Invoke-Atomic

CALDERA

CALDERA, or Cyber Adversary Language and Decision Engine for Red Team Automation, is a web agent that runs on Chrome. Perhaps the most important distinguishing feature between CALDERA and the other two platforms we’re examining is that CALDERA is not agentless. To that point, 54ndc47 (that’s leetspeak for “Sandcat,” if you didn’t catch it) is the default agent that receives and executes instructions from CALDERA.

Simply put, you connect the machines you want to test to CALDERA by way of Sandcat. From there you can instruct CALDERA to run a variety of tests on the group of machines you’ve connected to it. Each individual ATT&CK technique in CALDERA is called an “ability,” and groups of abilities are called “adversaries.” In this way, CALDERA is a useful framework both for testing detection coverage and as an educational resource for defensive security professionals who want to learn more about threat detection.

CALDERA certainly offers the ability to run tests (like Atomic Red Team), but the platform is also a fully fledged testing suite that teams can customize to meet a long list of needs. For example, it allows users to natively map adversary emulation efforts back to MITRE ATT&CK—and can help facilitate many other types of adversarial research, for example:

CALDERA supports other MITRE tools like Compass, which creates visualizations, like the ATT&CK coverage map I included earlier
You can build your own agents instead of using Sandcat
A somewhat outdated “Atomic tool” that you can use to merge Atomic Red Team tests
Stockpile is a component that collects and lists TTPs for CALDERA

Those elements and more are available in the web interface when you install Caldera.

While Atomic Red Team’s Invoke-Atomic framework lets you simply test your detection capabilities for very specific “atomic” tests (or combinations thereof) on Windows, macOS, and Linux, Caldera offers you the starting point for much more involved adversary emulation. That said, digging deeper with CALDERA is a lot of work, and, in practice, it may be more than you’re looking for. Either way, take a look at the overview of documentation to learn more about the numerous possibilities available in CALDERA.

If you like the idea of having a framework to replay adversarial actions and want more reporting than Atomic Red Team offers out of the box, but feel like a lot of the above is more than you need, then Mordor may be exactly the middle ground that you’re looking for. Plus, it is a young project that welcomes contributors to develop more datasets.

Mordor

Roberto Rodriguez created Mordor to help analysts who might not have extensive testing or red team experience simulate adversary behaviors to test their ability to detect or prevent. Mordor has really great documentation and excellent artwork—both solid additions. However, among the platforms analyzed here, Mordor is maybe the most different. While Atomic Red Team and CALDERA offer you the ability to test your security tooling or detection coverage with simulated attacks, Mordor focuses on the telemetry generated by simulated and real-world attacks.

From GitHub, in the maintainers’ own words:

“The Mordor project provides pre-recorded security events generated by simulated adversarial techniques in the form of JavaScript Object Notation (JSON) files for easy consumption. The pre-recorded data is categorized by platforms, adversary groups, tactics and techniques defined by the Mitre ATT&CK Framework. The pre-recorded data represents not only specific known malicious events but additional context/events that occur around it. This is done on purpose so that you can test creative correlations across diverse data sources, enhancing your detection strategy and potentially reducing the number of false positives in your own environment.”

Roberto and his brother Jose have deeply incorporated reporting via Jupyter Notebooks into their adversarial simulations—to create a very different approach to the same “atomic” type data, but with real-time reporting with which many of you in the community may already be familiar. They have also compiled all of this into the beautifully documented Threat-Hunter’s Playbook.

Roberto was recently on “The Brakeing Down Security” podcast, where in a two-part interview he explained the inspiration behind Mordor and some of its use cases. He explained that he especially wanted to ensure coverage of common privilege escalation and lateral movement techniques in Windows, which is why those are among the first and only tests currently available. In the second part of the interview, he dives deep into the data science aspects of Mordor and explains the benefits of using Jupyter Notebooks to share documentation.

Honorable mentions

We deliberately decided not to include Praetorian’s Purple Team Attack Automation tool in this comparison because it is entirely reliant on Metasploit. That said, it offers a great level of coverage and is certainly worth consideration.

Another noteworthy exemption from this comparison is Endgame’s Red Team Automation (RTA). It doesn’t seem to have been updated in some time, but it’s still a good collection of 50 or so tests for techniques like file timestomping, process injection, and beacon simulation.

The bottom line

One of the great strengths of each of these tools is that they are organized within the taxonomy of ATT&CK, thereby providing a common language and framework that has already been adopted by many organizations and other tools.

There is some overlap between Atomic Red Team and CALDERA, but the benefit of Atomic Red Team is clear: it’s fairly simple to set up and use, especially with the Invoke-Atomic and its detailed wiki page.

Caldera also offers a clear core strength too: it is extensible, and—given the sufficient time, attention, and resources—offers a free alternative to commercial testing and simulation platforms. Mordor, on the other hand, is entirely complementary to either CALDERA or Atomic Red Team.

Mordor offers unique reporting and the ability to examine how malicious materializes in different kinds of security telemetry. Further, the project is new and offers security professionals a great opportunity to contribute to an open source project that, over time, promises to become increasingly popular and helpful for security teams that are looking to improve their detection capabilities.

Many great open source projects are available to the red team and broader security community, and they don’t need to exist in mutual exclusivity. For example, there’s an awesome opportunity for someone to input Threat Hunter’s Playbook’s Jupyter Notebook approach into Atomic Red Team to create a new feature that adds in that same real-time reporting. Similarly, we can build tools for automatically running atomic tests in Invoke-Atomic or Caldera, and we can contribute the output of these tests back into Mordor.

No substitution for a skilled red team

Ultimately, these frameworks are no substitution for an experienced red team. The most skilled adversaries find exploitable gaps in tech or creative ways of using that tech to conduct attacks in ways that aren’t yet documented within frameworks like ATT&CK and Atomic Red Team.

A tool like Atomic Red Team will help validate that everything is working as it is supposed to, but a human red team will come up with clever attacks designed to subvert your organization’s specific defensive controls. In other words, a human red team ensures that you have coverage in between the frames of even the best frameworks.

Red teams also should work closely with trusted agents in a company to execute not only creative and innovative testing—to find (and help remediate) security gaps—but also to highlight specific areas of testing that have been predetermined to represent weaknesses in a company’s defenses. So in this latter sense, there may be some overlap between using framework-based automated testing; especially if it is being actively maintained by a large, cross-enterprise group of talented individuals.

A good starting point to read up on red teams is Tim MalcomVetter’s blog. I’ll highlight some other teams in upcoming weeks on my own blog as well.

JB (@Cherokeejb_) works in incident management, detection, and monitoring for a global organization. JB contributes as much as possible to the open source community, including the Atomic Red Team project, Internet Storm Center, and the Brakeing Down Security podcast. He has a passion for music, family, and nature; and his favorite working areas of research are currently macOS, forensics, and hunting for new, creative, and advanced defense tactics.

ATOMIC FRIDAY IS BACK!

Join our regular discussions with Atomic Red Team researchers and community members to talk about how security teams are using hundreds of atomic tests to improve threat detection and response.

Grading on a curve: How to assess a pentest

Testing and validation

Resources • Blog Testing and validation

Comparing open source adversary emulation platforms for red teams

Comparing open source adversary emulation platforms for red teams

JB•

Comparing coverage

Beyond coverage

Atomic Red Team

Watch a Video on Invoke-Atomic

CALDERA

Mordor

Honorable mentions

The bottom line

No substitution for a skilled red team

Related Articles

Grading on a curve: How to assess a pentest

How AI can streamline your security testing

Polishing Ruby on Rails with RSpec metadata

Explore the new Atomic Red Team website

Subscribe to our blog

See Red Canary in action

Watch the 10-minute demo now.

Security gaps? We got you.