Defense validation and testing

Confirmed testing comprised almost one quarter of our detections in 2021, with many coming from open source tools.
Pairs With This Song
Take Action


We see a lot of testing. In fact, 23.4 percent of all the confirmed threats we detected in 2021 were confirmed by customers to be testing. We’re all for testing (as you can hopefully tell by our work with Atomic Red Team), and we wanted to share what we’ve observed about testing when compared to “proper villains.” We also have some suggestions for how to make testing more effective.

In aggregate, confirmed testing behaviors we observed in 2021 differed significantly when compared to non-testing behaviors. When comparing the top 10 detection analytics that appeared in detections marked by customers as testing to those that fired in detections not marked as testing, only three analytics overlapped. Here are some patterns we observed in testing detections during 2021.

Common testing tools

Unsurprisingly, a large volume of the testing detections we observed were from common breach and intrusion simulation tools and open source testing tools. For example, a detection analytic on CrackMapExec execution from cmd.exe appeared in our top testing techniques, but not in our non-testing detections. CrackMapExec is a post-exploitation tool to audit and assess security in Active Directory environments, so it is a natural choice for testing. This suggests that CrackMapExec is more widely used by testers than non-testers.

Throughout 2021, we also frequently observed Mimikatz, BloodHound, Impacket, Cobalt Strike, and Metasploit in testing—so much so that testing detections involving these tools helped all of them make it into our top 10 threats this year. We consider all of these tools to be “dual-use”—they are used by both adversaries and legitimate users. These dual-use tools present a challenge because it can be difficult to determine if their use is malicious or benign without additional context and understanding of what is normal in an environment. We recommend all organizations have a clear understanding of authorized use of these tools in their environments and treat unconfirmed testing as malicious activity until proven otherwise.

Credential theft methods

We frequently observe credential theft during testing, which is a positive because adversaries frequently do this as well. However, we’ve noticed that testers often focus narrowly on two approaches for credential dumping. One analytic that fires frequently in testing detections identifies cross-process injection or access activity from rundll32.exe to lsass.exe. Another analytic identifies instances of rundll32.exe dumping process memory using MiniDump, a built-in code library. Part of the reason we observe these behaviors so frequently is because they are integrated into multiple automated breach and intrusion simulation tools, making it more likely for this behavior to occur at scale.

Noisy discovery commands

Another pattern in our testing detections is quick execution of a series of discovery commands such as ipconfig, whoami, and others. This is in opposition to what we see from many adversaries, who often perform fewer discovery commands in a more targeted way. For example, one of the top analytics we used for detecting testing was for enumeration of Windows Domain Administrator accounts with commands like net domain admins. While non-testers use this command as well, we found that testers use it more frequently.

Based on our findings, we encourage organizations to be thoughtful about their testing goals. One approach is to test atomic behaviors without considering the surrounding behavior. This can be helpful to determine if you have the ability to potentially detect that behavior. However, consider also adopting a goal to test a full intrusion chain. This may look different than testing for atomic behaviors—for example, instead of executing 20 discovery commands in quick sequence, you could execute one or two discovery commands followed by other activity, then return to additional discovery commands.

One approach that can help ensure you’re testing based on real-world threats that matter is to enhance testing with threat intelligence. Adversary emulation, in which testers use threat intelligence to try to carefully mimic threats of concern as closely as possible, is a widespread methodology that can provide significant value and help organizations improve testing. MITRE’s adversary emulation plans provide a helpful starting point.

We also recommend changing up your toolset. Automated red teaming and testing tools are powerful, but they are often easier for defenders to detect. To ensure your organization has robust detection capabilities for a range of behaviors, consider different ways you could test the same techniques. For example, instead of just using Mimikatz for credential dumping, try using Gsecdump, NPPSpy, or other tests from Atomic Red Team.