behind the scenes

Methodology

Since 2013, Red Canary has delivered high-quality threat detection to organizations of all sizes. Our platform collects hundreds of terabytes of endpoint telemetry every day, surfacing evidence of threats that are analyzed by our Cyber Incident Response Team (CIRT). Confirmed threats are tied to corresponding MITRE ATT&CK® techniques to help our customers clearly understand what is happening in their environments. This report is a summary of confirmed threats derived from this data.

Creating metrics around techniques and threats is a challenge for any organization. To help you better understand the data behind this report and to serve as a guide for how you can create your own metrics, we want to share some details about our methodology.

Behind the data

To understand our data, you need to understand how we detect malicious and suspicious behavior in the first place. We gather telemetry from our customers’ endpoints and feed it through a constantly evolving library of detection analytics. Each detection analytic is mapped to one or more ATT&CK techniques and sub-techniques, as appropriate. When telemetry matches the logic in one of our detection analytics, an event is generated for review by our detection engineers.

When a detection engineer determines that one or more events for a specific endpoint surpasses the threshold of suspicious or malicious behavior, a confirmed threat detection documenting the activity is created for that endpoint. These confirmed threat detections inherit the ATT&CK techniques that were mapped to the analytics that alerted us to the malicious or suspicious behaviors in the first place.

It’s important to understand that the techniques and sub-techniques we’re counting are based on our analytics—and not on the review performed by our detection engineers, during which they include more context into detections. We’ve chosen this approach out of efficiency and consistency. However, the limitation of this approach is that context gleaned during the investigation of a threat does not contribute to its technique mapping, and, by extension, some small percentage of threats may be mapped incorrectly or impartially. That said, we continually review these confirmed threats, and we do not believe that there are a significant number of mapping errors in our dataset.

Changes in ATT&CK

In 2020, MITRE released a version of ATT&CK that effectively added a new dimension to the matrix, in the form of sub-techniques. We took this change as an opportunity to comprehensively review the thousands of detection analytics we’d created over the years. In addition to specifically realigning our analytics so that they would map to sub-techniques, we were also able to standardize how we mapped our analytics to ATT&CK in general. This sort of mapping may seem straightforward, but it really isn’t. Over a period of years, we had many different people interpreting the framework in many different ways. Naturally, this led to a level of inconsistency that we wanted to fix. We implemented new guidelines for mapping detection analytics to techniques and applied this to our entire library.

We recommend that any organization mapping to ATT&CK (or any framework) create a set of standard guidelines for analysts. While frameworks seem simple, the choice of how to map information is a subjective human decision, and guidelines help keep everyone aligned.

The changes we made in mapping our detection analytics resulted in a more accurate representation of techniques being used. However, our remapping effort to sub-techniques means that it is difficult to compare our 2021 Threat Detection Report to last year’s report. While we realize this causes some confusion, we believe updating to the latest ATT&CK version ensures a solid foundation in the data underlying our report.

Okay, so how do you count?

Now that we’ve explained how we map to MITRE, you may be wondering how we tally the scores for the Threat Detection Report. Our methodology for counting technique prevalence has largely remained consistent since the original report in 2019. For each malicious or suspicious detection we published during the year, we incremented the count for each technique reflected by a detection analytic that contributed to that detection. (We excluded data from detections of unwanted software from these results.) If that detection was remediated, and the host was reinfected at a later date, a new detection would be created, thus incrementing the counts again. While this method of counting tends to overemphasize techniques that get reused across multiple hosts in a single environment (such as when a laterally moving adversary generates multiple detections within a single environment), we feel this gives appropriate weight to the techniques you are most likely to encounter as a defender.

For the purposes of this report, we decided to set our rankings based on techniques, even though the majority of our analysis and detection guidance will be based on sub-techniques. This seemed to be the most reasonable approach, considering the following:

  • Sometimes we map to a technique that doesn’t have sub-techniques
  • Sometimes we map to sub-techniques
  • Sometimes we map generally to a technique but not to its subs

We acknowledge the imperfection of this solution, but we also accept that this is a transition year for both ATT&CK and Red Canary. In cases where a parent technique has no subs or subs that we don’t map to, we will analyze the parent technique on its own and provide detection guidance for it. However, in cases where sub-technique detections are rampant for a given parent technique, we will focus our analysis and detection guidance entirely on sub-techniques that meet our requirements for minimum detection volume. To that point, we decided to analyze sub-techniques that represented at least 20 percent of the total detection volume for a given technique. If no sub-technique reached the 20 percent mark, then we analyzed the parent.

What about threats?

New to this year’s report is a ranking of the 10 most prevalent threats we encountered in 2020. The Red Canary Intelligence Team seeks to provide additional context about threats to help improve decision-making. By understanding what threats are present in a detection, customers can better understand how they should respond. Throughout 2020, the Intelligence Team sought to improve how we identified and associated threats in detections. We chose to define “threats” broadly as malware, threat groups, activity clusters, or any other threat. We took two main approaches to associating a detection to a threat: automatically associating them based on patterns identified for each specific threat and manually associating them based on intelligence analyst assessments conducted while reviewing each detection.

All that said, how did we tally the numbers for the most prevalent threats? In contrast to our technique methodology, we counted threats by the unique environments affected. Whereas for techniques we counted multiple detections within the same customer environment as distinct tallies, for threats we decided to only count by the number of customers who encountered that threat during 2020. This is due to the heavy skew introduced by incident response engagements for laterally moving threats that affect nearly every endpoint in an environment (think ransomware).

Had we counted threats by individual detections, ransomware and the laterally moving threats that lead up to it (e.g., Cobalt Strike) would have been disproportionately represented in our data. We believe counting in this way gives an appropriate measure of how likely each threat is to affect any given organization, absent more specific threat modeling details for that organization. It also serves as a check against the acknowledged bias in the way we count technique prevalence.

Limitations

There are a few limitations to our methodology for counting threats, as there are for any approach. Due to the nature of our visibility (i.e., that we predominantly leverage endpoint detection and response data), our perspective tends to weigh more heavily on threats that made it through the external defenses—such as email and firewall gateways—and were able to gain some level of execution on victim machines. As such, our results are likely different than what you may see from other vendors focused more on network or email-based detection. For example, though phishing is a generally common technique, it didn’t make it into our top 10.

Another important limitation to our counting method may seem obvious: we identify threats we already know about. As our nascent Intelligence Team began in 2019, it wasn’t until mid-2020 that we began to thoroughly review all malicious detections in earnest. And while we have built a considerable knowledge base of intelligence profiles, the vast and ever-changing threat landscape presents many unique threats that we are unable to associate (though in some cases we have been able to cluster these under new monikers such as Blue Mockingbird or Silver Sparrow). If we are able to identify a repeatable pattern for a certain threat and automate its association, we observe the threat more often.

This means that while the top 10 threats are worth focusing on, they are not the only threats that analysts should focus on, since there may be other impactful ones that are unidentified and therefore underreported. Despite these flaws, we believe that the analysis and detection guidance across the threats and techniques in this report is reflective of the overall landscape, and, if implemented, offers a great deal of defense-in-depth against the threats that most organizations are likely to encounter.

Knowing the limitations of any methodology is important as you determine what threats your team should focus on. While we hope our top 10 threats and detection opportunities help prioritize threats to focus on, we recommend building out your own threat model by comparing the top threats we share in our report with what other teams publish and what you observe in your own environment.