Skip Navigation
Get a Demo
 
 
 
 
 
 
 
 
 
Resources Blog Threat detection

The art and science of effective security storytelling

The art and science of effective security storytelling

Three criteria for assessing quality security data and telling the “minimum viable story”

Matt Graeber
Originally published . Last modified .

As defenders, detection engineers, and SOC analysts, it’s easy for us to label an alert or a security event as “garbage” when it lacks the context we implicitly expect. It’s true; there is no shortage of security event data that truly is garbage. But if a program manager, an executive leader, or a customer asked you why it’s not good, to what extent are you willing to offer an explanation beyond, “it just lacks context”?

As those who are considered subject matter experts in threats, techniques, and event data, we need to hold ourselves to a higher standard and be able to articulate objectively what constitutes a quality data source and, if it doesn’t meet a certain quality bar, to be able to clearly identify what is required to improve it. This post aims to establish a methodology for assessing data quality so that the data we spend copious amounts of money ingesting and querying gives us our expected return on investment, namely:

  1. It makes clear what happened based on the information present.
  2. It contains sufficient information to remediate the action.
  3. It can be correlated to other relevant data sources.

Let’s break down these three attributes that constitute a quality data source.

1. The data makes clear what happened based on the information present

At a high level, this assessment criteria considers the following question: Can you explain succinctly and confidently what actually happened to a technical audience, but one who is not necessarily a subject matter expert?

In order to answer that question, the event should tell a brief story comprised ideally of the following criteria, which the Red Canary Threat Research team has coined, “minimum viable storytelling.”

1. Who

This refers to the actor/identity that performed the action. This is not to be confused with an actor/identity that was affected by the action.

Examples:

  • a user identity – i.e., an identity associated with a person
  • a workload identity – e.g., service principal
  • a local account

2. What

Using grammatical terminology, this refers to the following:

  1. Verb: the action that was performed
  2. Direct object: the resource that was affected by the action
  3. Indirect object: what was done to the affected resource

We’ll demonstrate what this looks like in the example that follows.

3. When

The date and time that the action occurred. This should be clearly distinguished from when the event data was populated.

4. Where

The environment in which the action occurred.

Examples:

  • Entra tenant ID
  • device ID
  • computer hostname

5. Whence

The origin from which the action originated.

Examples:

  • IP address
  • IP geolocation

6. How

The means by which the action occurred.

Examples:

  • user agent string
  • application ID/name

What about “why?”

You may have noticed that “why” is conspicuously absent. That’s because an event on its own can’t tell us the intent behind the action occurring. Determining why an event happened is the objective of a defender determining if an event is benign or suspicious. A defender will not be able to confidently determine why an event occurred without quality data, however.

Piecing together the story

An event that includes all of the above criteria is well-suited to tell a succinct and clear story of what happened in no more than two sentences. This should also be the litmus test for any defender: Can you clearly explain to someone what occurred in an event? If so, you’ve demonstrated that not only is the data sufficient to tell a story but that you are comfortable enough translating the event into a format that is accessible to non-subject matter experts.

Let’s look at an example event and see if we can extract the criteria necessary to articulate a clear narrative, i.e., the “minimum viable story.” In this example, let us consider an Entra identity being added to a role. Here’s the raw AuditLogs event that we will assess:

{
  "TenantId": "8b71734b-35c8-4e06-b72a-c23f700bf0dd",
  "SourceSystem": "Azure AD",
  "TimeGenerated": "2025-10-16T16:49:52.4205655Z",
  "ResourceId": "/tenants/08cdc03a-f392-4da8-ba8e-872297df4c7f/providers/Microsoft.aadiam",
  "OperationName": "Add member to role",
  "OperationVersion": "1.0",
  "Category": "RoleManagement",
  "ResultType": "",
  "ResultSignature": "None",
  "ResultDescription": "",
  "DurationMs": "0",
  "CorrelationId": "90f4cd25-d149-4239-bd38-809c8344da8b",
  "Resource": "Microsoft.aadiam",
  "ResourceGroup": "Microsoft.aadiam",
  "ResourceProvider": "",
  "Identity": "Microsoft Graph Command Line Tools",
  "Level": "4",
  "Location": "",
  "AdditionalDetails": {
    "key": "User-Agent",
    "value": "Mozilla/5.0 (Macintosh; Darwin 24.6.0 Darwin Kernel Version 24.6.0: Mon Aug 11 21:16:34 PDT 2025; root:xnu-11417.140.69.701.11~1/RELEASE_ARM64_T6020; en-US) PowerShell/7.5.3"
  },
  "Id": "Directory_90f4cd25-d149-4239-bd38-809c8344da8b_S4P51_17590570",
  "InitiatedBy": {
    "user": {
      "id": "14b4f3a8-609d-4523-9997-e7557eea3d39",
      "displayName": "Microsoft Graph Command Line Tools",
      "userPrincipalName": "Matt@ContosoCorp.onmicrosoft.com",
      "ipAddress": "51.2.72.192",
      "roles": []
    }
  },
  "LoggedByService": "Core Directory",
  "Result": "success",
  "ResultReason": "",
  "TargetResources": [
    {
      "id": "5705b3c2-0eea-4a49-9a36-469001e400c3",
      "displayName": null,
      "type": "User",
      "userPrincipalName": "TestUser@ContosoCorp.onmicrosoft.com",
      "modifiedProperties": [
        {
          "displayName": "Role.ObjectID",
          "oldValue": null,
          "newValue": "\"c2166a73-164c-4f18-903c-c5f3234d3930\""
        },
        {
          "displayName": "Role.DisplayName",
          "oldValue": null,
          "newValue": "\"Privileged Role Administrator\""
        },
        {
          "displayName": "Role.TemplateId",
          "oldValue": null,
          "newValue": "\"e8611ab8-c189-46e8-94e1-60213ab1f814\""
        },
        {
          "displayName": "Role.WellKnownObjectName",
          "oldValue": null,
          "newValue": "\"PrivilegedRoleAdmins\""
        }
      ],
      "administrativeUnits": []
    },
    {
      "id": "c2166a73-164c-4f18-903c-c5f3234d3930",
      "displayName": null,
      "type": "Role",
      "modifiedProperties": [],
      "administrativeUnits": []
    }
  ],
  "AADTenantId": "08cdc03a-f392-4da8-ba8e-872297df4c7f",
  "ActivityDisplayName": "Add member to role",
  "ActivityDateTime": "2025-10-16T16:49:52.4205655Z",
  "AADOperationType": "Assign",
  "Type": "AuditLogs"
}

Now, using the raw event, let’s gauge the extent to which the who, what, when, where, whence, and how can be populated.

QuestionData points
Question :

Who

Data points:
  • A user identity performed the action. We know this because InitiatedBy.user exists. If a service principal performed the action, InitiatedBy.app would be present instead.
  • Matt@ContosoCorp.onmicrosoft.com is the identity name that performed the action. This is populated in the InitiatedBy.user.userPrincipalName field.
Question :

What

Data points:
  • Verb: A member was added to a role. We know this was successfully performed because the OperationName field is Add member to role.
  • Direct object: The member added to the role was a user identity. We know this because TargetResources[0].type is User.
  • Direct object: The member added to the role was TestUser@ContosoCorp.onmicrosoft.com. This is populated in TargetResources[0].userPrincipalName.
  • Indirect object: The role to which the member was added was the Privileged Role Administrator role. This is populated in TargetResources[0].modifiedProperties['Role.DisplayName'].newValue.
Question :

When

Data points:

The action was performed at 2025-10-16T16:49:52.4205655Z via the ActivityDateTime field.

Question :

Where

Data points:

The action was performed in Entra ID tenant ID 08cdc03a-f392-4da8-ba8e-872297df4c7f via the AADTenantId field.

Question :

Whence

Data points:

The action was performed from IP address 51.2.72.192.

Question :

How

Data points:
  • The action was performed by the Microsoft Graph Command Line Tools application. This is populated with the Identity field.
  • The client user agent string is: Mozilla/5.0 (Macintosh; Darwin 24.6.0 Darwin Kernel Version 24.6.0: Mon Aug 11 21:16:34 PDT 2025; root:xnu-11417.140.69.701.11~1/RELEASE_ARM64_T6020; en-US) PowerShell/7.5.3. This is populated with the AdditionalDetails['User-Agent'] field.

Event narrative

Fortunately, all of the above criteria were populated with this event, allowing us to craft a coherent and succinct human-readable narrative using the following template:

At WHEN, within WHERE, WHO did WHAT (direct object) to WHAT (indirect object) from WHENCE using HOW.

Applying this template, we get the following narrative:

At 2025-10-16T16:49:52Z, within Entra ID tenant ID 08cdc03a-f392-4da8-ba8e-872297df4c7f, the user Matt@ContosoCorp.onmicrosoft.com added TestUser@ContosoCorp.onmicrosoft.com to the Privileged Role Administrator role from IP address 51.2.72.192 using Microsoft Graph Command Line Tools with the following user agent: Mozilla/5.0 (Macintosh; Darwin 24.6.0 Darwin Kernel Version 24.6.0: Mon Aug 11 21:16:34 PDT 2025; root:xnu-11417.140.69.701.11~1/RELEASE_ARM64_T6020; en-US) PowerShell/7.5.3.

2. The data contains sufficient information to remediate the action

Quality data will supply an incident responder with enough information to remediate the action that occurred manually or with help from automation. Using the above example of the member being added to a role, if a responder were to remove the role assignment, they could use the Remove-MgBetaRoleManagementDirectoryRoleAssignment Graph API cmdlet. The command requires a UnifiedRoleAssignmentId argument, which the AuditLogs event above doesn’t supply. The event does, however, supply enough information to retrieve the relevant UnifiedRoleAssignmentId value, namely, the following:

  1. TargetResources[0].id: 5705b3c2-0eea-4a49-9a36-469001e400c3
  2. TargetResources[0].modifiedProperties['Role.TemplateId'].newValue: e8611ab8-c189-46e8-94e1-60213ab1f814

So we can retrieve the UnifiedRoleAssignmentId value with the following command:

 

$TargetRoleAssignment = Get-MgBetaRoleManagementDirectoryRoleAssignment -Filter "principalId eq '5705b3c2-0eea-4a49-9a36-469001e400c3' and roleDefinitionId eq 'e8611ab8-c189-46e8-94e1-60213ab1f814'"

The suspect role assignment can now be remediated (i.e., removed) with the Remove-MgBetaRoleManagementDirectoryRoleAssignment command:

 

Remove-MgBetaRoleManagementDirectoryRoleAssignment -UnifiedRoleAssignmentId $TargetRoleAssignment.Id

So we were able to successfully remediate the action using the data present in the event.

3. The data can be correlated to other relevant data sources

A single event in isolation only tells a minor portion of an overall threat story. The ability to understand a relevant event in the context of an overall threat storyline requires the correlation of other, relevant events performed by the actor.

In the example above, it would be valuable to know all actions performed by Matt@ContosoCorp.onmicrosoft.com that used the same access token as the one used to perform the role assignment. Correlation to the corresponding sign-in event that issued the token can be performed by referencing the unique token identifier or session ID values but unfortunately, AuditLogs events have neither value populated, making direct correlation to a sign-in event and related activity impossible. Related events can be inferred, however, based on the activity datetime, user identity, IP address, and user agent.

Is this quality data?

Based on the three criteria established for data quality, the highlighted AuditLogs entry example could be assessed as follows:

 

CRITERIAASSESSMENTJUSTIFICATION
CRITERIA :

Makes clear what happened based on the information present

ASSESSMENT :

High quality

JUSTIFICATION :

All aspects of the who, what, when, where, whence, and how were populated using available event data.

CRITERIA :

Contains sufficient information to remediate the action

ASSESSMENT :

High quality

JUSTIFICATION :

The event supplied all relevant fields necessary to automate remediation.

CRITERIA :

Can be correlated to other relevant data sources

ASSESSMENT :

Medium/low quality

JUSTIFICATION :

Direct correlation to related activity is not possible, although indirect correlation can be inferred based on other relevant fields. If Microsoft supplied both unique token identifier and session ID values like they do with other log sources, direct correlation would be possible.

Overall quality assessment: High

While the event doesn’t support direct correlation, it is populated with enough information that would allow threat hunters, detection engineers, and incident responders to ask any of the following questions that would lead to a benign/suspicious determination:

  • Does Matt@ContosoCorp.onmicrosoft.com perform role assignments often? Privileged Role Administrator is a high-privilege role. Is Matt@ContosoCorp.onmicrosoft.com expected to perform high-privileged role assignment?
  • Is it common for the Microsoft Graph Command Line Tools application to perform role assignments versus the Azure Portal or other sanctioned applications? Is there a reason why privileged identity management (PIM) wasn’t used to perform the role assignment?
  • Is the IP address common for the tenant and/or Matt@ContosoCorp.onmicrosoft.com?
  • Is the user agent string common for the tenant and/or Matt@ContosoCorp.onmicrosoft.com?
  • Does TestUser@ContosoCorp.onmicrosoft.com have an actual justification for the Privileged Role Administrator role?
  • What actions, if any, did TestUser@ContosoCorp.onmicrosoft.com perform after the role assignment?

Applying the narrative

As defenders and threat subject matter experts, we should be able to take ownership of establishing data requirements for detection and response. We should be able to clearly articulate the return on investment for the data sources we either claim we need for detection or the ones we claim lack value and can be dropped.

We hope that this assessment methodology can serve as a foundation for extracting the most value from the data sources needed for detection and response and can also be used to hold vendors accountable for supplying the data necessary to make informed, confident decisions.

 

Sniffing out TruffleHog in AWS

 

A defender’s guide to phishing

 

Unmasking risks that haunt your supply chain

 

Commanding attention: How adversaries are abusing AI CLI tools

Subscribe to our blog

Security gaps? We got you.

Sign up for our monthly email newsletter for expert insights on MDR, threat intel, and security ops—straight to your inbox.


 
 
Back to Top