Red Canary Co-founder and CTO Chris Rothe joined the Risky Business podcast to talk about Red Canary’s expanded cloud protection capabilities and how we help customers monitor their cloud infrastructure. We’ve embedded the entire podcast, as it’s worth a listen (the Risky Business podcast always is), but the interview with Chris starts at 47:20.
You can read Chris’s interview below, which has been edited for clarity:
Patrick Gray
We’re going to chat now with Chris Rothe. He is the cofounder of Red Canary and is its CTO, and he’s joining me to talk about how managed detection and response firms like Red Canary are now helping customers monitor their cloud infrastructure. So as you hear, when I recorded this interview, I was actually staying at a friend’s house in Melbourne, who happens to be an Azure cloud infrastructure guy.
So I wound up discussing this interview after I recorded it with him. And essentially, yeah, he told me the same thing that Chris did, and you’re about to hear, which is that out-of-the-box signals from services like Azure are actually pretty good these days, right? So you can plug an MDR provider into this telemetry and they can actually tell you useful things.
Anyway, here’s Chris Rothe talking about how MDR companies are tackling cloud infrastructure monitoring and response.
Chris Rothe
If you think about MDR, what are the core features of MDR that inform what telemetry you need? From our perspective, there’s sort of five main things: There’s 24/7/365 expert investigation of any potential threat. There’s advanced threat detection—detection engineering would be another term for that that—you need to apply to the right data sources.
There’s having a great global threat intelligence team that’s able to collect different pieces of intelligence and bring them to bear in your detection engineering pipeline. There’s threat hunting continuously to apply new intel, the old data. And then the response side, the R side, is proactive response in remediation, right? Being able to take action on threats and shut them down.
So if you accept that as sort of the core “What is MDR?” in order to deliver the outcomes that companies need—which is detecting threats and shutting them down before they cause damage—then the question starts to be, what types of data do you need to do that job in different environments? For endpoints you know, largely for the last decade, that’s where most of the action was.
That’s where the most threat actors were landing. The best source we ever found for that was endpoint detection response data. That telemetry that’s telling you every process, and what every process did, is the perfect set of data to do detection engineering on top of and find those threats in a really robust behavior-analytic type way.
Patrick Gray
I think the words that describe it best are execution events. I always thought it was just execution events. That’s it. And look at them. Find the funny ones, right?
Chris Rothe
Yeah, exactly. And that is like a core to our view of the world is, you don’t want to convict it. You’re just looking for things that are interesting, that need a human to look at them because products and tools can’t convict things. Right? If we could write perfect analytics and say that’s definitely bad, then you don’t need a security team and you don’t need an MDR, right?
It’s that gray space where, hey, this thing looks like normal user behavior, but it’s actually an adversary doing it. That’s where MDR really is critical. So as you go beyond the endpoint, you say, hey, now we have users that are using mostly SaaS tools. And so the identity is sort of the center of their world.
You have cloud infrastructure where maybe EDR is in place on the workloads, but you also have this cloud control plane with all these different service primitives that you can use. What are the right telemetry sources in those environments? And so through our last couple of years of learning and growing in those areas, we’ve sort of zeroed in on in the cloud space, it’s really the cloud API telemetry, right?
So in the AWS world, that would be CloudTrail and similar to the EDR analogy, it’s everything everyone did to the cloud control plane, every resource they created, every resource they stopped, every security group they modified. That’s what’s in that telemetry. And ultimately, that’s the same level of detail that you need in order to then build detection analytics on top of it. Similar in the identity space like from the Oktas of the world, Azure AD, getting that fine-grained login telemetry and then applying that into the email and productivity space when we’re talking O365 and the Unified Audit Log and all that kind of stuff. Those are the prime telemetry sources and sort of the modern conglomerated I.T. world.
Patrick Gray
You know, is one of the reasons that this wasn’t really possible earlier and is possible now because everybody, through the first decade of cloud had had a different approach, right? So every cloud environment was just such a snowflake that trying to get, you know, some managed detection and response company to look at the logs and even know what was going on was basically impossible. Whereas now, you know, it seems like there are more standard approaches to how people spin up these cloud environments. So, the badness kind of looks a little bit more uniform. Am I making sense here? Like, is that something that’s happened?
Chris Rothe
I think there’s a couple of ways to look at it. One is sort of an adversary’s view of it, and another would be “what are the tools that are actually available?” view of it. So from an adversary’s point of view, when it was so easy to compromise endpoints because everyone would click the link, why would I mess around trying to break into, you know, something in a cloud service provider?
An analogy would be like if I’m a sales rep who says I’ve got my tactics of sending a cold email or sending packages in the mail and I get way better response rate on one versus the other, let me just go with what works and then play the hits, right?
And so that was sort of the temporal angle of it, which is like, if I’m an adversary, why bother trying to attack a cloud infrastructure here when I can just get access to endpoints and do my thing from there and take advantage of it by delivering ransomware or whatever else they were using to monetize? So that’s one angle of it.
The other is, to your point, the cloud platforms, the cloud control planes, the cloud service providers have all sort of matured to—and I’m really talking about the big three here, AWS, GCP, Azure—have all matured to have a relatively similar portfolio of services, right? There’s nuances, there’s differences between them…
Patrick Gray
But it’s not like it was because people forget that AWS, you know, 10-15 years ago looked like what Digital Ocean looks like now, right? You could run your own Linux machines on a hypervisor and there was no, you know, telemetry source. So there was no standard way to do things either because you could bring your own VMs and you can run them in the cloud. But that’s not for sure.
Chris Rothe
Yeah. And I think the other, you know, evolution over time is the shared responsibility model that exists with the cloud service providers now. And if you’ve never seen that, the concept is there’s effectively an above-the-line and a below-the-line and all the things below the line are effectively the cloud service provider’s responsibility.
So when you think about that from if you’re mapping your traditional on-prem security thinking to a cloud environment—how many SOCs have I been in where somebody is like “one of my dreams is to take our badge reader data and correlate it with, you know, event logins to the computers.” That stuff’s gone, man.
There’s no physical security. Like, that’s all below the line, that’s out of your purview. That means from a security perspective, there’s lots of good things about not having to worry about that as a user of the cloud. Maybe the negative of it is that everything that’s above the line is your responsibility.
And that, in some cases, is stuff that you’ve never had to think about before, right? Because it wasn’t part of the universe when you were in an on-prem data-center-typetype environment. So I think that’s the other angle—that shared security or shared responsibility model has matured, security teams are realizing how much they have to take on in terms of securing things above the line.
And that’s that’s where detection and response starts to apply, where it’s like, oh, wow, now that we have visibility, now, to your point, that we have this common set of like activity telemetry that’s coming out from the different CSPs, now we’ve got to do something with it, right? Otherwise we’re negligent in finding those attacks.
Patrick Gray
Yeah. I mean, you know, some of the early approaches around cloud-based stuff was like you’d shim in a network sensor and sort of put some things together. So you would have basically like a network IDS in your cloud instance. And then maybe you’d do some endpoint telemetry. You know, if you’re running a bunch of Linux things, you’d throw in some sort of EDR-like security agent to send logs back. But that’s not really what we’re doing anymore, is it? I mean, that’s still a part of it.
Chris Rothe
The way we like to categorize it is you’ve got the MDR in our world, MDR for the cloud control plane. So you need to detect threats in the control plane. Again, use the analogy of the control plane as an OS.
Patrick Gray
But I guess that’s what I’m getting at, that’s the new part, right? We’ve said this on the show too, that something like AWS is essentially like a server operating system, now it is like an OS.
Chris Rothe
Yeah, absolutely. So there’s that piece and then there’s, you know, MDR for the cloud instances, those have some different flavors now with containerization and, and serverless functions and things like that that aren’t on-prem native. There’s no mindset of how to monitor a serverless function. So those are new things that we have to have to figure out, and figure out what it means to detect threats to them.
Patrick Gray
How are people handling serverless? Because it’s not super common. I mean, as you know, I’ll just tell the audience I’m traveling at the moment. I’m in Melbourne, I’m staying at a friend’s house. This friend happened to have developed serverless Azure apps some years ago. Had an opportunity to do some development, and it was incredible what he was able to spin up in an incredibly short period of time using serverless. But then you think, okay, well, how do you get insight into what it’s doing?
Like, how are people doing that? Do you have to basically build your own logging and service serverless apps or do the cloud providers, basically extract out some generic telemetry for you?
Chris Rothe
I think that to add on to your question, what’s actually relevant? What does it mean to compromise a serverless application? If the thing just spins up in response to an API request, does its job and then shuts down, what is the actual vector there? . So you have traditional things like web application attacks, SQL injection, if what your serverless app is doing is serving a web page or whatever. And so the database that probably underlies your serverless application needs to be protected from that standpoint. So any app you’re building serverless needs to implement the same types of safeguards on the front end of it to make sure you’re not vulnerable to those types of attacks.
But in terms of what we think of as a compromise, to compromise something serverless like that, you have to get in through the control plan and inject code. And so that comes back to: what are we monitoring for there? Monitoring for changes to those applications that maybe were made outside of the CICD pipeline? They were hand-poked in there, you know, by who or by some API.
Patrick Gray
I mean, you keep coming back to the same thing, right? Which which makes a lot of sense, which is from an MDR perspective, the one generic info source that you can make best use of is going to be that control plane logging. Right. And that’s something that you can just plug in. Doesn’t matter how diverse the environments are that you’re having to monitor, there’s going to be some stuff that just sticks out like a sore thumb. That’s essentially what you’re saying here, right?
Chris Rothe
Yeah, absolutely. And it’s not the one,it’s not everything, right? As always in security, it’s part of a solution. You also need what we would historically call a CSPM, cloud security posture management—. like a Wiz, like a Lacework,—to look at those configuration changes help to highlight vulnerability-type activity. Which is another kind of interesting thing I think about cloud security: the definition of a threat seems to move a little bit more to the left.
Patrick Gray
Yes, absolutely.
Chris Rothe
Vulnerabilities are, more like threats— how many vulnerabilities are on your laptop at any given time? Probably lots. So what? They’re not accessible. There’s nothing anyone could do with them. In the cloud, you can’t have that same attitude.
Patrick Gray
About what what percentage of your customers have you doing this? Because I’d imagine that, you know, the market has only recently kind of wrapped its head around the idea that MDR can be trusted, right? Like, that’s new. And I’m guessing that this is a small but growing business line for you. Is that about right?
Chris Rothe
Yeah, correct. And it’s really about the profile of the company. A lot of especially cloud-native type companies never had a-on prem infrastructure that they lifted into the cloud. Those are sort of the early adopters in this space. Maybe we were monitoring their corporate environments and they were the ones who were asking us, hey, can we apply some of this similar stuff?
Patrick Gray
Well, the people who did lift and shift would have been the ones who did the network sensors in the cloud and who plugged in the EDR-like telemetry, and they have it feeding back to their SOC or their existing way of doing monitoring. So that’s why I’m curious about who’s embracing this. And it makes sense that it’s the cloud first, you know?
Chris Rothe
Think about the profile of those companies in terms of things to monitor. Let’s pick a company like Red Canary. We’ve got, you know, somewhere between 500-1,000 employees or something like that. We’ve got thousands of machines running in AWS at any given time, scaling up, scaling down databases, data storage pipelines, like all this stuff, all the time.
It’s a much bigger environment than our user population. That’s typical of a SaaS company or a cloud-native company. So those are the early adopters in terms of MDR for cloud. But as more and more people get out of the on-prem infrastructure business, we expect it grow there.
Patrick Gray
All right. Well, Chris Rothe, thank you so much for joining us on the show to talk through all of that. Let’s see where all this goes.
Chris Rothe
Thanks, Patrick. Appreciate you having me.