One great thing about working in information security is that there is no normal. We routinely encounter strange and novel things, and that’s exactly what turned up after some roving on Pastebin over the last few weeks.
As a bit of context, it’s pretty routine for adversaries to host shells, executables, and other attack infrastructure on Pastebin. If you scrape the site for the right indicators or patterns, then you’re bound to find something. Of course, a lot of the attack tooling on Pastebin is pretty mundane, but sometimes—if you’re willing to examine a long and convoluted sequence of scripts—you can find a persistent Linux backdoor concealing itself with steganography.
Let’s start at the beginning
I ran some Pastebin searches for evidence of a classic shell trick—where adversaries use
wget to download and execute a later-stage payload—and they yielded seemingly negligible results:
However, since it would only take a minute to see if the download was available, it made sense to check. Of course, it was available, and what followed was ultimately fascinating.
Here’s what turned up
Take a look at line 12:
While not a completely new idea, it’s certainly unusual to see this use of the Unix
dd command. Normally,
dd is used to copy and convert data. Security professionals will often encounter it when making forensic disk copies and overwriting data. In this case, it looked like they were using
dd with the
skip option to extract an executable from an image, suggesting that this wasn’t your everyday bash shell malware.
Now we follow the breadcrumbs
Luckily, the website hosting the image hadn’t been taken down, and I was able to retrieve it (see below). Even better yet, the image was still concealing the executable, and a Google search for the image turned up no matches.
dd command worked as expected, and, upon extraction, revealed an executable program named
rcu_bh. By using the file command, it turned out that the extracted program was an executable and linking format (ELF) binary—which is basically the Linux equivalent of Windows’ portable executable (PE) binary format.
At the time, no one had uploaded the file to VirusTotal, where zero out of 59 anti-malware engines marked it as malicious. As of this writing, it has the same score. VirusTotal did note that it was issuing a
curl command to Pastebin—but nothing else. The Hybrid-Analysis sandbox system also failed to find anything interesting about the file, determining that it posed “No Specific Threat.”
Beyond sandboxes and malware repositories
At this point, regular text-based analysis was going nowhere, and VirusTotal and Hybrid-Analysis failed to report back the presence of any malicious activity. The logical next step was to carry out a deeper analysis by executing the binary in an isolated and instrumented virtual machine, which ultimately revealed its purpose: to download and execute another script from Pastebin.
This follow-on script executed a command that was identical to the first command shown in this report, only with a different target URL on Pastebin.
What’s going on at this URL?
Let’s break this down a bit because there’s some interesting stuff going on here. First, the script ensures that there’s a copy of
/tmp available (see line four), suggesting that the attack requires root-level access, since root privileges are typically required to create a directory in
/. Then, starting at line seven, it checks for the presence of an unusual file that wouldn’t normally show up on a Linux system:
lsx. If that program is indeed on the system, the attackers run it and leave.
lsx isn’t already installed on the system, the attackers carry out the following actions:
- Try to install git
- Try to install gcc (the compiler for
- Download another, different image (from the same location as before)
- Extract a program from the image via
dd and execute it (same as before)
lsx (which suggests that
lsx is supposed to be installed by the program that is run in step four)
- Clean-up and exit.
But wait, there’s more!
The second image ended up looking identical to the first, and the program it extracted was a 64-bit ELF executable. Yet again, zero antivirus vendors marked it as malicious on VirusTotal, and the behavioral analysis report showed some git-related activity—but nothing significant beyond that. So once more we needed to execute the program in a controlled environment. This time, instead of getting the program to run from Pastebin, or extracting it from an image, it used git to clone a code repository and execute a script from the resulting directory tree. Specifically, it executed the following command:
Well, now we know why the previous script was trying to install git. Here we see
init.sh, which goes on to performs a variety of functions, which we’re about to break down.
First it locates this
ls program, then it invokes a new Python script (
cpl.py) in that location. Most of the contents of the Python script are dead code or Chinese-language notes, so we only need to examine a few pieces of it:
This routine takes the
ls command that was located by the
init.sh script, and renames it to
ls.—note that the period on the end is part of the file’s new name.
It also creates a file in
systs.conf, a perfectly plausible name for a file in
/etc, but not a legitimate configuration file. Next, the program writes a dictionary array (formatted as json) with a number and a date to that file.
And we can see here where
cpl.py runs and executes those two routines.
However there is another bit of
init.sh that we need to look at:
ls command is being replaced by the following shell script:
ls.sh. Naturally, we need to see what
ls was replaced by a shell script that invokes the original
ls (remember: in
ls was renamed to
ls.). In addition, the script passes to
ls. all the arguments that it was originally invoked with (
”$@” expands to all the arguments that the script was invoked with). After the script has run the original
ls.), the script then runs some other program named
lsx. In other words, the script executes the legitimate copy of
ls in the way the user invoking the script will expect the real
ls utility to run. The unexpected part is the extra execution of
The plot thickens
The following piece of
init.sh is the next important thing we need to examine:
And now we know why gcc was installed earlier. The init.sh script compiles the
x.c and creates an executable program named
lsx is copied to the directory where
ls lives—effectively allowing the shell script
ls.sh, which is replacing the
ls command, to run
lsx after running the real
ls (which had been renamed to
Let’s look at the
lsx invokes a Python program named
key, and this brings us back to
init.sh for one last important detail:
This merely installs
key.py—and some auxiliary python code—in the system Python library. Since it’s in the
syskey directory, the easiest way to refer to
key.py is to treat it as a part of the
syskey module (e.g.
syskey.key as we saw in the
This seems like a lot effort for someone who is simply trying to run
key.py every time
Making it worth the wait
VirusTotal had not seen
key.py before, and zero of 58 antivirus engines are currently alerting on it.
key.py in reorganized pieces in order to clearly explain what it’s up to.
Well, now we’re getting an idea of what the
/etc/systs.conf file is for. The variable
ts ends up getting the date stored as
start in that file, represented as the number of seconds since the epoch.
time.time() is the current time, represented as seconds since the epoch. In the
if block, we can see that if the current date is before the time stored in
/etc/systs.conf, then this function simply returns without doing anything. Otherwise, the variable
latency_time is set to the value of the
delay value stored in
/etc/systs.conf and execution continues in this function. This would appear to prevent the program from running before some start date.
In the sample I have, that start date is
2018-01-01, so the program is currently free to run. This could serve to prevent execution prior to some start date. It could also function as a type of kill switch, by specifying a date far in the future.
Here’s the other piece of the
This routine (
is_running) uses the file
/tmp/sys.ts to track the last time this program was run. If the program was run less than
latency_time seconds ago, it returns
True. Otherwise it writes the current time (in seconds since the epoch) to that file and returns
False. This routine is used as a check to ensure that
key.py isn’t run too often—by causing
key.py to abort if it has been run less than
latency_time seconds ago.
As you can see, there’s more going on in
This is interesting because there is a saved SSH authentication public key in the variable
And in the above image you can see that the authentication key is being added to the root account’s
authorized_keys file and ensuring that that SSH is running. At this point, anybody with knowledge of the corresponding SSH private key can login to the root account over SSH.
Finally, it’s clear what all this effort was for: to provide persistent, root-level access. Every time root on this computer runs the ‘ls’ command, the program ‘key.py’ will be run by root and reinstall this backdoor to the root account.
But wait again… there’s still more!
After decoding and decompressing, the string in
url, the variable in
u contains the following string:
platform.platform() call returns the version of the operating system running.
This all means that after the
u.strip(‘www.qq.com’) is run, this program will connect to a web server whose name is hidden by base64 encoding, compressed, embedded within distractor hostnames. When it connects to the web server, in addition to providing its IP address simply by connecting, it will upload details about what operating system is running (on the computer that just had a root back-door installed on it) to the server.
Tying up the loose ends
check() does the bulk of the work for
check() exits, one of the following happens:
authorized_keys for the root account has the key installed, which give the attacker root access via SSH
- SSH is already running
- A connection has been made to the attacker’s website, notifying them that this machine has the backdoor installed and what operating system the machine running
latency_time check might have failed, meaning that the machine is already compromised, and it’s too soon to run this program again. It’s also possible that the current date is before the
start time, which would prevent the program from running altogether.
It’s easy to get lost in this admittedly convoluted maze of scripts, so here’s the high level version:
- An already compromised machine downloads and executes a one-line script from Pastebin
- That one-line script downloads and executes a more complex script from Pastebin
- The more complex script downloads a .png file from a Chinese host, extracts a hidden executable (
rcu_bh), and runs it
rcu_bh downloads and executes yet another one-line script from Pastebin
- Again, the one-line script downloads and executes a more complex script from Pastebin
- The more complex script downloads a different .png, extracts another hidden executable (
rcu_gp), and runs it
rcu_gp performs a git clone from a Chinese site, effectively downloading a directory tree, and executes a bash script (
init.sh) in that downloaded directory tree
- The bash script then coordinates the replacement of the standard Unix
ls command with a script that will run the original
ls command, before running a malicious Python script named
key.py. It also installs a file in
/etc, which controls when and how often this malware will run.
- The malicious Python script (
key.py) modifies the root account’s
authorized_keys file, effectively providing a backdoor to the root account via SSH. It also contacts a remote server, notifying it that the backdoor has been successfully installed on local machine, and providing information about the operating system it’s running.
Attempts to connect to the web server in the
post_info() routine returned an HTTP “500 Internal Server Error.” Further, while running this program, there weren’t any subsequent SSH connections trying to access the root account. This could mean that the server has already been taken down, or perhaps it has not entered production yet.
This tooling is all about establishing persistence on an already compromised machine. It requires root access to run. Once an attacker has root, downloading and running the initial script is trivial. The names of the two executable programs,
rcu_gp, are the same as standard parts of the Linux operating system, which is pretty clearly an attempt at defense evasion.
I’ve obscured many of the details here since all the links are still live and dangerous. It appears likely that the
shc shell script compiler was used to generate the two executables. This conclusion is based on some of the dead code in the scripts, files found in the downloaded git repository, and the behavior of the two executables.