A few days ago, a case came in which had some rather odd symptoms, such as
processes using high amounts of CPU and memory, and running from the /tmp
directory.
After asking for some logs, and some samples of the binaries, it became obvious that the system was compromised, and was now running some interesting malware.
In this post, we are going to look into the malware called dovecat, which turned out to be a cryptominer, and hy4, which is a IRC botnet malware dropper.
I’m pretty excited, as I haven’t analysed any Linux malware before, and this is real life stuff pulled directly from a production machine, so it still has its fangs intact.
Let’s get started.
Problem Description
This case caught my eye as soon as I saw it in the queue. The description mentions that a process called dovecat was using a large amount of CPU time and most of the system’s memory, and was causing the machine to run slowly.
dovecat did not seem to match any service the system was running, and there
are files in the /tmp
directory owned by the service which is running the
dovecat process. It all looked rather suspicious, and a case was filed.
Now, the description alone raises a bunch of red flags. Is the dovecat
executable itself in /tmp
? Are the files in /tmp
configuration, or more
malware? No legitimate programs place files in /tmp
for anything other than
temporary storage. Malware only use /tmp
since any user has the ability to
write there.
We needed more information, so we asked for a sosreport. The logs were extremely interesting. The system itself is Ubuntu 18.04, but it is massively out of date. It looks like it hasn’t been patched in 1 - 2 years. Here’s what I found:
Firstly, looking at ps aux
, we can see that dovecat is indeed running from
/tmp
, as the system daemon user:
daemon 100394 397 29.4 2894488 2402584 ? Sl 05:34 735:24 /tmp/dovecat
The kernel logs showed that dovecat was segfaulting occasionally:
kernel: [2394416.671219] dovecat[46657]: segfault at 63 ip 00007f2be096b448 sp 00007f2be2393490 error 4 in libnss_files-2.27.so[7f2be0968000+b000]
kernel: [2424348.437406] dovecat[53028]: segfault at 63 ip 00007f45e1b60448 sp 00007f45e3588490 error 4 in libnss_files-2.27.so[7f45e1b5d000+b000]
kernel: [2431562.775108] dovecat[54622]: segfault at 63 ip 00007feec3df1448 sp 00007feec9831490 error 4 in libnss_files-2.27.so[7feec3dee000+b000]
kernel: [2467413.285152] dovecat[62803]: segfault at 63 ip 00007f803f8be448 sp 00007f80412e6490 error 4 in libnss_files-2.27.so[7f803f8bb000+b000]
syslog also showed some strange an alarming cronjobs running with strange names:
CRON[105618]: (daemon) CMD (/var/lock/bash7 > /dev/null 2>&1 &^M)
CRON[105617]: (CRON) info (No MTA installed, discarding output)
CRON[105627]: (daemon) CMD (/var/tmp/sh7 > /dev/null 2>&1 &^M)
CRON[105625]: (CRON) info (No MTA installed, discarding output)
CRON[105628]: (daemon) CMD (/tmp/bash7 > /dev/null 2>&1 &^M)
CRON[105626]: (CRON) info (No MTA installed, discarding output)
CRON[105712]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
CRON[105753]: (daemon) CMD (/var/tmp/sh7 > /dev/null 2>&1 &^M)
CRON[105751]: (CRON) info (No MTA installed, discarding output)
CRON[105754]: (daemon) CMD (/dev/shm/bash7 > /dev/null 2>&1 &^M)
CRON[105758]: (daemon) CMD (/tmp/bash7 > /dev/null 2>&1 &^M)
CRON[105749]: (CRON) info (No MTA installed, discarding output)
CRON[105756]: (daemon) CMD (/var/lock/bash7 > /dev/null 2>&1 &^M)
CRON[105757]: (daemon) CMD (/tmp/init7 > /dev/null 2>&1 &^M)
CRON[105748]: (CRON) info (No MTA installed, discarding output)
CRON[105752]: (CRON) info (No MTA installed, discarding output)
CRON[105750]: (CRON) info (No MTA installed, discarding output)
Where do I even begin?
dovecat was indeed running directly from /tmp
as /tmp/dovecat
. The binary
itself segfaulting in libnss_files-2.27.so
means that dovecat was either
poorly written, or that it was trying to link to a system library it was not
compiled for, or if it was statically linked, something went wrong in the
linker stage.
The cronjobs are particularly alarming, since there are multiple executables,
all located in world writable places, such as /tmp
, /var/lock
, /var/tmp
and /dev/shm
, and all use the same discard to /dev/null
string:
> /dev/null 2>&1 &^M
. These executables are obviously wanting to hide their
output to evade detection, and are placed throughout the disk to gain redundant
persistence.
At this point, I asked for samples to be collected for the following files:
/tmp/dovecat
/var/lock/bash7
/var/tmp/sh7
/tmp/bash7
/dev/shm/bash7
/var/lock/bash7
/tmp/init7
They were collected and uploaded to the case, so let’s start doing some in-depth analysis, shall we?
Basic Information on the Collected Samples
If you are wanting to follow along at home, you can find the samples analysed by searching for their SHA256 hash on Google or VirusTotal. I don’t really want to host live malware on my blog, so I won’t offer the samples as a download.
Alright, lets have a look what we have here.
dovecat
SHA256 10c0ed6e8223e4c18475c39beec579911bb18d5e64bf33d2de051c9c59138a08
$ file dovecat
dovecat: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.32, BuildID[sha1]=5abe6768b29bdf70910880c44f79c991682b439f, stripped
Okay, nothing too surprising here. Statically linked executable built for 64 bit Linux. Let’s check VirusTotal for the hash:
It seems we have a match, and only very recently too. Currently 29 / 61 virus scanning engines detect the binary as a virus, and interestingly, it was first submitted on 2020-10-09 23:23:39, meaning that this executable has been compiled within the last month or so.
The engines seem to class this as some sort of cryptocurrency miner, so we will need to dig into this a bit further.
This is one big executable, at 7mb. We have 6416 functions, which is large, although this is statically linked, so we need to include the various libraries which have been linked into the base executable.
What is interesting is the compiler: GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
.
It seems the attacker compiled on Ubuntu 16.04, using the gcc-5
package
at the latest version.
bash7 / init7 / sh7
These files are interesting, as all the following samples that were collected:
/var/lock/bash7
/var/tmp/sh7
/tmp/bash7
/dev/shm/bash7
/var/lock/bash7
/tmp/init7
they all have the same hash, and are the same executable. I did a quick check, and it seems they are packed with UPX:
$ strings init7 | grep UPX
UPX!
$Info: This file is packed with the UPX executable packer http://upx.sf.net $
$Id: UPX 3.94 Copyright (C) 1996-2017 the UPX Team. All Rights Reserved.
I installed UPX, and found they unpack with no problems. The attacker seems to be using a non-modified version of UPX.
$ upx -d init7
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2017
UPX 3.94 Markus Oberhumer, Laszlo Molnar & John Reiser May 12th 2017
File size Ratio Format Name
-------------------- ------ ----------- -----------
73227 <- 36948 50.46% linux/i386 init7
Unpacked 1 file.
Alright, now the basic stats:
SHA256 f9c3165b9634b8f0ee139905b32e396ab10b30b74a05f4f705b18e841302555
SHA256 (unpacked) 22f1c7056beb9be8acf2ca5b4185ebe422b5566af7b36052b85d35686e38b456
$ file init7
init7: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, not stripped
Not stripped? Now that’s interesting. Let’s check VirusTotal.
Interesting again, only 6 / 61 virus engines detect this as malware. It seems very new as well, with the first submission only being 4 days ago: 2020-10-22 08:40:27.
This malware has something to hide, thats for sure. We are going to need to look deeper into this one as well.
This binary is much smaller, at 72kb. There are still a lot of functions, 241 of them, but are mostly going to be library functions that have been statically linked. The compiler is a bit more older, and doesn’t seem to be an Ubuntu provided one.
Advanced Static Analysis
Time to have a look into these executables from an assembly language perspective, and see if we can determine exactly what these binaries do.
Today I’ll be using radare2-cutter and Ghidra. Just the latest upstream version from their respective websites.
dovecat
The entrypoint to dovecat isn’t interesting, it seems to jump around and setup
various statically linked libraries. I skipped ahead to main()
:
We seem to check some magic numbers, and if the check fails, we exit, otherwise we enter an infinite loop that calls three functions over and over.
Those three functions themselves aren’t interesting either. Looks like we will have to go hunting for some strings and do some x-refs to see what is going on.
With 101933 strings to go through, this is going to be tough. We might have to search. Since VirusTotal seems to think this is a cryptocurrency miner, let’s try things like “bitcoin”, “coin”, “mine”.
“bitcoin” came up empty. “coin” wasn’t useful either. “mine” was very, very useful, since it came up with this string:
{"autosave": true,
"donate-level": 0,
"cpu": true,
"opencl": false,
"cuda": false,
"pools":
[
{
"url": "pool.minexmr.com:443",
"user": "46bHvv8wD6B2PF3aiNoWq2K89GiT5QXpFYg2dP898PRwasqWYSEHzNjVznCPCDpoNa7N8QPJD94P4jK4pWKoRixB5zR3TnQ",
"rig-id": "w1",
"keepalive": true,
"tls": true
}
]
}
This seems to be some sort of configuration for this binary. It has CPU mining enabled, but opencl and cuda disabled. Weird, normally you would want to take advantage of a GPU if the system had one.
It also shows it is a member of the mining pool pool.minexmr.com:443
,
and supplies a user hash 46bHvv8wD6B2PF3aiNoWq2K89GiT5QXpFYg2dP898PRwasqWYSEHzNjVznCPCDpoNa7N8QPJD94P4jK4pWKoRixB5zR3TnQ
.
Let’s go to the mining pool website, and see if we can get some information about the user hash we have here.
Well, well, well, what have we stumbled upon.
It seems this user hash is a wallet public key for the Monero cryptocurrency. Monero is one of those privacy coins with a hidden ledger. You can’t see the balance of a particular wallet. Kind of frustrating for detectives you know?
Anyway, it seems the attacker is pulling a hashrate of 161kh/s, over 3 “workers”. At the time of writing, they have pocketed 1.861194 XMR for their efforts, which is about $248 USD or $371 NZD or $210 Euro.
The hashrate seems to be going upward, but it goes up and down, probably as machines are infected, start mining, get discovered by their owners, and then offlined.
There seems to be 3 “workers”, although, I think multiple machines are identifying
themselves as a single “worker”. The configuration string we saw had "rig-id": "w1"
set, which means the system was probably in the w1
worker.
Alright, we have now established that this malware is likely a Monero (XMR) cryptocurrency miner. Now we need to try and see if this program is hiding any other secrets, or if it is just an off the shelf miner.
Back to string searching in the binary, it seems we have found a man page, or the documentation for the program:
These strings indicate that this is a copy of XMRig 6.3.3
, which is free and
open source Monero mining software. It’s upstream code repository is:
https://github.com/xmrig/xmrig
Having a further look at the binary, it is looking like the attacker just
cloned the repo, hard coded their configuration in, and statically compiled
a binary, and named it dovecat
to try make it blend into a system, so
people would think its just dovecot
, which is a mail daemon.
I don’t think we need to look at any more assembly for this executable, the executable is too large, and it is very likely going to be benign. We can always catch bad behaviour during dynamic analysis.
bash7 / init7 / sh7 aka hy4
Time to dive into the next malware sample, bash7 / init7 / sh7. This one is small enough that we should be able to cover most of its functions.
Now, what I find striking about this sample, is that it isn’t stripped. This sample has its debug symbols intact. Why? Did the attacker forget to strip the binary before pushing it to the world? Or is it intentional? Who knows.
But we are exceptionally lucky. Now we can get some serious insight into this binary.
Ghidra shows us a list of files which this executable was compiled from. There are 190 different files in total, a few of them are below:
The only one that stood out was “hy4.c”. It doesn’t seem to be a part of any standard library, and searches return no results. I suppose we will call this malware hy4 from now on.
Since we can see a list of all functions this malware calls, it shouldn’t be too hard determining what it does.
Click for full list of functions
Lets jump to main()
and have a look:
The control flow graph itself isn’t too bad. We seem to have a large initialisation stage, followed by some blocks at the bottom which seem to be infinite loops that are swapped between.
The first thing that hy4 does is call rand_init()
, daemonize()
and
bindport()
. Let’s see what these do.
rand_init()
seems to set ‘x’ to the time, ‘y’ seems to be the xor of process
id and parent process id, and z seems to be the clock. w seems to be the xor
of clock and time.
daemonize()
seems to see if the process is a child, and if it isn’t, then
it forks. It checks to see if fork()
fails, and if it does then it exits, and
the parent also exits. Only the child remains running.
It then redirects the programs file descriptiors for stdin and stdout to
/dev/null
, and changes the signal handler for the following signals:
0x11 - SIGCHLD
0x14 - SIGSTP
0x16 - SIGTTOU
0x15 - SIGTTIN
1 - SIGHUP
0xf - SIGTERM
The new signal handler is 0x1, or True. Looking at the signals changed, it seems the attacker really doesn’t want this malware to be killed or interrupted.
bindport()
seems to create a socket, and bind it. To see what port,
we bind &local_18
of type sockaddr
. The compiler has done some stuff, so:
struct sockaddr {
sa_family_t sa_family;
char sa_data[14];
}
sa_family
is 2
as per &local_18
. sa_data
is derived from local_14
and
local_16
.
We then start listening on the port.
What happens next is kinda weird. hy4 checks to see if /share/CACHEDEV1_DATA/Web
exists. If it does, we enter the if statement:
It then executes some shell commands using system()
. The first tries to mount
a bunch of devices in a brute force fashion to /tmp/config
. with the below
command:
mount $(/sbin/hal_app --get_boot_pd port_id=0)6 /tmp/config ;
mount -t ext2 /dev/mtdblock4 /tmp/config ;
mount -t ext2 /dev/mtdblock5 /tmp/config ;
mount -t ext2 /dev/sdx6 /tmp/config ;
mount -t ext2 /dev/sdc6 /tmp/config"
If any of these succeed, then it runs a command to make a autorun file that is a shell script:
echo \"#!/bin/sh\n%s\" > /tmp/config/autorun.sh ;
chmod +x /tmp/config/autorun.sh
Script seems empty for now. What is this /share/CACHEDEV1_DATA/Web
directory?
Is it from some sort of vulnerable internet of things device? I googled it and
it seems to be for QNAP devices. QNAP seems to manufacture NAS, video cameras
and stuff. Typical internet of things device.
Moving on.
The code then attempts to access a bunch of directories to see if they are writable.
These directories look familiar…
/dev/shm/
/var/tmp/
/tmp/
/var/lock/
/var/run/
If they are writable, they get added to some sort of list. It then goes and opens a few crontabs, and does some greps.
"(crontab -l | grep -v \"/%s\" | grep -v \"/sh7\" | grep -v \"/init7\" | grep -v \"/bash7\" | grep -v \"no cron\" > %s) > /dev/null 2>&1"
Hmm. Is it checking to see if the crontab is already infected? I think it is.
If the system is not already infected, it calls injectbot()
on the following
directories:
$PWD
/dev/shm/
/var/tmp/
/tmp/
/var/lock/
/var/run/
Lets look at injectbot()
:
It seems to have “init7”, “bash7” and “sh7” hard-coded, and selects them randomly
depending on the gettimeofday()
and a random chance. From there malloc()
a
buffer, where we make a copy of the running executable, and copy it to the
new path with the newly randomly chosen name.
Since this happens a bunch of times, we end up with all the duplicate copies.
Once these have been run, a new cronjob is installed in the system, in this
case at /var/spool/cron/crontabs/daemon
.
If we look at the sosreport from the infected system, we see:
# DO NOT EDIT THIS FILE - edit the master and reinstall.
# (/var/lock/.hh21804289383 installed on Thu Oct 22 12:54:01 2020)
# (Cron version -- $Id: crontab.c,v 2.13 1994/01/17 03:20:37 vixie Exp $)
*/10 * * * * /var/tmp/bash7 > /dev/null 2>&1 &
*/2 * * * * /var/lock/init7 > /dev/null 2>&1 &
*/1 * * * * /dev/shm/sh7 > /dev/null 2>&1 &
*/10 * * * * /tmp/init7 > /dev/null 2>&1 &
We now fully understand how this malware gains persistence (cronjobs and redundant binaries), and prevents itself from being terminated (forking to daemon, re-registering signal handlers).
Now things start getting more interesting. We have reached the end of the large initialisation section, and have now entered the loops, of what seems to be IRC server communication.
We make some random numbers, call makestring()
, which in term, makes a string
out the hostname or uname with some random characters added to the end:
From there, the result of makestring()
becomes the systems IRC nick.
It connects to channel #XLM
with pass 321
:
After that, hy4 calls con()
, which seems to have functionality to swap
between different IRC servers. What it seems to do on the first try is to
connect to 5.253.84.148
, uses the nick, channel and pass from before,
and sends the string "NICK %s\nUSER K localhost localhost :2010\n"
.
After that, two main things happen:
The first, is that hy4 recv()
some data, and then calls strtok()
to parse
it:
There isn’t any indication of what the commands we are parsing are though.
We stay in this loop forever though, so hy4 always waits for instructions, than goes to execute them.
See that call ecx
on the far right? It seems we load the address of a function
to ecx
and execute it. I’m not sure what function though.
Let’s have a look for other functions to see what functionality the IRC commands might call.
376()
seems to be how hy4 joins a IRC server, and is pretty explicit:
433()
seems to rotate the IRC nick.
_NICK()
seems to check for a specific IRC nick.
ping()
just seems to reply on IRC with “pong”.
cback()
turned out to be extremely interesting. It appears to fork off a new
process, which makes a socket, and connects to a remote host on a specific
port and IP.
This is your classic reverse shell. It takes two parameters, “IP” and “PORT”
and if you pass any more, you get a IRC message error with
"NOTICE %s :CBACK <ip> <port>\n"
.
When you connect to the reverse shell, you see the strings:
"NOTICE %s :Connected.\n"
"echo [-] logged at `date`"
"echo [-] `uname -a || cat /proc/version`"
If you are lucky enough, it will even check for gid 0, and print “root shell!” if you happen to be root:
It then execve("/bin/sh")
, and a shell is spawned for the remote attacker.
stdin and stdout are redirected to the socket, via the calls to dup2.
There seems to be some steps taken to prevent any commands from this shell
from being logged. It also exports a normal $PATH
.
I went and tracked down all the strings from the hy4 section, and found:
It seems the commands are just: CBACK
, IRC
, NOTICE
, MODE
, JOIN
, PONG
PRIVMSG
, PING
, NICK
.
I wonder what this string is:
Playful thoughts indeed.
I think that about wraps up the analysis of hy4. What I didn’t come across was a way for a file to be downloaded and executed automatically, but the functionality could very well be there, and I just didn’t look hard enough.
Executive Summary of Malware Infection
Infection Vector
For this particular system, the initial infection vector is unknown.
My only remarks are:
- The system was out of date, and had not been patched at all in at least 18 months.
- The system was running as a desktop computer, virtualised in the cloud.
Firefox was very old, at version 68. If you run old outdated browsers, along with being out of date on other software, such as the kernel and such, you open yourself up to drive by downloads and arbitrary execution vulnerabilities.
Desktop tasks are exposed to more risks than running a standard production workload, due to web browsing and constantly executing untrusted code in the form of Javascript. It is important to keep these systems up to date, and not forget about these when the are hidden away as virtualised appliances.
I do not believe that this malware was targeted. Quite the opposite, it seems that this malware was just opportunistic, in the right place at the right time, and was only motivated by the attacker making a quick buck.
hy4 was likely first onto the system, and was likely instructed to download and execute dovecat as a malware dropper payload.
dovecat
dovecat is cryptocurrency miner built from a freely accessible program called XMRig, at version 6.3.3. It uses CPU and memory resources to process currency transactions for the Monero (XMR) cryptocurrency.
The executable itself is not dangerous. It does not steal data. All it does is consume computing resources for financial gain in the form of Monero.
dovecat can be removed by terminating the process and deleting the executable.
hy4
hy4 is dangerous and should be considered as a threat. Due to hy4 connecting to and forming a part of a IRC botnet, and accepting commands remotely, any system found to be infected with hy4 should be considered compromised, and should be removed from production immediately.
Since an attacker has the ability to spawn a root shell, and interact with it remotely, an attacker can explore the compromised system, and can steal data with ease. All credentials on this machine should be revoked, and assume an attacker has constant remote access to the compromised machine.
Since hy4 gains deep persistence and is difficult to terminate, I recommend that the system is to be decommissioned and erased, and installed fresh in order to remove the infection.
Recommendations
I always recommend you keep your system up to date. If possible, patch daily or at least weekly, and it helps if you are running the latest Ubuntu LTS.
If you have a number of machines, you can install a program called
unattended-upgrades
with $ sudo apt install unattended-upgrades
. It will
patch the machine on a regular schedule.
If you have a large fleet of machines, then maybe a service like Landscape can be useful. It lets you view your fleet’s update status on a nice web interface, and you can patch your fleet with a few clicks in your web browser.
As always, only trust software from the official Ubuntu software archives. When you download and install software from a website to your machine, you are taking a risk that the software is not malicious.
My Thoughts on the Malware and Attribution
I have reverse engineered a fair amount of malware in my time, but this was the first Linux malware I have ever looked into. On the whole it was actually pretty pleasant, due to Cutter and Ghidra being very mature tools. The only thing missing is a good debugger, and I miss not being able to use x64dbg, since its Windows only.
The malware itself was pretty interesting. hy4 is one interesting specimen. dovecat not so much, since it is a rebuilt open source miner, just hard coded to mine Monero for the attacker.
hy4 is strange at a first glance. Not stripping debugging symbols was a huge mistake on the attacker’s part. It meant that I could read function names in the code just as they were in the source code, and the symbols also helped Ghidra’s decompiler build an accurate source code picture.
hy4 itself is also remarkably simple. It gains persistence, and joins an IRC botnet and awaits external instructions. Its functionality allows to spawn a reverse shell back to the attacker, and very likely carries functionality to download and execute malware.
It seems very basic. Someone has obviously written this as their first foray into cybercrime. The techniques used to gain persistence and prevent being terminated are entry level, but its complex as it talks to a remote C2 server.
This is no teenage script kiddy. This is a semi-experienced to experienced software engineer who is likely very new to writing malware, and this is probably their first botnet.
The malware was written by hand, and the botnet is probably owned by the author
of the malware. The author is probably early in their career, recently finishing
University with some sort of Computer Science degree, and has taken some
operating system classes to learn about fork()
, dup2()
and signals.
Most script kiddys could buy a quality exploit kit + botnet off of the dark net for a few hundred dollars, and it would be fully featured and be much more complex than hy4 is.
hy4 seems to be full of beginner mistakes, for example, not stripping the binary, using a default UPX and not modifying the UPX distribution such that normal UPX won’t be able to unpack the executable. All the strings in the binary were not encrypted, or any effort undertaken to hide them. There was no inserting of data bytes in code to fool disassembly algorithms.
There was no hiding domain names or IP addresses.
hy4 and dovecat seem to be compiled very recently, within the last month. dovecat also had metadata intact, and we could see what compiler was used. Possibly written by someone bored at home during COVID lockdowns? Who knows.
To the owner of hy4. Take your botnet down. If someone was sufficiently motivated, they could probably find you. You have likely made similar beginner mistakes with your IRC C2 server. The risk is not worth it for $200 of Monero.
I’m not going to come after you. I don’t care in the slightest. I only did this analysis for fun, and to see what sort of threat this malware has to the community.
But hey, your malware is also great analysis for beginner malware analysts. If anyone reading this is a beginner reverse engineer, give these samples a try. You won’t be disappointed.
Conclusion
Today, we did a full analysis of the dovecat and hy4 malware, from samples taken from a real production machine that had been infected, from a case filed about some suspicious behaviour.
We determined that dovecat is a cryptocurrency miner that mines Monero (XMR), and hy4 is a IRC botnet malware dropper, that has the ability to spawn root shells, and to execute malware payloads.
I had a lot of fun analysing this malware. It’s great to get back to reverse engineering again. I don’t get a lot of opportunities to open up Cutter and Ghidra these days. I like pulling things apart and admiring other’s hard work, and solving puzzles that reverse engineering binaries bring.
I hope you enjoyed the writeup. If you have any questions or comments, contact me.
Matthew Ruffell