raw-data memdumps

GHIDRA scripting - Artra Downloader strings decryptor

March 25, 2019

There is not yet much talking about Ghidra plugins and malware analysis .. so here we go … what follows is a 100 feet view, on how to quickly craft together a Ghidra python snippet, malware analysis oriented.

Objectives

Decrypt Artra Downloader v1 encrypted strings with

  1. Ghidra Jython
  2. (plus - because, why not) IDAPython

Quick background about Artra Downloader

I am using the same malware label that was given by PaloAltoNetworks’s Unit42 team to easly identify the sample family, but it might be also known as “CTF Loader”, as ironically reported by @VK_Intel ;) .

A full analysis - also covering variant 2 and 3 - can be found here.

Firstly observed in this tweet from @malwrhunterteam, the decryption routine is fairly easy, thus plays well for testing it with the new RE tool.

To The Batmobile!

Sample information:

(sample pwd commonly used in the malware research sector for sharing samples)

Here the main steps we have to follow for reaching our goals:

  1. Overcome the string obfuscation
  2. Identify code referencing the decryption function
  3. Get function arguments - one will be the enc string offset addr
  4. Extract encrypted strings from offset address
  5. Apply string decryption
  6. Return results

Artra v1 strings decryption function

Graph view vs Pseudo-C view

Ghidra malware analysis scripting 101

First things first

1. Overcome the string obfuscation

This is straight forward - link

1
2
3
def decrypt_string(enc_str):
	mapping = (enc_str, ''.join([chr(ord(char) - 1) for char in enc_str]))
	return(mapping[1])

2. Identify code referencing the decryption function

For this particular sample, the decryption function is located @0x004026b0, for the test, we will just hardcode this value.

1
2
3
def run():
	xrefs = getReferencesTo(toAddr(0x004026b0))
	extract_encrypted_str(xrefs)

3. and 4. Get function arguments + Extract encrypted strings from offset address

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def get_function_args(addr):
	while True:
		# get instruction at given address
		ins = getInstructionBefore(toAddr(addr))
		# get instruction offset address
		ins_addr = ins.getAddress()
		# check pattern
		get_ins = getInstructionAt(toAddr(addr))
		op = get_ins.toString().split()[0]
		if "MOV" == op and get_ins.getDefaultOperandRepresentation(0) == "EAX" and \
			 "0x" in get_ins.getDefaultOperandRepresentation(1):
			enc_string = getDataAt(toAddr(get_ins.getDefaultOperandRepresentation(1)))
			if enc_string:
				# map encrypted string and its offset address
				mapping = (toAddr(get_ins.getDefaultOperandRepresentation(1)), enc_string)
				enc_buffer.append(mapping)
			break
		else:
			get_function_args(ins_addr.toString())
		break

def extract_encrypted_str(xrefs):
	for xref in xrefs:
		ref_addr = (xref.getFromAddress())
		get_function_args(ref_addr.toString())

	decrypt_enc_str_and_comment()

The code above is basically scanning backward every functions detected via xref.getFromAddress(), and for each of them, it is checking its arguments, looking for the pattern below

1
2
3
...
MOV EAX, <encrypted_string_addr>
...

If the instructions we are looking for are not detected at first try, we will move backward once again to the previous instructions set and so on, until we reach the pattern we are searching.

Once we have a match, a mapping - for later use - is created considering:

  1. address where the encrypted string is located
  2. the encrypted string itself
1
2
mapping = (toAddr(get_ins.getDefaultOperandRepresentation(1)), enc_string)
enc_buffer.append(mapping)

5. Apply string decryption

Iterating through the pre-filled enc_buffer, the decryption function is called for every gathered string

1
2
3
4
5
6
def decrypt_enc_str_and_comment():
	for enc_str_addr, enc_str in enc_buffer:
		enc_str = enc_str.toString().split()[1].strip("\"")
		dec_str = decrypt_string(enc_str)
        ...
        ...

6. Return results

Decoded strings are returned to Ghidra’s console and also comments are placed beside the encrypted strings inside the listing view.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def decrypt_enc_str_and_comment():
	for enc_str_addr, enc_str in enc_buffer:
		enc_str = enc_str.toString().split()[1].strip("\"")
		dec_str = decrypt_string(enc_str)
		
		# add comments
		codeUnit = listing.getCodeUnitAt(toAddr(enc_str_addr.toString()))
		ds_string = getDataAt(toAddr(enc_str_addr.toString()))
		ds_string.setComment(codeUnit.EOL_COMMENT, dec_str)
		
		# print results to console
		print("Address: %-40s Enc string: %-40s Dec string: %-40s" % \
			(toAddr(enc_str_addr.toString()), enc_str, dec_str))

Console view

Listing view

… some strings …

Putting everything together

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# ArtraDownloader v1 - strings decryptor
#
# Ref sample:
#	file name: winsvc
#   md5: 7cc0b212d1b8ceb808c250495d83bae4  
#   sha1: d2c161ce52240b61d632607a2262890327d82502
#   sha256: ef0cb0a1a29bcdf2b36622f72734aec8d38326fc8f7270f78bd956e706a5fd57
#
# Ref links: 
#	2018.12.19 https://twitter.com/malwrhunterteam/status/1075454863008382976
#	2018.12.21 https://gist.github.com/raw-data/14915eca4e5e2963a9056f935442358d
#	2019.02.25 https://unit42.paloaltonetworks.com/multiple-artradownloader\
#				-variants-used-by-bitter-to-target-pakistan/

#@author raw-data
#@category malware strings decryptor
#@keybinding 
#@menupath 
#@toolbar 

import ghidra.app.script.GhidraScript
import exceptions

enc_buffer = []

listing = currentProgram.getListing()

def decrypt_string(enc_str):
	mapping = (enc_str, ''.join([chr(ord(char) - 1) for char in enc_str]))
	return(mapping[1])

def get_function_args(addr):
	while True:
		# get instruction at given address
		ins = getInstructionBefore(toAddr(addr))
		# get instruction offset address
		ins_addr = ins.getAddress()
		# check pattern
		get_ins = getInstructionAt(toAddr(addr))
		op = get_ins.toString().split()[0]
		if "MOV" == op and get_ins.getDefaultOperandRepresentation(0) == "EAX" \
			and "0x" in get_ins.getDefaultOperandRepresentation(1):
			enc_string = getDataAt(toAddr(get_ins.getDefaultOperandRepresentation(1)))
			if enc_string:
				# map encrypted string and its offset address
				mapping = (toAddr(get_ins.getDefaultOperandRepresentation(1)), enc_string)
				enc_buffer.append(mapping)
			break
		else:
			get_function_args(ins_addr.toString())
		break

def extract_encrypted_str(xrefs):
	for xref in xrefs:
		ref_addr = (xref.getFromAddress())
		get_function_args(ref_addr.toString())

	decrypt_enc_str_and_comment()

def decrypt_enc_str_and_comment():
	for enc_str_addr, enc_str in enc_buffer:
		enc_str = enc_str.toString().split()[1].strip("\"")
		dec_str = decrypt_string(enc_str)
		
		# add comments
		codeUnit = listing.getCodeUnitAt(toAddr(enc_str_addr.toString()))
		ds_string = getDataAt(toAddr(enc_str_addr.toString()))
		ds_string.setComment(codeUnit.EOL_COMMENT, dec_str)
		
		# print results to console
		print("Address: %-40s Enc string: %-40s Dec string: %-40s" % \
			(toAddr(enc_str_addr.toString()), enc_str, dec_str))

def run():
	xrefs = getReferencesTo(toAddr(0x004026b0))
	extract_encrypted_str(xrefs)

run()

Out of curiosity I just translated the code above to IDAPython, resulting in

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from idautils import *
from idc import *

############################################################################
# ArtraDownloader v1 - strings decryptor
#
# Ref sample:
#	file name: winsvc
#   md5: 7cc0b212d1b8ceb808c250495d83bae4  
#   sha1: d2c161ce52240b61d632607a2262890327d82502
#   sha256: ef0cb0a1a29bcdf2b36622f72734aec8d38326fc8f7270f78bd956e706a5fd57
#
# Ref links: 
#	2018.12.19 https://twitter.com/malwrhunterteam/status/1075454863008382976
#	2018.12.21 https://gist.github.com/raw-data/14915eca4e5e2963a9056f935442358d
#	2019.02.25 https://unit42.paloaltonetworks.com/multiple-artradownloader\
#				-variants-used-by-bitter-to-target-pakistan/
############################################################################

__author__ = 'raw-data'

def decrypt_string(enc_str):
	mapping = (enc_str, ''.join([chr(ord(char) - 1) for char in enc_str]))
	return(mapping[1])

def get_string(addr):
	return GetString(addr)

def get_function_args(addr):
  while True:
    addr = idc.PrevHead(addr)
    if GetMnem(addr) == "mov" and "eax" in GetOpnd(addr, 0):
      return GetOperandValue(addr, 1)
      break

def extract_encrypted_str(xrefs):
	for addr in xrefs:
	  ref = get_function_args((addr.frm))
	  enc_str = get_string(ref)
	  dec_str = decrypt_string(enc_str)

	  # add comments
	  MakeComm(addr.frm, dec_str)
	  MakeComm(ref, dec_str)

	  # print results to console
	  print("Address: %-40s Enc string: %-40s Dec string: %-40s" % \
	  	(addr.frm, dec_str, dec_str))

def run():
	xrefs = XrefsTo(0x004026b0, flags=0)
	extract_encrypted_str(xrefs)

run()

If you quickly check both versions of the script, you will see that there are not many (obviously) differences - not counting specific tools API calls.

Just the backward scanning function get_function_args was implemented slightly differently, but I am sure there are more elegant ways to get to the same result (ghidra side) … but as a first try I think it is not too bad and it did the trick!

Going the extra mile

Sample signatures

One the one hand, I am not going to reinvent the wheel, so here you can find @James_inthe_box’s #snort / #suricata and #yara signatures.

On the other hand, if you want to track Artra Downloader v1 string decryption function (you will miss v2 and v3), I got decent results with the following

rule memory_win_trojan_downloader_artra_v1
{
    meta:
        author = "raw-data"
        tlp = "WHITE"

        version = "1.0"
        created = "2019-03-26"
        modified = "2019-03-26"

        description = "Detects Artra string decryption routine"

        reference1 = "https://twitter.com/malwrhunterteam/status/1075454863008382976"
        reference2 = "https://gist.github.com/raw-data/14915eca4e5e2963a9056f935442358d"
        reference3 = "https://unit42.paloaltonetworks.com/multiple-artradownloader-variants-used-by-bitter-to-target-pakistan/"

        sha256_sample1 = "523a17f6892c2558ac4765959df4af938e56a94fa6ed39636b8b7315def3a1b4"
        sha256_sample2 = "ef0cb0a1a29bcdf2b36622f72734aec8d38326fc8f7270f78bd956e706a5fd57"

	strings:
		$hex1 = { 8a 08 40 84 c9 75 ?? 2b c2 8b f0 8d 46 01 50 e8 27 04 00 00 
				  83 c4 04 33 c9 85 f6 7e ?? 55 8b c8  }
		$hex2 = { 8a 14 0f fe ca 88 11 41 83 ed 01 75 ?? 5d 5f c6 04 06 00  }

	condition:
		any of ($hex*)
}

Tags: