How to Emulate Android Native Libraries Using Qiling?

While doing a penetration test on an Android application or even while performing malware analysis on an APK file, it is often seen that important application functionality has been implemented inside the native library. 

The native code languages like C and C++ provide benefits such as increased speed and access to native platform libraries. The native code is later compiled into machine code (shared libraries). The JNI (Java Native Interface) is the interface through which the Java and C++ components talk to one another.

Analyzing the native libraries of an Android application is thus important, but it can get increasingly difficult based on the level of obfuscation the application has implemented. 

While analyzing a heavily obfuscated native library, a good emulation framework is a must-have tool in a reverse engineer’s arsenal. Trust me, this is going to make your life much easier unless you thrive on manually going through each assembly instruction one by one. Having said that, recently, I have been on a quest to find a good emulation framework for emulating native libraries of an APK, and while I was at it, I stumbled upon Qiling. Hence this blog. 

What is Qiling?

Qiling is an advanced binary emulation framework written in Python and based on the Unicorn engine. It supports multiple platforms (Windows, MacOS, Linux, BSD, UEFI) and multiple architectures (X86, X86_64, Arm, Arm64, MIPS). Qiling is designed as a higher-level framework that leverages Unicorn to emulate CPU instructions, but Qiling understands OS as it has executable format loaders (for PE, MachO & ELF at the moment), dynamic linkers (so we can load & relocate shared libraries), syscall & IO handlers. 

One can install Qiling by simply running:

pip3 install qiling

Through this blog, I would like to demonstrate the capabilities of Qiling when it comes to emulating native Android libraries. For this, I have created a sample application that has basic root detection checks inside the native library.

After opening the native binary in a disassembler, one can see that there is a checkRoot function. The function has defined a string array containing encrypted strings. Using a loop, each string is passed to the `transform` function, and then the return value of the `transform` function is passed to the `exists` function.

checkRoot function

The `exists` function passes the argument to the `fopen` function and then validates the return value. This is done to check if the file is present or not. This is a very rudimentary root detection check.

root detection check

Based on this information, we can assume that the `transform` function is a decryption routine that will decrypt our encrypted strings. Our goal will be to emulate the `transform` function so we don’t have to analyze and rewrite our own decryption routine to get the decrypted strings. 

How to Use Qiling?

For using Qiling, we would need to start by importing Qiling and a few of Qiling’s constants. We will also need `UC_PROT_ALL` and `UC_MEM_WRITE` from Unicorn. 

from qiling import *
from qiling.const import QL_VERBOSE, QL_INTERCEPT
from qiling.os.const import STRING, SIZE_T, POINTER
from unicorn import UC_PROT_ALL, UC_MEM_WRITE
import struct

def my_sandbox(path, rootfs):
    start_addr = 0xb80
    end_addr = 0xc08
    ql = Qiling(path, rootfs, verbose=QL_VERBOSE.DEBUG)
    base_address = int(ql.profile.get("OS64", "load_address"), 16)
    prepare_for_emulation(ql, base_address)

    ql.hook_address(branch_transform, base_address + 0xbdc)


if __name__== "__main__":

 my_sandbox(["/home/krat0s/Downloads/qilingLab/libkeys.so"],                       "/home/krat0s/projects/qiling/qiling/examples/rootfs/arm64_android")

 

Here, I have created a `my_sandbox` function that takes two arguments, `path` and `rootfs`. Then inside the `my_sandbox` function, we are initiating Qiling. Here, Qiling initialization constructor can take multiple arguments:

  • filename: binary file and its arguments, example: filename=[“test”,”-argv1”,”argv2”]
  • rootfs: virtual “/” folder, this is a “jail” file system when executing Qiling (target architecture)
  • env: environment variables, example: env={“SHELL”:”/bin/bash”,”HOME”:”/tmp”}
  • verbose: “default”, “debug”, “disasm”, “dump” where dump=(disam + debug)

After this, we call the `prepare_for_emulation` function by passing the Qiling object and base address to it. Here the base address is the address at which Qiling has loaded our binary in memory. The following is the code for `prepare_for_emulation` function:

def prepare_for_emulation(ql,base_address):
    ql.mem.map(0x6d5a620000, 65536, UC_PROT_ALL, info = "[challenge]")
    ql.mem.map(0x120000, 65536, UC_PROT_ALL, info = "[challenge_2]")
    ql.mem.map(0x0, 1024, UC_PROT_ALL, info = "[challenge_3]")
    ql.arch.regs.write("sp", 0x6d5a620200)
    ql.arch.regs.write("x29", 0x6d5a620280)
    ql.arch.regs.write("x9", 0x555555554000)
    ql.arch.regs.write("x8", 0x0)

 

How We Can Emulate Android Libraries Using Qiling?

The `prepare_for_emulation` function maps three memory regions using the `mem.map` API. Memory has to be mapped before it can be accessed. The map method binds a contiguous memory region at a specified location and sets its access protection bits. A string label may be provided for easy identification on the mapping info table.

Synopsys:

ql.mem.map(addr: int, size: int, perms: int = UC_PROT_ALL, info: Optional[str]=None) -> None

 

Arguments: -addr- requested mapping base address; should be on a page granularity (see: pagesize) -size- mapping size in bytes; must be a multiplication of page size - perms - protection bitmap; defines whether this memory range is readable, writeable and/or executable (optional, see: UC_PROT_* constants) - info - sets a string label to the mapped range for easy identification (optional)

Next, we are initializing the SP, x29, x9, and x8 registers using the `ql.arch.regs.write(‘register’, ‘value’)` function. Since we are only going to be emulating the code snippet responsible for decrypting the strings, we need to initialize these registers with suitable values, which are identified by reverse engineering the binary. 

Now we are done creating the memory mappings and initializing the registers needed for emulation. Next up, we will be using the `hook_address` API to register a callback when the execution reaches a certain address. In our code, firstly, branch_transform will be called when execution reaches `base_address+0xbdc`.

Emulate Android Libraries Using Qiling

In the disassembled code, one can see that there is a function call to `transform` at offset 0xbdc. Qiling on its own won't be able to branch to `transform` function and execute it. So using the `branch_transform` function, I am modifying the `PC` register to point to the `transform` function when the execution reaches offset `0xbdc`.

def branch_transform(ql: Qiling) -> None:
    transform = 0x555555554000 + 0xa20
    ql.arch.regs.write("pc", transform) 

Similarly, I will add two more hooks:

ql.hook_address(read_decrypt_string, base_address + 0xb30)
ql.hook_address(branch_back_to_main, base_address + 0xb38)

The `read_decrypt_string` function will read the decrypted string by hooking just before `transform` returns.

read_decrypt_string` function

`arch.regs.read` API is used to read the memory pointer stored in x0 register, and then `mem.string` API is used to read the value stored at this memory location.

def read_decrypt_string(ql: Qiling) -> None:
    x0 = ql.arch.regs.read("x0")
    if ql.mem.is_mapped(x0,1024):
        t = ql.mem.string(x0)
        ql.log.debug(f'Decrypted String: {t}')

The `branch_back_to_main` function will help to return the program execution to the `checkRoot` function again.

`branch_back_to_main` function

def branch_back_to_main(ql: Qiling) -> None:
    ret = 0x555555554000 + 0xbe0
    ql.arch.regs.write("pc", ret)

Next up, we will need to hook calls to libc functions. Since this is a dynamically linked shared library, the definitions for libc functions are not inside our binary. What we can do is hook calls to these libc functions and then we can have Qiling run our custom functions that would emulate the behavior of these libc functions. 

We can hijack libc function calls using:

ql.os.set_api('libc_function', our_function_to_execute, QL_INTERCEPT.CALL)

Here, `QL_INTERCEPT.CALL` - hooks intercepting on-call may return a value as necessary.

def my_sandbox(path, rootfs):
    start_addr = 0xb80
    end_addr = 0xc08
    ql = Qiling(path, rootfs, verbose=QL_VERBOSE.DEBUG)
    base_address = int(ql.profile.get("OS64", "load_address"), 16)
    prepare_for_emulation(ql, base_address)
    #ql.hook_mem_write(mem_write)
    ql.hook_address(branch_transform, base_address + 0xbdc)
    ql.hook_address(read_decrypt_string, base_address + 0xb30)
    ql.hook_address(branch_back_to_main, base_address + 0xb38)
    ql.os.set_api('printf', my_printf, QL_INTERCEPT.CALL)
    ql.os.set_api('__strlen_chk', my_strlen, QL_INTERCEPT.CALL)
    ql.os.set_api('__strncat_chk', my_strncat, QL_INTERCEPT.CALL)

The definitions of `my_print`, `my_strlen` and `my_strncat` functions are as follows:

def my_printf(ql: Qiling):
    params = ql.os.resolve_fcall_params({'s1': STRING , 's2': STRING})
    s1 = params['s1']
    s2 = params['s2']
    #ql.log.info(f'my_printf: got "{s1}" and "{s2}" as an argument')

def my_strlen(ql: Qiling):
    params = ql.os.resolve_fcall_params({'s': STRING})
    s = params['s']
    #ql.log.info(f'param to strlen: {s}')
    ql.arch.regs.write("x0", len(s))
    return len(s)

def my_strncat(ql: Qiling):
    params = ql.os.resolve_fcall_params({'s1': STRING , 's2': POINTER,
's3': SIZE_T})
    #s1 = params['s1']
    s2 = params['s2']
    s3 = params['s3']
    ff = ql.mem.read(s2,1)[0]
    addr = ql.arch.regs.read("x0")
    s1 = ql.mem.string(addr)
    #ql.log.info(f'Called strncat({s1},{chr(ff)},{s3})')
    s1 += chr(ff)
    ql.mem.string(addr,s1)
    return s1

Finally, we will start the emulation by calling `ql.run()`. The run() function can also take multiple arguments:

begin: start address of emulated code
end: end address of emulated code
timeout: emulation timeout (in microseconds)
count: maximum instruction count to be emulated

def my_sandbox(path, rootfs):
    start_addr = 0xb80
    end_addr = 0xc08
    ql = Qiling(path, rootfs, verbose=QL_VERBOSE.DEBUG)
    base_address = int(ql.profile.get("OS64", "load_address"), 16)
    prepare_for_emulation(ql, base_address)
    #ql.hook_mem_write(mem_write)
    ql.hook_address(branch_transform, base_address + 0xbdc)
    ql.hook_address(read_decrypt_string, base_address + 0xb30)
    ql.hook_address(branch_back_to_main, base_address + 0xb38)
    ql.os.set_api('printf', my_printf, QL_INTERCEPT.CALL)
    ql.os.set_api('__strlen_chk', my_strlen, QL_INTERCEPT.CALL)
    ql.os.set_api('__strncat_chk', my_strncat, QL_INTERCEPT.CALL)
    ql.run(begin=base_address+start_addr, end=base_address+end_addr)

Now let’s run our emulation script.

image6-4

As you can see, the script ran successfully and we got the decrypted strings. The sample application and the emulation script can be found here.

Final Thoughts

The innovative approach of emulating native libraries using Qiling not only paves the way for enhanced software testing and debugging but also deepens our understanding of how Android native applications operate. 

Leveraging Qiling's unique features allows developers and researchers to bridge the gap between dynamic analysis and execution, and with the guidelines presented, one can adeptly navigate the challenges inherent in this complex emulation task. Whether for vulnerability research or software development, mastering this emulation technique will undoubtedly provide significant advantages to developers and testers in the Android ecosystem.

Published on Sep 4, 2023
Abhinav Vasisth
Written by Abhinav Vasisth
Abhinav Vasisth is a certified ethical hacker and the security research lead at Appknox, a mobile security suite that helps enterprises automate mobile security. Abhinav has been a critical member of Appknox for 5 years, reinventing the standards of mobile app security against evolving threats. He is highly regarded in the industry for his expertise, speaks at various security conferences like PHDays, and has collaborated with numerous enterprises to safeguard their digital assets.
When he's not outsmarting hackers, he listens to metal music or is lost in books.

Questions?

Chat With Us

Using Other Product?

Switch to Appknox

2 Weeks Free Trial!

Get Started Now