Indirect Windows Syscalls – Trampoline Jumps

Published by

on

How to call undocumented Windows APIs/syscalls through Trampoline Jumps.

I was watching a video by eversinc33 discussing Syscalls for Windows Malware and wanted to learn how to implement trampoline jumps with undocumented Nt API calls, and decided to do it myself. This blog is therefore a walkthrough of that process and how I implemented them.

What are Indirect Syscalls / Trampoline Jumps

Win API calls execute additional undocumented functions which were not intended for use by non native Windows functionality. Win API calls, however can contain userland hooks to detect malicious behavior. Because of this, people turned to calling the undocumented functions directly, with direct syscalls.

These however, can also be problematic for evasion as analysis can check the call tree to identify if these functions are being called by something other than the standard and intended Win API functions, and is also easy to identify with static analysis by searching for a syscall instruction.

So, that leads us to indirect syscalls, or Trampoline Jumps. the purpose is to call the syscall contained within the undocumented functions, without calling the original Win API, without calling an undocumented function directly, and also without using a syscall instruction.

To demonstrate this I’m going to start from a basic CreateRemoteThread payload using direct Win API calls, then modify this to use direct Nt API calls and finally modify this to use indirect syscalls with Trampoline Jumps.

Identifying Nt Functions

I chose a basic CreateRemoteThread payload from Red Team Notes as a base to start from, which uses the standard Win API calls. Using the method described in y3a‘s blog with API monitor we can execute this payload to see the call tree of each of the API calls in the CreateRemoteThread payload, i.e. OpenProcess, VirtualAllocEx, WriteProcessMemory and CreateRemoteThread.

In the example of VirtualAllocEx, this function calls the function NtAllocateVirtualMemory. Before we get indirect calls working, we can begin with getting this function working by using SysWhispers. The output generated by the header file from SysWhispers should look something like this:

EXTERN_C NTSTATUS NtAllocateVirtualMemory(
	IN HANDLE ProcessHandle,
	IN OUT PVOID * BaseAddress,
	IN ULONG ZeroBits,
	IN OUT PSIZE_T RegionSize,
	IN ULONG AllocationType,
	IN ULONG Protect);

Finding a Syscall Instruction

I’m going to modify the SysWhispers function slightly and then use GetProcAddress to get the address of the NtAllocateVirtualMemory function (this could of course be improved, but for simplicity/explaining the indirect syscall method, I’m using this approach).

#include "Windows.h"
#include <stdio.h>

using jNtAllocateVirtualMemory = NTSTATUS(NTAPI*)(                                                                                                                                                                                                                                                                                   
    IN HANDLE ProcessHandle,                                                                                                                                                                                                                                                                                                       
    IN OUT PVOID* BaseAddress,                                                                                                                                                                                                                                                                                                   
    IN ULONG ZeroBits,                                                                                                                                                                                                                                                                                                            
    IN OUT PSIZE_T RegionSize,                                                                                                                                                                                                                                                                                                   
    IN ULONG AllocationType,                                                                                                                                                                                                                                                                                                     
    IN ULONG Protect); 

void main() {
    // sorry single line, bit messed up here
    jNtAllocateVirtualMemory ntAllocateVirtualMemory = (jNtAllocateVirtualMemory)(GetProcAddress(GetModuleHandleA("ntdll"), "NtAllocateVirtualMemory")); 
    
    printf("Address: %p\n", ntAllocateVirtualMemory);
}

Running this on my computer prints out the address 00007FFCAC64D140. If we then open up WinDbg, execute calc.exe or whichever binary you wish, go to View->Disassembly and search for the address you should see the assembly code for the NtAllocateVirtualMemory function. Also take note of the syscall instruction number that is being used in the eax register, this instruction is used with the syscall instruction to decide which system level function to call.

Now if we restart the computer and rerun our program, my address for NtAllocateVirtualMemory is now at 00007FFF5458D140 so it’s going to be randomised and we can’t hardcode a single address unfortunately. We can then do some operations on the address to get the address of the syscall instruction based on the offset. I assume this method of getting the syscall instruction will be problematic if the functions are hooked, but I haven’t verified that yet, still it’s a nice POC.

    char nt_function_char_address[17];
    sprintf(nt_function_char_address, "%p", ntAllocateVirtualMemory);
    uintptr_t converted_uint_address;
    sscanf(nt_function_char_address, "%p", (void**)&converted_uint_address);
    converted_uint_address += 0x12;

Inline Assembly

We need to write some custom Assembly to do our trampoline jump, but we can’t write inline assembly with x64 processors so we’re going to have to write a separate Assembly file, compile it to an object file and use during compilation.

Inline assembly is not supported on the ARM and x64 processors.

https://learn.microsoft.com/en-us/cpp/assembler/inline/inline-assembler?view=msvc-170

The following code demonstrates how to pass six strings into a function which will be defined in an external assembly file. The purpose is to demonstrate how to pass arguments to an Assembly function, and how to handle the arguments in that function, which will be important for later.

#include <iostream>

extern "C" char* printString(
    char *one, 
    char *two, 
    char *three, 
    char *four, 
    char *five, 
    char *six);

void main() {
    char *a1 = "one";
    char *a2 = "two";
    char *a3 = "three";
    char *a4 = "four";
    char *a5 = "five";
    char *a6 = "six";
    char *ret = printString(a1, a2, a3, a4, a5, a6);
    std::cout << ret << std::endl;
}

Next we can then create a separate ASM file containing our printString function. It’s worth compiling these files yourself and printing out the different args, read the Windows x64 calling convention for information on printing out each argument. Most importantly, how do you print out the a1 and a6 arguments?

section .text
global printString

printString:
	mov rax, qword [rsp+40]
        ret

Next we can create our environment with vcvars, compile our ASM file to an object file with nasm, then compile with clang.

"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars64.bat"

nasm -f win64 printString.asm

"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.37.32822\bin\Hostx64\x64\cl.exe" main.cpp printString.obj

.\main.exe

Implementing the Trampoline Jump

So now we have everything to create the trampoline jump, we can define a function to replace our SysWhispers function that imitates the behavior in the Disassembly we looked at earlier. This may work a little differently for each function that is implemented, for example different syscall instruction numbers.

We want to implement the Assembly function from eversinc33’s video on syscalls in Malware, basically the same ASM code from the disassembly shown previously. The syscall instruction number can also be found from the disassembly, in the example of NtAllocateVirtualMemory this would be 0x18. The syscall instruction address is the address we calculated earlier from the SysWhispers function, which we now pass in as an additional argument in the C++ code and can access in the ASM with an offset from the stack pointer [rsp + 56].

section .text
global asmAllocateVirtualMemory

asmAllocateVirtualMemory:
        mov r10, rcx
        mov eax, <syscallInstructionNumber> 
        mov r11, [rsp + 56] ; <syscallInstructionAddress>       
        jmp r11
        ret

Then we replicate the function definition of our imported function from SysWhispers, but we add an additional argument for our syscall instruction address.

extern "C" NTSTATUS asmAllocateVirtualMemory(
        HANDLE ProcessHandle,
        PVOID * BaseAddress,
        ULONG ZeroBits,
        PSIZE_T RegionSize,
        ULONG AllocationType,
        ULONG Protect, 
        uint64_t Addr);

I’m unsure about what the use of r10 is here in the first ASM instruction, the argument would be the process handle variable and the mov instruction was in the disassembly and eversinc33’s video, and it doesn’t work without it.

Then we can call the ASM function the same way we would with the Nt function from SysWhispers by using the same arguments, but instead use our new ASM function with the syscall instruction at the end instead. The C++ function call will automatically place our function arguments onto the stack/registers in the same way that the NtAllocateVirtualMemory function expects, then we can get the final argument (syscall address) by referencing the value on the stack so that our jump instruction will jump to the correct location with the arguments already prepared for the syscall to operate on.

If you learnt something cool send me BTC for future jord things (:

bc1qannme72ya2gechk2ued2f96ec6v2veyctvz7mc

Blog at WordPress.com.