Windows x64 Shellcoding – Part 1

Published by

on

The first part of a three part series on Windows x64 Shellcoding. Going from a basic Hello World program in Assembly to implementing guardrails in MSFVenom shellcode.

Part 1 – Assembly and WinAPIs – Hello World
Part 2 – Assembly to Shellcode – Stack Buffers and WinAPIs
Part 3 – Adding extra functionality to MSFVenom

So, to start with we’re going to write a basic Hello World program in x64 ASM to introduce some basic topics, and then use this program to demonstrate some of the difficulties faced when writing shellcode which will then be addressed in part 2.

Hello World, Calling Convention and Windows APIs

Our program is going to write to stdout with the Windows API WriteFile. The C++ function definition from the Microsoft documentation is provided below.

BOOL WriteFile(
  [in]                HANDLE       hFile,
  [in]                LPCVOID      lpBuffer,
  [in]                DWORD        nNumberOfBytesToWrite,
  [out, optional]     LPDWORD      lpNumberOfBytesWritten,
  [in, out, optional] LPOVERLAPPED lpOverlapped
);a

While I am starting with Hello World, I assume you already have an elementary understanding of how Assembly works. If you’re unfamiliar with the Windows x64 calling convention, it’s worth reading about the calling convention for function calls. The main part that’s required is provided below.

By default, the x64 calling convention passes the first four arguments to a function in registers. The registers used for these arguments depend on the position and type of the argument. Remaining arguments get pushed on the stack in right-to-left order.

Integer valued arguments in the leftmost four positions are passed in left-to-right order in RCX, RDX, R8, and R9, respectively. The fifth and higher arguments are passed on the stack as previously described. All integer arguments in registers are right-justified, so the callee can ignore the upper bits of the register and access only the portion of the register necessary

https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170

A good example to illustrate this is also provided in the documentation.

func1(int a, int b, int c, int d, int e, int f);
// a in RCX, b in RDX, c in R8, d in R9, f then e pushed on stack

So back to our example. We want to call the function WriteFile(HANDLE, LPCVOID, DWORD, LPDWORD, LPOVERLAPPED). Using the calling convention, this would look something like this.

mov	rcx, <stdOutHandle>
mov	rdx, <helloWorldString>
mov	r8, <numberOfBytesToWrite>
mov	r9, <numberOfBytesWritten>
push	<pointerToOverlappedStructure>

You can hopefully identify/understand the ordering of these arguments based on the previous examples. The exact requirements for these arguments can be found using the Windows API documentation. The final three of these arguments can just be integers/NULL. The hello world string for the moment will be defined in a variable in the .data section as shown below:

section .data
    hello: db "Hello, World!",0

If you’re unfamiliar with what I mean by a .data section, I’m referring to the section in an Assembly program where we can store variables. As an example, consider the following diagram of a Portable Executable (PE32) structure, PE32 is the structure that is used for Executable (EXE) and Dynamic Link Library (DLL) files. The important things to note in this diagram are the .text and .data sections.

So later, after we compile our Assembly program our hello variable will be stored in the .data section and our Assembly code will be stored in .text.

So that covers all variables except for stdOutHandle. For this, we’re going to use another Win API function, GetStdHandle. Luckily this function only requires a single argument and it’s documented nicely in the Windows documentation. The argument corresponds to a handle type, either input, output or error. As we’re trying to print to standard out, we’ll use the output argument, i.e. -11.

If you remember the calling convention listed earlier, we only have one argument so this will be placed in the rcx register. After calling the GetStdHandle function, our output variable will be stored in rax, which we can then move to rcx for the first argument of the WriteFile function.

After printing the string to standard out, we’re going to call a final Win API function, ExitProcess. Again this only requires a single argument, and we’re going to use 0 as the argument to indicate a successful exit.

So with our understanding of the .data and .text sections, Win API functions, and calling conventions, lets put all this together now into a final ASM file.

extern GetStdHandle
extern WriteFile
extern ExitProcess

section .data
	hello: db "Hello, World!",0

section .text
global start
start:
	sub	rsp, 40       

	mov	rcx, -11         
	call	GetStdHandle

	mov 	rcx, rax

	mov 	rdx, hello
   	mov 	r8, 13
	mov 	r9, 0
	mov	qword [rsp+32], 0
	call	WriteFile

	xor 	ecx, ecx
	call 	ExitProcess

The other parts that have not been explained in this are the initial sub instruction, extern and global lines. The initial sub instruction reserves shadow space and aligns our stack. The extern calls tell our program that we will be using externally linked functions (from other libraries) into our compiled code. The global label start allows us to specify the entry point when using the linker for when we execute our final EXE.

Compilation

Now we need to compile our file, I’m using the Netwide Assembler (NASM). Compilation to an object file is very simple, we just provide the format win64 to define the architecture and we get our object file, hello.obj.

nasm -f win64 hello.asm

But we still need to tell our machine code where to find the external functions. If we go back to the Microsoft documentation for WriteFile and scroll to the Requirements section at the bottom of the page you will find the library and DLL file that the WriteFile function is used in, in this example this is Kernel32.lib.

I will be creating the final Hello World executable with the linker included with Microsoft Visual Studio. To find these files you can just search the filesystem for Kernel32.dll and link.exe and it should be straight forward. We provide the linker with our obj file, entry point, subsystem argument and the external libraries required for external function calls.

link.exe hello.obj /entry:start /subsystem:console kernel32.lib

The other functions are also in Kernel32.lib so now we should be able to compile and get our final EXE, and run .\hello.exe.

C:\Users\jord\dev\asm>.\hello.exe
Hello, World!

Now we have our working compiled EXE, let’s try and turn this into shellcode. This will not work, but the purpose here is to explain why, to identify what we need to fix in the part 2 of this series.

Disassembly

Open up Linux or WSL if you prefer and let’s analyse the file with objdump. First we can view the .data section, which as previously discussed holds our hello world string.

$ objdump -s -j .data hello.exe

hello.exe:     file format pei-x86-64

Contents of section .data:
 140003000 48656c6c 6f2c2057 6f726c64 2100      Hello, World!.

We can also dump the .text section of the EXE.

$ objdump hello.exe -d

hello.exe:     file format pei-x86-64


Disassembly of section .text:

0000000140001000 <.text>:
   140001000:   48 83 ec 28             sub    $0x28,%rsp
   140001004:   48 c7 c1 f5 ff ff ff    mov    $0xfffffffffffffff5,%rcx

   14000100b:   e8 2e 00 00 00          call   0x14000103e
   140001010:   48 89 c1                mov    %rax,%rcx
   140001013:   48 ba 00 30 00 40 01    movabs $0x140003000,%rdx

   14000101a:   00 00 00
   14000101d:   41 b8 0d 00 00 00       mov    $0xd,%r8d
   140001023:   41 b9 00 00 00 00       mov    $0x0,%r9d
   140001029:   48 c7 44 24 20 00 00    movq   $0x0,0x20(%rsp)
   140001030:   00 00
   140001032:   e8 0d 00 00 00          call   0x140001044
   140001037:   31 c9                   xor    %ecx,%ecx
   140001039:   e8 0c 00 00 00          call   0x14000104a
   14000103e:   ff 25 cc 0f 00 00       jmp    *0xfcc(%rip)        # 0x140002010

   140001044:   ff 25 b6 0f 00 00       jmp    *0xfb6(%rip)        # 0x140002000

   14000104a:   ff 25 b8 0f 00 00       jmp    *0xfb8(%rip)        # 0x140002008

The byte values, i.e. 48 83 ec 28 48 c7 c1 f5 ... is our shellcode. This is the code than the executable will run when we execute .\hello.exe. Have a look at the original Assembly and disassembly to see how the instructions have changed.

You may have already noticed how our hello world string is being handled, movabs $0x140003000,%rdx. The 0x140003000 address also corresponds to the address in the disassembled .data section. So we when take our shellcode (the bytes from the .text section and use process injection techniques to inject it into a different process than the original executable, the new process will have something else located at 0x140003000. So the first problem we face when writing shellcode is that unfortunately, we’re not able to use the .data section to store our variables. The second problem is essentially the same idea, but we don’t know where our functions are located, or if they exist at all within the new process.

The next part of this tutorial series will cover Stack variables (strings and buffers), and one method for position independent Win API function calls.

Windows x64 Shellcoding – Part 2

If you learnt something cool or this helped you out, send me some BTC for beer money! 😉

bc1qannme72ya2gechk2ued2f96ec6v2veyctvz7mc

Blog at WordPress.com.