The first part of a three part series on Windows x64 Shellcoding. Going from a basic Hello World program in Assembly to implementing guardrails in MSFVenom shellcode.
Part 1 – Assembly and WinAPIs – Hello World
Part 2 – Assembly to Shellcode – Stack Buffers and WinAPIs
Part 3 – Adding extra functionality to MSFVenom
So, to start with we’re going to write a basic Hello World program in x64 ASM to introduce some basic topics, and then use this program to demonstrate some of the difficulties faced when writing shellcode which will then be addressed in part 2.
Hello World, Calling Convention and Windows APIs
Our program is going to write to stdout
with the Windows API WriteFile
. The C++ function definition from the Microsoft documentation is provided below.
BOOL WriteFile(
[in] HANDLE hFile,
[in] LPCVOID lpBuffer,
[in] DWORD nNumberOfBytesToWrite,
[out, optional] LPDWORD lpNumberOfBytesWritten,
[in, out, optional] LPOVERLAPPED lpOverlapped
);a
While I am starting with Hello World, I assume you already have an elementary understanding of how Assembly works. If you’re unfamiliar with the Windows x64 calling convention, it’s worth reading about the calling convention for function calls. The main part that’s required is provided below.
By default, the x64 calling convention passes the first four arguments to a function in registers. The registers used for these arguments depend on the position and type of the argument. Remaining arguments get pushed on the stack in right-to-left order.
Integer valued arguments in the leftmost four positions are passed in left-to-right order in RCX, RDX, R8, and R9, respectively. The fifth and higher arguments are passed on the stack as previously described. All integer arguments in registers are right-justified, so the callee can ignore the upper bits of the register and access only the portion of the register necessary
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170
A good example to illustrate this is also provided in the documentation.
func1(int a, int b, int c, int d, int e, int f);
// a in RCX, b in RDX, c in R8, d in R9, f then e pushed on stack
So back to our example. We want to call the function WriteFile(HANDLE, LPCVOID, DWORD, LPDWORD, LPOVERLAPPED)
. Using the calling convention, this would look something like this.
mov rcx, <stdOutHandle>
mov rdx, <helloWorldString>
mov r8, <numberOfBytesToWrite>
mov r9, <numberOfBytesWritten>
push <pointerToOverlappedStructure>
You can hopefully identify/understand the ordering of these arguments based on the previous examples. The exact requirements for these arguments can be found using the Windows API documentation. The final three of these arguments can just be integers/NULL
. The hello world string for the moment will be defined in a variable in the .data
section as shown below:
section .data
hello: db "Hello, World!",0
If you’re unfamiliar with what I mean by a .data section, I’m referring to the section in an Assembly program where we can store variables. As an example, consider the following diagram of a Portable Executable (PE32) structure, PE32 is the structure that is used for Executable (EXE) and Dynamic Link Library (DLL) files. The important things to note in this diagram are the .text
and .data
sections.
So later, after we compile our Assembly program our hello
variable will be stored in the .data
section and our Assembly code will be stored in .text
.
So that covers all variables except for stdOutHandle
. For this, we’re going to use another Win API function, GetStdHandle
. Luckily this function only requires a single argument and it’s documented nicely in the Windows documentation. The argument corresponds to a handle type, either input, output or error. As we’re trying to print to standard out, we’ll use the output argument, i.e. -11
.
If you remember the calling convention listed earlier, we only have one argument so this will be placed in the rcx
register. After calling the GetStdHandle
function, our output variable will be stored in rax
, which we can then move to rcx
for the first argument of the WriteFile
function.
After printing the string to standard out, we’re going to call a final Win API function, ExitProcess
. Again this only requires a single argument, and we’re going to use 0
as the argument to indicate a successful exit.
So with our understanding of the .data
and .text
sections, Win API functions, and calling conventions, lets put all this together now into a final ASM file.
extern GetStdHandle
extern WriteFile
extern ExitProcess
section .data
hello: db "Hello, World!",0
section .text
global start
start:
sub rsp, 40
mov rcx, -11
call GetStdHandle
mov rcx, rax
mov rdx, hello
mov r8, 13
mov r9, 0
mov qword [rsp+32], 0
call WriteFile
xor ecx, ecx
call ExitProcess
The other parts that have not been explained in this are the initial sub
instruction, extern
and global
lines. The initial sub instruction reserves shadow space and aligns our stack. The extern
calls tell our program that we will be using externally linked functions (from other libraries) into our compiled code. The global
label start
allows us to specify the entry point when using the linker for when we execute our final EXE.
Compilation
Now we need to compile our file, I’m using the Netwide Assembler (NASM). Compilation to an object file is very simple, we just provide the format win64
to define the architecture and we get our object file, hello.obj
.
nasm -f win64 hello.asm
But we still need to tell our machine code where to find the external functions. If we go back to the Microsoft documentation for WriteFile
and scroll to the Requirements
section at the bottom of the page you will find the library and DLL file that the WriteFile
function is used in, in this example this is Kernel32.lib
.
I will be creating the final Hello World executable with the linker included with Microsoft Visual Studio. To find these files you can just search the filesystem for Kernel32.dll
and link.exe
and it should be straight forward. We provide the linker with our obj
file, entry point, subsystem
argument and the external libraries required for external function calls.
link.exe hello.obj /entry:start /subsystem:console kernel32.lib
The other functions are also in Kernel32.lib
so now we should be able to compile and get our final EXE
, and run .\hello.exe
.
C:\Users\jord\dev\asm>.\hello.exe
Hello, World!
Now we have our working compiled EXE, let’s try and turn this into shellcode. This will not work, but the purpose here is to explain why, to identify what we need to fix in the part 2 of this series.
Disassembly
Open up Linux or WSL if you prefer and let’s analyse the file with objdump
. First we can view the .data
section, which as previously discussed holds our hello world string.
$ objdump -s -j .data hello.exe
hello.exe: file format pei-x86-64
Contents of section .data:
140003000 48656c6c 6f2c2057 6f726c64 2100 Hello, World!.
We can also dump the .text
section of the EXE
.
$ objdump hello.exe -d
hello.exe: file format pei-x86-64
Disassembly of section .text:
0000000140001000 <.text>:
140001000: 48 83 ec 28 sub $0x28,%rsp
140001004: 48 c7 c1 f5 ff ff ff mov $0xfffffffffffffff5,%rcx
14000100b: e8 2e 00 00 00 call 0x14000103e
140001010: 48 89 c1 mov %rax,%rcx
140001013: 48 ba 00 30 00 40 01 movabs $0x140003000,%rdx
14000101a: 00 00 00
14000101d: 41 b8 0d 00 00 00 mov $0xd,%r8d
140001023: 41 b9 00 00 00 00 mov $0x0,%r9d
140001029: 48 c7 44 24 20 00 00 movq $0x0,0x20(%rsp)
140001030: 00 00
140001032: e8 0d 00 00 00 call 0x140001044
140001037: 31 c9 xor %ecx,%ecx
140001039: e8 0c 00 00 00 call 0x14000104a
14000103e: ff 25 cc 0f 00 00 jmp *0xfcc(%rip) # 0x140002010
140001044: ff 25 b6 0f 00 00 jmp *0xfb6(%rip) # 0x140002000
14000104a: ff 25 b8 0f 00 00 jmp *0xfb8(%rip) # 0x140002008
The byte values, i.e. 48 83 ec 28 48 c7 c1 f5 ...
is our shellcode. This is the code than the executable will run when we execute .\hello.exe
. Have a look at the original Assembly and disassembly to see how the instructions have changed.
You may have already noticed how our hello world string is being handled, movabs $0x140003000,%rdx
. The 0x140003000
address also corresponds to the address in the disassembled .data
section. So we when take our shellcode (the bytes from the .text
section and use process injection techniques to inject it into a different process than the original executable, the new process will have something else located at 0x140003000
. So the first problem we face when writing shellcode is that unfortunately, we’re not able to use the .data
section to store our variables. The second problem is essentially the same idea, but we don’t know where our functions are located, or if they exist at all within the new process.
The next part of this tutorial series will cover Stack variables (strings and buffers), and one method for position independent Win API function calls.
Windows x64 Shellcoding – Part 2
If you learnt something cool or this helped you out, send me some BTC for beer money! 😉
bc1qannme72ya2gechk2ued2f96ec6v2veyctvz7mc