The time comes in every programmer’s life where they look back on their career and they stop and they think, “It’s time to learn some assembler.”
This time has come for me.
First I need to choose an assembler. It seems there are a few available. I decided to go with nasm for no reason other than I needed something. If the assembler bug hits me big, I’ll step back and spend some time comparing the different options somewhere.
All my assembly here will be run on a Linux based x86_64 machine.
Let’s start with a basic Hello World.
nasm. It’s likely available in all distributions package managers. Create a folder somewhere
for the project. Then create and edit a file called
We’ll start with a comment:
; Here we do amazing things, like say hello.
Now, onto the code.
An executable has several sections inside it. Most compilers for other languages handle this for us, however in assembler we need to set them up ourselves.
The first section is the data section. Here we setup any normally static data that we may want to use during the running of the program. The section is defined like:
We want to store some text in this section that we can use to offer some form of salutation to the kind person who wishes to run our program. To do this we need to use the pseudo instruction db. A pseudo instruction is not an intruction that gets compiled to machine code, it is an instruction to the assembler to do something - in this case store some bytes at this position in the executable.
greetings db "Greetings earthlings!", 10
The first word
greetings is a label we can use to reference the pointer to the position of this text in memory.
The text we are storing is the text “Greetings earthlings!”, followed by the character 10 - which is a line feed.
The next section we need is text. This is the main part that contains the code to run.
The first thing we have to do is create a global symbol that will be used to tell the linker what part of our code we should run first:
Then we need to insert the label into our code:
You can insert labels throughout your code, which is useful when you want to tell the assembler to start running
code from that position. The
_start label is the only one that you absolutely have to insert - and it has to be
So how do we output text to stdout? We need to run a system call. System calls are basically instructions to the operating system to get it to do stuff. In our case we want to get it to write some text to standard out for us.
Each syscall is identified by a number. You can find the list of available syscalls together with their number
unistd_64.h most likely found in your filesystem under
/usr/include/asm. For example, we want to write
some data, so this line is of interest:
#define __NR_write 1
We want the
write syscall, which is # 1.
At this point it may be worth running
man syscall to get a lowdown on syscalls, in particular the section on
Architecture calling conventions. Calling syscalls is slightly different depending on your architecture. Of note,
for us on x86_64, this is what we need:
Arch/ABI Instruction System Ret Ret Error Notes call # val val2 ─────────────────────────────────────────────────────────────────── x86-64 syscall rax rax rdx - 5
This tells us that we need to pass the system call number in the
rax register. There are a number of different
registers an the x86_64. A register is a tiny storage location that the
CPU can access very quickly. If accessing main memory was a summer trip around the world, accessing a register would
be popping over to your neighbours to feed their cat. So we try to pass as much data around as possible via
So, we know we need to store the syscall number (1) in the
rax register to make our syscall. Let’s write a proper
line of code!
mov rax, 1
This line says ‘move the value 1 into the
We want to pass more information to the
Have a look at the
man syscall page again. This shows us how to pass more parameters to syscalls.
Arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes ────────────────────────────────────────────────────────────── x86-64 rdi rsi rdx r10 r8 r9 -
The first parameter goes in the
rdi register, second in
rsi and so on.
What is the first parameter to write? Syscalls are documented in Section 2 of
man. We can look up more details
man 2 write. Here we see the following C function definition:
ssize_t write(int fd, const void *buf, size_t count);
Seeing the C definition is as good as it gets for us assembler programmers, it seems!
We can see the first parameter is the file descripton. We know that the
fd of stdout is 1. So we can
set that to
rdi - our first parameter as indicated above.
mov rdi, 1
The next parameter is a pointer to the buffer to write. We set this pointer up in the
data section and gave it the label
greetings. We can use that here.
mov rsi, greetings
The final parameter is the number of bytes we want to write. We can count this - “Greetings earthlings!" has 22 characters (including the newline).
mov rdx, 22
Yes! All our parameters are setup. Lets make the syscall:
We have to make one more syscall to tell the OS that we are now safely and happily exiting. This
is done with the
exit syscall, defined as:
#define __NR_exit 60
man 2 exit:
noreturn void _exit(int status);
Typically a status of 0 means we are exiting successfully. So we know what to do:
mov rax, 60 mov rdi, 0 syscall
The full code
; Here we do amazing things, like say hello. section .data greetings db "Greetings earthlings!", 10 section .text global _start _start: mov rax, 1 ; write system call mov rdi, 1 ; fd for stdout mov rsi, greetings ; pointer to our text to write mov rdx, 22 ; 21 is the file length syscall mov rax, 60 ; exit system call mov rdi, 0 ; exit code 0 syscall
We need to build this thing. Building comes in two stages, compiling and linking.
nasm handles the compiling for us.
Run this to compile:
nasm -f elf64 -g -F stabs greetings.asm
This creates the object file -
greetings.o, which is not quite executable. We need to link it. For this we can use
ld greetings.o -o greetings
This should produce a beautiful executable called
greetings that we can run:
╰─$ ./greetings Greetings earthlings!
To make things a tiny bit easier we can put the build instructions into a Makefile:
all: greetings.o ld greetings.o -o greetings greetings.o: greetings.asm nasm -f elf64 -g -F stabs greetings.asm
Now, instead of running two commands to build your changes you can just run
This assembly stuff seems pretty easy so far…