2014-10-15 10:23:33 +00:00
|
|
|
*Concepts you may want to Google beforehand: C, object code, linker, disassemble*
|
2014-10-15 09:46:44 +00:00
|
|
|
|
|
|
|
**Goal: Learn to write the same low-level code as we did with assembler, but in C**
|
|
|
|
|
|
|
|
|
|
|
|
Compile
|
|
|
|
-------
|
|
|
|
|
|
|
|
Let's see how the C compiler compiles our code and compare it to the machine code
|
|
|
|
generated with the assembler.
|
|
|
|
|
|
|
|
We will start writing a simple program which contains a function, `function.c`.
|
|
|
|
Open the file and examine it.
|
|
|
|
|
|
|
|
To compile system-independent code, we need the flag `-ffreestanding`, so compile
|
|
|
|
`function.c` in this fashion:
|
|
|
|
|
|
|
|
`i386-elf-gcc -ffreestanding -c function.c -o function.o`
|
|
|
|
|
|
|
|
Let's examine the machine code generated by the compiler:
|
|
|
|
|
|
|
|
`i386-elf-objdump -d function.o`
|
|
|
|
|
|
|
|
Now that is something we recognize, isn't it?
|
|
|
|
|
|
|
|
|
|
|
|
Link
|
|
|
|
----
|
|
|
|
|
|
|
|
Finally, to produce a binary file, we will use the linker. An important part of this
|
|
|
|
step is to learn how high level languages call function labels. Which is the offset
|
|
|
|
where our function will be placed in memory? We don't actually know. For this
|
|
|
|
example, we'll place the offset at `0x0` and use the `binary` format which
|
|
|
|
generates machine code without any labels and/or metadata
|
|
|
|
|
|
|
|
`i386-elf-ld -o function.bin -Ttext 0x0 --oformat binary function.o`
|
|
|
|
|
|
|
|
*Note: a warning may appear when linking, disregard it*
|
|
|
|
|
|
|
|
Now examine both "binary" files, `function.o` and `function.bin` using `xdd`. You
|
|
|
|
will see that the `.bin` file is machine code, while the `.o` file has a lot
|
|
|
|
of debugging information, labels, etc.
|
|
|
|
|
|
|
|
|
|
|
|
Decompile
|
|
|
|
---------
|
|
|
|
|
|
|
|
As a curiosity, we will examine the machine code.
|
|
|
|
|
|
|
|
`ndisasm -b 32 function.bin`
|
|
|
|
|
|
|
|
|
|
|
|
More
|
|
|
|
----
|
|
|
|
|
|
|
|
I encourage you to write more small programs, which feature:
|
|
|
|
|
2014-10-15 10:21:46 +00:00
|
|
|
- Local variables `localvars.c`
|
|
|
|
- Function calls `functioncalls.c`
|
|
|
|
- Pointers `pointers.c`
|
|
|
|
|
|
|
|
Then compile and disassemble them, and examine the resulting machine code. Follow
|
|
|
|
the os-guide.pdf for explanations. Try to answer this question: why does the
|
|
|
|
disassemblement of `pointers.c` not resemble what you would expect? Where is
|
|
|
|
the ASCII `0x48656c6c6f` for "Hello"?
|