Compiling and linking#
Say we have saved our Hello World program to a file main.cpp
. Before we can run it, we need to build it — that is, convert our human-readable C++ code into executable machine code, and combine this with other machine code that our code relies on. The build process consists of two steps:
Compilation: Translating each source file (
.cpp
file) into a corresponding object file (.o
file) with machine code.Linking: Combine multiple object files into a single executable, i.e. a program we can run.
Note
Informally, the entire process of compiling + linking is often referred to simply as “compiling the code”. But we should not forget that the technical term “compilation” actually only refers to the first step. To avoid confusion, it is often useful to say “building the code” when referring to the entire process, as we do here.
We will use the common C++ compiler g++
for building our code.
Compilation: To compile main.cpp
, we run the following terminal command:
g++ -c main.cpp
Here the option -c
means “just compile, don’t link”. This produces an object file main.o
.
Linking: To do the linking step, we pass this object file (main.o
) as input to g++
:
g++ main.o -o main.exe
Here the option -o main.exe
just means that we give the resulting program the name main.exe
. (We could have picked any file name, and we didn’t have to use the .exe
suffix.)
To run our fantastic program, we do
./main.exe
which should produce the familiar output
Hello, World!
Hooray, we have compiled, linked and run our first C++ program!
Note
In this simple example it may not seem like we’re doing much “linking” at all — there’s only one object file, right? Well, not really: Under the hood, our object file main.o
is linked to object files from the standard C++ library.
Compiling and linking in one go#
We can do compilation + linking with a single command:
g++ main.cpp -o main.exe
The compiler will then (1) compile main.cpp
into main.o
, (2) link main.o
to create main.exe
, and (3) delete main.o
. Such shorthand commands are often useful. But it will pay off in the long run to remember that there really are two different things going on here. To quote this nice article by Alex Allain:
Knowing the difference between the compilation phase and the link phase can make it easier to hunt for bugs. Compiler errors are usually syntactic in nature — a missing semicolon, an extra parenthesis. Linking errors usually have to do with missing or multiple definitions. If you get an error that a function or variable is defined multiple times from the linker, that’s a good indication that the error is that two of your source code files have the same function or variable.
We will see this more clearly when we start creating larger projects with many files.
A peak under the hood#
We have referred to “machine code” a few times. Do you want to see what it actually looks like, before it’s translated to binary form? You can do this with the compiler option -S
, like this:
g++ -S main.cpp
This produces a file main.s
with the code in assembly language, a very low-level type of language that deals directly with instructions for the CPU. Open main.s
in your text editor and have a look — but don’t worry, we will never need to work with assembly language in this course! But it’s nice to have seen it once.