I ended last time at saying how the dynamic linker had three main tasks:
Determine and load dependencies, relocate the application and dependencies, and
initialize the application and dependencies, and how the key to speeding up all
of these was to have fewer dependencies in the application.
Now, we're going to look at the relocation process more thorougly. First of all,
what's going on? What does 'relocation' mean?
I'm by no means an expert in this, but I'm going to venture an attempt at an
explanation: After an ELF object has been compiled, it has an entry point
address - in other words, at which memory address the file resides, and if
control is transferred to that address, the ELF object will start executing.
However, there are at least a couple of caveats here. First of all: Even if your
ELF object has a fixed entry point address, it doesn't mean it will be loaded
into actual physical memory at this address. Each process gets its own
virtual memory space, which is a mapping from physical memory to a 'platonic'
memory space. So the application might get loaded into the entry point address
of the virtual memory space, but this address will correspond to another address
entirely in physical space.
The second point is that if we're not talking about an executable, but rather a
dynamic shared object, as we are here (or rather, we have one executable with a
potentially high number of dynamic shared objects that are to be associated with
it), the entry point address isn't even the entry point address it will end up
with in the final executable - it will get shifted depending on what the linker
determines is the best way to combine the addresses of all participating DSOs.
This means that all 'internal' addresses in that object will be shifted by the
same amount as well. This is what we're currently talking about when we use the
term 'relocation'.
So the thing we're going to talk about is how the linker accomplishes this
relocation - and especially the part where it has to synchronize all the load addresses, etc. First, it must be noted that there are two types of dependencies a given DSO can have. For one, you can have dependencies that are located within
the same object - which I imagine happens when you create an object file with
two functions/subroutines and one of them depends on the other - and for
another, you can have dependencies that come from a different object.
The first kind of dependency is easy to handle, given that you know the 'new'
entry point address of the object in question. For each such dependency, you
calculate its relative offset from the entry point, and then simply add this
offset to the new entry point.
The second type of dependency resolution is more involved, and I'm going to talk
about that more the next time.
No comments:
Post a Comment