I ended last time at saying how the dynamic linker had three main tasks:
Determine and load dependencies, relocate the application and dependencies, and
initialize the application and dependencies, and how the key to speeding up all
of these was to have fewer dependencies in the application.
Now, we're going to look at the relocation process more thorougly. First of all,
what's going on? What does 'relocation' mean?
I'm by no means an expert in this, but I'm going to venture an attempt at an
explanation: After an ELF object has been compiled, it has an entry point
address - in other words, at which memory address the file resides, and if
control is transferred to that address, the ELF object will start executing.
However, there are at least a couple of caveats here. First of all: Even if your
ELF object has a fixed entry point address, it doesn't mean it will be loaded
into actual physical memory at this address. Each process gets its own
virtual memory space, which is a mapping from physical memory to a 'platonic'
memory space. So the application might get loaded into the entry point address
of the virtual memory space, but this address will correspond to another address
entirely in physical space.
The second point is that if we're not talking about an executable, but rather a
dynamic shared object, as we are here (or rather, we have one executable with a
potentially high number of dynamic shared objects that are to be associated with
it), the entry point address isn't even the entry point address it will end up
with in the final executable - it will get shifted depending on what the linker
determines is the best way to combine the addresses of all participating DSOs.
This means that all 'internal' addresses in that object will be shifted by the
same amount as well. This is what we're currently talking about when we use the
term 'relocation'.
So the thing we're going to talk about is how the linker accomplishes this
relocation - and especially the part where it has to synchronize all the load addresses, etc. First, it must be noted that there are two types of dependencies a given DSO can have. For one, you can have dependencies that are located within
the same object - which I imagine happens when you create an object file with
two functions/subroutines and one of them depends on the other - and for
another, you can have dependencies that come from a different object.
The first kind of dependency is easy to handle, given that you know the 'new'
entry point address of the object in question. For each such dependency, you
calculate its relative offset from the entry point, and then simply add this
offset to the new entry point.
The second type of dependency resolution is more involved, and I'm going to talk
about that more the next time.
Showing posts with label libraries. Show all posts
Showing posts with label libraries. Show all posts
Friday, June 28, 2013
Tuesday, June 25, 2013
Shared library writeup: Part 3
I ended last time by talking about how the dynamic linker had to run before an
ELF file could run. Currently, in other words, the dynamic linker has been
loaded into the memory space of the process that we want to run, and we're going
to run the linker before we can run our actual application.
We cannot run the linker quite yet, however. The linker has to know where it should transfer control to after it has completed whatever it is supposed to do. This is accomplished by the kernel - it puts an auxiliary vector on top of the process' stack. This vector is a name-value array (or in Python terms, a dictionary), and it already contains the values I talked about last time - the
At this point the (dynamic) linker is supposed to do its magic. It has three tasks:
We cannot run the linker quite yet, however. The linker has to know where it should transfer control to after it has completed whatever it is supposed to do. This is accomplished by the kernel - it puts an auxiliary vector on top of the process' stack. This vector is a name-value array (or in Python terms, a dictionary), and it already contains the values I talked about last time - the
PT_INTERP
value which is contained in the
p_offset
field of the ELF program header. In addition to these, a
number of other values are added to the auxiliary vector. The values that can be
added like this are defined in the elf.h
header file, and have the
prefix AT_
. After this vector has been set up, control is finally
transferred to the dynamic linker. The linker's entry point is defined in the
ELF header of the linker, in the e_entry
field.
Trusting the linker
At this point the (dynamic) linker is supposed to do its magic. It has three tasks:
- Determine and load dependencies
- Relocate the application and all dependencies
- Initialize the application and dependencies in the correct order
Labels:
ELF,
libraries,
programming,
Quality,
regurgitated information,
Useful
Monday, June 17, 2013
Shared library writeup: Part 2
I ended last time talking about how the program header file contains references to the segments of an ELF file. Now, these segments can have various access permissions - some parts are executable but not writable, while some are writable but not executable.
Having a lot of non-writable segments is a good thing, since it means, in addition to data being protected from unintentional or malignant modification, that these segments can be shared if there are several applications that use them.
The way that the kernel knows what kind of segments are of what type is by reading the program header table, where this information is located. This table is represented by C structs called
However, the program header table is not located at a fixed place in an ELF file. The only thing that is fixed is the ELF header, which is always put at 'offset' zero, meaning the beginning of the file, essentially. (Offset means how many bytes from the beginning something is located). This header is also represented by a C struct, called
Now, the ELF header struct contains several pieces of information (fields) that are necessary to determine where the program header is. Writing down these pieces means essentially copy-pasting the article I'm reading, so I think I will not go down to that level of granulation.
Once the kernel has found the program header table, it can start reading information about each segment. The first thing it needs to know is which type the segment is, which is represented by the
However, even though the offset in memory is irrelevant for unlinked DSOs, the virtual memory size of the segment is relevant. This is because the actual memory space that the segment needs can be larger than the size of the segment in-file. When the kernel loads the segment into memory, if the requested memory size is larger than the segment size, the extra memory is initialized with zeroes. This is practical if there are so-called BSS sections in the segment. BSS is an old name for a section of data that contains only zero bits. Thus, as long as extraneous memory is initialized with zeroes, this is a good way to save space - you only need to know how large the bss section is, add that size to the current size of the segment, and the kernel handles the rest. An example of a BSS section is a section containing uninitialized variables in C code, since such variables are set to zero in C anyway.
Finally, each segment has a logical set of permissions that is defined in the
After this, the virtual address space for the ELF executable is set up. However, the executable binary at this point only contains the segments that had the
The dynamic linker is a program just like the executable we're trying to run, so it has to go through all the above steps. The difference is that the linker is a complete binary, and it should also be relocatable. Which linker is used is not specified by the kernel - it is contained in a special segment in the ELF file, which has the
This ends the second part of the writeup. And there's plenty left..
Having a lot of non-writable segments is a good thing, since it means, in addition to data being protected from unintentional or malignant modification, that these segments can be shared if there are several applications that use them.
The way that the kernel knows what kind of segments are of what type is by reading the program header table, where this information is located. This table is represented by C structs called
ELF32_Phdr
or ELF64_Phdr
.
However, the program header table is not located at a fixed place in an ELF file. The only thing that is fixed is the ELF header, which is always put at 'offset' zero, meaning the beginning of the file, essentially. (Offset means how many bytes from the beginning something is located). This header is also represented by a C struct, called
ELF32_Ehdr
or ELF64_Ehdr
(the 32 or 64 refers to whether the computer architecture is 32-bit or 64-bit, respectively - i.e., all its registers, memory addresses and buses have sizes of 32 bits or 64 bits.)
Now, the ELF header struct contains several pieces of information (fields) that are necessary to determine where the program header is. Writing down these pieces means essentially copy-pasting the article I'm reading, so I think I will not go down to that level of granulation.
Once the kernel has found the program header table, it can start reading information about each segment. The first thing it needs to know is which type the segment is, which is represented by the
p_type
field of the program header table struct. If this field has the value PT_LOAD
it means that this segment is 'loadable'. (other values this field can have is PT_DYNAMIC
, which means that this segment contains dynamic linking information, PT_NOTE
, which means the segment contains auxilliary notes, et cetera.) If the p_type
field has the value PT_LOAD
, the kernel must, in addition to knowing where the segment starts, also know how big it is, which is specified in the p_filesz
field. There are also a couple of fields that describe where the segment is located in virtual memory space. However, the actual offset in virtual memory space is irrelevant for DSOs that are not linked, since they haven't been assigned a specific place in virtual memory space. For executables and so-called 'prelinked' DSOs (meaning that they have been bound to an executable even if they're dynamic), the offset is relevant.
However, even though the offset in memory is irrelevant for unlinked DSOs, the virtual memory size of the segment is relevant. This is because the actual memory space that the segment needs can be larger than the size of the segment in-file. When the kernel loads the segment into memory, if the requested memory size is larger than the segment size, the extra memory is initialized with zeroes. This is practical if there are so-called BSS sections in the segment. BSS is an old name for a section of data that contains only zero bits. Thus, as long as extraneous memory is initialized with zeroes, this is a good way to save space - you only need to know how large the bss section is, add that size to the current size of the segment, and the kernel handles the rest. An example of a BSS section is a section containing uninitialized variables in C code, since such variables are set to zero in C anyway.
Finally, each segment has a logical set of permissions that is defined in the
p_flags
field of the program header struct - whether the segment is writable, readable, executable or any combination of the three.
After this, the virtual address space for the ELF executable is set up. However, the executable binary at this point only contains the segments that had the
PT_LOAD
value in the p_type
field. The dynamically linked segments are not yet loaded - they only have an address in virtual memory. Therefore, before execution can start, another program must be executed - the dynamic linker.
The dynamic linker is a program just like the executable we're trying to run, so it has to go through all the above steps. The difference is that the linker is a complete binary, and it should also be relocatable. Which linker is used is not specified by the kernel - it is contained in a special segment in the ELF file, which has the
PT_INTERP
value in the p_type
field. This segment is just a null-terminated string which specifies which linker to use. And the load address of the linker should not conflict with any of the executables on which it is being run.
This ends the second part of the writeup. And there's plenty left..
Labels:
ELF,
libraries,
programming,
Quality,
regurgitated information,
Useful
Friday, June 14, 2013
Shared library writeup: Part 1
During my daily work this week, I found myself struggling with shared libraries, linking them, and the various compiler flags needed to make the type of library you want. I decided to actually learn this stuff once and for all, and so I am currently reading "How to write shared libraries" by Ulrich Drepper. I decided this was a perfect opportunity to multitask - both write stuff for the blog and learn something! Espec
ially since you learn much better by writing about it. Hence, this will be the first part of my writeup of Drepper's paper.
In the most abstract, libraries are collections of code gathered into one file for easy reuse. They can be static, meaning that if you want to use the code in a program, the compiler must take the code contained in the library and bake it into the program upon compilation. Alternatively, they can also be shared or dynamic, meaning that they are not included in the program upon compilation, but the program contains mention of the libraries, so that on run-time, the program loads the library and incorporates it into the program.
Nowadays, (on Unix-like systems) libraries are handled by the so-called ELF (Executable Linkage format), which is a common file format that are used not just for libraries, but for executables and other types of files as well.
Earlier, other formats, such as a.out and the Common Object File Format (COFF) were used. The disadvantage with these were that when these libraries did not support relocation.
When you have a piece of compiled code (typically in what's called an object file), this file will contain a relocation table. Such a table is a list of pointers to various addresses within that object file, and these addresses are typically given relative to the beginning of the file (which is typically zero). When combining several such object files into one large executable, this object-file-specific list must typically be changed, since the object file now is not located at 'zero' anymore, but rather at some arbitrary point within the new executable.Then, when the executable is to be executed, the addresses are again modified to reflect the actual addresses in RAM. This last part is what is not supported by the old library formats.
This essentially means that each library must be given an absolute address in virtual memory upon creation, and that some central authority must keep track of where the various shared libraries are stored. In addition: when we make additions to a library that is supposed to be shared, we don't want to have to tell all the applications that used the old version that the library has changed - as long as the new version still contains all the stuff we need for our application, it should still work for that application without having to re-link the application with the new version of the library. This means that the table that points to where the various parts of the library are located must be kept separate from the actual library, and it must actually keep track of the pointer tables of all the old versions of that library - once a function had been added to a library, its address lasted forever. New additions to a library would just append to the existing table. In short, a.out and COFF were not very practical for use as shared libraries, although they did make the program run fast, since there is no relocation of table pointers at run time.
For an application that contains no dynamic components (no shared libraries etc.), its execution is straightforward: The application is loaded into memory, then instruction at the 'entry point' memory address is executed, which should start a chain of events that ends with the termination of the program.
For applications that do contain dynamic components, it is less straightforward: There must be another program that can coordinate the application with the DSOs (Dynamic Shared Objects) before execution of the program starts.
The section header table is a table with references to the various sections of the file. The program header table contains references to various groupings of the sections. So you might say that the section header table describes each 'atom' of the file, whereas the program header table collects these atoms into 'molecules' and makes sensible chunks, called segments, that are sections that work together to form a coherent whole.
End of part 1 of the writeup! And I'm only on page 3 of the paper!
In the most abstract, libraries are collections of code gathered into one file for easy reuse. They can be static, meaning that if you want to use the code in a program, the compiler must take the code contained in the library and bake it into the program upon compilation. Alternatively, they can also be shared or dynamic, meaning that they are not included in the program upon compilation, but the program contains mention of the libraries, so that on run-time, the program loads the library and incorporates it into the program.
Nowadays, (on Unix-like systems) libraries are handled by the so-called ELF (Executable Linkage format), which is a common file format that are used not just for libraries, but for executables and other types of files as well.
Earlier, other formats, such as a.out and the Common Object File Format (COFF) were used. The disadvantage with these were that when these libraries did not support relocation.
When you have a piece of compiled code (typically in what's called an object file), this file will contain a relocation table. Such a table is a list of pointers to various addresses within that object file, and these addresses are typically given relative to the beginning of the file (which is typically zero). When combining several such object files into one large executable, this object-file-specific list must typically be changed, since the object file now is not located at 'zero' anymore, but rather at some arbitrary point within the new executable.Then, when the executable is to be executed, the addresses are again modified to reflect the actual addresses in RAM. This last part is what is not supported by the old library formats.
This essentially means that each library must be given an absolute address in virtual memory upon creation, and that some central authority must keep track of where the various shared libraries are stored. In addition: when we make additions to a library that is supposed to be shared, we don't want to have to tell all the applications that used the old version that the library has changed - as long as the new version still contains all the stuff we need for our application, it should still work for that application without having to re-link the application with the new version of the library. This means that the table that points to where the various parts of the library are located must be kept separate from the actual library, and it must actually keep track of the pointer tables of all the old versions of that library - once a function had been added to a library, its address lasted forever. New additions to a library would just append to the existing table. In short, a.out and COFF were not very practical for use as shared libraries, although they did make the program run fast, since there is no relocation of table pointers at run time.
Enter ELF
ELF is, as mentioned before, a common file type for both applications, object files, libraries and more. It is therefore very easy to make a library once you know how to make an application - you just pass in an additional compiler flag. The only difference between them is that applications usually have a fixed load address, that is, the (virtual) memory address into which they are loaded upon execution. There is a special class of applications, called Position Independent Executables (PIEs) that don't even have a fixed load address, and for those, the difference between applications and shared libraries are even smaller.For an application that contains no dynamic components (no shared libraries etc.), its execution is straightforward: The application is loaded into memory, then instruction at the 'entry point' memory address is executed, which should start a chain of events that ends with the termination of the program.
For applications that do contain dynamic components, it is less straightforward: There must be another program that can coordinate the application with the DSOs (Dynamic Shared Objects) before execution of the program starts.
The ELF file structure
ELF files usually contain the following:- the file header
- the Program header table
- the Section header table
- Sections
The section header table is a table with references to the various sections of the file. The program header table contains references to various groupings of the sections. So you might say that the section header table describes each 'atom' of the file, whereas the program header table collects these atoms into 'molecules' and makes sensible chunks, called segments, that are sections that work together to form a coherent whole.
End of part 1 of the writeup! And I'm only on page 3 of the paper!
Labels:
ELF,
libraries,
programming,
Quality,
regurgitated information,
Useful
Subscribe to:
Posts (Atom)