-
-
Notifications
You must be signed in to change notification settings - Fork 603
Dynamic Linker
OSv comprises many components but the dynamic linker is probably one of the most essential ones as it interacts with and ties all other components together and is responsible for bootstrapping an application. In essence, it involves locating an ELF file on the filesystem, loading it into memory using mmap()
, processing its headers and segments to relocate symbols, configuring TLS (Thread Local Storage), executing its DT_INIT
/DT_INIT_ARRAY
functions, loading any dependant ELF objects, and finally starting the app. Please note that unlike Linux, the dynamic linker is an integral part of the OSv kernel. Most of the dynamic linker code is located in core/app.cc
, core/elf.cc
, arch/x64/arch-elf.cc
and arch/aarch64/arch-elf.cc
.
Represents the dynamic linker's view of the running program. Typically there is only one instance of it created by elf::create_main_program()
called from loader.cc
. The program
constructor sets up the program base in memory, initializes _core
- an instance of elf::memory_image
to represent the ELF of OSv kernel, and finally sets up a default set of "supplied" modules like libc.so.6
, libpthread.so.0
, etc in _modules_rcu
. The default main program is stored in the s_program
global variable, so effectively the elf::program
is a singleton, but it is possible to create multiple program
instances for new ELF namespaces.
The key methods:
-
std::shared_ptr<object>
get_library
(std::string name, ..)
- the main method called byosv::application
constructor andlibc/dlfcn.cc:dlopen()
to instantiate anelf::object
representing newly loaded ELF. The method delegates the key part of the ELF loading logic toprogram::load_object(..)
, then builds static TLS template if present by callinginit_static_tls()
on the newobject
and finally invokes theDT_INIT
/DT_INIT_ARRAY
functions by delegating toprogram::init_library()
. -
std::shared_ptr<elf::object>
load_object
(std::string name, ..)
- locates an ELF file on the filesystem, creates an instance ofelf::object
to represent it and finally orchestrates new ELF initialization logic by invoking number of keyobject
methods on it -load_segments()
,process_headers()
,load_needed()
,relocate()
andfix_permissions()
. For example, theload_segments()
memory-maps allPT_LOAD
segments into memory, theload_needed()
finds and loads all dependant child objects perDT_NEEDED
, and therelocate()
processess all relocations. Please note that even though the new object is a member of_modules_rcu
, it is NOT visible yet from symbol-lookup perspective at this point. -
void
init_library
(int argc, char** argv)
- invokesDT_INIT
/DT_INIT_ARRAY
functions on the ELF object and its dependant children in the correct order and eventually makes the new object visible for symbol lookup (please seeobject::lookup_symbol(const char* name,..)
). -
symbol_module
lookup
(const char* name)
- iterates over all objects returned byprogram::modules_list program::modules_get()
and callsobject::lookup_symbol(name)
for each to finally return asymbol_module
tuple for the first found occurrence. Themodules_list
holds a list ofelf::object
s maintained in search-priority order and managed as a RCU (Read-Copy-Update)[] structure.
Represents an ELF object and implements logic to load an ELF file and into memory and process its headers and relocations.
The key methods are:
-
void
load_segments()
- memory-maps allPT_LOAD
segments of the ELF file into memory -
void
load_needed()
- ???? -
void
relocate()
- the top method that processes the ELF object relocations by callingrelocate_rela()
andrelocate_pltgot()
-
void
relocate_rela()
- iterates over the table of relocation entries perDT_RELA
and callsarch_relocate_rela()
for eachElf64_Rela*
and passes its relocation type (p->r_info & 0xffffffff
), index in the symbol table of the object being relocated (p->r_info >> 32
), address of the relocation (_base + p->r_offset
: where to write the relocation value to) and addend (p->r_addend
) -
bool
arch_relocate_rela
(u32 type, u32 sym, void *addr, Elf64_Sxword addend)
- based on the relocation type (type
argument) determines the relocation value (symbol relocated address or object module index orst_value
(?) for TLS) and writes it to the relocation address (addr
argument): *R_X86_64_COPY
- callsobject::symbol_other(sym)
to find symbol in other objects *R_X86_64_64
- callsobject::symbol(sym, true)
to find symbol in all objects (see below) and calculates the value assymbol.relocated_addr() + addend
*R_X86_64_RELATIVE
- calculates the value as_base + addend
*R_X86_64_JUMP_SLOT
,R_X86_64_GLOB_DAT
- callsobject::symbol(sym, true)
to find symbol in all objects (see below) and calculates the value assymbol.relocated_addr()
*R_X86_64_DTPMOD64
- callsobject::symbol(sym, true)
to find symbol in all objects (see below) and calculates value as the module index of the object where symbol was found in; forSTN_UNDEF
uses index ofthis
object *R_X86_64_DTPOFF64
- (TLS) ??? *R_X86_64_TPOFF64
- (TLS)??? -
void
relocate_pltgot()
- called fromobject:relocate()
to process the PLT (Procedure Linkage Table) relocations mostly for functions; it iterates over entries inDT_JMPREL
and either callsobject::arch_relocate_jump_slot()
ifbind_now
or sets the jump slots to resolve lazily later (PLT_GOT
) -
bool
arch_relocate_jump_slot
(u32 sym, void *addr, Elf64_Sxword addend, bool ignore_missing)
- callsobject::symbol(sym, true)
to find symbol in all objects (see below) and writessymbol.relocated_addr()
to the relocation jump slot address (addr
argument) -
void*
resolve_pltgot
(unsigned index)
- finds relocation info underdynamic_ptr<Elf64_Rela>(DT_JMPREL)
and symbol index and finds symbol by callingobject::symbol()
and callsobject::arch_relocate_jump_slot()
to write the symbol`s relocated address -
symbol_module
symbol
(unsigned idx, bool ignore_missing)
- entry point to symbol lookup; accepts symbol index, finds its name in the object symbols table (dynamic_ptr<Elf64_Sym>(DT_SYMTAB)
) and searches for a symbol by name in all objects programs knows about by callingprogram::lookup(name)
; if symbol not found it aborts ifignore_missing
isfalse
otherwise just warns; returnssymbol_module
that is a tuple of the object the symbol resides and the symbol definition (Elf64_Sym *
); called by following methods during relocation phase:arch_relocate_rela()
arch_relocate_jump_slot()
resolve_pltgot()
-
Elf64_Sym*
lookup_symbol
(const char* name)
- looks up symbol by name by delegating to eitherlookup_symbol_old
orlookup_symbol_gnu
; bails out if object not visible (during construction)-
Elf64_Sym* object::lookup_symbol_old(const char* name)
- ??? -
Elf64_Sym* object::lookup_symbol_gnu(const char* name)
- uses GNU hashmap
-
Represents a running program and its _program
member points to the program the application was created for.
The program::get_library()
is the critical point where the dynamic linker gets involved in instantiating new applications.
The main program (kernel?) gets instantiated by elf::create_main_program()
called from loader.cc
application::new_program()
instantiates new program for new ELF namespace with new base address.
Thread local storage (TLS) is a mechanism that allows applications and shared libraries to use variables stored in memory area specific to a given thread. These include variables marked with __thread
and C++ thread_local
modifiers. For TLS variables to work correctly, OSv dynamic linker needs to recognize TLS segments in an ELF file, construct static TLS blocks in memory, process relevant relocations and provide certain functions like __tls_get_addr
among other things.
Before we delve into what OSv dynamic linker does to support TLS, it is important to understand two different formats of static TLS block layout - so-called Variant I and Variant II - and 4 different models of accessing TLS variables: local-exec, initial-exec, general-dynamic and local-dynamic.
Static TLS block is an area of memory allocated for each thread independently, intended to store thread-local variables and built from a template derived when loading the main application and its dependant libraries. The template, in essence, specifies the total size of the TLS block, the offsets in it for each ELF object including OSv kernel, and any initial values for the TLS variables in those objects. The static TLS does not change once the thread is created and running; that is why it is called "static" after all. In Variant I (used in AArch64 port) the data is laid out from left-to-right (local-exec for PIEs, kernel followed with other objects). In Variant II (used in X86_64 port) the data is laid out from right-to-left which is exactly opposite to Variant I.
- Variant I
// (1) - TLS memory area layout with app as shared library
// |------|--------------|-----|-----|-----|
// |<NONE>|KERNEL |SO_1 |SO_2 |SO_3 |
// |------|--------------|-----|-----|-----|
// (2) - TLS memory area layout with PIE or position dependant executable
// |------|--------------|-----|-----|
// | EXE |KERNEL |SO_2 |SO_3 |
// |------|--------------|-----|-----|
- Variant II
// (1) - TLS memory area layout with app shared library
// |-----|-----|-----|--------------|------|
// |SO_3 |SO_2 |SO_1 |KERNEL |<NONE>|
// |-----|-----|-----|--------------|------|
// (2) - TLS memory area layout with PIE or position dependant executable
// |-----|-----|---------------------|
// |SO_3 |SO_2 |KERNEL | EXE |
// |-----|-----|--------------|------|
The role of the dynamic linker with respect to TLS handling is to connect the "dots" which at a high-level can be divided into 4 phases:
- processing TLS program header to detect the size and other specifics of TLS data,
- processing TLS-related relocations,
- building memory blueprint for TLS - so-called template,
- and finally allocating and initializing TLS blocks for each thread before it is started.
In the 1st phase, as OSv dynamic linker loads an ELF object in core/elf.cc:std::shared_ptr<elf::object> program::load_object(..)
, it first mmap
s all PT_LOAD
segments and then processes all headers (see core/elf.cc:void object::process_headers()
) to detect any TLS segment and capture its size, alignment, and its address in memory.
In the 2nd phase, the dynamic linker processes all relocations including those in both GOT (Global Offset Table, see core/elf.cc:void object::relocate_rela()
) and PLT (Procedure Linkage Table, see core/elf.cc:void object::relocate_pltgot()
). Some of those relocations are TLS specific and are processed in a specific way:
-
R_X86_64_DTPMOD64
: to identify the module ID (unique id assigned to each ELF object) where the given TLS variable referenced by this relocation is located; please note that theR_X86_64_DTPMOD64
relocation may be in one ELF object and its definition and location may be in another one or the same; in addition, ifindex == STN_UNDEF
, the relocation applies to hidden (static) TLS variables in given module and is used to determine its module index passed later to__tls_get_addr()
- offset in TLS block is known in advance, -
R_X86_64_DTPOFF64
: to calculate the offset in the TLS block of the object where the variable is going to "live"; the values determined for bothR_X86_64_DTPMOD64
andR_X86_64_DTPOFF64
are stored in the relocation placeholders in the GOT and then referenced as input to the__tls_get_addr()
function provided by OSv dynamic linker; please note that even though both relocations are intended for the dynamic model based on__tls_get_addr()
, the variables can live in either static TLS or lazily allocated block if the object wasdlopen()
-ed, -
R_X86_64_TPOFF64
/R_AARCH64_TLS_TPREL64
: to calculate offset of the TLS variable in static TLS - this means that variable will "live" in the statically allocated memory area; in addition, each of these relocations triggers a call tovoid object::alloc_static_tls()
to determine the offset of the TLS block of the object owning the variable and eventually fed in order to properly build application TLS template, -
R_AARCH64_TLSDESC
: to place address of static TLS "resolver" function and its only argument - offset of the variable in static TLS.
In the 3rd phase, after std::shared_ptr<elf::object> program::load_object(..)
completes loading ELF in memory, OSv dynamic linker calls void object::init_static_tls()
(see std::shared_ptr<object> program::get_library(..)
) to build the TLS template for the main app object based on its own TLS segment and any of its dependencies. More specifically it calculates total TLS template size (stored in object::_initial_tls_size
member variable) and iterates over all its dependant objects to copy their TLS data to object::_initial_tls
buffer by delegating to the void object::prepare_initial_tls(..)
or void object::prepare_local_tls(..)
methods that are arch (AArch64 or X86_64) specific. At the same time, it also stores the offsets of TLS data for each object in object::_initial_tls_offsets
vector. So at the end the init_static_tls()
sets 3 key variables - object::_initial_tls
, object::_initial_tls_size
and object::_initial_tls_offsets
.
Finally, for each thread created, its constructor (see core/sched.cc::thread::thread()
) allocates so called TCB (Thread Control Block) which includes static TLS block by delegating to arch-specific setup_tcb()
(see arch/*/arch-switch.hh
for details). The static TLS block (thread::_tcb->tls_base
) is in essence an instance of the TLS template built in previous phases. Finally, the index (_tls
) vector is populated with offsets for each module based on object::_initial_tls_offsets
.
- The ELF Specification
- How To Write Shared Libraries
- The ELF format - how programs look from the inside
- ELF loading and dynamic linking
- Introduction to The ELF Format: Series
- The 101 of ELF files on Linux: Understanding and Analysis
- Understanding Symbols Relocation
- Relocations, Relocations
- Executable and Linkable Format 101 Part 3: Relocations
- Dynamic Linking
- ELF symbol resolution
- A Deep dive into (implicit) Thread Local Storage