Sun Microsystems Logo
Products & Services
 
Support & Training
 
 

Previous Previous     Contents     Index     Next Next
Chapter 8

Thread-Local Storage

The compilation environment supports the declaration of thread-local data. This data is sometime referred to as thread-specific, or thread-private data, but more typically by the acronym TLS. By declaring variables to be thread-local, the compiler automatically arranges for these variables to be allocated on a per-thread basis.

The built-in support for this feature serves three purposes:

  • It provides a foundation upon which the POSIX interfaces for allocating thread specific data are built.

  • It offers a more convenient and more efficient mechanism for direct use by applications and libraries.

  • It allows compilers to allocate TLS as necessary when performing loop-parallelizing optimizations.

C/C++ Programming Interface

Variables are declared thread-local using the __thread keyword, as in the following examples:

__thread int i;
__thread char *p;
__thread struct state s;

During loop optimizations, the compiler may choose to create thread-local temporaries as needed.

Applicability

The __thread keyword may be applied to any global, file-scoped static, or function-scoped static variable. It has no effect on automatic variables, which are always thread-local.

Initialization

In C++, a thread-local variable may not be initialized if the initialization requires a static constructor. Otherwise, a thread-local variable may be initialized to any value that would be legal for an ordinary static variable.

No variable, thread-local or otherwise, may be statically initialized to the address of a thread-local variable.

Binding

Thread-local variables may be declared and referenced externally, and they are subject to the same interposition rules as normal symbols.

Dynamic loading restrictions

A shared library can be dynamically loaded during process startup, or after process startup via lazy loading, filters, or dlopen(3DL). A shared library containing a reference to a thread-local variable, may be loaded post-startup if every translation unit containing the reference is compiled with a dynamic TLS model.

Static TLS models generates faster code. However, code compiled to use this model cannot reference thread-local variables in post-startup dynamically loaded libraries. A dynamic TLS model is able to reference all TLS. These models are described in Thread-Local Storage Access Models.

Address-of operator

The address-of operator, &, can be applied to a thread-local variable. This operator is evaluated at runtime, and returns the address of the variable within the current thread. The address obtained by this operator may be used freely by any thread in the process as long as the thread that evaluated the address remains in existence. When a thread terminates, any pointers to thread-local variables in that thread become invalid.

When dlsym(3DL) is used to obtain the address of a thread-local variable, the address returned is the address of the instance of that variable in the thread that called dlsym().

Thread-Local Storage Section

Separate copies of thread-local data, allocated at compile-time, must be associated with individual threads of execution. To provide this data, TLS sections are used to specify the size and initial contents.

The compilation environment allocates TLS in sections identified with the SHF_TLS flag. These sections provide initialized and uninitialized TLS based on how the storage is declared:

  • An initialized thread-local variable is allocated in a .tdata, or .tdata1 section. This initialization may require relocation.

  • An uninitialized thread-local variable is defined as a COMMON symbol. The resulting allocation is made in a .tbss section.

The uninitialized section is allocated immediately following any initialized sections, subject to padding for proper alignment. Together, the combined sections form a TLS template that is used to allocate TLS whenever a new thread is created.

The initialized portion of this template is called the TLS initialization image. All relocations generated as a result of initialized thread-local variables are applied to this template. These relocated values are then used when a new thread requires the initial values.

TLS symbols have the symbol type STT_TLS. These symbols are assigned offsets relative to the beginning of the TLS template. The actual virtual address associated with these symbols is irrelevant. The address refers only to the template, and not to the per-thread copy of each data item.

In dynamic executables and shared objects, the st_value field of a STT_TLS symbol contains the assigned offset for defined symbols, or zero for undefined symbols.

Several relocations are defined to support access to TLS. See SPARC: Thread-Local Storage Relocation Types and x86: Thread-Local Storage Relocation Types. Symbols of type STT_TLS are only referenced by TLS relocations. TLS relocations only reference symbols of type STT_TLS.

In dynamic executables and shared objects, a PT_TLS program entry describes a TLS template, and has the following members:

Table 8-1 ELF PT_TLS Program Header Entry

Member

Value

p_offset

File offset of the TLS initialization image

p_vaddr

Virtual memory address of the TLS initialization image

p_paddr

Reserved

p_filesz

Size of the TLS initialization image

p_memsz

Total size of the TLS template

p_flags

PF_R

p_align

Alignment of the TLS template

Runtime Allocation of Thread-Local Storage

TLS is created at three occasions during the lifetime of a program:

  • At program startup.

  • When a new thread is created.

  • When a thread references a TLS block for the first time after a shared library is loaded following program startup.

Thread-local data storage is layed out at runtime as illustrated in Figure 8-1.

Figure 8-1 Runtime Storage Layout of Thread-Local Storage

Runtime Thread-Local Storage Layout

Program Startup

At program startup, the runtime system creates TLS for the main thread.

First, the runtime linker logically combines the TLS templates for all loaded dynamic objects, including the dynamic executable, into a single static template. Each dynamic objects's TLS template is assigned an offset within the combined template, tlsoffsetm, as follows:

  • tlsoffset1 = round(tlssize1, align1)

  • tlsoffsetm+1 = round(tlsoffsetm + tlssizem+1, alignm+1)

tlssizem+1 and alignm+1 are the size and alignment, respectively, for the allocation template for dynamic object m (1 <= m <= M, where M is the total number of loaded dynamic objects). The round(offset, align) function returns an offset rounded up to the next multiple of align. The TLS template is placed immediately preceding the thread pointer tpt. Accesses to the TLS data are based off of subtractions from tpt.

Next, the runtime linker computes the total startup TLS allocation size, tlssizeS, which is equal to tlsoffsetM.

The runtime linker then constructs a linked list of initialization records. Each record in this list describes the TLS initialization image for one loaded dynamic object, and contains the following fields:

  • A pointer to the TLS initialization image.

  • The size of the TLS initialization image.

  • The tlsoffsetm of the object.

  • A flag indicating whether the object uses a static TLS model.

The thread library uses this information to allocate storage for the initial thread. This storage is initialized, and a dynamic TLS vector for the initial thread is created.

Previous Previous     Contents     Index     Next Next