The descriptions below make use of the following definitions:
operator delete
) for T.
class A : public B, public C
,
A is walked first,
then B and its subobjects,
and then C and its subobjects.)
this
)
or other parts of the environment
before transferring control to the target function,
and possibly making further modifications after its return.
A thunk may contain as little as an instruction to be executed prior to
falling through to an immediately following target function,
or it may be a full function with its own stack frame that does
a full call to the target function.
longjmp_unwind
; or
There are two principal reasons for this:
An implementation shall place its standard support library in a DSO
named libcxa.so
on Itanium systems,
or in auxiliary DSOs automatically loaded by it.
It shall place implicit compiler support
in a library separate from the standard support library,
with any external names chosen to avoid conflicts between vendors
(e.g. by including a vendor identifier as part of the names).
This allows a program to function properly if linked with the
target's standard support library and the implicit compiler support
libraries from any implementations used to build components.
For purposes internal to the specification, we also specify:
The size and alignment of a type which is a POD for the purpose of layout is as specified by the base (C) ABI. Type bool has size and alignment 1. All of these types have data size and non-virtual size equal to their size. (We ignore tail padding for PODs because the Standard does not allow us to use it for anything else.)
A pointer to data member is an offset from the base
address of the class object containing it,
represented as a ptrdiff_t
.
It has the size and alignment attributes of a ptrdiff_t
.
A NULL pointer is represented as -1.
A pointer to member function is a pair
ptr
:
ptrdiff_t
.
The value zero represents a NULL pointer,
independent of the adjustment field value below.
adj
:
ptrdiff_t
.
It has the size, data size, and alignment of a class containing those two members, in that order. (For 64-bit Itanium, that will be 16, 16, and 8 bytes respectively.)
Case (2b) above is now considered to be an error in the design. The use of the first indirect primary base class as the derived class' primary base does not save any space in the object, and will cause some duplication of virtual function pointers in the additional copy of the base classes virtual table.
The benefit is that using the derived class virtual pointer as the base
class virtual pointer will often save a load,
and no adjustment to the this
pointer will be required for
calls to its virtual functions.
It was thought that 2b would allow the compiler to avoid
adjusting this
in some cases, but this was incorrect, as
the virtual function call algorithm requires that
the function be looked up through a pointer to a class that defines
the function, not one that just inherits it. Removing that
requirement would not be a good idea, as there would then no longer be
a way to emit all thunks with the functions they jump to. For
instance, consider this example:
struct A { virtual void f(); }; struct B : virtual public A { int i; }; struct C : virtual public A { int j; }; struct D : public B, public C {};
When B and C are declared, A is a primary base in each case, so although
vcall offsets are allocated in the A-in-B and A-in-C vtables, no
this
adjustment is required and no thunk is generated.
However, inside D objects, A is no longer a primary base of C, so if we
allowed calls to C::f()
to use the copy of A's vtable in the C
subobject, we would need to adjust this
from C*
to B::A*
, which would require a third-party thunk. Since we
require that a call to C::f()
first convert to
A*
, C-in-D's copy of A's vtable is never referenced, so this
is not necessary.
For each data component D (first the primary base of C, if any, then the non-primary, non-virtual direct base classes in declaration order, then the non-static data members in declaration order), allocate as follows:
T [b]: n;"
,
for some integral POD type T and bit count n:
There are two cases depending on sizeof(T)
and n
:
sizeof(T)*8 >= n
,
the bitfield is allocated as required by the underlying C psABI.
That is, it will be placed in the next available n bits,
subject to the constraint that it does not cross an alignment
boundary for type T
.
If dsize(C) > 0, and the byte at offset dsize(C) - 1 is partially filled by a bitfield, and that bitfield is also a data member declared in C (but not in one of C's proper base classes), the next available bits are the unfilled bits at offset dsize(C) - 1. Otherwise, the next available bits are at offset dsize(C).
Update align(C) to max (align(C), align(T)).
sizeof(T)*8 < n
,
let T' be the largest integral POD type with
sizeof(T')*8 <= n
.
The bitfield is allocated starting at the next offset aligned
appropriately for T', with length n bits.
The first sizeof(T)*8
bits are used to hold the
value of the bitfield,
followed by n - sizeof(T)*8
bits of padding.
Update align(C) to max (align(C), align(T')).
In either case, update dsize(C) to include the last byte containing (part of) the bitfield, and update sizeof(C) to max(sizeof(C),dsize(C)).
Start at offset dsize(C), incremented if necessary for alignment to nvalign(D) for base classes or to align(D) for data members. Place D at this offset unless doing so would result in two components (direct or indirect) of the same type having the same offset. If such a component type conflict occurs, increment the candidate offset by nvalign(D) for base classes or by align(D) for data members and try again, repeating until success occurs (which will occur no later than sizeof(C) rounded up to the required alignment).
If D is a base class, this step allocates only its non-virtual part, i.e. excluding any direct or indirect virtual bases.
If D is a base class, update sizeof(C) to max (sizeof(C), offset(D)+nvsize(D)). Otherwise, if D is a data member, update sizeof(C) to max (sizeof(C), offset(D)+sizeof(D)).
If D is a base class (not empty in this case), update dsize(C) to offset(D)+nvsize(D), and align(C) to max (align(C), nvalign(D)). If D is a data member, update dsize(C) to offset(D)+sizeof(D), align(C) to max (align(C), align(D)).
Its allocation is similar to case (2) above, except that additional candidate offsets are considered before starting at dsize(C). First, attempt to place D at offset zero. If unsuccessful (due to a component type conflict), proceed with attempts at dsize(C) as for non-empty bases. As for that case, if there is a type conflict at dsize(C) (with alignment updated as necessary), increment the candidate offset by nvalign(D), and try again, repeating until success occurs.
Once offset(D) has been chosen, update sizeof(C) to max (sizeof(C), offset(D)+sizeof(D)). Note that nvalign(D) is 1, so no update of align(C) is needed. Similarly, since D is an empty base class, no update of dsize(C) is needed.
After all such components have been allocated, set nvalign(C) = align(C) and nvsize(C) = sizeof(C). The values of nvalign(C) and nvsize(C) will not change during virtual base allocation. Note that nvsize(C) need not be a multiple of nvalign(C).
struct R { virtual void r (); };
struct S { virtual void s (); };
struct T : virtual public S { virtual void t (); };
struct U : public R, virtual public T { virtual void u (); };
R is the primary base class for U since it is the first direct
non-virtual dynamic base.
Then, since an inheritance-order walk of U is { U, R, T, S }
the T base is allocated next.
Since S is a primary base of T,
there is no need to allocate it separately.
However, given:
struct V : public R, virtual public S, virtual public T {
virtual void v ();
};
the inheritance-order walk of V is { V, R, S, T }.
Nevertheless, although S is considered for allocation first as a virtual base,
it is not allocated separately because it is a primary base of T,
another base.
Thus sizeof (V) == sizeof (U),
and the full layout is equivalent to the C struct:
struct X {
R r;
T t;
};
ptrdiff_t
unless otherwise stated.
this
pointer to the virtual base,
and then adds the value contained at the vcall offset
in the virtual base to its this
pointer
to get the address of the derived object where the function was overridden.
These values may be positive or negative.
These are first in the virtual table if present,
ordered as specified in categories 3 and 4 of Section 2.5.3 below.
ptrdiff_t
.
It is always present.
The offset provides a way to find the top of the object from any base
subobject with a virtual table pointer.
This is necessary for dynamic_cast<void*> in particular.
Consider the following inheritance hierarchy:
struct S { virtual void f() }; struct T : virtual public S {}; struct U : virtual public T {}; struct V : public T, virtual public U {};
struct W : public T {};
The elements of the VTT array for a class D are in this order:
This construction is applied recursively.
The order in which the virtual pointers appear in the VTT is inheritance graph preorder.
Parts (1) and (3) of a primary (not secondary, i.e. nested) VTT, that is the primary and secondary virtual pointers, are used for the final initialization of an object's virtual pointers before the full-object initialization and later use, and must therefore point to the main virtual table group for the class. Those bases which do not have secondary virtual pointers in the VTT have their virtual pointers explicitly initialized to the main virtual table group by the constructors (see Subobject Construction and Destruction).
The virtual pointers in the secondary VTTs and virtual VTTs are used for subobject construction, and may always point to special construction virtual tables laid out as described in the following subsections. However, it will sometimes be possible to use either the full-object virtual table for the subclass, or its secondary virtual table for the full class being constructed. This ABI does not specify a choice, nor does it specify names for the construction virtual tables, so the constructors must use the VTT rather than assuming that a particular construction virtual table exists.
For example, suppose we have the following hierarchy:
class A1 { int i; }; class A2 { int i; virtual void f(); }; class V1 : public A1, public A2 { int i; }; // A2 is primary base of V1, A1 is non-polymorphic class B1 { int i; }; class B2 { int i; }; class V2 : public B1, public B2, public virtual V1 { int i; }; // V2 has no primary base, V1 is secondary base class V3 {virtual void g(); }; class C1 : public virtual V1 { int i; }; // C1 has no primary base, V1 is secondary base class C2 : public virtual V3, virtual V2 { int i; }; // C2 has V3 primary (nearly-empty virtual) base, V2 is secondary base class X1 { int i; }; class C3 : public X1 { int i; }; class D : public C1, public C2, public C3 { int i; }; // C1 is primary base, C2 is secondary base, C3 is non-polymorphicThen the VTT for D would appear in the following order, where indenting indicates the sub-VTT structure, and asterisks (*) indicate that construction virtual tables instead of complete object virtual tables are required.
// 1. Primary virtual pointer: [0] D has virtual bases (complete object vptr) // 2. Secondary VTTs: [1] C1 * (has virtual base) [2] V1-in-C1 in D (secondary vptr) [3] C2 * (has virtual bases) [4] V3-in-C2 in D (primary vptr) [5] V2-in-C2 in D (secondary vptr) [6] V1-in-C2 in D (secondary vptr) // 3. Secondary virtual pointers: // (no C1-in-D -- primary base) [7] V1-in-D (V1 is virtual) [8] C2-in-D (preorder; has virtual bases) [9] V3-in-D (V3 is virtual) [10] V2-in-D (V2 is virtual) // (For complete object D VTT, these all can point to the // secondary vtables in the D vtable, the V3-in-D entry // will be the same as the C2-in-D entry, as that is the active // V3 virtual base in the complete object D. In the sub-VTT for // D in a class derived from D, some might be construction // virtual tables.) // 4. Virtual VTTs: // (V1 has no virtual bases). [11] V2 * (V2 has virtual bases) [12] V1-in-V2 in D * (secondary vptr, V1 is virtual) (A2 is primary base of V1) // (V3 has no virtual bases)
If A2 is a virtual base of V1, the VTT will contain more elements (exercise left to the astute reader).
The construction virtual tables for a complete object are emitted in the same object file as the virtual table. So the virtual table structures for a complete object of class C include, in no particular order:
The VTT array is referenced via its own mangled external name, and the construction virtual tables are accessed via the VTT array, so the latter do not have external names.
The construction virtual table group for a proper base class subobject B (of derived class D) does not have the same entries in the same order as the main virtual table group for a complete object B, as described in Virtual Table Layout above. Some of the base class subobjects may not need construction virtual tables, which will therefore not be present in the construction virtual table group, even though the subobject virtual tables are present in the main virtual table group for the complete object.
The values of some construction virtual table entries will differ from the corresponding entries in either the main virtual table group for B or the virtual table group for B-in-D, primarily because the virtual bases of B will be at different relative offsets in a D object than in a standalone B object, as follows:
new
Cookies
new
operator being used
is ::operator new[](size_t, void*)
.
These rules have the following consequences:
Given the above, the following is pseudocode for processing
new(ARGS) T[n]
:
if T has a trivial destructor (C++ standard, 12.4/3) padding = 0 else if we're using ::operator new[](size_t, void*) padding = 0 else padding = max(sizeof(size_t), alignof(T)) p = operator new[](n * sizeof(T) + padding, ARGS) p1 = (T*) ( (char *)p + padding ) if padding > 0 *( (size_t *)p1 - 1) = n for i = [0, n) create a T, using the default constructor, at p1[i] return p1
See Section 3.3.2 for the API for references to this guard variable.
typeid
Operator
namespace std {
class type_info {
public:
virtual ~type_info();
bool operator==(const type_info &) const;
bool operator!=(const type_info &) const;
bool before(const type_info &) const;
const char* name() const;
private:
type_info (const type_info& rhs);
type_info& operator= (const type_info& rhs);
};
}
After linking and loading, only one std::type_info structure is accessible via the external name defined by this ABI for any particular complete type symbol (see Vague Linkage). Therefore, except for direct or indirect pointers to incomplete types, the equality and inequality operators can be written as address comparisons when operating on those type_info objects: two type_info structures describe the same type if and only if they are the same structure (at the same address). However, in the case of pointer types, directly or indirectly pointing to incomplete class types, a more complex comparison is required, described below with the RTTI layout of pointer types.
The name()
member function returns the address of an NTBS,
unique to the type,
containing the mangled name of the type.
It has a mangled name defined by the ABI
to allow consistent reference to it,
and the Vague Linkage section specifies how to
produce a unique copy.
In a flat address space
(such as that of the Itanium architecture),
the operator==
, operator!=
, and before()
members are easily implemented in terms of
an address comparison of the name NTBS.
This implies that the type information must keep a description of the public, unambiguous inheritance relationship of a type, as well as the const and volatile qualifications applied to types.
dynamic_cast
OperatorAlthough dynamic_cast can work on pointers and references, from the point of view of representation we need only to worry about polymorphic class types. Also, some kinds of dynamic_cast operations are handled at compile time and do not need any RTTI. There are then three kinds of truly dynamic cast operations:
The most common kind of dynamic_cast is base-to-derived in a singly inherited hierarchy.
std::type_info
class given below,
and do not imply anything about the member functions of these classes.
Virtual member functions of these classes may only be used within the
target systems' respective runtime libraries.
The data members must be laid out exactly as specified.
std::type_info
.
This entry is located at the word preceding the location
pointed to by the virtual pointer (i.e., entry "-1").
The entry is allocated in all virtual tables;
for classes having virtual bases but no virtual functions,
the entry is zero.
class type_info {
... // See section 2.9.3
private:
const char *__type_name;
};
__type_name
is a pointer to a NTBS
representing the mangled name of the type.
The possible derived types are:
abi::__fundamental_type_info
abi::__array_type_info
abi::__function_type_info
abi::__enum_type_info
abi::__class_type_info
abi::__si_class_type_info
abi::__vmi_class_type_info
abi::__pbase_type_info
abi::__pointer_type_info
abi::__pointer_to_member_type_info
abi::__fundamental_type_info
adds no data members
to std::type_info
;
abi::__array_type_info
and
abi::__function_type_info
do not add data
members to std::type_info
(these types are only produced by the typeid operator;
they decay in other contexts).
abi::__enum_type_info
does not add data members either.
abi::__class_type_info
is used for class types having no bases,
and is also a base type for the other two class type representations.
class __class_type_info : public std::type_info {}
This RTTI class may also be used for incomplete class types when referenced by a pointer RTTI, in which case it must be prevented from preempting the RTTI for the complete class type, for instance by emitting it as a static object (without external linkage).
Two abi::__class_type_info
objects can always be compared,
for equality (i.e. of the types represented) or ordering,
by comparison of their name NTBS addresses.
In addition, complete class RTTI objects
may also be compared for equality
by comparison of their type_info addresses.
abi::__si_class_type_info
is used.
It adds to abi::__class_type_info
a single member pointing to the type_info structure for the base type,
declared "__class_type_info const *__base_type
".
class __si_class_type_info : public __class_type_info {
public:
const __class_type_info *__base_type;
};
__si_class_type_info
constraints,
abi::__vmi_class_type_info
is used.
It is derived from abi::__class_type_info
:
class __vmi_class_type_info : public __class_type_info {
public:
unsigned int __flags;
unsigned int __base_count;
__base_class_type_info __base_info[1];
enum __flags_masks {
__non_diamond_repeat_mask = 0x1,
__diamond_shaped_mask = 0x2
};
};
abi::__pbase_type_info
is a base for both pointer types and
pointer-to-member types.
It adds two data members:
class __pbase_type_info : public std::type_info {
public:
unsigned int __flags;
const std::type_info *__pointee;
enum __masks {
__const_mask = 0x1,
__volatile_mask = 0x2,
__restrict_mask = 0x4,
__incomplete_mask = 0x8,
__incomplete_class_mask = 0x10
};
};
__pointee
type has const qualifier
__pointee
type has volatile qualifier
__pointee
type has restrict qualifier
__pointee
type is incomplete
__pointee
is incomplete (in pointer to member)
abi::__pointer_type_info
is derived from
abi::__pbase_type_info
with no additional data members.
abi::__pointer_to_member_type_info
type adds one field
to abi::__pbase_type_info
:
class __pointer_to_member_type_info : public __pbase_type_info {
public:
const abi::__class_type_info *__context;
};
std::type_info::name()
The null-terminated byte string returned by this routine is the mangled name of the type.
dynamic_cast
Algorithm Dynamic casts to "void cv*" are inserted inline at compile time. So are dynamic casts of null pointers and dynamic casts that are really static.
This leaves the following test to be implemented in the run-time library for truly dynamic casts of the form "dynamic_cast<T>(v)": (see [expr.dynamic_cast] 5.2.7/8)
The first check corresponds to a "base-to-derived cast" and the second to a "cross cast". These tests are implemented by abi::__dynamic_cast:
extern "C"
void* __dynamic_cast ( const void *sub,
const abi::__class_type_info *src,
const abi::__class_type_info *dst,
std::ptrdiff_t src2dst_offset);
/* sub: source address to be adjusted; nonnull, and since the
* source object is polymorphic, *(void**)sub is a virtual
pointer.
* src: static type of the source object.
* dst: destination type (the "T" in "dynamic_cast<T>(v)").
* src2dst_offset: a static hint about the location of the
* source subobject with respect to the complete object;
* special negative values are:
* -1: no hint
* -2: src is not a public base of dst
* -3: src is a multiple public base type but never a
* virtual base type
* otherwise, the src type is a unique public nonvirtual
* base type of dst at offset src2dst_offset from the
* origin of dst.
*/
Rationale:
Since the RTTI related exception handling routines are "personality specific", no interfaces need to be specified in this document (beyond the layout of the RTTI data).
In general, the calling conventions for C++ in this ABI follow those specified by the underlying processor-specific ABI for C, whenever there is an analogous construct in C. This chapter specifies exceptions required by C++-specific semantics, or by features without analogues in C. It also specifies the APIs of a variety of runtime utility routines required to be part of the support library of an ABI-conforming implementation for use by compiled code. In addition, reference is made to the separate description of exception handling in this ABI, which defines a large number of runtime utility routine APIs.
3.1 Non-Virtual Function Calling Conventions
In general, C++ value parameters are handled just like C parameters. This includes class type parameters passed wholly or partially in registers. However, in the special case where the parameter type has a non-trivial copy constructor or destructor, the caller must allocate space for a temporary copy, and pass the resulting copy by reference (below). Specifically,
Reference parameters are handled by passing a pointer to the actual parameter.
Empty classes will be passed no differently from ordinary classes. If passed in registers the NaT bit must not be set on all registers that make up the class.
The contents of the single byte parameter slot are unspecified, and the callee may not depend on any particular value. On Itanium, the associated NaT bit must not be set if the parameter slot is associated with a register.
In general, C++ return values are handled just like C return values.
This includes class type results returned in registers.
However, if the return value type has a non-trivial copy constructor
or destructor,
the caller allocates space for a temporary,
and passes a pointer to the temporary as an implicit
first parameter
preceding both the this
parameter and user parameters.
The callee constructs the return value into this temporary.
On Itanium, the pointer is passed in out0
,
different from other large class result buffer pointers,
passed in r8
.
A result of an empty class type will be returned as though it were
a struct containing a single char,
i.e. struct S { char c; };
.
The actual content of the return register is unspecified.
On Itanium, the associated NaT bit must not be set.
Constructors return void
results.
3.2 Virtual Function Calling Conventions
This section sketches the calling convention for virtual functions, based on the above virtual table layout. See also the ABI examples document for motivating examples and potential implementations.
We explain, at a high level, what information must be present in the virtual table for a class A which declares a virtual function f in order that, given an pointer of type A*, the caller can call the virtual function f. This section does not specify exactly where that information is located (see above), nor does it specify how to convert a pointer to a class derived from A to an A*, if that is required.
When this section uses the term function pointer it is understood that this term may refer either to a traditional function pointer (i.e., a pointer to a GP/address pair) or a GP/address pair itself. Which of these alternatives is actually used is specified elsewhere in the ABI, but is independent of the description in this section.
Throughout this section, we assume that A is the class for which we are creating a virtual table, B is the most derived class in the hierarchy, and C is the class that contains C::f, the unique final overrider for A::f. This section specifies the contents of the f entry in the A-in-B virtual table. (If A is primary base in the hierarchy, then the A-in-B virtual table will be shared with the derived class virtual table -- but the contents of the A portion of that virtual table will still be as specified here.)
In all cases, the non-adjusting entry point for a virtual function expects the `this' pointer to point to an instance of the class in which the virtual function is defined. In other words, the non-adjusting entry point for C::f will expect that its `this' pointer points to a C object.
For each virtual function declared in a class C, we add an entry to its virtual table if one is not already there (i.e. if it is not overriding a function in its primary base). In particular, a declaration which overrides a function inherited from a secondary base gets a new slot in the primary virtual table. We do this to avoid useless adjustments when calling a virtual function through a pointer to the most derived class.
The content of this entry for class A is a function pointer, as determined by one of the following cases. Recall that we are dealing with a hierarchy where B is most derived, A is a direct (or indirect) base of B defining f, and C contains the unique final overrider C::f of A::f.
(In this case, we are creating either the primary virtual table for A, or the A-in-B secondary virtual table.)
The virtual table contains a function pointer pointing to the non-adjusting entry point for A::f.
In this case, we are creating the A-in-B secondary virtual table.
The virtual table contains a pointer to an entry point that performs the adjustment from an A* to a C*, and then transfers control to the non-adjusting entry point for C::f.
When a class is used as a virtual base, we add a vcall offset slot to the beginning of its virtual table for each of the virtual functions it provides, whether in its primary or secondary virtual tables. Derived classes which override these functions may use the slots to determine the adjustment necessary.
For each direct or indirect base A of C that is not a morally virtual
base of C,
the compiler must emit, in the same object file as the code for C::f,
an A-adjusting entry point for C::f.
This entry point will expect that its this
pointer
points to an A*,
and will convert it to a C*
(which merely requires adding a constant offset)
before transferring control to the non-adjusting entry point for C::f.
For each direct or indirect virtual base V of C such that V declares f,
the compiler must emit, in the same object file as the code for C::f,
a V-adjusting entry point for C::f.
This entry point will expect that its this
pointer
points to the unique virtual V subobject of C.
(Note that there may in general be multiple V subobjects of C,
but that only one of them will be virtual.)
This entry point must load the vcall offset corresponding to f located
in the virtual table for V obtained via its this
pointer,
extract the vcall offset corresponding to f located in that virtual table,
and add this offset to the this
pointer.
(Note that, as specified in the data layout document,
when V is used as a virtual base,
its virtual table contains vcall offsets for every virtual function
declared in V or any of its bases.)
Then,
this entry point must transfer control to the non-adjusting entry point.
When calling a virtual function f, through a pointer of static type B*, the caller
this
pointer.
Note that the ABI only specifies the multiple entry points for a virtual function and its associated thunks; how those entry points are provided is unspecified. An existing compiler which uses thunks with a different means of adjusting the virtual table pointers can be made compliant with this ABI by only adding the vcall offsets -- the thunks need not use them. A more efficient implementation would be to emit all of the thunks immediately before the non-adjusting entry point to the function. Another might emit a new copy of the function for each entry point; this is a quality of implementation issue. See further discussion of implementation in the ABI examples document.
extern "C" void __cxa_pure_virtual ();
3.3 Construction and Destruction APIs
This section describes APIs to be used for the construction and destruction of objects. This includes:
3.3.1 Subobject Construction and Destruction
The complete object constructors and destructors find the VTT, described in Section 2.6, Virtual Tables During Object Construction, via its mangled name. They pass the address of the subobject's sub-VTT entry in the VTT as a second parameter when calling the base object constructors and destructors. The base object constructors and destructors use the addresses passed to initialize the primary virtual pointer and virtual pointers that point to the classes which either have virtual bases or override virtual functions with a virtual step (have vcall offsets needing adjustment).
If a constructor calls constructors for base class subobjects that do not need construction virtual tables, e.g. because they have no virtual bases, the construction virtual table parameter is not passed to the base class subobject constructor, and the base class subobject constructors use their complete object virtual tables for initialization.
If a class has a non-virtual destructor, and a deleting destructor is
emitted for that class, the deleting destructor must correctly
handle the case that the this
pointer is
NULL
. All other destructors, including deleting
destructors for classes with a virtual destructor, may assume that the
this
pointer is not NULL
.
Suppose we have a subobject class D that needs a construction virtual table, derived from a base B that needs a construction virtual table as part of D, and possibly from others that do not need construction virtual tables. Then the sub-VTT and constructor code for D would look like the following:
// Sub-VTT for D (embedded in VTT for its derived class X): static vtable *__VTT__1D [1+n+m] = { D primary vtable, // The sub-VTT for B-in-D in X may have further structure: B-in-D sub-VTT (n elements), // The secondary virtual pointers for D's bases have elements // corresponding to those in the B-in-D sub-VTT, // and possibly others for virtual bases of D: D secondary virtual pointer for B and bases (m elements) }; D ( D *this, vtable **ctorvtbls ) { // (The following will be unwound, not a real loop): for ( each base A of D ) { // A "boring" base is one that does not need a ctorvtbl: if ( ! boring(A) ) { // Call subobject constructors with sub-VTT index // if the base needs it -- only B in our example: A ( (A*)this, ctorvtbls + sub-VTT-index(A) ); } else { // Otherwise, just invoke the complete-object constructor: A ( (A*)this ); } } // Initialize virtual pointer with primary ctorvtbls address // (first element): this->vptr = ctorvtbls+0; // primary virtual pointer // (The following will be unwound, not a real loop): for ( each subobject A of D ) { // Initialize virtual pointers of subobjects with ctorvtbls // addresses for the bases if ( ! boring(A) ) { ((A*)this)->vptr = ctorvtbls + 1+n + secondary-vptr-index(A); // where n is the number of elements in the sub-VTTs } else { // Otherwise, just use the complete-object vtable: ((A *)this)->vptr = &(A-in-D vtable); } } // Code for D constructor. ... }
A test program for this can be found in the ABI Examples document.
As described in Section 2.8, function-scope static objects have associated guard variables used to support the requirement that they be initialized exactly once, the first time the scope declaring them is entered. An implementation that does not anticipate supporting multi-threading may simply check the first byte (i.e., the byte with lowest address) of that guard variable, initializing if and only if its value is zero, and then setting it to a non-zero value.
However, an implementation intending to support automatically thread-safe, one-time initialization (as opposed to requiring explicit user control for thread safety) may make use of the following API functions:
extern "C" int __cxa_guard_acquire ( __int64_t *guard_object );
Returns 1 if the initialization is not yet complete; 0 otherwise.
This function is called before initialization takes place. If this
function returns 1, either __cxa_guard_release
or
__cxa_guard_abort
must be called with the same argument.
The first byte of the guard_object
is not modified by this
function.
A thread-safe implementation will probably guard access to the first
byte of the guard_object
with a mutex. If this function
returns 1, the mutex will have been acquired by the calling thread.
extern "C" void __cxa_guard_release ( __int64_t *guard_object );
Sets the first byte of the guard object to a non-zero value. This function is called after initialization is complete.
A thread-safe implementation will release the mutex acquired by
__cxa_guard_acquire
after setting the first byte of the
guard object.
extern "C" void __cxa_guard_abort ( __int64_t *guard_object );
This function is called if the initialization terminates by throwing an exception.
A thread-safe implementation will release the mutex acquired by
__cxa_guard_acquire
.
The following is pseudo-code showing how these functions can be used:
if (obj_guard.first_byte == 0) { if ( __cxa_guard_acquire (&obj_guard) ) { try { ... initialize the object ...; } catch (...) { __cxa_guard_abort (&obj_guard); throw; } ... queue object destructor with __cxa_atexit() ...; __cxa_guard_release (&obj_guard); } }
An implementation need not include the simple inline test of the initialization flag in the guard variable around the above sequence. If it does so, the cost of this scheme, when run single-threaded with minimal versions of the above functions, will be two extra function calls, each of them accessing the guard variable, the first time the scope is entered.
An implementation supporting thread-safety on multiprocessor systems
must also guarantee that references to the initialized object do not
occur before the load of the initialization flag.
On Itanium, this can be done by using a ld1.acq
operation to
load the flag.
The intent of specifying an 8-byte structure for the guard variable, but only describing one byte of its contents, is to allow flexibility in the implementation of the API above. On systems with good small lock support, the second word might be used for a mutex lock. On others, it might identify (as a pointer or index) a more complex lock structure to use.
An ABI-compliant system shall provide several runtime routines for use in array construction and destruction. They may be used by compilers, but their use is not required. The required APIs are:
extern "C" void * __cxa_vec_new ( size_t element_count, size_t element_size, size_t padding_size, void (*constructor) ( void *this ), void (*destructor) ( void *this ) );
Equivalent to
__cxa_vec_new2(element_count, element_size, padding_size, constructor, destructor, &::operator new[], &::operator delete[])
extern "C" void * __cxa_vec_new2 ( size_t element_count, size_t element_size, size_t padding_size, void (*constructor) ( void *this ), void (*destructor) ( void *this ), void* (*alloc) ( size_t size ), void (*dealloc) ( void *obj ) );
Given the number and size of elements for an array and the
non-negative size of prefix padding for a cookie, allocate space
(using alloc
) for the array preceded by the specified
padding, initialize the cookie if the padding is non-zero, and call
the given constructor on each element. Return the address of the
array proper, after the padding.
If alloc
throws an exception, rethrow the exception.
If alloc
returns NULL
, return
NULL
. If the constructor
throws an
exception, call destructor
for any already constructed
elements, and rethrow the exception. If the destructor
throws an exception, call std::terminate
.
The constructor may be NULL
, in which case it must
not be called. If the padding_size
is zero, the
destructor
may be NULL
; in that case it must
not be called.
Neither alloc
nor dealloc
may be
NULL
.
extern "C" void * __cxa_vec_new3 ( size_t element_count, size_t element_size, size_t padding_size, void (*constructor) ( void *this ), void (*destructor) ( void *this ), void* (*alloc) ( size_t size ), void (*dealloc) ( void *obj, size_t size ) );
__cxa_vec_new2
except that the deallocation
function takes both the object address and its size.
extern "C" void __cxa_vec_ctor ( void *array_address, size_t element_count, size_t element_size, void (*constructor) ( void *this ), void (*destructor) ( void *this ) );
terminate()
.
The constructor and/or destructor pointers may be NULL.
If either is NULL, no action is taken when it would have been called.
extern "C" void __cxa_vec_dtor ( void *array_address, size_t element_count, size_t element_size, void (*destructor) ( void *this ) );
terminate()
.
The destructor pointer may be NULL,
in which case this routine does nothing.
extern "C" void __cxa_vec_cleanup ( void *array_address, size_t element_count, size_t element_size, void (*destructor) ( void *this ) );
terminate()
.
The destructor pointer may be NULL,
in which case this routine does nothing.
extern "C" void __cxa_vec_delete ( void *array_address, size_t element_size, size_t padding_size, void (*destructor) ( void *this ) );
array_address
is NULL
, return
immediately. Otherwise, given the (data) address of an array, the
non-negative size of prefix padding for the cookie, and the size of
its elements, call the given destructor on each element, using the
cookie to determine the number of elements, and then delete the space.
If the destructor throws an exception, rethrow after destroying the
remaining elements if possible. If the destructor throws a second
exception, call terminate()
. If padding_size is 0, the
destructor pointer must be NULL. If the destructor pointer is NULL,
no destructor call is to be made.
extern "C" void __cxa_vec_delete2 ( void *array_address, size_t element_size, size_t padding_size, void (*destructor) ( void *this ), void (*dealloc) ( void *obj ) );
__cxa_vec_delete
,
except that the given function is used for deallocation
instead of the default delete function.
If dealloc
throws an exception,
the result is undefined.
The dealloc
pointer may not be NULL.
extern "C" void __cxa_vec_delete3 ( void *array_address, size_t element_size, size_t padding_size, void (*destructor) ( void *this ), void (*dealloc) ( void *obj, size_t size ) );
__cxa_vec_delete
,
except that the given function is used for deallocation
instead of the default delete function.
The deallocation function takes both the object address and its size.
If dealloc
throws an exception,
the result is undefined.
The dealloc
pointer may not be NULL.
extern "C" void __cxa_vec_cctor ( void *dest_array, void *src_array, size_t element_count, size_t element_size, void (*constructor) (void *destination, void *source), void (*destructor) (void *));
terminate()
.
The constructor and or destructor pointers may be NULL.
If either is NULL, no action is taken when it would have been called.
A user may specify the construction priority with the pragma:
#pragma priority ( <priority> )The <priority> parameter specifies a 32-bit signed initialization priority, with lower numbers meaning earlier initialization. The range of priorities [MIN_INT .. MIN_INT+1023] is reserved to the implementation. The pragma applies to all file scope variables in the file where it appears, from the point of appearance to the next priority pragma or the end of the file. Objects defined before any priority pragmas have a default priority of zero, as do initialization actions specified by other means, e.g.
DT_INIT_ARRAY
entries.
For consistency with the C++ Standard requirements on initialization order,
behavior is undefined unless the priorities appearing in a single file,
including any default zero priorities,
are in non-decreasing numeric (non-increasing priority) order.
typedef struct { ElfXX_Word pi_pri; ElfXX_Addr pi_addr; } ElfXX_Priority_Init;The field
pi_addr
is a function pointer,
as defined by the base ABI
(a pointer to a function descriptor on Itanium).
The function takes a single unsigned int
priority parameter,
which performs some initialization at priority pi_pri
.
The priority value is obtained from the signed int in the source pragma
by subtracting MIN_INT, so the default priority is -MIN_INT.
The section header field sh_entsize
is 8 for ELF-32,
or 16 for ELF-64.
void __cxa_priority_init ( ElfXX_Priority_Init *pi, int cnt );It will be called with the address of a
cnt
-element
(sub-)vector of the priority initialization entries,
and must call each of them in order.
It will be called with the GP of the initialization entries.
extern "C" int __cxa_atexit ( void (*f)(void *), void *p, void *d );
__cxa_atexit(f,p,d)
,
is intended to cause the call f(p)
when DSO d
is unloaded,
before all such termination calls registered before this one.
It returns zero if registration is successful, nonzero on failure.
The registration function is not called from within the constructor.
atexit
calls:
__cxa_atexit ( f, NULL, NULL );
atexit
implementation so that C-only DSOs will nevertheless interact with C++
programs in a C++-standard-conforming manner.
No user interface to __cxa_atexit
is supported,
so the user is not able to register an atexit
function
with a parameter or a home DSO.
extern "C" void __cxa_finalize ( void *d );
&__dso_handle
.
namespace abi { extern "C" char* __cxa_demangle (const char* mangled_name, char* buf, size_t* n, int* status); }
mangled-name
is a pointer to a null-terminated array of characters.
It may be either an external name, i.e. with a "_Z" prefix,
or an internal NTBS mangling, e.g. of a type for type_info.
buf
may be null.
If it is non-null, then n
must also be nonnull,
and buf
is a pointer to an array, of at least *n
characters,
that was allocated using malloc.
status
points to an int that is used as an error indicator.
It is permitted to be null,
in which case the user just doesn't get any detailed error information.
buf
is a null pointer,
__cxa_demangle
allocates a new buffer with
malloc
. It stores the size of the buffer in
*n
, if n
is not NULL
.
buf
is not a null pointer, it must have been
allocated with malloc
. If buf
is not
big enough to store the resulting demangled name,
__cxa_demangle
must either a) call free
to deallocate buf
and then allocate a new buffer
with malloc
, or b) call realloc
to
increase the size of the buffer. In either case, the new buffer
size will be stored in *n
.
See Exception Handling document, currently just the base psABI-level material, and the HP exception handling working paper, 8 December 1999.
See the separate table summarizing the encoding characters used as terminals. Also see additional mangling examples in the separate ABI examples document.
In the various explanatory examples,
we use Ret?
for an unknown function return type
(i.e. that is not given by the mangling),
or Type?
for an unknown data type.
Entities with C linkage and global namespace variables are not mangled. Mangled names have the general structure:
<mangled-name> ::= _Z <encoding>
<encoding> ::= <function name> <bare-function-type>
::= <data name>
::= <special-name>
Thus, a name is mangled by prefixing "_Z" to an encoding of its name,
and in the case of functions its type (to support overloading).
At this top level,
function types do not have the special delimiter characters required
when nested (see below).
The type is omitted for variables and static data members.
For the purposes of mangling, the name of an anonymous union is considered to be the name of the first named data member found by a pre-order, depth-first, declaration-order walk of the data members of the anonymous union. If there is no such data member (i.e., if all of the data members in the union are unnamed), then there is no way for a program to refer to the anonymous union, and there is therefore no need to mangle its name.
All of these examples:
are considered to have the nameunion { int i; int j; }; union { union { int : 7 }; union { int i; }; }; union { union { int j; } i; };
i
for the purposes of
mangling.
<name> ::= <nested-name>
::= <unscoped-name>
::= <unscoped-template-name> <template-args>
::= <local-name> # See Scope Encoding below
<unscoped-name> ::= <unqualified-name>
::= St <unqualified-name> # ::std::
<unscoped-template-name> ::= <unscoped-name>
::= <substitution>
Names of objects nested in namespaces or classes are identified as a
delimited sequence of names identifying the enclosing scopes.
In addition, when naming a class member function,
CV-qualifiers may be prefixed to the compound name,
encoding the this
attributes.
Note that if member function CV-qualifiers are required,
the delimited form must be used even if the remainder of the name is
a single substitution.
<nested-name> ::= N [<CV-qualifiers>] <prefix> <unqualified-name> E
::= N [<CV-qualifiers>] <template-prefix> <template-args> E
<prefix> ::= <prefix> <unqualified-name>
::= <template-prefix> <template-args>
::= <template-param>
::= # empty
::= <substitution>
<template-prefix> ::= <prefix> <template unqualified-name>
::= <template-param>
::= <substitution>
<unqualified-name> ::= <operator-name>
::= <ctor-dtor-name>
::= <source-name>
<source-name> ::= <positive length number> <identifier>
<number> ::= [n] <non-negative decimal integer>
<identifier> ::= <unqualified source code identifier>
<number> is a pseudo-terminal representing a decimal integer, with a leading 'n' for negative integers. It is used in <source-name> to provide the byte length of the following identifier. <number>s appearing in mangled names never have leading zeroes, except for the value zero, represented as '0'. <identifier> is a pseudo-terminal representing the unqualified identifier for the entity in the source code.
Note that <source-name> in the productions for <unqualified-name> may be either a function or data object name when derived from <name>, or a class or enum name when derived from <type>.
<operator-name> ::= nw # new
::= na # new[]
::= dl # delete
::= da # delete[]
::= ps # + (unary)
::= ng # - (unary)
::= ad # & (unary)
::= de # * (unary)
::= co # ~
::= pl # +
::= mi # -
::= ml # *
::= dv # /
::= rm # %
::= an # &
::= or # |
::= eo # ^
::= aS # =
::= pL # +=
::= mI # -=
::= mL # *=
::= dV # /=
::= rM # %=
::= aN # &=
::= oR # |=
::= eO # ^=
::= ls # <<
::= rs # >>
::= lS # <<=
::= rS # >>=
::= eq # ==
::= ne # !=
::= lt # <
::= gt # >
::= le # <=
::= ge # >=
::= nt # !
::= aa # &&
::= oo # ||
::= pp # ++
::= mm # --
::= cm # ,
::= pm # ->*
::= pt # ->
::= cl # ()
::= ix # []
::= qu # ?
::= st # sizeof (a type)
::= sz # sizeof (an expression)
::= cv <type> # (cast)
::= v <digit> <source-name> # vendor extended operator
<special-name> ::= TV <type> # virtual table
::= TT <type> # VTT structure (construction vtable index)
::= TI <type> # typeinfo structure
::= TS <type> # typeinfo name (null-terminated byte string)
<special-name> ::= GV <object name> # Guard variable for one-time initialization
# No <type>
<special-name> ::= T <call-offset> <base encoding>
# base is the nominal target function of thunk
<call-offset> ::= h <nv-offset> _
::= v <v-offset> _
<nv-offset> ::= <offset number>
# non-virtual base override
<v-offset> ::= <offset number> _ <virtual offset number>
# virtual base override, with vcall offset
<special-name> ::= Tc <call-offset> <call-offset> <base encoding>
# base is the nominal target function of thunk
# first call-offset is 'this' adjustment
# second call-offset is result adjustment
<ctor-dtor-name> ::= C1 # complete object constructor
::= C2 # base object constructor
::= C3 # complete object allocating constructor
::= D0 # deleting destructor
::= D1 # complete object destructor
::= D2 # base object destructor
<type> ::= <builtin-type>
::= <function-type>
::= <class-enum-type>
::= <array-type>
::= <pointer-to-member-type>
::= <template-param>
::= <template-template-param> <template-args>
::= <substitution> # See
Compression below
Types are qualified (optionally) by single-character prefixes encoding cv-qualifiers and/or pointer, reference, complex, or imaginary types:
<type> ::= <CV-qualifiers> <type>
::= P <type> # pointer-to
::= R <type> # reference-to
::= C <type> # complex pair (C 2000)
::= G <type> # imaginary (C 2000)
::= U <source-name> <type> # vendor extended type qualifier
<CV-qualifiers> ::= [r] [V] [K] # restrict (C99), volatile, const
Vendors who define extended type qualifiers (e.g. _near, _far for pointers) shall encode them as a 'U' prefix followed by the name in <length,ID> form.
In cases where multiple order-insensitive qualifiers are present,
they should be ordered 'K' (closest to the base type), 'V', 'r', and
'U' (farthest from the base type), with the 'U' qualifiers in
alphabetical order by the vendor name
(with alphabetically earlier names closer to the base type).
For example, int* volatile const restrict _far p
has mangled type name U4_farrVKPi
.
Vendors must therefore specify which of their extended qualifiers are considered order-insensitive, not necessarily on the basis of whether their language translators impose an order in source code. They are encouraged to resolve questionable cases as being order-insensitive to maximize consistency in mangling.
For purposes of substitution,
given a CV-qualified type,
the base type is substitutible,
and the type with all the C, V, and r qualifiers plus any vendor
extended types in the same order-insensitive set is substitutible;
any type with a subset of those qualifiers is not.
That is, given a type const volatile foo
,
the fully qualified type or foo may be substituted,
but not volatile foo
nor const foo
.
Also, note that the grammar above is written with the assumption that
vendor extended type qualifiers will be in the order-sensitive (not CV)
set. An appropriate grammar modification would be necessitated by an
order-insensitive vendor extended type qualifier like const or volatile.
The restrict qualifier is part of the C99 standard, but is strictly an extension to C++ at this time. There is no standard specification of whether the restrict attribute is part of the type for overloading purposes. An implementation should include its encoding in the mangled name if and only if it also treats it as a distinguishing attribute for overloading purposes. This ABI does not specify that choice.
Builtin types are represented by single-letter codes:
<builtin-type> ::= v # void
::= w # wchar_t
::= b # bool
::= c # char
::= a # signed char
::= h # unsigned char
::= s # short
::= t # unsigned short
::= i # int
::= j # unsigned int
::= l # long
::= m # unsigned long
::= x # long long, __int64
::= y # unsigned long long, __int64
::= n # __int128
::= o # unsigned __int128
::= f # float
::= d # double
::= e # long double, __float80
::= g # __float128
::= z # ellipsis
::= u <source-name> # vendor extended type
Function types are composed from their parameter types and possibly the result type. Except at the outer level type of an <encoding>, or in the <encoding> of an otherwise delimited external name in a <template-parameter> or <local-name> function encoding, these types are delimited by an "F..E" pair. For purposes of substitution (see Compression below), delimited and undelimited function types are considered the same.
Whether the mangling of a function type includes the return type depends on the context and the nature of the function. The rules for deciding whether the return type is included are:
operator int
.
Empty parameter lists,
whether declared as ()
or conventionally as (void)
,
are encoded with a void parameter specifier (v).
Therefore function types always encode at least one parameter type,
and function manglings can always be distinguished from data manglings
by the presence of the type.
Member functions do not encode the types of
implicit parameters, either this
or the VTT parameter.
A "Y" prefix for the bare function type encodes extern "C".
If there are any cv-qualifiers of this
,
they are encoded at the beginning of the <qualified-name>
as described above.
This affects only type mangling,
since extern "C" function objects have unmangled names.
<function-type> ::= F [Y] <bare-function-type> E
<bare-function-type> ::= <signature type>+
# types are possible return type, then parameter types
A class, union, or enum type is simply a name, It may be a simple <unqualified-name>, with or without a template argument list, or a more complex <nested-name>. Thus, it is encoded like a function name, except that no CV-qualifiers are present in a nested name specification.
<class-enum-type> ::= <name>
Array types encode the dimension (number of elements) and the element type. Note that "array" parameters to functions are encoded as pointer types. For variable length arrays (C99 VLAs), the dimension (but not the '_' separator) is omitted.
<array-type> ::= A <positive dimension number> _ <element type>
::= A [<dimension expression>] _ <element type>
When the dimension is an expression involving template parameters, the second production is used. Thus, the declarations:
template<int I> void foo (int (&)[I + 1]) { }
template void foo<2> (int (&)[3]);
produce the mangled name "_Z3fooILi2EEvRAplT_Li1E_i
".
Pointer-to-member types encode the class and member types.
<pointer-to-member-type> ::= M <class type> <member type>
Note that for a pointer to cv-qualified member function, the qualifiers are attached to the function type, so
struct A;
void f (void (A::*)() const) {}
produces the mangled name "_Z1fM1AKFvvE
".
When function and member function template instantiations reference the template parameters in their parameter/result types, the template parameter number is encoded, with the sequence T_, T0_, ... Class template parameter references are mangled using the standard mangling for the actual parameter type, typically a substitution. Note that a template parameter reference is a substitution candidate, distinct from the type (or other substitutible entity) that is the actual parameter.
<template-param> ::= T_ # first template parameter
::= T <parameter-2 non-negative number> _
<template-template-param> ::= <template-param>
::= <substitution>
Template argument lists appear after the unqualified template name, and are bracketed by I/E. This is used in names for specializations in particular, but also in types and scope identification.
<template-args> ::= I <template-arg>+ E
<template-arg> ::= <type> # type or template
::= X <expression> E # expression
::= <expr-primary> # simple expressions
<expression> ::= <unary operator-name> <expression>
::= <binary operator-name> <expression> <expression>
::= <trinary operator-name> <expression> <expression> <expression>
::= st <type>
::= <template-param>
::= sr <type> <unqualified-name> # dependent name
::= sr <type> <unqualified-name> <template-args> # dependent template-id
::= <expr-primary>
<expr-primary> ::= L <type> <value number> E # integer literal
::= L <type <value float> E # floating literal
::= L <mangled-name> E # external name
Type arguments appear using their regular encoding. For example, the template class "A<char, float>" is encoded as "1AIcfE". A slightly more involved example is a dependent function parameter type "A<T2>::X" (T2 is the second template parameter) which is encoded as "N1AIT0_E1XE", where the "N...E" construct is used to describe a qualified name.
Literal arguments, e.g. "A<42L>", are encoded with their type and value. Negative integer values are preceded with "n"; for example, "A<-42L>" becomes "1AILln42EE". The bool value false is encoded as 0, true as 1.
Floating-point literals are encoded using a fixed-length lowercase hexadecimal string corresponding to the internal representation (IEEE on Itanium), high-order bytes first, without leading zeroes. For example: "Lf bf800000 E" is -1.0f on Itanium.
The encoding for a literal of an enumerated type is the encoding of the type name followed by the encoding of the numeric value of the literal in its base integral type (which deals with values that don't have names declared in the type).
A reference to an entity with external linkage is encoded with
"L<mangled name>E".
For example:
void foo(char); // mangled as _Z3fooc template<void (&)(char)> struct CB; // CB<foo> is mangled as "2CBIL_Z3foocEE"
The <encoding> of an extern "C" function is treated like
global-scope data,
i.e. as its <source-name> without a type.
For example:
extern "C" bool IsEmpty(char *); // (un)mangled as IsEmpty template<void (&)(char *)> struct CB; // CB<IsEmpty> is mangled as "2CBIL_Z7IsEmptyEE"
An expression, e.g., "B<(J+1)/2>", is encoded with a prefix traversal of the operators involved, delimited by "X...E". The operators are encoded using their two letter mangled names. For example, "B<(J+1)/2>", if J is the third template parameter, becomes "1BI Xdv pl T1_ Li1E Li2E E E" (the blanks are present only to visualize the decomposition). Note that the expression is mangled without constant folding or other simplification, and without parentheses, which are implicit in the prefix representation. Except for the parentheses, therefore, it represents the source token stream. (C++ Standard reference 14.5.5.1 p. 5.)
If an expression is a qualified-name, and the qualifying scope is a
dependent type, one of the sr
productions is used, rather
than the <mangled-name>
production. If the qualified
name refers to an operator for which both unary and binary manglings
are available, the mangling chosen is the mangling for the binary
version.
N <qual 1> ... <qual N> <unqual name> Ewhere each <qual K> is the encoding of a namespace name or a class name (with the latter possibly including a template argument list).
<local-name> := Z <function encoding> E <entity name> [<discriminator>] := Z <function encoding> E s [<discriminator>] <discriminator> := _ <non-negative number>
namespace N { inline char* f(int i) { static char *p = "Itanium C++ ABI"; // p = 1, "..." = 2 { struct X { // X = 3 void g() {} }; } return p[i]; } }
_ZZN1N1fEiE1p
":
encoding of N::f::p (first local mangled entity)
_ZZN1N1fEiEs
":
encoding of N::f::"Itanium C++ ABI" (no discriminator)
_ZNZN1N1fEiE1X1gE
":
encoding of N::f::X::g()
(third local mangled entity used as a class-qualifier)
the functiontypedef void T(); struct S {}; void f(T*, T (S::*)) {}
f
is mangled as
_Z1fPFvvEM1SFvvE
; the type of the member function pointed
to by the second parameter is not considered the same as the type of
the function pointed to by the first parameter. Both function types
are, however, entered the substitution table; subsequent references to
either variant of the function type will result in the use of
substitutions.
Substitution is according to the production:
<substitution> ::= S <seq-id> _
::= S_
The <seq-id> is a sequence number in base 36,
using digits and upper case letters,
and identifies the <seq-id>-th
encoded component,
in left-to-right order,
starting at "0".
As a special case,
the first substitutable entity is encoded as "S_",
i.e. with no number,
so the numbered entities are the second one as "S0_",
the third as "S1_", the twelfth as "SA_", the thirty-eighth as "S10_",
etc.
All substitutable components are so numbered,
except those that have already been numbered for substitution.
A component is earlier in the substitution dictionary
than the structure of which it is a part.
For example:
"_ZN1N1TIiiE2mfES0_IddE": Ret? N::T<int, int>::mf(N::T<double, double>)since the substitutions generated for this name are:
"S_" == N (qualifier is less recent than qualified entity) "S0_" == N::T (template-id comes before template) (int is builtin, and isn't considered) "S1_" == N::T<int, int> "S2_" == N::T<double, double>
In addition, the following catalog of abbreviations of the form "Sx" are used:
<substitution> ::= St # ::std::
<substitution> ::= Sa # ::std::allocator
<substitution> ::= Sb # ::std::basic_string
<substitution> ::= Ss # ::std::basic_string < char,
::std::char_traits<char>,
::std::allocator<char> >
<substitution> ::= Si # ::std::basic_istream<char, std::char_traits<char> >
<substitution> ::= So # ::std::basic_ostream<char, std::char_traits<char> >
<substitution> ::= Sd # ::std::basic_iostream<char, std::char_traits<char> >
<name> ::= St <unqualified-name> # ::std::
For example:
"_ZSt5state": ::std::state "_ZNSt3_In4wardE": ::std::_In::ward
In many cases,
we will deal with duplicates by putting possibly duplicated objects
in distinct ELF sections or groups of sections,
and using the COMDAT feature of SHT_GROUP
sections in the
gABI to remove duplicates.
We will refer to this simply as using a COMDAT group,
and specify the symbol to be used to identify duplicates in the
SHT_GROUP
section.
COMDAT groups are a new gABI feature specified during the Itanium ABI
definition, and may not be implemented everywhere immediately.
See the separate ABI examples
document for a discussion of alternatives pending COMDAT implementation.
Note that nothing in this section should be construed to require COMDAT usage for objects with internal linkage unless they may in fact be referenced outside the translation unit where they appear, for instance due to inlining.
It may sometimes be necessary or desirable to reference an out-of-line copy of a function declared inline, i.e. to reference a global symbol naming the function. This may occur because the implementation cannot, or chooses not to, inline the function, or because it needs an address rather than a call. In such a case, the function is to be emitted in each object where its name is referenced. A COMDAT group is used to eliminate duplicates, with the mangled name of the function as the identifying symbol.
Inline functions, whether or not declared as such, and whether they are inline or out-of-line copies, may reference static data or character string literals, that must be kept in common among all copies by using the local symbol mangling defined above. These objects are named according to the rules for local names in the Scope Encoding section above, and the definition of each is emitted in a COMDAT group, identified by the symbol name described in the Scope Encoding section above. Each COMDAT group must be emitted in any object with references to the symbol for the object it contains, whether inline or out-of-line.
Local static data objects generally have associated guard variables used to ensure that they are initialized only once (see 3.3.2). If the object is emitted using a COMDAT group, the guard variable must be too. It is suggested that it be emitted in the same COMDAT group as the associated data object, but it may be emitted in its own COMDAT group, identified by its name. In either case, it must be weak.
The virtual table for a class is emitted in the same object containing the definition of its key function, i.e. the first non-pure virtual function that is not inline at the point of class definition. If there is no key function, it is emitted everywhere used. The emitted virtual table includes the full virtual table group for the class, any new construction virtual tables required for subobjects, and the VTT for the class. They are emitted in a COMDAT group, with the virtual table mangled name as the identifying symbol. Note that if the key function is not declared inline in the class definition, but its definition later is always declared inline, it will be emitted in every object containing the definition.
In the abstract, a pure virtual destructor could be used as the key function, as it must be defined even though it is pure. However, the ABI committee did not realize this fact until after the specification of key function was complete; therefore a pure virtual destructor cannot be the key function.
The RTTI std::type_info structure for a complete class type is emitted in the same object as its virtual table if dynamic, or everywhere referenced if not. The RTTI std::type_info structure for an incomplete class type is emitted wherever referenced. The RTTI std::type_info structures for various basic types as specified by the Run-Time Type Information section are provided by the runtime library. The RTTI name NTBS objects are emitted with each referencing std::type_info object.
The RTTI std::type_info structures for complete class types and basic types are emitted in COMDAT groups identified by their mangled names. The RTTI std::type_info structures for incomplete class types are emitted with other than the ABI-defined complete type mangled names; an implementation may choose to emit them as local static objects, or in COMDAT groups with implementation-defined names and COMDAT identifiers. The RTTI name NTBS objects are emitted in separate COMDAT groups identified by the NTBS mangled names as weak symbols.
Constructors and destructors for a class, whether implicitly-defined or user-defined, are emitted under the same rules as other functions. That is, user-defined constructors or destructors, unless the function is declared inline, or has internal linkage, are emitted where defined, with their complete, and base object variants. For destructors, in classes with a virtual destructor, the deleting variant is emitted as well. A user-defined constructor or destructor with non-inline, internal linkage is emitted where defined, with only the variants actually referenced. Implicitly-defined or inline user-defined constructors and destructors are emitted where referenced, each in its own COMDAT group identified by the constructor or destructor name.
This ABI does not require the generation or use of allocating constructors or deleting destructors for classes without a virtual destructor. However, if an implementation emits such functions, it must use the external names specified in this ABI. If such a function has external linkage, it must be emitted wherever referenced, in a COMDAT group whose name is the external name of the function.
An instantiation of a class template requires:
An instantiation of a function template or member function template is emitted in any object where its symbol is referenced (non-inline), in a COMDAT group identified by the function name.
[031128] Fix alphabetization of company names.
[031102]
Specify the behavior of __cxa_vec_delete
when the
array_address
is NULL
.
[030609]
Use void*
instead of dso_handle
.
[030518] Define "POD for the purpose of layout."
[030316] Add acknowledgements section.
[030313] Correct broken links and incorrect formatting.
[030103] Clarify definition of substantively different types.
[021222] Document mangling for anonymous unions.
[021204] Remove note about 32-bit RTTI variation.
[021125] Clarify guard functions.
[021110] Clarify definition of nearly empty class.
[021110] Clarify ordering of string literals in mem-initializer-list.
[021110] Remove unnecessary V-adjusting thunks.
[021110] Clarify VTT contents.
[021021] Specify place and manner of emission for deleting destructors.
[021021] Clarify mangling of pointer-to-member functions.
[021016] Clarify mangling of floating-point literals.
[021014]
Clarify use of sr
in mangling.
[021011] Add mangling for unary plus.
[021008] Make the names used for constructors and destructor entry points consistent throughout.
[021008] Define manglings for typename types.
[020827] Clarify definition of nearly empty class, dsize, nvsize, nvalign.
[020827] Clarify handling of tail-padding.
[020326]
Clarify wording in __cxa_demangle
memory management
specification.
[020220] Clarify pointer to member function mangling (5.1.5).
[000817] Updates from 17 August meeting, email.
[000511] Specify 32-bit form of vmi_offset_flags. Add export template note.
[000502] Fixed mangling of template parameters again.
[000427] Reorganization and section numbering. Added non-virtual function calling conventions.
[000417] Updates from 17 April meeting. Clarify order of vcall offsets. More elaboration of construction virtual table. Specification of COMDAT RTTI name. Reorganization of pointer RTTI. Modify mangling grammar to clarify substitution in compound names. Clarify Vague Linkage section.
[000407] Updates from 6 April meeting, email. More elaboration of construction vtable. Updates/issues in RTTI. Minor mangling changes. Added Vague Linkage section.
[000327] Updates from 30 March meeting. Define base classes to include self, proper base classes. Modify local function mangling per JFW proposal.
[000327] Updates from 23 March meeting. Adopt construction vtable Proposal B, and rewrite. Further work on mangling, especially substitution.
[000320] Clarify class size limit. Editorial changes in vtable components description. Add alternate to construction vtable proposal. Clarification in array cookie specification. Removed COMMON proxy from class RTTI. Extensive changes to mangling writeup.
[000314] Construction vtable modifications. RTTI modifications for incomplete class types. Mangling rework: grammar, new constructs, function return types.
[000309] Add limits section. Specify NULL member pointer values. Combine vtable content and order sections; clarify ordering. Specify when distinct virtual function entries are needed for overriders. Define (and modify) vector constructor/destructor runtime APIs. Virtual base offsets are promoted from non-virtual bases.
[000228] Add thunk definition. Revise inheritance graph order definition. Fix member function pointer description (no division by two). Move bitfield allocation description (much modified) to the non-virtual-base allocation description. Replace virtual function calling convention description.
[000228] Add thunk definition. Revise inheritance graph order definition. Fix member function pointer description (no division by two). Move bitfield allocation description (much modified) to the non-virtual-base allocation description. Replace virtual function calling convention description.
[000217] Add excess-size bitfield specification. Add namespace/header section. Touch up array new cookies. Remove construction vtable example to new file. Add mangling proposal.
[000214] Complete array new cookie specification. Remove unnecessary RTTI flags. Correct repeated inheritance flag description. Move all type_info subclasses in namespace abi, not namespace std. Note requirements for an implementation to prevent users from emitting invalid vtables for RTTI classes. Include construction vtable proposal.
[000203] Incorporate discussion of 3 Febrary. Remove __reference_type_info (issue A-22). Restructure struct RTTI and flags (issue A-23). Clarify __base_class_info layout.
[000125] Incorporate discussion of 20 January, generally clarifications. Resolved A-19 (choice of a primary virtual base). Answered Nathan's questions about RTTI. Included RTTI "Deliberations" as rationale notes in the specification, or removed redundant ones. Added array operator new section.
[000119] Clarify when virtual base offsets are required. Note that a vtable has offset-to-top and RTTr entries for classes with virtual bases even if there are no virtual functions. Resolve allocation of a virtual base class that is a primary base for another base (A-17). Resolve choice of a primary virtual base class that is a primary base for another base (A-19). Describe the (non-)effect of virtual bases on the alignment of the non-virtual part of a class as the base of another class (A-18).
[991230] Integrate proposed resolution of A-16, A-17 in base class layout. Add outstanding questions list, and clean up questions in text.
[991229] Clarify definition of nearly empty class, layout of virtual bases.
[991203] Added description of vfunc calling convention from Jason.
[991104] Noted pair of vtable entries for virtual destructors.
[991019] Modified RTTI proposal for 14 October decisions.
[991006] Added RTTI proposal.
[990930] Updated to new vtable layout proposal.
[990811] Described member pointer representations, virtual table layout.
[990730] Selected first variant for empty base allocation; removed others.
Please send corrections to the C++ ABI mailing list.