[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4. Machine Dependent Features

ld has additional features on some platforms; the following sections describe them. Machines where ld has no additional functionality are not listed.

4.1 ld and the H8/300  
4.2 ld and the Intel 960 Family  ld and the Intel 960 family
4.3 ld and the Motorola 68HC11 and 68HC12 families  
4.4 ld and the ARM family  
4.5 ld and HPPA 32-bit ELF Support  ld and HPPA 32-bit ELF
4.6 ld and the Motorola 68K family  
4.7 ld and MMIX  
4.8 ld and MSP430  
4.9 ld and PowerPC 32-bit ELF Support  
4.10 ld and PowerPC64 64-bit ELF Support  
4.11 ld and SPU ELF Support  
4.12 ld's Support for Various TI COFF Versions  ld and TI COFF
4.13 ld and WIN32 (cygwin/mingw)  
4.14 ld and Xtensa Processors  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.1 ld and the H8/300

For the H8/300, ld can perform these global optimizations when you specify the `--relax' command-line option.

relaxing address modes
ld finds all jsr and jmp instructions whose targets are within eight bits, and turns them into eight-bit program-counter relative bsr and bra instructions, respectively.

synthesizing instructions
ld finds all mov.b instructions which use the sixteen-bit absolute address form, but refer to the top page of memory, and changes them to use the eight-bit address form. (That is: the linker turns `mov.b @aa:16' into `mov.b @aa:8' whenever the address aa is in the top page of memory).

ld finds all mov instructions which use the register indirect with 32-bit displacement addressing mode, but use a small displacement inside 16-bit displacement range, and changes them to use the 16-bit displacement form. (That is: the linker turns `mov.b @d:32,ERx' into `mov.b @d:16,ERx' whenever the displacement d is in the 16 bit signed integer range. Only implemented in ELF-format ld).

bit manipulation instructions
ld finds all bit manipulation instructions like band, bclr, biand, bild, bior, bist, bixor, bld, bnot, bor, bset, bst, btst, bxor which use 32 bit and 16 bit absolute address form, but refer to the top page of memory, and changes them to use the 8 bit address form. (That is: the linker turns `bset #xx:3,@aa:32' into `bset #xx:3,@aa:8' whenever the address aa is in the top page of memory).

system control instructions
ld finds all ldc.w, stc.w instructions which use the 32 bit absolute address form, but refer to the top page of memory, and changes them to use 16 bit address form. (That is: the linker turns `ldc.w @aa:32,ccr' into `ldc.w @aa:16,ccr' whenever the address aa is in the top page of memory).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.2 ld and the Intel 960 Family

You can use the `-Aarchitecture' command line option to specify one of the two-letter names identifying members of the 960 family; the option specifies the desired output target, and warns of any incompatible instructions in the input files. It also modifies the linker's search strategy for archive libraries, to support the use of libraries specific to each particular architecture, by including in the search loop names suffixed with the string identifying the architecture.

For example, if your ld command line included `-ACA' as well as `-ltry', the linker would look (in its built-in search paths, and in any paths you specify with `-L') for a library with the names

 
try
libtry.a
tryca
libtryca.a

The first two possibilities would be considered in any event; the last two are due to the use of `-ACA'.

You can meaningfully use `-A' more than once on a command line, since the 960 architecture family allows combination of target architectures; each use will add another pair of name variants to search for when `-l' specifies a library.

ld supports the `--relax' option for the i960 family. If you specify `--relax', ld finds all balx and calx instructions whose targets are within 24 bits, and turns them into 24-bit program-counter relative bal and cal instructions, respectively. ld also turns cal instructions into bal instructions when it determines that the target subroutine is a leaf routine (that is, the target subroutine does not itself call any subroutines).

The `--fix-cortex-a8' switch enables a link-time workaround for an erratum in certain Cortex-A8 processors. The workaround is enabled by default if you are targeting the ARM v7-A architecture profile. It can be enabled otherwise by specifying `--fix-cortex-a8', or disabled unconditionally by specifying `--no-fix-cortex-a8'.

The erratum only affects Thumb-2 code. Please contact ARM for further details.

The `--no-merge-exidx-entries' switch disables the merging of adjacent exidx entries in debuginfo.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3 ld and the Motorola 68HC11 and 68HC12 families


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3.1 Linker Relaxation

For the Motorola 68HC11, ld can perform these global optimizations when you specify the `--relax' command-line option.

relaxing address modes
ld finds all jsr and jmp instructions whose targets are within eight bits, and turns them into eight-bit program-counter relative bsr and bra instructions, respectively.

ld also looks at all 16-bit extended addressing modes and transforms them in a direct addressing mode when the address is in page 0 (between 0 and 0x0ff).

relaxing gcc instruction group
When gcc is called with `-mrelax', it can emit group of instructions that the linker can optimize to use a 68HC11 direct addressing mode. These instructions consists of bclr or bset instructions.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3.2 Trampoline Generation

For 68HC11 and 68HC12, ld can generate trampoline code to call a far function using a normal jsr instruction. The linker will also change the relocation to some far function to use the trampoline address instead of the function address. This is typically the case when a pointer to a function is taken. The pointer will in fact point to the function trampoline.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.4 ld and the ARM family

For the ARM, ld will generate code stubs to allow functions calls between ARM and Thumb code. These stubs only work with code that has been compiled and assembled with the `-mthumb-interwork' command line option. If it is necessary to link with old ARM object files or libraries, which have not been compiled with the -mthumb-interwork option then the `--support-old-code' command line switch should be given to the linker. This will make it generate larger stub functions which will work with non-interworking aware ARM code. Note, however, the linker does not support generating stubs for function calls to non-interworking aware Thumb code.

The `--thumb-entry' switch is a duplicate of the generic `--entry' switch, in that it sets the program's starting address. But it also sets the bottom bit of the address, so that it can be branched to using a BX instruction, and the program will start executing in Thumb mode straight away.

The `--use-nul-prefixed-import-tables' switch is specifying, that the import tables idata4 and idata5 have to be generated with a zero element prefix for import libraries. This is the old style to generate import tables. By default this option is turned off.

The `--be8' switch instructs ld to generate BE8 format executables. This option is only valid when linking big-endian objects. The resulting image will contain big-endian data and little-endian code.

The `R_ARM_TARGET1' relocation is typically used for entries in the `.init_array' section. It is interpreted as either `R_ARM_REL32' or `R_ARM_ABS32', depending on the target. The `--target1-rel' and `--target1-abs' switches override the default.

The `--target2=type' switch overrides the default definition of the `R_ARM_TARGET2' relocation. Valid values for `type', their meanings, and target defaults are as follows:

`rel'
`R_ARM_REL32' (arm*-*-elf, arm*-*-eabi)
`abs'
`R_ARM_ABS32' (arm*-*-symbianelf)
`got-rel'
`R_ARM_GOT_PREL' (arm*-*-linux, arm*-*-*bsd)

The `R_ARM_V4BX' relocation (defined by the ARM AAELF specification) enables objects compiled for the ARMv4 architecture to be interworking-safe when linked with other objects compiled for ARMv4t, but also allows pure ARMv4 binaries to be built from the same ARMv4 objects.

In the latter case, the switch `--fix-v4bx' must be passed to the linker, which causes v4t BX rM instructions to be rewritten as MOV PC,rM, since v4 processors do not have a BX instruction.

In the former case, the switch should not be used, and `R_ARM_V4BX' relocations are ignored.

Replace BX rM instructions identified by `R_ARM_V4BX' relocations with a branch to the following veneer:

 
TST rM, #1
MOVEQ PC, rM
BX Rn

This allows generation of libraries/applications that work on ARMv4 cores and are still interworking safe. Note that the above veneer clobbers the condition flags, so may cause incorrect program behavior in rare cases.

The `--use-blx' switch enables the linker to use ARM/Thumb BLX instructions (available on ARMv5t and above) in various situations. Currently it is used to perform calls via the PLT from Thumb code using BLX rather than using BX and a mode-switching stub before each PLT entry. This should lead to such calls executing slightly faster.

This option is enabled implicitly for SymbianOS, so there is no need to specify it if you are using that target.

The `--vfp11-denorm-fix' switch enables a link-time workaround for a bug in certain VFP11 coprocessor hardware, which sometimes allows instructions with denorm operands (which must be handled by support code) to have those operands overwritten by subsequent instructions before the support code can read the intended values.

The bug may be avoided in scalar mode if you allow at least one intervening instruction between a VFP11 instruction which uses a register and another instruction which writes to the same register, or at least two intervening instructions if vector mode is in use. The bug only affects full-compliance floating-point mode: you do not need this workaround if you are using "runfast" mode. Please contact ARM for further details.

If you know you are using buggy VFP11 hardware, you can enable this workaround by specifying the linker option `--vfp-denorm-fix=scalar' if you are using the VFP11 scalar mode only, or `--vfp-denorm-fix=vector' if you are using vector mode (the latter also works for scalar code). The default is `--vfp-denorm-fix=none'.

If the workaround is enabled, instructions are scanned for potentially-troublesome sequences, and a veneer is created for each such sequence which may trigger the erratum. The veneer consists of the first instruction of the sequence and a branch back to the subsequent instruction. The original instruction is then replaced with a branch to the veneer. The extra cycles required to call and return from the veneer are sufficient to avoid the erratum in both the scalar and vector cases.

The `--fix-arm1176' switch enables a link-time workaround for an erratum in certain ARM1176 processors. The workaround is enabled by default if you are targeting ARM v6 (excluding ARM v6T2) or earlier. It can be disabled unconditionally by specifying `--no-fix-arm1176'.

Further information is available in the "ARM1176JZ-S and ARM1176JZF-S Programmer Advice Notice" available on the ARM documentation website at: http://infocenter.arm.com/.

The `--no-enum-size-warning' switch prevents the linker from warning when linking object files that specify incompatible EABI enumeration size attributes. For example, with this switch enabled, linking of an object file using 32-bit enumeration values with another using enumeration values fitted into the smallest possible space will not be diagnosed.

The `--no-wchar-size-warning' switch prevents the linker from warning when linking object files that specify incompatible EABI wchar_t size attributes. For example, with this switch enabled, linking of an object file using 32-bit wchar_t values with another using 16-bit wchar_t values will not be diagnosed.

The `--pic-veneer' switch makes the linker use PIC sequences for ARM/Thumb interworking veneers, even if the rest of the binary is not PIC. This avoids problems on uClinux targets where `--emit-relocs' is used to generate relocatable binaries.

The linker will automatically generate and insert small sequences of code into a linked ARM ELF executable whenever an attempt is made to perform a function call to a symbol that is too far away. The placement of these sequences of instructions - called stubs - is controlled by the command line option `--stub-group-size=N'. The placement is important because a poor choice can create a need for duplicate stubs, increasing the code size. The linker will try to group stubs together in order to reduce interruptions to the flow of code, but it needs guidance as to how big these groups should be and where they should be placed.

The value of `N', the parameter to the `--stub-group-size=' option controls where the stub groups are placed. If it is negative then all stubs are placed after the first branch that needs them. If it is positive then the stubs can be placed either before or after the branches that need them. If the value of `N' is 1 (either +1 or -1) then the linker will choose exactly where to place groups of stubs, using its built in heuristics. A value of `N' greater than 1 (or smaller than -1) tells the linker that a single group of stubs can service at most `N' bytes from the input sections.

The default, if `--stub-group-size=' is not specified, is `N = +1'.

Farcalls stubs insertion is fully supported for the ARM-EABI target only, because it relies on object files properties not present otherwise.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.5 ld and HPPA 32-bit ELF Support

When generating a shared library, ld will by default generate import stubs suitable for use with a single sub-space application. The `--multi-subspace' switch causes ld to generate export stubs, and different (larger) import stubs suitable for use with multiple sub-spaces.

Long branch stubs and import/export stubs are placed by ld in stub sections located between groups of input sections. `--stub-group-size' specifies the maximum size of a group of input sections handled by one stub section. Since branch offsets are signed, a stub section may serve two groups of input sections, one group before the stub section, and one group after it. However, when using conditional branches that require stubs, it may be better (for branch prediction) that stub sections only serve one group of input sections. A negative value for `N' chooses this scheme, ensuring that branches to stubs always use a negative offset. Two special values of `N' are recognized, `1' and `-1'. These both instruct ld to automatically size input section groups for the branch types detected, with the same behaviour regarding stub placement as other positive or negative values of `N' respectively.

Note that `--stub-group-size' does not split input sections. A single input section larger than the group size specified will of course create a larger group (of one section). If input sections are too large, it may not be possible for a branch to reach its stub.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.6 ld and the Motorola 68K family

The `--got=type' option lets you choose the GOT generation scheme. The choices are `single', `negative', `multigot' and `target'. When `target' is selected the linker chooses the default GOT generation scheme for the current target. `single' tells the linker to generate a single GOT with entries only at non-negative offsets. `negative' instructs the linker to generate a single GOT with entries at both negative and positive offsets. Not all environments support such GOTs. `multigot' allows the linker to generate several GOTs in the output file. All GOT references from a single input object file access the same GOT, but references from different input object files might access different GOTs. Not all environments support such GOTs.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.7 ld and MMIX

For MMIX, there is a choice of generating ELF object files or mmo object files when linking. The simulator mmix understands the mmo format. The binutils objcopy utility can translate between the two formats.

There is one special section, the `.MMIX.reg_contents' section. Contents in this section is assumed to correspond to that of global registers, and symbols referring to it are translated to special symbols, equal to registers. In a final link, the start address of the `.MMIX.reg_contents' section corresponds to the first allocated global register multiplied by 8. Register $255 is not included in this section; it is always set to the program entry, which is at the symbol Main for mmo files.

Global symbols with the prefix __.MMIX.start., for example __.MMIX.start..text and __.MMIX.start..data are special. The default linker script uses these to set the default start address of a section.

Initial and trailing multiples of zero-valued 32-bit words in a section, are left out from an mmo file.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.8 ld and MSP430

For the MSP430 it is possible to select the MPU architecture. The flag `-m [mpu type]' will select an appropriate linker script for selected MPU type. (To get a list of known MPUs just pass `-m help' option to the linker).

The linker will recognize some extra sections which are MSP430 specific:

`.vectors'
Defines a portion of ROM where interrupt vectors located.

`.bootloader'
Defines the bootloader portion of the ROM (if applicable). Any code in this section will be uploaded to the MPU.

`.infomem'
Defines an information memory section (if applicable). Any code in this section will be uploaded to the MPU.

`.infomemnobits'
This is the same as the `.infomem' section except that any code in this section will not be uploaded to the MPU.

`.noinit'
Denotes a portion of RAM located above `.bss' section.

The last two sections are used by gcc.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.9 ld and PowerPC 32-bit ELF Support

Branches on PowerPC processors are limited to a signed 26-bit displacement, which may result in ld giving `relocation truncated to fit' errors with very large programs. `--relax' enables the generation of trampolines that can access the entire 32-bit address space. These trampolines are inserted at section boundaries, so may not themselves be reachable if an input section exceeds 33M in size. You may combine `-r' and `--relax' to add trampolines in a partial link. In that case both branches to undefined symbols and inter-section branches are also considered potentially out of range, and trampolines inserted.

`--bss-plt'
Current PowerPC GCC accepts a `-msecure-plt' option that generates code capable of using a newer PLT and GOT layout that has the security advantage of no executable section ever needing to be writable and no writable section ever being executable. PowerPC ld will generate this layout, including stubs to access the PLT, if all input files (including startup and static libraries) were compiled with `-msecure-plt'. `--bss-plt' forces the old BSS PLT (and GOT layout) which can give slightly better performance.

`--secure-plt'
ld will use the new PLT and GOT layout if it is linking new `-fpic' or `-fPIC' code, but does not do so automatically when linking non-PIC code. This option requests the new PLT and GOT layout. A warning will be given if some object file requires the old style BSS PLT.

`--sdata-got'
The new secure PLT and GOT are placed differently relative to other sections compared to older BSS PLT and GOT placement. The location of .plt must change because the new secure PLT is an initialized section while the old PLT is uninitialized. The reason for the .got change is more subtle: The new placement allows .got to be read-only in applications linked with `-z relro -z now'. However, this placement means that .sdata cannot always be used in shared libraries, because the PowerPC ABI accesses .sdata in shared libraries from the GOT pointer. `--sdata-got' forces the old GOT placement. PowerPC GCC doesn't use .sdata in shared libraries, so this option is really only useful for other compilers that may do so.

`--emit-stub-syms'
This option causes ld to label linker stubs with a local symbol that encodes the stub type and destination.

`--no-tls-optimize'
PowerPC ld normally performs some optimization of code sequences used to access Thread-Local Storage. Use this option to disable the optimization.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.10 ld and PowerPC64 64-bit ELF Support

`--stub-group-size'
Long branch stubs, PLT call stubs and TOC adjusting stubs are placed by ld in stub sections located between groups of input sections. `--stub-group-size' specifies the maximum size of a group of input sections handled by one stub section. Since branch offsets are signed, a stub section may serve two groups of input sections, one group before the stub section, and one group after it. However, when using conditional branches that require stubs, it may be better (for branch prediction) that stub sections only serve one group of input sections. A negative value for `N' chooses this scheme, ensuring that branches to stubs always use a negative offset. Two special values of `N' are recognized, `1' and `-1'. These both instruct ld to automatically size input section groups for the branch types detected, with the same behaviour regarding stub placement as other positive or negative values of `N' respectively.

Note that `--stub-group-size' does not split input sections. A single input section larger than the group size specified will of course create a larger group (of one section). If input sections are too large, it may not be possible for a branch to reach its stub.

`--emit-stub-syms'
This option causes ld to label linker stubs with a local symbol that encodes the stub type and destination.

`--dotsyms, --no-dotsyms'
These two options control how ld interprets version patterns in a version script. Older PowerPC64 compilers emitted both a function descriptor symbol with the same name as the function, and a code entry symbol with the name prefixed by a dot (`.'). To properly version a function `foo', the version script thus needs to control both `foo' and `.foo'. The option `--dotsyms', on by default, automatically adds the required dot-prefixed patterns. Use `--no-dotsyms' to disable this feature.

`--no-tls-optimize'
PowerPC64 ld normally performs some optimization of code sequences used to access Thread-Local Storage. Use this option to disable the optimization.

`--no-opd-optimize'
PowerPC64 ld normally removes .opd section entries corresponding to deleted link-once functions, or functions removed by the action of `--gc-sections' or linker script /DISCARD/. Use this option to disable .opd optimization.

`--non-overlapping-opd'
Some PowerPC64 compilers have an option to generate compressed .opd entries spaced 16 bytes apart, overlapping the third word, the static chain pointer (unused in C) with the first word of the next entry. This option expands such entries to the full 24 bytes.

`--no-toc-optimize'
PowerPC64 ld normally removes unused .toc section entries. Such entries are detected by examining relocations that reference the TOC in code sections. A reloc in a deleted code section marks a TOC word as unneeded, while a reloc in a kept code section marks a TOC word as needed. Since the TOC may reference itself, TOC relocs are also examined. TOC words marked as both needed and unneeded will of course be kept. TOC words without any referencing reloc are assumed to be part of a multi-word entry, and are kept or discarded as per the nearest marked preceding word. This works reliably for compiler generated code, but may be incorrect if assembly code is used to insert TOC entries. Use this option to disable the optimization.

`--no-multi-toc'
If given any toc option besides -mcmodel=medium or -mcmodel=large, PowerPC64 GCC generates code for a TOC model where TOC entries are accessed with a 16-bit offset from r2. This limits the total TOC size to 64K. PowerPC64 ld extends this limit by grouping code sections such that each group uses less than 64K for its TOC entries, then inserts r2 adjusting stubs between inter-group calls. ld does not split apart input sections, so cannot help if a single input file has a .toc section that exceeds 64K, most likely from linking multiple files with ld -r. Use this option to turn off this feature.

`--no-toc-sort'
By default, ld sorts TOC sections so that those whose file happens to have a section called .init or .fini are placed first, followed by TOC sections referenced by code generated with PowerPC64 gcc's -mcmodel=small, and lastly TOC sections referenced only by code generated with PowerPC64 gcc's -mcmodel=medium or -mcmodel=large options. Doing this results in better TOC grouping for multi-TOC. Use this option to turn off this feature.

`--plt-align'
`--no-plt-align'
Use these options to control whether individual PLT call stubs are aligned to a 32-byte boundary, or to the specified power of two boundary when using --plt-align=. By default PLT call stubs are packed tightly.

`--plt-static-chain'
`--no-plt-static-chain'
Use these options to control whether PLT call stubs load the static chain pointer (r11). ld defaults to not loading the static chain since there is never any need to do so on a PLT call.

`--plt-thread-safe'
`--no-thread-safe'
With power7's weakly ordered memory model, it is possible when using lazy binding for ld.so to update a plt entry in one thread and have another thread see the individual plt entry words update in the wrong order, despite ld.so carefully writing in the correct order and using memory write barriers. To avoid this we need some sort of read barrier in the call stub, or use LD_BIND_NOW=1. By default, ld looks for calls to commonly used functions that create threads, and if seen, adds the necessary barriers. Use these options to change the default behaviour.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.11 ld and SPU ELF Support

`--plugin'
This option marks an executable as a PIC plugin module.

`--no-overlays'
Normally, ld recognizes calls to functions within overlay regions, and redirects such calls to an overlay manager via a stub. ld also provides a built-in overlay manager. This option turns off all this special overlay handling.

`--emit-stub-syms'
This option causes ld to label overlay stubs with a local symbol that encodes the stub type and destination.

`--extra-overlay-stubs'
This option causes ld to add overlay call stubs on all function calls out of overlay regions. Normally stubs are not added on calls to non-overlay regions.

`--local-store=lo:hi'
ld usually checks that a final executable for SPU fits in the address range 0 to 256k. This option may be used to change the range. Disable the check entirely with `--local-store=0:0'.

`--stack-analysis'
SPU local store space is limited. Over-allocation of stack space unnecessarily limits space available for code and data, while under-allocation results in runtime failures. If given this option, ld will provide an estimate of maximum stack usage. ld does this by examining symbols in code sections to determine the extents of functions, and looking at function prologues for stack adjusting instructions. A call-graph is created by looking for relocations on branch instructions. The graph is then searched for the maximum stack usage path. Note that this analysis does not find calls made via function pointers, and does not handle recursion and other cycles in the call graph. Stack usage may be under-estimated if your code makes such calls. Also, stack usage for dynamic allocation, e.g. alloca, will not be detected. If a link map is requested, detailed information about each function's stack usage and calls will be given.

`--emit-stack-syms'
This option, if given along with `--stack-analysis' will result in ld emitting stack sizing symbols for each function. These take the form __stack_<function_name> for global functions, and __stack_<number>_<function_name> for static functions. <number> is the section id in hex. The value of such symbols is the stack requirement for the corresponding function. The symbol size will be zero, type STT_NOTYPE, binding STB_LOCAL, and section SHN_ABS.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.12 ld's Support for Various TI COFF Versions

The `--format' switch allows selection of one of the various TI COFF versions. The latest of this writing is 2; versions 0 and 1 are also supported. The TI COFF versions also vary in header byte-order format; ld will read any version or byte order, but the output header format depends on the default specified by the specific target.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.13 ld and WIN32 (cygwin/mingw)

This section describes some of the win32 specific ld issues. See Command Line Options for detailed description of the command line options mentioned here.

import libraries
The standard Windows linker creates and uses so-called import libraries, which contains information for linking to dll's. They are regular static archives and are handled as any other static archive. The cygwin and mingw ports of ld have specific support for creating such libraries provided with the `--out-implib' command line option.

exporting DLL symbols
The cygwin/mingw ld has several ways to export symbols for dll's.

using auto-export functionality
By default ld exports symbols with the auto-export functionality, which is controlled by the following command line options:

  • --export-all-symbols [This is the default]
  • --exclude-symbols
  • --exclude-libs
  • --exclude-modules-for-implib
  • --version-script

When auto-export is in operation, ld will export all the non-local (global and common) symbols it finds in a DLL, with the exception of a few symbols known to belong to the system's runtime and libraries. As it will often not be desirable to export all of a DLL's symbols, which may include private functions that are not part of any public interface, the command-line options listed above may be used to filter symbols out from the list for exporting. The `--output-def' option can be used in order to see the final list of exported symbols with all exclusions taken into effect.

If `--export-all-symbols' is not given explicitly on the command line, then the default auto-export behavior will be disabled if either of the following are true:

  • A DEF file is used.
  • Any symbol in any object file was marked with the __declspec(dllexport) attribute.

using a DEF file
Another way of exporting symbols is using a DEF file. A DEF file is an ASCII file containing definitions of symbols which should be exported when a dll is created. Usually it is named `<dll name>.def' and is added as any other object file to the linker's command line. The file's name must end in `.def' or `.DEF'.

 
gcc -o <output> <objectfiles> <dll name>.def

Using a DEF file turns off the normal auto-export behavior, unless the `--export-all-symbols' option is also used.

Here is an example of a DEF file for a shared library called `xyz.dll':

 
LIBRARY "xyz.dll" BASE=0x20000000

EXPORTS
foo
bar
_bar = bar
another_foo = abc.dll.afoo
var1 DATA
doo = foo == foo2
eoo DATA == var1

This example defines a DLL with a non-default base address and seven symbols in the export table. The third exported symbol _bar is an alias for the second. The fourth symbol, another_foo is resolved by "forwarding" to another module and treating it as an alias for afoo exported from the DLL `abc.dll'. The final symbol var1 is declared to be a data object. The `doo' symbol in export library is an alias of `foo', which gets the string name in export table `foo2'. The `eoo' symbol is an data export symbol, which gets in export table the name `var1'.

The optional LIBRARY <name> command indicates the internal name of the output DLL. If `<name>' does not include a suffix, the default library suffix, `.DLL' is appended.

When the .DEF file is used to build an application, rather than a library, the NAME <name> command should be used instead of LIBRARY. If `<name>' does not include a suffix, the default executable suffix, `.EXE' is appended.

With either LIBRARY <name> or NAME <name> the optional specification BASE = <number> may be used to specify a non-default base address for the image.

If neither LIBRARY <name> nor NAME <name> is specified, or they specify an empty string, the internal name is the same as the filename specified on the command line.

The complete specification of an export symbol is:

 
EXPORTS
  ( (  ( <name1> [ = <name2> ] )
     | ( <name1> = <module-name> . <external-name>))
  [ @ <integer> ] [NONAME] [DATA] [CONSTANT] [PRIVATE] [== <name3>] ) *

Declares `<name1>' as an exported symbol from the DLL, or declares `<name1>' as an exported alias for `<name2>'; or declares `<name1>' as a "forward" alias for the symbol `<external-name>' in the DLL `<module-name>'. Optionally, the symbol may be exported by the specified ordinal `<integer>' alias. The optional `<name3>' is the to be used string in import/export table for the symbol.

The optional keywords that follow the declaration indicate:

NONAME: Do not put the symbol name in the DLL's export table. It will still be exported by its ordinal alias (either the value specified by the .def specification or, otherwise, the value assigned by the linker). The symbol name, however, does remain visible in the import library (if any), unless PRIVATE is also specified.

DATA: The symbol is a variable or object, rather than a function. The import lib will export only an indirect reference to foo as the symbol _imp__foo (ie, foo must be resolved as *_imp__foo).

CONSTANT: Like DATA, but put the undecorated foo as well as _imp__foo into the import library. Both refer to the read-only import address table's pointer to the variable, not to the variable itself. This can be dangerous. If the user code fails to add the dllimport attribute and also fails to explicitly add the extra indirection that the use of the attribute enforces, the application will behave unexpectedly.

PRIVATE: Put the symbol in the DLL's export table, but do not put it into the static import library used to resolve imports at link time. The symbol can still be imported using the LoadLibrary/GetProcAddress API at runtime or by by using the GNU ld extension of linking directly to the DLL without an import library.

See ld/deffilep.y in the binutils sources for the full specification of other DEF file statements

While linking a shared dll, ld is able to create a DEF file with the `--output-def <file>' command line option.

Using decorations
Another way of marking symbols for export is to modify the source code itself, so that when building the DLL each symbol to be exported is declared as:

 
__declspec(dllexport) int a_variable
__declspec(dllexport) void a_function(int with_args)

All such symbols will be exported from the DLL. If, however, any of the object files in the DLL contain symbols decorated in this way, then the normal auto-export behavior is disabled, unless the `--export-all-symbols' option is also used.

Note that object files that wish to access these symbols must not decorate them with dllexport. Instead, they should use dllimport, instead:

 
__declspec(dllimport) int a_variable
__declspec(dllimport) void a_function(int with_args)

This complicates the structure of library header files, because when included by the library itself the header must declare the variables and functions as dllexport, but when included by client code the header must declare them as dllimport. There are a number of idioms that are typically used to do this; often client code can omit the __declspec() declaration completely. See `--enable-auto-import' and `automatic data imports' for more information.

automatic data imports
The standard Windows dll format supports data imports from dlls only by adding special decorations (dllimport/dllexport), which let the compiler produce specific assembler instructions to deal with this issue. This increases the effort necessary to port existing Un*x code to these platforms, especially for large c++ libraries and applications. The auto-import feature, which was initially provided by Paul Sokolovsky, allows one to omit the decorations to achieve a behavior that conforms to that on POSIX/Un*x platforms. This feature is enabled with the `--enable-auto-import' command-line option, although it is enabled by default on cygwin/mingw. The `--enable-auto-import' option itself now serves mainly to suppress any warnings that are ordinarily emitted when linked objects trigger the feature's use.

auto-import of variables does not always work flawlessly without additional assistance. Sometimes, you will see this message

"variable '<var>' can't be auto-imported. Please read the documentation for ld's --enable-auto-import for details."

The `--enable-auto-import' documentation explains why this error occurs, and several methods that can be used to overcome this difficulty. One of these methods is the runtime pseudo-relocs feature, described below.

For complex variables imported from DLLs (such as structs or classes), object files typically contain a base address for the variable and an offset (addend) within the variable--to specify a particular field or public member, for instance. Unfortunately, the runtime loader used in win32 environments is incapable of fixing these references at runtime without the additional information supplied by dllimport/dllexport decorations. The standard auto-import feature described above is unable to resolve these references.

The `--enable-runtime-pseudo-relocs' switch allows these references to be resolved without error, while leaving the task of adjusting the references themselves (with their non-zero addends) to specialized code provided by the runtime environment. Recent versions of the cygwin and mingw environments and compilers provide this runtime support; older versions do not. However, the support is only necessary on the developer's platform; the compiled result will run without error on an older system.

`--enable-runtime-pseudo-relocs' is not the default; it must be explicitly enabled as needed.

direct linking to a dll
The cygwin/mingw ports of ld support the direct linking, including data symbols, to a dll without the usage of any import libraries. This is much faster and uses much less memory than does the traditional import library method, especially when linking large libraries or applications. When ld creates an import lib, each function or variable exported from the dll is stored in its own bfd, even though a single bfd could contain many exports. The overhead involved in storing, loading, and processing so many bfd's is quite large, and explains the tremendous time, memory, and storage needed to link against particularly large or complex libraries when using import libs.

Linking directly to a dll uses no extra command-line switches other than `-L' and `-l', because ld already searches for a number of names to match each library. All that is needed from the developer's perspective is an understanding of this search, in order to force ld to select the dll instead of an import library.

For instance, when ld is called with the argument `-lxxx' it will attempt to find, in the first directory of its search path,

 
libxxx.dll.a
xxx.dll.a
libxxx.a
xxx.lib
cygxxx.dll (*)
libxxx.dll
xxx.dll

before moving on to the next directory in the search path.

(*) Actually, this is not `cygxxx.dll' but in fact is `<prefix>xxx.dll', where `<prefix>' is set by the ld option `--dll-search-prefix=<prefix>'. In the case of cygwin, the standard gcc spec file includes `--dll-search-prefix=cyg', so in effect we actually search for `cygxxx.dll'.

Other win32-based unix environments, such as mingw or pw32, may use other `<prefix>'es, although at present only cygwin makes use of this feature. It was originally intended to help avoid name conflicts among dll's built for the various win32/un*x environments, so that (for example) two versions of a zlib dll could coexist on the same machine.

The generic cygwin/mingw path layout uses a `bin' directory for applications and dll's and a `lib' directory for the import libraries (using cygwin nomenclature):

 
bin/
	cygxxx.dll
lib/
	libxxx.dll.a   (in case of dll's)
	libxxx.a       (in case of static archive)

Linking directly to a dll without using the import library can be done two ways:

1. Use the dll directly by adding the `bin' path to the link line
 
gcc -Wl,-verbose  -o a.exe -L../bin/ -lxxx

However, as the dll's often have version numbers appended to their names (`cygncurses-5.dll') this will often fail, unless one specifies `-L../bin -lncurses-5' to include the version. Import libs are generally not versioned, and do not have this difficulty.

2. Create a symbolic link from the dll to a file in the `lib' directory according to the above mentioned search pattern. This should be used to avoid unwanted changes in the tools needed for making the app/dll.

 
ln -s bin/cygxxx.dll lib/[cyg|lib|]xxx.dll[.a]

Then you can link without any make environment changes.

 
gcc -Wl,-verbose  -o a.exe -L../lib/ -lxxx

This technique also avoids the version number problems, because the following is perfectly legal

 
bin/
	cygxxx-5.dll
lib/
	libxxx.dll.a -> ../bin/cygxxx-5.dll

Linking directly to a dll without using an import lib will work even when auto-import features are exercised, and even when `--enable-runtime-pseudo-relocs' is used.

Given the improvements in speed and memory usage, one might justifiably wonder why import libraries are used at all. There are three reasons:

1. Until recently, the link-directly-to-dll functionality did not work with auto-imported data.

2. Sometimes it is necessary to include pure static objects within the import library (which otherwise contains only bfd's for indirection symbols that point to the exports of a dll). Again, the import lib for the cygwin kernel makes use of this ability, and it is not possible to do this without an import lib.

3. Symbol aliases can only be resolved using an import lib. This is critical when linking against OS-supplied dll's (eg, the win32 API) in which symbols are usually exported as undecorated aliases of their stdcall-decorated assembly names.

So, import libs are not going away. But the ability to replace true import libs with a simple symbolic link to (or a copy of) a dll, in many cases, is a useful addition to the suite of tools binutils makes available to the win32 developer. Given the massive improvements in memory requirements during linking, storage requirements, and linking speed, we expect that many developers will soon begin to use this feature whenever possible.

symbol aliasing
adding additional names
Sometimes, it is useful to export symbols with additional names. A symbol `foo' will be exported as `foo', but it can also be exported as `_foo' by using special directives in the DEF file when creating the dll. This will affect also the optional created import library. Consider the following DEF file:

 
LIBRARY "xyz.dll" BASE=0x61000000

EXPORTS
foo
_foo = foo

The line `_foo = foo' maps the symbol `foo' to `_foo'.

Another method for creating a symbol alias is to create it in the source code using the "weak" attribute:

 
void foo () { /* Do something.  */; }
void _foo () __attribute__ ((weak, alias ("foo")));

See the gcc manual for more information about attributes and weak symbols.

renaming symbols
Sometimes it is useful to rename exports. For instance, the cygwin kernel does this regularly. A symbol `_foo' can be exported as `foo' but not as `_foo' by using special directives in the DEF file. (This will also affect the import library, if it is created). In the following example:

 
LIBRARY "xyz.dll" BASE=0x61000000

EXPORTS
_foo = foo

The line `_foo = foo' maps the exported symbol `foo' to `_foo'.

Note: using a DEF file disables the default auto-export behavior, unless the `--export-all-symbols' command line option is used. If, however, you are trying to rename symbols, then you should list all desired exports in the DEF file, including the symbols that are not being renamed, and do not use the `--export-all-symbols' option. If you list only the renamed symbols in the DEF file, and use `--export-all-symbols' to handle the other symbols, then the both the new names and the original names for the renamed symbols will be exported. In effect, you'd be aliasing those symbols, not renaming them, which is probably not what you wanted.

weak externals
The Windows object format, PE, specifies a form of weak symbols called weak externals. When a weak symbol is linked and the symbol is not defined, the weak symbol becomes an alias for some other symbol. There are three variants of weak externals: As a GNU extension, weak symbols that do not specify an alternate symbol are supported. If the symbol is undefined when linking, the symbol uses a default value.

aligned common symbols
As a GNU extension to the PE file format, it is possible to specify the desired alignment for a common symbol. This information is conveyed from the assembler or compiler to the linker by means of GNU-specific commands carried in the object file's `.drectve' section, which are recognized by ld and respected when laying out the common symbols. Native tools will be able to process object files employing this GNU extension, but will fail to respect the alignment instructions, and may issue noisy warnings about unknown linker directives.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.14 ld and Xtensa Processors

The default ld behavior for Xtensa processors is to interpret SECTIONS commands so that lists of explicitly named sections in a specification with a wildcard file will be interleaved when necessary to keep literal pools within the range of PC-relative load offsets. For example, with the command:

 
SECTIONS
{
  .text : {
    *(.literal .text)
  }
}

ld may interleave some of the .literal and .text sections from different object files to ensure that the literal pools are within the range of PC-relative load offsets. A valid interleaving might place the .literal sections from an initial group of files followed by the .text sections of that group of files. Then, the .literal sections from the rest of the files and the .text sections from the rest of the files would follow.

Relaxation is enabled by default for the Xtensa version of ld and provides two important link-time optimizations. The first optimization is to combine identical literal values to reduce code size. A redundant literal will be removed and all the L32R instructions that use it will be changed to reference an identical literal, as long as the location of the replacement literal is within the offset range of all the L32R instructions. The second optimization is to remove unnecessary overhead from assembler-generated "longcall" sequences of L32R/CALLXn when the target functions are within range of direct CALLn instructions.

For each of these cases where an indirect call sequence can be optimized to a direct call, the linker will change the CALLXn instruction to a CALLn instruction, remove the L32R instruction, and remove the literal referenced by the L32R instruction if it is not used for anything else. Removing the L32R instruction always reduces code size but can potentially hurt performance by changing the alignment of subsequent branch targets. By default, the linker will always preserve alignments, either by switching some instructions between 24-bit encodings and the equivalent density instructions or by inserting a no-op in place of the L32R instruction that was removed. If code size is more important than performance, the `--size-opt' option can be used to prevent the linker from widening density instructions or inserting no-ops, except in a few cases where no-ops are required for correctness.

The following Xtensa-specific command-line options can be used to control the linker:

`--size-opt'
When optimizing indirect calls to direct calls, optimize for code size more than performance. With this option, the linker will not insert no-ops or widen density instructions to preserve branch target alignment. There may still be some cases where no-ops are required to preserve the correctness of the code.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by GNAT Mailserver on April, 10 2014 using texi2html