ld/ldint.texi - binutils-gdb - Git at Google

 \input texinfo
 @setfilename ldint.info
 @c Copyright (C) 1992-2021 Free Software Foundation, Inc.

 @ifnottex
 @dircategory Software development
 @direntry
 * Ld-Internals: (ldint).	The GNU linker internals.
 @end direntry
 @end ifnottex

 @copying
 This file documents the internals of the GNU linker ld.

 Copyright @copyright{} 1992-2021 Free Software Foundation, Inc.
 Contributed by Cygnus Support.

 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
 any later version published by the Free Software Foundation; with the
 Invariant Sections being ``GNU General Public License'' and ``Funding
 Free Software'', the Front-Cover texts being (a) (see below), and with
 the Back-Cover Texts being (b) (see below).  A copy of the license is
 included in the section entitled ``GNU Free Documentation License''.

 (a) The FSF's Front-Cover Text is:

      A GNU Manual

 (b) The FSF's Back-Cover Text is:

      You have freedom to copy and modify this GNU Manual, like GNU
      software.  Copies published by the Free Software Foundation raise
      funds for GNU development.
 @end copying

 @iftex
 @finalout
 @setchapternewpage off
 @settitle GNU Linker Internals
 @titlepage
 @title{A guide to the internals of the GNU linker}
 @author Per Bothner, Steve Chamberlain, Ian Lance Taylor, DJ Delorie
 @author Cygnus Support
 @page

 @tex
 \def\$#1${{#1}}  % Kluge: collect RCS revision info without $...$
 \xdef\manvers{2.10.91}  % For use in headers, footers too
 {\parskip=0pt
 \hfill Cygnus Support\par
 \hfill \manvers\par
 \hfill \TeX{}info \texinfoversion\par
 }
 @end tex

 @vskip 0pt plus 1filll
 Copyright @copyright{} 1992-2021 Free Software Foundation, Inc.

       Permission is granted to copy, distribute and/or modify this document
       under the terms of the GNU Free Documentation License, Version 1.3
       or any later version published by the Free Software Foundation;
       with no Invariant Sections, with no Front-Cover Texts, and with no
       Back-Cover Texts.  A copy of the license is included in the
       section entitled "GNU Free Documentation License".

 @end titlepage
 @end iftex

 @node Top
 @top

 This file documents the internals of the GNU linker @code{ld}.  It is a
 collection of miscellaneous information with little form at this point.
 Mostly, it is a repository into which you can put information about
 GNU @code{ld} as you discover it (or as you design changes to @code{ld}).

 This document is distributed under the terms of the GNU Free
 Documentation License.  A copy of the license is included in the
 section entitled "GNU Free Documentation License".

 @menu
 * README::			The README File
 * Emulations::			How linker emulations are generated
 * Emulation Walkthrough::	A Walkthrough of a Typical Emulation
 * Architecture Specific::	Some Architecture Specific Notes
 * GNU Free Documentation License::  GNU Free Documentation License
 @end menu

 @node README
 @chapter The @file{README} File

 Check the @file{README} file; it often has useful information that does not
 appear anywhere else in the directory.

 @node Emulations
 @chapter How linker emulations are generated

 Each linker target has an @dfn{emulation}.  The emulation includes the
 default linker script, and certain emulations also modify certain types
 of linker behaviour.

 Emulations are created during the build process by the shell script
 @file{genscripts.sh}.

 The @file{genscripts.sh} script starts by reading a file in the
 @file{emulparams} directory.  This is a shell script which sets various
 shell variables used by @file{genscripts.sh} and the other shell scripts
 it invokes.

 The @file{genscripts.sh} script will invoke a shell script in the
 @file{scripttempl} directory in order to create default linker scripts
 written in the linker command language.  The @file{scripttempl} script
 will be invoked 5 (or, in some cases, 6) times, with different
 assignments to shell variables, to create different default scripts.
 The choice of script is made based on the command-line options.

 After creating the scripts, @file{genscripts.sh} will invoke yet another
 shell script, this time in the @file{emultempl} directory.  That shell
 script will create the emulation source file, which contains C code.
 This C code permits the linker emulation to override various linker
 behaviours.  Most targets use the generic emulation code, which is in
 @file{emultempl/generic.em}.

 To summarize, @file{genscripts.sh} reads three shell scripts: an
 emulation parameters script in the @file{emulparams} directory, a linker
 script generation script in the @file{scripttempl} directory, and an
 emulation source file generation script in the @file{emultempl}
 directory.

 For example, the Sun 4 linker sets up variables in
 @file{emulparams/sun4.sh}, creates linker scripts using
 @file{scripttempl/aout.sc}, and creates the emulation code using
 @file{emultempl/sunos.em}.

 Note that the linker can support several emulations simultaneously,
 depending upon how it is configured.  An emulation can be selected with
 the @code{-m} option.  The @code{-V} option will list all supported
 emulations.

 @menu
 * emulation parameters::        @file{emulparams} scripts
 * linker scripts::              @file{scripttempl} scripts
 * linker emulations::           @file{emultempl} scripts
 @end menu

 @node emulation parameters
 @section @file{emulparams} scripts

 Each target selects a particular file in the @file{emulparams} directory
 by setting the shell variable @code{targ_emul} in @file{configure.tgt}.
 This shell variable is used by the @file{configure} script to control
 building an emulation source file.

 Certain conventions are enforced.  Suppose the @code{targ_emul} variable
 is set to @var{emul} in @file{configure.tgt}.  The name of the emulation
 shell script will be @file{emulparams/@var{emul}.sh}.  The
 @file{Makefile} must have a target named @file{e@var{emul}.c}; this
 target must depend upon @file{emulparams/@var{emul}.sh}, as well as the
 appropriate scripts in the @file{scripttempl} and @file{emultempl}
 directories.  The @file{Makefile} target must invoke @code{GENSCRIPTS}
 with two arguments: @var{emul}, and the value of the make variable
 @code{tdir_@var{emul}}.  The value of the latter variable will be set by
 the @file{configure} script, and is used to set the default target
 directory to search.

 By convention, the @file{emulparams/@var{emul}.sh} shell script should
 only set shell variables.  It may set shell variables which are to be
 interpreted by the @file{scripttempl} and the @file{emultempl} scripts.
 Certain shell variables are interpreted directly by the
 @file{genscripts.sh} script.

 Here is a list of shell variables interpreted by @file{genscripts.sh},
 as well as some conventional shell variables interpreted by the
 @file{scripttempl} and @file{emultempl} scripts.

 @table @code
 @item SCRIPT_NAME
 This is the name of the @file{scripttempl} script to use.  If
 @code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use
 the script @file{scripttempl/@var{script}.sc}.

 @item TEMPLATE_NAME
 This is the name of the @file{emultempl} script to use.  If
 @code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will
 use the script @file{emultempl/@var{template}.em}.  If this variable is
 not set, the default value is @samp{generic}.

 @item GENERATE_SHLIB_SCRIPT
 If this is set to a nonempty string, @file{genscripts.sh} will invoke
 the @file{scripttempl} script an extra time to create a shared library
 script.  @ref{linker scripts}.

 @item OUTPUT_FORMAT
 This is normally set to indicate the BFD output format use (e.g.,
 @samp{"a.out-sunos-big"}.  The @file{scripttempl} script will normally
 use it in an @code{OUTPUT_FORMAT} expression in the linker script.

 @item ARCH
 This is normally set to indicate the architecture to use (e.g.,
 @samp{sparc}).  The @file{scripttempl} script will normally use it in an
 @code{OUTPUT_ARCH} expression in the linker script.

 @item ENTRY
 Some @file{scripttempl} scripts use this to set the entry address, in an
 @code{ENTRY} expression in the linker script.

 @item TEXT_START_ADDR
 Some @file{scripttempl} scripts use this to set the start address of the
 @samp{.text} section.

 @item SEGMENT_SIZE
 The @file{genscripts.sh} script uses this to set the default value of
 @code{DATA_ALIGNMENT} when running the @file{scripttempl} script.

 @item TARGET_PAGE_SIZE
 If @code{SEGMENT_SIZE} is not defined, the @file{genscripts.sh} script
 uses this to define it.

 @item ALIGNMENT
 Some @file{scripttempl} scripts set this to a number to pass to
 @code{ALIGN} to set the required alignment for the @code{end} symbol.
 @end table

 @node linker scripts
 @section @file{scripttempl} scripts

 Each linker target uses a @file{scripttempl} script to generate the
 default linker scripts.  The name of the @file{scripttempl} script is
 set by the @code{SCRIPT_NAME} variable in the @file{emulparams} script.
 If @code{SCRIPT_NAME} is set to @var{script}, @code{genscripts.sh} will
 invoke @file{scripttempl/@var{script}.sc}.

 The @file{genscripts.sh} script will invoke the @file{scripttempl}
 script 5 to 9 times.  Each time it will set the shell variable
 @code{LD_FLAG} to a different value.  When the linker is run, the
 options used will direct it to select a particular script.  (Script
 selection is controlled by the @code{get_script} emulation entry point;
 this describes the conventional behaviour).

 The @file{scripttempl} script should just write a linker script, written
 in the linker command language, to standard output.  If the emulation
 name--the name of the @file{emulparams} file without the @file{.sc}
 extension--is @var{emul}, then the output will be directed to
 @file{ldscripts/@var{emul}.@var{extension}} in the build directory,
 where @var{extension} changes each time the @file{scripttempl} script is
 invoked.

 Here is the list of values assigned to @code{LD_FLAG}.

 @table @code
 @item (empty)
 The script generated is used by default (when none of the following
 cases apply).  The output has an extension of @file{.x}.
 @item n
 The script generated is used when the linker is invoked with the
 @code{-n} option.  The output has an extension of @file{.xn}.
 @item N
 The script generated is used when the linker is invoked with the
 @code{-N} option.  The output has an extension of @file{.xbn}.
 @item r
 The script generated is used when the linker is invoked with the
 @code{-r} option.  The output has an extension of @file{.xr}.
 @item u
 The script generated is used when the linker is invoked with the
 @code{-Ur} option.  The output has an extension of @file{.xu}.
 @item shared
 The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
 this value if @code{GENERATE_SHLIB_SCRIPT} is defined in the
 @file{emulparams} file.  The @file{emultempl} script must arrange to use
 this script at the appropriate time, normally when the linker is invoked
 with the @code{-shared} option.  The output has an extension of
 @file{.xs}.
 @item c
 The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
 this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
 @file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf}. The
 @file{emultempl} script must arrange to use this script at the appropriate
 time, normally when the linker is invoked with the @code{-z combreloc}
 option.  The output has an extension of
 @file{.xc}.
 @item cshared
 The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
 this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
 @file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf} and
 @code{GENERATE_SHLIB_SCRIPT} is defined in the @file{emulparams} file.
 The @file{emultempl} script must arrange to use this script at the
 appropriate time, normally when the linker is invoked with the @code{-shared
 -z combreloc} option.  The output has an extension of @file{.xsc}.
 @item auto_import
 The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
 this value if @code{GENERATE_AUTO_IMPORT_SCRIPT} is defined in the
 @file{emulparams} file.  The @file{emultempl} script must arrange to
 use this script at the appropriate time, normally when the linker is
 invoked with the @code{--enable-auto-import} option.  The output has
 an extension of @file{.xa}.
 @end table

 Besides the shell variables set by the @file{emulparams} script, and the
 @code{LD_FLAG} variable, the @file{genscripts.sh} script will set
 certain variables for each run of the @file{scripttempl} script.

 @table @code
 @item RELOCATING
 This will be set to a non-empty string when the linker is doing a final
 relocation (e.g., all scripts other than @code{-r} and @code{-Ur}).

 @item CONSTRUCTING
 This will be set to a non-empty string when the linker is building
 global constructor and destructor tables (e.g., all scripts other than
 @code{-r}).

 @item DATA_ALIGNMENT
 This will be set to an @code{ALIGN} expression when the output should be
 page aligned, or to @samp{.} when generating the @code{-N} script.

 @item CREATE_SHLIB
 This will be set to a non-empty string when generating a @code{-shared}
 script.

 @item COMBRELOC
 This will be set to a non-empty string when generating @code{-z combreloc}
 scripts to a temporary file name which can be used during script generation.
 @end table

 The conventional way to write a @file{scripttempl} script is to first
 set a few shell variables, and then write out a linker script using
 @code{cat} with a here document.  The linker script will use variable
 substitutions, based on the above variables and those set in the
 @file{emulparams} script, to control its behaviour.

 When there are parts of the @file{scripttempl} script which should only
 be run when doing a final relocation, they should be enclosed within a
 variable substitution based on @code{RELOCATING}.  For example, on many
 targets special symbols such as @code{_end} should be defined when doing
 a final link.  Naturally, those symbols should not be defined when doing
 a relocatable link using @code{-r}.  The @file{scripttempl} script
 could use a construct like this to define those symbols:
 @smallexample
   $@{RELOCATING+ _end = .;@}
 @end smallexample
 This will do the symbol assignment only if the @code{RELOCATING}
 variable is defined.

 The basic job of the linker script is to put the sections in the correct
 order, and at the correct memory addresses.  For some targets, the
 linker script may have to do some other operations.

 For example, on most MIPS platforms, the linker is responsible for
 defining the special symbol @code{_gp}, used to initialize the
 @code{$gp} register.  It must be set to the start of the small data
 section plus @code{0x8000}.  Naturally, it should only be defined when
 doing a final relocation.  This will typically be done like this:
 @smallexample
   $@{RELOCATING+ _gp = ALIGN(16) + 0x8000;@}
 @end smallexample
 This line would appear just before the sections which compose the small
 data section (@samp{.sdata}, @samp{.sbss}).  All those sections would be
 contiguous in memory.

 Many COFF systems build constructor tables in the linker script.  The
 compiler will arrange to output the address of each global constructor
 in a @samp{.ctor} section, and the address of each global destructor in
 a @samp{.dtor} section (this is done by defining
 @code{ASM_OUTPUT_CONSTRUCTOR} and @code{ASM_OUTPUT_DESTRUCTOR} in the
 @code{gcc} configuration files).  The @code{gcc} runtime support
 routines expect the constructor table to be named @code{__CTOR_LIST__}.
 They expect it to be a list of words, with the first word being the
 count of the number of entries.  There should be a trailing zero word.
 (Actually, the count may be -1 if the trailing word is present, and the
 trailing word may be omitted if the count is correct, but, as the
 @code{gcc} behaviour has changed slightly over the years, it is safest
 to provide both).  Here is a typical way that might be handled in a
 @file{scripttempl} file.
 @smallexample
     $@{CONSTRUCTING+ __CTOR_LIST__ = .;@}
     $@{CONSTRUCTING+ LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)@}
     $@{CONSTRUCTING+ *(.ctors)@}
     $@{CONSTRUCTING+ LONG(0)@}
     $@{CONSTRUCTING+ __CTOR_END__ = .;@}
     $@{CONSTRUCTING+ __DTOR_LIST__ = .;@}
     $@{CONSTRUCTING+ LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)@}
     $@{CONSTRUCTING+ *(.dtors)@}
     $@{CONSTRUCTING+ LONG(0)@}
     $@{CONSTRUCTING+ __DTOR_END__ = .;@}
 @end smallexample
 The use of @code{CONSTRUCTING} ensures that these linker script commands
 will only appear when the linker is supposed to be building the
 constructor and destructor tables.  This example is written for a target
 which uses 4 byte pointers.

 Embedded systems often need to set a stack address.  This is normally
 best done by using the @code{PROVIDE} construct with a default stack
 address.  This permits the user to easily override the stack address
 using the @code{--defsym} option.  Here is an example:
 @smallexample
   $@{RELOCATING+ PROVIDE (__stack = 0x80000000);@}
 @end smallexample
 The value of the symbol @code{__stack} would then be used in the startup
 code to initialize the stack pointer.

 @node linker emulations
 @section @file{emultempl} scripts

 Each linker target uses an @file{emultempl} script to generate the
 emulation code.  The name of the @file{emultempl} script is set by the
 @code{TEMPLATE_NAME} variable in the @file{emulparams} script.  If the
 @code{TEMPLATE_NAME} variable is not set, the default is
 @samp{generic}.  If the value of @code{TEMPLATE_NAME} is @var{template},
 @file{genscripts.sh} will use @file{emultempl/@var{template}.em}.

 Most targets use the generic @file{emultempl} script,
 @file{emultempl/generic.em}.  A different @file{emultempl} script is
 only needed if the linker must support unusual actions, such as linking
 against shared libraries.

 The @file{emultempl} script is normally written as a simple invocation
 of @code{cat} with a here document.  The document will use a few
 variable substitutions.  Typically each function names uses a
 substitution involving @code{EMULATION_NAME}, for ease of debugging when
 the linker supports multiple emulations.

 Every function and variable in the emitted file should be static.  The
 only globally visible object must be named
 @code{ld_@var{EMULATION_NAME}_emulation}, where @var{EMULATION_NAME} is
 the name of the emulation set in @file{configure.tgt} (this is also the
 name of the @file{emulparams} file without the @file{.sh} extension).
 The @file{genscripts.sh} script will set the shell variable
 @code{EMULATION_NAME} before invoking the @file{emultempl} script.

 The @code{ld_@var{EMULATION_NAME}_emulation} variable must be a
 @code{struct ld_emulation_xfer_struct}, as defined in @file{ldemul.h}.
 It defines a set of function pointers which are invoked by the linker,
 as well as strings for the emulation name (normally set from the shell
 variable @code{EMULATION_NAME} and the default BFD target name (normally
 set from the shell variable @code{OUTPUT_FORMAT} which is normally set
 by the @file{emulparams} file).

 The @file{genscripts.sh} script will set the shell variable
 @code{COMPILE_IN} when it invokes the @file{emultempl} script for the
 default emulation.  In this case, the @file{emultempl} script should
 include the linker scripts directly, and return them from the
 @code{get_scripts} entry point.  When the emulation is not the default,
 the @code{get_scripts} entry point should just return a file name.  See
 @file{emultempl/generic.em} for an example of how this is done.

 At some point, the linker emulation entry points should be documented.

 @node Emulation Walkthrough
 @chapter A Walkthrough of a Typical Emulation

 This chapter is to help people who are new to the way emulations
 interact with the linker, or who are suddenly thrust into the position
 of having to work with existing emulations.  It will discuss the files
 you need to be aware of.  It will tell you when the given "hooks" in
 the emulation will be called.  It will, hopefully, give you enough
 information about when and how things happen that you'll be able to
 get by.  As always, the source is the definitive reference to this.

 The starting point for the linker is in @file{ldmain.c} where
 @code{main} is defined.  The bulk of the code that's emulation
 specific will initially be in @code{emultempl/@var{emulation}.em} but
 will end up in @code{e@var{emulation}.c} when the build is done.
 Most of the work to select and interface with emulations is in
 @code{ldemul.h} and @code{ldemul.c}.  Specifically, @code{ldemul.h}
 defines the @code{ld_emulation_xfer_struct} structure your emulation
 exports.

 Your emulation file exports a symbol
 @code{ld_@var{EMULATION_NAME}_emulation}.  If your emulation is
 selected (it usually is, since usually there's only one),
 @code{ldemul.c} sets the variable @var{ld_emulation} to point to it.
 @code{ldemul.c} also defines a number of API functions that interface
 to your emulation, like @code{ldemul_after_parse} which simply calls
 your @code{ld_@var{EMULATION}_emulation.after_parse} function.  For
 the rest of this section, the functions will be mentioned, but you
 should assume the indirect reference to your emulation also.

 We will also skip or gloss over parts of the link process that don't
 relate to emulations, like setting up internationalization.

 After initialization, @code{main} selects an emulation by pre-scanning
 the command-line arguments.  It calls @code{ldemul_choose_target} to
 choose a target.  If you set @code{choose_target} to
 @code{ldemul_default_target}, it picks your @code{target_name} by
 default.

 @code{main} calls @code{ldemul_before_parse}, then @code{parse_args}.
 @code{parse_args} calls @code{ldemul_parse_args} for each arg, which
 must update the @code{getopt} globals if it recognizes the argument.
 If the emulation doesn't recognize it, then parse_args checks to see
 if it recognizes it.

 Now that the emulation has had access to all its command-line options,
 @code{main} calls @code{ldemul_set_symbols}.  This can be used for any
 initialization that may be affected by options.  It is also supposed
 to set up any variables needed by the emulation script.

 @code{main} now calls @code{ldemul_get_script} to get the emulation
 script to use (based on arguments, no doubt, @pxref{Emulations}) and
 runs it.  While parsing, @code{ldgram.y} may call @code{ldemul_hll} or
 @code{ldemul_syslib} to handle the @code{HLL} or @code{SYSLIB}
 commands.  It may call @code{ldemul_unrecognized_file} if you asked
 the linker to link a file it doesn't recognize.  It will call
 @code{ldemul_recognized_file} for each file it does recognize, in case
 the emulation wants to handle some files specially.  All the while,
 it's loading the files (possibly calling
 @code{ldemul_open_dynamic_archive}) and symbols and stuff.  After it's
 done reading the script, @code{main} calls @code{ldemul_after_parse}.
 Use the after-parse hook to set up anything that depends on stuff the
 script might have set up, like the entry point.

 @code{main} next calls @code{lang_process} in @code{ldlang.c}.  This
 appears to be the main core of the linking itself, as far as emulation
 hooks are concerned(*).  It first opens the output file's BFD, calling
 @code{ldemul_set_output_arch}, and calls
 @code{ldemul_create_output_section_statements} in case you need to use
 other means to find or create object files (i.e. shared libraries
 found on a path, or fake stub objects).  Despite the name, nobody
 creates output sections here.

 (*) In most cases, the BFD library does the bulk of the actual
 linking, handling symbol tables, symbol resolution, relocations, and
 building the final output file.  See the BFD reference for all the
 details.  Your emulation is usually concerned more with managing
 things at the file and section level, like "put this here, add this
 section", etc.

 Next, the objects to be linked are opened and BFDs created for them,
 and @code{ldemul_after_open} is called.  At this point, you have all
 the objects and symbols loaded, but none of the data has been placed
 yet.

 Next comes the Big Linking Thingy (except for the parts BFD does).
 All input sections are mapped to output sections according to the
 script.  If a section doesn't get mapped by default,
 @code{ldemul_place_orphan} will get called to figure out where it goes.
 Next it figures out the offsets for each section, calling
 @code{ldemul_before_allocation} before and
 @code{ldemul_after_allocation} after deciding where each input section
 ends up in the output sections.

 The last part of @code{lang_process} is to figure out all the symbols'
 values.  After assigning final values to the symbols,
 @code{ldemul_finish} is called, and after that, any undefined symbols
 are turned into fatal errors.

 OK, back to @code{main}, which calls @code{ldwrite} in
 @file{ldwrite.c}.  @code{ldwrite} calls BFD's final_link, which does
 all the relocation fixups and writes the output bfd to disk, and we're
 done.

 In summary,

 @itemize @bullet

 @item @code{main()} in @file{ldmain.c}
 @item @file{emultempl/@var{EMULATION}.em} has your code
 @item @code{ldemul_choose_target} (defaults to your @code{target_name})
 @item @code{ldemul_before_parse}
 @item Parse argv, calls @code{ldemul_parse_args} for each
 @item @code{ldemul_set_symbols}
 @item @code{ldemul_get_script}
 @item parse script

 @itemize @bullet
 @item may call @code{ldemul_hll} or @code{ldemul_syslib}
 @item may call @code{ldemul_open_dynamic_archive}
 @end itemize

 @item @code{ldemul_after_parse}
 @item @code{lang_process()} in @file{ldlang.c}

 @itemize @bullet
 @item create @code{output_bfd}
 @item @code{ldemul_set_output_arch}
 @item @code{ldemul_create_output_section_statements}
 @item read objects, create input bfds - all symbols exist, but have no values
 @item may call @code{ldemul_unrecognized_file}
 @item will call @code{ldemul_recognized_file}
 @item @code{ldemul_after_open}
 @item map input sections to output sections
 @item may call @code{ldemul_place_orphan} for remaining sections
 @item @code{ldemul_before_allocation}
 @item gives input sections offsets into output sections, places output sections
 @item @code{ldemul_after_allocation} - section addresses valid
 @item assigns values to symbols
 @item @code{ldemul_finish} - symbol values valid
 @end itemize

 @item output bfd is written to disk

 @end itemize

 @node Architecture Specific
 @chapter Some Architecture Specific Notes

 This is the place for notes on the behavior of @code{ld} on
 specific platforms.  Currently, only Intel x86 is documented (and
 of that, only the auto-import behavior for DLLs).

 @menu
 * ix86::                        Intel x86
 @end menu

 @node ix86
 @section Intel x86

 @table @emph
 @code{ld} can create DLLs that operate with various runtimes available
 on a common x86 operating system.  These runtimes include native (using
 the mingw "platform"), cygwin, and pw.

 @item auto-import from DLLs
 @enumerate
 @item
 With this feature on, DLL clients can import variables from DLL
 without any concern from their side (for example, without any source
 code modifications).  Auto-import can be enabled using the
 @code{--enable-auto-import} flag, or disabled via the
 @code{--disable-auto-import} flag.  Auto-import is disabled by default.

 @item
 This is done completely in bounds of the PE specification (to be fair,
 there's a minor violation of the spec at one point, but in practice
 auto-import works on all known variants of that common x86 operating
 system)  So, the resulting DLL can be used with any other PE
 compiler/linker.

 @item
 Auto-import is fully compatible with standard import method, in which
 variables are decorated using attribute modifiers. Libraries of either
 type may be mixed together.

 @item
 Overhead (space): 8 bytes per imported symbol, plus 20 for each
 reference to it; Overhead (load time): negligible; Overhead
 (virtual/physical memory): should be less than effect of DLL
 relocation.
 @end enumerate

 Motivation

 The obvious and only way to get rid of dllimport insanity is
 to make client access variable directly in the DLL, bypassing
 the extra dereference imposed by ordinary DLL runtime linking.
 I.e., whenever client contains something like

 @code{mov dll_var,%eax,}

 address of dll_var in the command should be relocated to point
 into loaded DLL. The aim is to make OS loader do so, and than
 make ld help with that.  Import section of PE made following
 way: there's a vector of structures each describing imports
 from particular DLL. Each such structure points to two other
 parallel vectors: one holding imported names, and one which
 will hold address of corresponding imported name. So, the
 solution is de-vectorize these structures, making import
 locations be sparse and pointing directly into code.

 Implementation

 For each reference of data symbol to be imported from DLL (to
 set of which belong symbols with name <sym>, if __imp_<sym> is
 found in implib), the import fixup entry is generated. That
 entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3
 subsection. Each fixup entry contains pointer to symbol's address
 within .text section (marked with __fuN_<sym> symbol, where N is
 integer), pointer to DLL name (so, DLL name is referenced by
 multiple entries), and pointer to symbol name thunk. Symbol name
 thunk is singleton vector (__nm_th_<symbol>) pointing to
 IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing
 imported name. Here comes that "om the edge" problem mentioned above:
 PE specification rambles that name vector (OriginalFirstThunk) should
 run in parallel with addresses vector (FirstThunk), i.e. that they
 should have same number of elements and terminated with zero. We violate
 this, since FirstThunk points directly into machine code. But in
 practice, OS loader implemented the sane way: it goes thru
 OriginalFirstThunk and puts addresses to FirstThunk, not something
 else. It once again should be noted that dll and symbol name
 structures are reused across fixup entries and should be there
 anyway to support standard import stuff, so sustained overhead is
 20 bytes per reference. Other question is whether having several
 IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes,
 it is done even by native compiler/linker (libth32's functions are in
 fact resident in windows9x kernel32.dll, so if you use it, you have
 two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is
 whether referencing the same PE structures several times is valid.
 The answer is why not, prohibiting that (detecting violation) would
 require more work on behalf of loader than not doing it.

 @end table

 @node GNU Free Documentation License
 @chapter GNU Free Documentation License

 @include fdl.texi

 @contents
 @bye