| @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, |
| @c 2000, 2001, 2002, 2003 Free Software Foundation, Inc. |
| @c This is part of the GCC manual. |
| @c For copying conditions, see the file gcc.texi. |
| |
| @node Passes |
| @chapter Passes and Files of the Compiler |
| @cindex passes and files of the compiler |
| @cindex files and passes of the compiler |
| @cindex compiler passes and files |
| |
| @cindex top level of compiler |
| The overall control structure of the compiler is in @file{toplev.c}. This |
| file is responsible for initialization, decoding arguments, opening and |
| closing files, and sequencing the passes. Routines for emitting |
| diagnostic messages are defined in @file{diagnostic.c}. The files |
| @file{pretty-print.h} and @file{pretty-print.c} provide basic support |
| for language-independent pretty-printing. |
| |
| @cindex parsing pass |
| The parsing pass is invoked only once, to parse the entire input. A |
| high level tree representation is then generated from the input, |
| one function at a time. This tree code is then transformed into RTL |
| intermediate code, and processed. The files involved in transforming |
| the trees into RTL are @file{expr.c}, @file{expmed.c}, and |
| @file{stmt.c}. |
| @c Note, the above files aren't strictly the only files involved. It's |
| @c all over the place (function.c, final.c,etc). However, those are |
| @c the files that are supposed to be directly involved, and have |
| @c their purpose listed as such, so i've only listed them. |
| The order of trees that are processed, is not |
| necessarily the same order they are generated from |
| the input, due to deferred inlining, and other considerations. |
| |
| @findex rest_of_compilation |
| @findex rest_of_decl_compilation |
| Each time the parsing pass reads a complete function definition or |
| top-level declaration, it calls either the function |
| @code{rest_of_compilation}, or the function |
| @code{rest_of_decl_compilation} in @file{toplev.c}, which are |
| responsible for all further processing necessary, ending with output of |
| the assembler language. All other compiler passes run, in sequence, |
| within @code{rest_of_compilation}. When that function returns from |
| compiling a function definition, the storage used for that function |
| definition's compilation is entirely freed, unless it is an inline |
| function, or was deferred for some reason (this can occur in |
| templates, for example). |
| (@pxref{Inline,,An Inline Function is As Fast As a Macro,gcc,Using the |
| GNU Compiler Collection (GCC)}). |
| |
| Here is a list of all the passes of the compiler and their source files. |
| Also included is a description of where debugging dumps can be requested |
| with @option{-d} options. |
| |
| @itemize @bullet |
| @item |
| Parsing. This pass reads the entire text of a function definition, |
| constructing a high level tree representation. (Because of the semantic |
| analysis that takes place during this pass, it does more than is |
| formally considered to be parsing.) |
| |
| The tree representation does not entirely follow C syntax, because it is |
| intended to support other languages as well. |
| |
| Language-specific data type analysis is also done in this pass, and every |
| tree node that represents an expression has a data type attached. |
| Variables are represented as declaration nodes. |
| |
| The language-independent source files for parsing are |
| @file{tree.c}, @file{fold-const.c}, and @file{stor-layout.c}. |
| There are also header files @file{tree.h} and @file{tree.def} |
| which define the format of the tree representation. |
| |
| C preprocessing, for language front ends, that want or require it, is |
| performed by cpplib, which is covered in separate documentation. In |
| particular, the internals are covered in @xref{Top, ,Cpplib internals, |
| cppinternals, Cpplib Internals}. |
| |
| The source files to parse C are found in the toplevel directory, and |
| by convention are named @file{c-*}. Some of these are also used by |
| the other C-like languages: @file{c-common.c}, |
| @file{c-common.def}, |
| @file{c-format.c}, |
| @file{c-opts.c}, |
| @file{c-pragma.c}, |
| @file{c-semantics.c}, |
| @file{c-lex.c}, |
| @file{c-incpath.c}, |
| @file{c-ppoutput.c}, |
| @file{c-cppbuiltin.c}, |
| @file{c-common.h}, |
| @file{c-dump.h}, |
| @file{c.opt}, |
| @file{c-incpath.h} |
| and |
| @file{c-pragma.h}, |
| |
| Files specific to each language are in subdirectories named after the |
| language in question, like @file{ada}, @file{objc}, @file{cp} (for C++). |
| |
| @cindex Tree optimization |
| @item |
| Tree optimization. This is the optimization of the tree |
| representation, before converting into RTL code. |
| |
| @cindex inline on trees, automatic |
| Currently, the main optimization performed here is tree-based |
| inlining. |
| This is implemented in @file{tree-inline.c} and used by both C and C++. |
| Note that tree based inlining turns off rtx based inlining (since it's more |
| powerful, it would be a waste of time to do rtx based inlining in |
| addition). |
| |
| @cindex constant folding |
| @cindex arithmetic simplifications |
| @cindex simplifications, arithmetic |
| Constant folding and some arithmetic simplifications are also done |
| during this pass, on the tree representation. |
| The routines that perform these tasks are located in @file{fold-const.c}. |
| |
| @cindex RTL generation |
| @item |
| RTL generation. This is the conversion of syntax tree into RTL code. |
| |
| @cindex target-parameter-dependent code |
| This is where the bulk of target-parameter-dependent code is found, |
| since often it is necessary for strategies to apply only when certain |
| standard kinds of instructions are available. The purpose of named |
| instruction patterns is to provide this information to the RTL |
| generation pass. |
| |
| @cindex tail recursion optimization |
| Optimization is done in this pass for @code{if}-conditions that are |
| comparisons, boolean operations or conditional expressions. Tail |
| recursion is detected at this time also. Decisions are made about how |
| best to arrange loops and how to output @code{switch} statements. |
| |
| @c Avoiding overfull is tricky here. |
| The source files for RTL generation include |
| @file{stmt.c}, |
| @file{calls.c}, |
| @file{expr.c}, |
| @file{explow.c}, |
| @file{expmed.c}, |
| @file{function.c}, |
| @file{optabs.c} |
| and @file{emit-rtl.c}. |
| Also, the file |
| @file{insn-emit.c}, generated from the machine description by the |
| program @code{genemit}, is used in this pass. The header file |
| @file{expr.h} is used for communication within this pass. |
| |
| @findex genflags |
| @findex gencodes |
| The header files @file{insn-flags.h} and @file{insn-codes.h}, |
| generated from the machine description by the programs @code{genflags} |
| and @code{gencodes}, tell this pass which standard names are available |
| for use and which patterns correspond to them. |
| |
| Aside from debugging information output, none of the following passes |
| refers to the tree structure representation of the function (only |
| part of which is saved). |
| |
| @cindex inline on rtx, automatic |
| The decision of whether the function can and should be expanded inline |
| in its subsequent callers is made at the end of rtl generation. The |
| function must meet certain criteria, currently related to the size of |
| the function and the types and number of parameters it has. Note that |
| this function may contain loops, recursive calls to itself |
| (tail-recursive functions can be inlined!), gotos, in short, all |
| constructs supported by GCC@. The file @file{integrate.c} contains |
| the code to save a function's rtl for later inlining and to inline that |
| rtl when the function is called. The header file @file{integrate.h} |
| is also used for this purpose. |
| |
| @opindex dr |
| The option @option{-dr} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.rtl} to |
| the input file name. |
| |
| @c Should the exception handling pass be talked about here? |
| |
| @cindex sibling call optimization |
| @item |
| Sibling call optimization. This pass performs tail recursion |
| elimination, and tail and sibling call optimizations. The purpose of |
| these optimizations is to reduce the overhead of function calls, |
| whenever possible. |
| |
| The source file of this pass is @file{sibcall.c} |
| |
| @opindex di |
| The option @option{-di} causes a debugging dump of the RTL code after |
| this pass is run. This dump file's name is made by appending |
| @samp{.sibling} to the input file name. |
| |
| @cindex jump optimization |
| @cindex unreachable code |
| @cindex dead code |
| @item |
| Jump optimization. This pass simplifies jumps to the following |
| instruction, jumps across jumps, and jumps to jumps. It deletes |
| unreferenced labels and unreachable code, except that unreachable code |
| that contains a loop is not recognized as unreachable in this pass. |
| (Such loops are deleted later in the basic block analysis.) It also |
| converts some code originally written with jumps into sequences of |
| instructions that directly set values from the results of comparisons, |
| if the machine has such instructions. |
| |
| Jump optimization is performed two or three times. The first time is |
| immediately following RTL generation. The second time is after CSE, |
| but only if CSE says repeated jump optimization is needed. The |
| last time is right before the final pass. That time, cross-jumping |
| and deletion of no-op move instructions are done together with the |
| optimizations described above. |
| |
| The source file of this pass is @file{jump.c}. |
| |
| @opindex dj |
| The option @option{-dj} causes a debugging dump of the RTL code after |
| this pass is run for the first time. This dump file's name is made by |
| appending @samp{.jump} to the input file name. |
| |
| |
| @cindex register use analysis |
| @item |
| Register scan. This pass finds the first and last use of each |
| register, as a guide for common subexpression elimination. Its source |
| is in @file{regclass.c}. |
| |
| @cindex jump threading |
| @item |
| @opindex fthread-jumps |
| Jump threading. This pass detects a condition jump that branches to an |
| identical or inverse test. Such jumps can be @samp{threaded} through |
| the second conditional test. The source code for this pass is in |
| @file{jump.c}. This optimization is only performed if |
| @option{-fthread-jumps} is enabled. |
| |
| @cindex common subexpression elimination |
| @cindex constant propagation |
| @item |
| Common subexpression elimination. This pass also does constant |
| propagation. Its source files are @file{cse.c}, and @file{cselib.c}. |
| If constant propagation causes conditional jumps to become |
| unconditional or to become no-ops, jump optimization is run again when |
| CSE is finished. |
| |
| @opindex ds |
| The option @option{-ds} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.cse} to |
| the input file name. |
| |
| @cindex global common subexpression elimination |
| @cindex constant propagation |
| @cindex copy propagation |
| @item |
| Global common subexpression elimination. This pass performs two |
| different types of GCSE depending on whether you are optimizing for |
| size or not (LCM based GCSE tends to increase code size for a gain in |
| speed, while Morel-Renvoise based GCSE does not). |
| When optimizing for size, GCSE is done using Morel-Renvoise Partial |
| Redundancy Elimination, with the exception that it does not try to move |
| invariants out of loops---that is left to the loop optimization pass. |
| If MR PRE GCSE is done, code hoisting (aka unification) is also done, as |
| well as load motion. |
| If you are optimizing for speed, LCM (lazy code motion) based GCSE is |
| done. LCM is based on the work of Knoop, Ruthing, and Steffen. LCM |
| based GCSE also does loop invariant code motion. We also perform load |
| and store motion when optimizing for speed. |
| Regardless of which type of GCSE is used, the GCSE pass also performs |
| global constant and copy propagation. |
| |
| The source file for this pass is @file{gcse.c}, and the LCM routines |
| are in @file{lcm.c}. |
| |
| @opindex dG |
| The option @option{-dG} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.gcse} to |
| the input file name. |
| |
| @cindex loop optimization |
| @cindex code motion |
| @cindex strength-reduction |
| @item |
| Loop optimization. This pass moves constant expressions out of loops, |
| and optionally does strength-reduction and loop unrolling as well. |
| Its source files are @file{loop.c} and @file{unroll.c}, plus the header |
| @file{loop.h} used for communication between them. Loop unrolling uses |
| some functions in @file{integrate.c} and the header @file{integrate.h}. |
| Loop dependency analysis routines are contained in @file{dependence.c}. |
| |
| Second loop optimization pass takes care of basic block level optimizations -- |
| unrolling, peeling and unswitching loops. The source files are |
| @file{cfgloopanal.c} and @file{cfgloopmanip.c} containing generic loop |
| analysis and manipulation code, @file{loop-init.c} with initialization and |
| finalization code, @file{loop-unswitch.c} for loop unswitching and |
| @file{loop-unroll.c} for loop unrolling and peeling. |
| |
| @opindex dL |
| The option @option{-dL} causes a debugging dump of the RTL code after |
| these passes. The dump file names are made by appending @samp{.loop} and |
| @samp{.loop2} to the input file name. |
| |
| @cindex jump bypassing |
| @item |
| Jump bypassing. This pass is an aggressive form of GCSE that transforms |
| the control flow graph of a function by propagating constants into |
| conditional branch instructions. |
| |
| The source file for this pass is @file{gcse.c}. |
| |
| @opindex dG |
| The option @option{-dG} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.bypass} |
| to the input file name. |
| |
| @cindex web construction |
| @item |
| Simple optimization pass that splits independent uses of each pseudo |
| increasing effect of other optimizations. This can improve effect of the |
| other transformation, such as CSE or register allocation. |
| Its source files are @file{web.c}. |
| |
| @opindex dZ |
| The option @option{-dZ} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.web} to |
| the input file name. |
| |
| @item |
| @opindex frerun-cse-after-loop |
| If @option{-frerun-cse-after-loop} was enabled, a second common |
| subexpression elimination pass is performed after the loop optimization |
| pass. Jump threading is also done again at this time if it was specified. |
| |
| @opindex dt |
| The option @option{-dt} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.cse2} to |
| the input file name. |
| |
| @cindex data flow analysis |
| @cindex analysis, data flow |
| @cindex basic blocks |
| @item |
| Data flow analysis (@file{flow.c}). This pass divides the program |
| into basic blocks (and in the process deletes unreachable loops); then |
| it computes which pseudo-registers are live at each point in the |
| program, and makes the first instruction that uses a value point at |
| the instruction that computed the value. |
| |
| @cindex autoincrement/decrement analysis |
| This pass also deletes computations whose results are never used, and |
| combines memory references with add or subtract instructions to make |
| autoincrement or autodecrement addressing. |
| |
| @opindex df |
| The option @option{-df} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.flow} to |
| the input file name. If stupid register allocation is in use, this |
| dump file reflects the full results of such allocation. |
| |
| @cindex instruction combination |
| @item |
| Instruction combination (@file{combine.c}). This pass attempts to |
| combine groups of two or three instructions that are related by data |
| flow into single instructions. It combines the RTL expressions for |
| the instructions by substitution, simplifies the result using algebra, |
| and then attempts to match the result against the machine description. |
| |
| @opindex dc |
| The option @option{-dc} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.combine} |
| to the input file name. |
| |
| @cindex if conversion |
| @item |
| If-conversion is a transformation that transforms control dependencies |
| into data dependencies (IE it transforms conditional code into a |
| single control stream). |
| It is implemented in the file @file{ifcvt.c}. |
| |
| @opindex dE |
| The option @option{-dE} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.ce} to |
| the input file name. |
| |
| @cindex register movement |
| @item |
| Register movement (@file{regmove.c}). This pass looks for cases where |
| matching constraints would force an instruction to need a reload, and |
| this reload would be a register-to-register move. It then attempts |
| to change the registers used by the instruction to avoid the move |
| instruction. |
| |
| @opindex dN |
| The option @option{-dN} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.regmove} |
| to the input file name. |
| |
| @cindex instruction scheduling |
| @cindex scheduling, instruction |
| @item |
| Instruction scheduling (@file{sched.c}). This pass looks for |
| instructions whose output will not be available by the time that it is |
| used in subsequent instructions. (Memory loads and floating point |
| instructions often have this behavior on RISC machines). It re-orders |
| instructions within a basic block to try to separate the definition and |
| use of items that otherwise would cause pipeline stalls. |
| |
| Instruction scheduling is performed twice. The first time is immediately |
| after instruction combination and the second is immediately after reload. |
| |
| @opindex dS |
| The option @option{-dS} causes a debugging dump of the RTL code after this |
| pass is run for the first time. The dump file's name is made by |
| appending @samp{.sched} to the input file name. |
| |
| @cindex register allocation |
| @item |
| Register allocation. These passes make sure that all occurrences of pseudo |
| registers are eliminated, either by allocating them to a hard register, |
| replacing them by an equivalent expression (e.g.@: a constant) or by placing |
| them on the stack. This is done in several subpasses: |
| |
| @itemize @bullet |
| @cindex register class preference pass |
| @item |
| Register class preferencing. The RTL code is scanned to find out |
| which register class is best for each pseudo register. The source |
| file is @file{regclass.c}. |
| |
| @cindex local register allocation |
| @item |
| Local register allocation (@file{local-alloc.c}). This pass allocates |
| hard registers to pseudo registers that are used only within one basic |
| block. Because the basic block is linear, it can use fast and |
| powerful techniques to do a very good job. |
| |
| @opindex dl |
| The option @option{-dl} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.lreg} to |
| the input file name. |
| |
| @cindex global register allocation |
| @item |
| Global register allocation (@file{global.c}). This pass |
| allocates hard registers for the remaining pseudo registers (those |
| whose life spans are not contained in one basic block). |
| |
| @cindex graph coloring register allocation |
| @opindex fnew-ra |
| @opindex dl |
| @item |
| Graph coloring register allocator. The files @file{ra.c}, @file{ra-build.c}, |
| @file{ra-colorize.c}, @file{ra-debug.c}, @file{ra-rewrite.c} together with |
| the header @file{ra.h} contain another register allocator, which is used |
| when the option @option{-fnew-ra} is given. In that case it is run instead |
| of the above mentioned local and global register allocation passes, and the |
| option @option{-dl} causes a debugging dump of its work. |
| |
| @cindex reloading |
| @item |
| Reloading. This pass renumbers pseudo registers with the hardware |
| registers numbers they were allocated. Pseudo registers that did not |
| get hard registers are replaced with stack slots. Then it finds |
| instructions that are invalid because a value has failed to end up in |
| a register, or has ended up in a register of the wrong kind. It fixes |
| up these instructions by reloading the problematical values |
| temporarily into registers. Additional instructions are generated to |
| do the copying. |
| |
| The reload pass also optionally eliminates the frame pointer and inserts |
| instructions to save and restore call-clobbered registers around calls. |
| |
| Source files are @file{reload.c} and @file{reload1.c}, plus the header |
| @file{reload.h} used for communication between them. |
| |
| @opindex dg |
| The option @option{-dg} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.greg} to |
| the input file name. |
| @end itemize |
| |
| @cindex instruction scheduling |
| @cindex scheduling, instruction |
| @item |
| Instruction scheduling is repeated here to try to avoid pipeline stalls |
| due to memory loads generated for spilled pseudo registers. |
| |
| @opindex dR |
| The option @option{-dR} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.sched2} |
| to the input file name. |
| |
| @cindex basic block reordering |
| @cindex reordering, block |
| @item |
| Basic block reordering. This pass implements profile guided code |
| positioning. If profile information is not available, various types of |
| static analysis are performed to make the predictions normally coming |
| from the profile feedback (IE execution frequency, branch probability, |
| etc). It is implemented in the file @file{bb-reorder.c}, and the |
| various prediction routines are in @file{predict.c}. |
| |
| @opindex dB |
| The option @option{-dB} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.bbro} to |
| the input file name. |
| |
| @cindex delayed branch scheduling |
| @cindex scheduling, delayed branch |
| @item |
| Delayed branch scheduling. This optional pass attempts to find |
| instructions that can go into the delay slots of other instructions, |
| usually jumps and calls. The source file name is @file{reorg.c}. |
| |
| @opindex dd |
| The option @option{-dd} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.dbr} |
| to the input file name. |
| |
| @cindex branch shortening |
| @item |
| Branch shortening. On many RISC machines, branch instructions have a |
| limited range. Thus, longer sequences of instructions must be used for |
| long branches. In this pass, the compiler figures out what how far each |
| instruction will be from each other instruction, and therefore whether |
| the usual instructions, or the longer sequences, must be used for each |
| branch. |
| |
| @cindex register-to-stack conversion |
| @item |
| Conversion from usage of some hard registers to usage of a register |
| stack may be done at this point. Currently, this is supported only |
| for the floating-point registers of the Intel 80387 coprocessor. The |
| source file name is @file{reg-stack.c}. |
| |
| @opindex dk |
| The options @option{-dk} causes a debugging dump of the RTL code after |
| this pass. This dump file's name is made by appending @samp{.stack} |
| to the input file name. |
| |
| @cindex final pass |
| @cindex peephole optimization |
| @item |
| Final. This pass outputs the assembler code for the function. It is |
| also responsible for identifying spurious test and compare |
| instructions. Machine-specific peephole optimizations are performed |
| at the same time. The function entry and exit sequences are generated |
| directly as assembler code in this pass; they never exist as RTL@. |
| |
| The source files are @file{final.c} plus @file{insn-output.c}; the |
| latter is generated automatically from the machine description by the |
| tool @file{genoutput}. The header file @file{conditions.h} is used |
| for communication between these files. |
| |
| @cindex debugging information generation |
| @item |
| Debugging information output. This is run after final because it must |
| output the stack slot offsets for pseudo registers that did not get |
| hard registers. Source files are @file{dbxout.c} for DBX symbol table |
| format, @file{sdbout.c} for SDB symbol table format, @file{dwarfout.c} |
| for DWARF symbol table format, files @file{dwarf2out.c} and |
| @file{dwarf2asm.c} for DWARF2 symbol table format, and @file{vmsdbgout.c} |
| for VMS debug symbol table format. |
| @end itemize |
| |
| Some additional files are used by all or many passes: |
| |
| @itemize @bullet |
| @item |
| Every pass uses @file{machmode.def} and @file{machmode.h} which define |
| the machine modes. |
| |
| @item |
| Several passes use @file{real.h}, which defines the default |
| representation of floating point constants and how to operate on them. |
| |
| @item |
| All the passes that work with RTL use the header files @file{rtl.h} |
| and @file{rtl.def}, and subroutines in file @file{rtl.c}. The tools |
| @code{gen*} also use these files to read and work with the machine |
| description RTL@. |
| |
| @item |
| All the tools that read the machine description use support routines |
| found in @file{gensupport.c}, @file{errors.c}, and @file{read-rtl.c}. |
| |
| @findex genconfig |
| @item |
| Several passes refer to the header file @file{insn-config.h} which |
| contains a few parameters (C macro definitions) generated |
| automatically from the machine description RTL by the tool |
| @code{genconfig}. |
| |
| @cindex instruction recognizer |
| @item |
| Several passes use the instruction recognizer, which consists of |
| @file{recog.c} and @file{recog.h}, plus the files @file{insn-recog.c} |
| and @file{insn-extract.c} that are generated automatically from the |
| machine description by the tools @file{genrecog} and |
| @file{genextract}. |
| |
| @item |
| Several passes use the header files @file{regs.h} which defines the |
| information recorded about pseudo register usage, and @file{basic-block.h} |
| which defines the information recorded about basic blocks. |
| |
| @item |
| @file{hard-reg-set.h} defines the type @code{HARD_REG_SET}, a bit-vector |
| with a bit for each hard register, and some macros to manipulate it. |
| This type is just @code{int} if the machine has few enough hard registers; |
| otherwise it is an array of @code{int} and some of the macros expand |
| into loops. |
| |
| @item |
| Several passes use instruction attributes. A definition of the |
| attributes defined for a particular machine is in file |
| @file{insn-attr.h}, which is generated from the machine description by |
| the program @file{genattr}. The file @file{insn-attrtab.c} contains |
| subroutines to obtain the attribute values for insns and information |
| about processor pipeline characteristics for the instruction |
| scheduler. It is generated from the machine description by the |
| program @file{genattrtab}. |
| @end itemize |