| @c Copyright (C) 1988,89,92,93,94,96 Free Software Foundation, Inc. |
| @c This is part of the GCC manual. |
| @c For copying conditions, see the file gcc.texi. |
| |
| @ifset INTERNALS |
| @node Machine Desc |
| @chapter Machine Descriptions |
| @cindex machine descriptions |
| |
| A machine description has two parts: a file of instruction patterns |
| (@file{.md} file) and a C header file of macro definitions. |
| |
| The @file{.md} file for a target machine contains a pattern for each |
| instruction that the target machine supports (or at least each instruction |
| that is worth telling the compiler about). It may also contain comments. |
| A semicolon causes the rest of the line to be a comment, unless the semicolon |
| is inside a quoted string. |
| |
| See the next chapter for information on the C header file. |
| |
| @menu |
| * Patterns:: How to write instruction patterns. |
| * Example:: An explained example of a @code{define_insn} pattern. |
| * RTL Template:: The RTL template defines what insns match a pattern. |
| * Output Template:: The output template says how to make assembler code |
| from such an insn. |
| * Output Statement:: For more generality, write C code to output |
| the assembler code. |
| * Constraints:: When not all operands are general operands. |
| * Standard Names:: Names mark patterns to use for code generation. |
| * Pattern Ordering:: When the order of patterns makes a difference. |
| * Dependent Patterns:: Having one pattern may make you need another. |
| * Jump Patterns:: Special considerations for patterns for jump insns. |
| * Insn Canonicalizations::Canonicalization of Instructions |
| * Peephole Definitions::Defining machine-specific peephole optimizations. |
| * Expander Definitions::Generating a sequence of several RTL insns |
| for a standard operation. |
| * Insn Splitting:: Splitting Instructions into Multiple Instructions |
| * Insn Attributes:: Specifying the value of attributes for generated insns. |
| @end menu |
| |
| @node Patterns |
| @section Everything about Instruction Patterns |
| @cindex patterns |
| @cindex instruction patterns |
| |
| @findex define_insn |
| Each instruction pattern contains an incomplete RTL expression, with pieces |
| to be filled in later, operand constraints that restrict how the pieces can |
| be filled in, and an output pattern or C code to generate the assembler |
| output, all wrapped up in a @code{define_insn} expression. |
| |
| A @code{define_insn} is an RTL expression containing four or five operands: |
| |
| @enumerate |
| @item |
| An optional name. The presence of a name indicate that this instruction |
| pattern can perform a certain standard job for the RTL-generation |
| pass of the compiler. This pass knows certain names and will use |
| the instruction patterns with those names, if the names are defined |
| in the machine description. |
| |
| The absence of a name is indicated by writing an empty string |
| where the name should go. Nameless instruction patterns are never |
| used for generating RTL code, but they may permit several simpler insns |
| to be combined later on. |
| |
| Names that are not thus known and used in RTL-generation have no |
| effect; they are equivalent to no name at all. |
| |
| @item |
| The @dfn{RTL template} (@pxref{RTL Template}) is a vector of incomplete |
| RTL expressions which show what the instruction should look like. It is |
| incomplete because it may contain @code{match_operand}, |
| @code{match_operator}, and @code{match_dup} expressions that stand for |
| operands of the instruction. |
| |
| If the vector has only one element, that element is the template for the |
| instruction pattern. If the vector has multiple elements, then the |
| instruction pattern is a @code{parallel} expression containing the |
| elements described. |
| |
| @item |
| @cindex pattern conditions |
| @cindex conditions, in patterns |
| A condition. This is a string which contains a C expression that is |
| the final test to decide whether an insn body matches this pattern. |
| |
| @cindex named patterns and conditions |
| For a named pattern, the condition (if present) may not depend on |
| the data in the insn being matched, but only the target-machine-type |
| flags. The compiler needs to test these conditions during |
| initialization in order to learn exactly which named instructions are |
| available in a particular run. |
| |
| @findex operands |
| For nameless patterns, the condition is applied only when matching an |
| individual insn, and only after the insn has matched the pattern's |
| recognition template. The insn's operands may be found in the vector |
| @code{operands}. |
| |
| @item |
| The @dfn{output template}: a string that says how to output matching |
| insns as assembler code. @samp{%} in this string specifies where |
| to substitute the value of an operand. @xref{Output Template}. |
| |
| When simple substitution isn't general enough, you can specify a piece |
| of C code to compute the output. @xref{Output Statement}. |
| |
| @item |
| Optionally, a vector containing the values of attributes for insns matching |
| this pattern. @xref{Insn Attributes}. |
| @end enumerate |
| |
| @node Example |
| @section Example of @code{define_insn} |
| @cindex @code{define_insn} example |
| |
| Here is an actual example of an instruction pattern, for the 68000/68020. |
| |
| @example |
| (define_insn "tstsi" |
| [(set (cc0) |
| (match_operand:SI 0 "general_operand" "rm"))] |
| "" |
| "* |
| @{ if (TARGET_68020 || ! ADDRESS_REG_P (operands[0])) |
| return \"tstl %0\"; |
| return \"cmpl #0,%0\"; @}") |
| @end example |
| |
| This is an instruction that sets the condition codes based on the value of |
| a general operand. It has no condition, so any insn whose RTL description |
| has the form shown may be handled according to this pattern. The name |
| @samp{tstsi} means ``test a @code{SImode} value'' and tells the RTL generation |
| pass that, when it is necessary to test such a value, an insn to do so |
| can be constructed using this pattern. |
| |
| The output control string is a piece of C code which chooses which |
| output template to return based on the kind of operand and the specific |
| type of CPU for which code is being generated. |
| |
| @samp{"rm"} is an operand constraint. Its meaning is explained below. |
| |
| @node RTL Template |
| @section RTL Template |
| @cindex RTL insn template |
| @cindex generating insns |
| @cindex insns, generating |
| @cindex recognizing insns |
| @cindex insns, recognizing |
| |
| The RTL template is used to define which insns match the particular pattern |
| and how to find their operands. For named patterns, the RTL template also |
| says how to construct an insn from specified operands. |
| |
| Construction involves substituting specified operands into a copy of the |
| template. Matching involves determining the values that serve as the |
| operands in the insn being matched. Both of these activities are |
| controlled by special expression types that direct matching and |
| substitution of the operands. |
| |
| @table @code |
| @findex match_operand |
| @item (match_operand:@var{m} @var{n} @var{predicate} @var{constraint}) |
| This expression is a placeholder for operand number @var{n} of |
| the insn. When constructing an insn, operand number @var{n} |
| will be substituted at this point. When matching an insn, whatever |
| appears at this position in the insn will be taken as operand |
| number @var{n}; but it must satisfy @var{predicate} or this instruction |
| pattern will not match at all. |
| |
| Operand numbers must be chosen consecutively counting from zero in |
| each instruction pattern. There may be only one @code{match_operand} |
| expression in the pattern for each operand number. Usually operands |
| are numbered in the order of appearance in @code{match_operand} |
| expressions. |
| |
| @var{predicate} is a string that is the name of a C function that accepts two |
| arguments, an expression and a machine mode. During matching, the |
| function will be called with the putative operand as the expression and |
| @var{m} as the mode argument (if @var{m} is not specified, |
| @code{VOIDmode} will be used, which normally causes @var{predicate} to accept |
| any mode). If it returns zero, this instruction pattern fails to match. |
| @var{predicate} may be an empty string; then it means no test is to be done |
| on the operand, so anything which occurs in this position is valid. |
| |
| Most of the time, @var{predicate} will reject modes other than @var{m}---but |
| not always. For example, the predicate @code{address_operand} uses |
| @var{m} as the mode of memory ref that the address should be valid for. |
| Many predicates accept @code{const_int} nodes even though their mode is |
| @code{VOIDmode}. |
| |
| @var{constraint} controls reloading and the choice of the best register |
| class to use for a value, as explained later (@pxref{Constraints}). |
| |
| People are often unclear on the difference between the constraint and the |
| predicate. The predicate helps decide whether a given insn matches the |
| pattern. The constraint plays no role in this decision; instead, it |
| controls various decisions in the case of an insn which does match. |
| |
| @findex general_operand |
| On CISC machines, the most common @var{predicate} is |
| @code{"general_operand"}. This function checks that the putative |
| operand is either a constant, a register or a memory reference, and that |
| it is valid for mode @var{m}. |
| |
| @findex register_operand |
| For an operand that must be a register, @var{predicate} should be |
| @code{"register_operand"}. Using @code{"general_operand"} would be |
| valid, since the reload pass would copy any non-register operands |
| through registers, but this would make GNU CC do extra work, it would |
| prevent invariant operands (such as constant) from being removed from |
| loops, and it would prevent the register allocator from doing the best |
| possible job. On RISC machines, it is usually most efficient to allow |
| @var{predicate} to accept only objects that the constraints allow. |
| |
| @findex immediate_operand |
| For an operand that must be a constant, you must be sure to either use |
| @code{"immediate_operand"} for @var{predicate}, or make the instruction |
| pattern's extra condition require a constant, or both. You cannot |
| expect the constraints to do this work! If the constraints allow only |
| constants, but the predicate allows something else, the compiler will |
| crash when that case arises. |
| |
| @findex match_scratch |
| @item (match_scratch:@var{m} @var{n} @var{constraint}) |
| This expression is also a placeholder for operand number @var{n} |
| and indicates that operand must be a @code{scratch} or @code{reg} |
| expression. |
| |
| When matching patterns, this is equivalent to |
| |
| @smallexample |
| (match_operand:@var{m} @var{n} "scratch_operand" @var{pred}) |
| @end smallexample |
| |
| but, when generating RTL, it produces a (@code{scratch}:@var{m}) |
| expression. |
| |
| If the last few expressions in a @code{parallel} are @code{clobber} |
| expressions whose operands are either a hard register or |
| @code{match_scratch}, the combiner can add or delete them when |
| necessary. @xref{Side Effects}. |
| |
| @findex match_dup |
| @item (match_dup @var{n}) |
| This expression is also a placeholder for operand number @var{n}. |
| It is used when the operand needs to appear more than once in the |
| insn. |
| |
| In construction, @code{match_dup} acts just like @code{match_operand}: |
| the operand is substituted into the insn being constructed. But in |
| matching, @code{match_dup} behaves differently. It assumes that operand |
| number @var{n} has already been determined by a @code{match_operand} |
| appearing earlier in the recognition template, and it matches only an |
| identical-looking expression. |
| |
| @findex match_operator |
| @item (match_operator:@var{m} @var{n} @var{predicate} [@var{operands}@dots{}]) |
| This pattern is a kind of placeholder for a variable RTL expression |
| code. |
| |
| When constructing an insn, it stands for an RTL expression whose |
| expression code is taken from that of operand @var{n}, and whose |
| operands are constructed from the patterns @var{operands}. |
| |
| When matching an expression, it matches an expression if the function |
| @var{predicate} returns nonzero on that expression @emph{and} the |
| patterns @var{operands} match the operands of the expression. |
| |
| Suppose that the function @code{commutative_operator} is defined as |
| follows, to match any expression whose operator is one of the |
| commutative arithmetic operators of RTL and whose mode is @var{mode}: |
| |
| @smallexample |
| int |
| commutative_operator (x, mode) |
| rtx x; |
| enum machine_mode mode; |
| @{ |
| enum rtx_code code = GET_CODE (x); |
| if (GET_MODE (x) != mode) |
| return 0; |
| return (GET_RTX_CLASS (code) == 'c' |
| || code == EQ || code == NE); |
| @} |
| @end smallexample |
| |
| Then the following pattern will match any RTL expression consisting |
| of a commutative operator applied to two general operands: |
| |
| @smallexample |
| (match_operator:SI 3 "commutative_operator" |
| [(match_operand:SI 1 "general_operand" "g") |
| (match_operand:SI 2 "general_operand" "g")]) |
| @end smallexample |
| |
| Here the vector @code{[@var{operands}@dots{}]} contains two patterns |
| because the expressions to be matched all contain two operands. |
| |
| When this pattern does match, the two operands of the commutative |
| operator are recorded as operands 1 and 2 of the insn. (This is done |
| by the two instances of @code{match_operand}.) Operand 3 of the insn |
| will be the entire commutative expression: use @code{GET_CODE |
| (operands[3])} to see which commutative operator was used. |
| |
| The machine mode @var{m} of @code{match_operator} works like that of |
| @code{match_operand}: it is passed as the second argument to the |
| predicate function, and that function is solely responsible for |
| deciding whether the expression to be matched ``has'' that mode. |
| |
| When constructing an insn, argument 3 of the gen-function will specify |
| the operation (i.e. the expression code) for the expression to be |
| made. It should be an RTL expression, whose expression code is copied |
| into a new expression whose operands are arguments 1 and 2 of the |
| gen-function. The subexpressions of argument 3 are not used; |
| only its expression code matters. |
| |
| When @code{match_operator} is used in a pattern for matching an insn, |
| it usually best if the operand number of the @code{match_operator} |
| is higher than that of the actual operands of the insn. This improves |
| register allocation because the register allocator often looks at |
| operands 1 and 2 of insns to see if it can do register tying. |
| |
| There is no way to specify constraints in @code{match_operator}. The |
| operand of the insn which corresponds to the @code{match_operator} |
| never has any constraints because it is never reloaded as a whole. |
| However, if parts of its @var{operands} are matched by |
| @code{match_operand} patterns, those parts may have constraints of |
| their own. |
| |
| @findex match_op_dup |
| @item (match_op_dup:@var{m} @var{n}[@var{operands}@dots{}]) |
| Like @code{match_dup}, except that it applies to operators instead of |
| operands. When constructing an insn, operand number @var{n} will be |
| substituted at this point. But in matching, @code{match_op_dup} behaves |
| differently. It assumes that operand number @var{n} has already been |
| determined by a @code{match_operator} appearing earlier in the |
| recognition template, and it matches only an identical-looking |
| expression. |
| |
| @findex match_parallel |
| @item (match_parallel @var{n} @var{predicate} [@var{subpat}@dots{}]) |
| This pattern is a placeholder for an insn that consists of a |
| @code{parallel} expression with a variable number of elements. This |
| expression should only appear at the top level of an insn pattern. |
| |
| When constructing an insn, operand number @var{n} will be substituted at |
| this point. When matching an insn, it matches if the body of the insn |
| is a @code{parallel} expression with at least as many elements as the |
| vector of @var{subpat} expressions in the @code{match_parallel}, if each |
| @var{subpat} matches the corresponding element of the @code{parallel}, |
| @emph{and} the function @var{predicate} returns nonzero on the |
| @code{parallel} that is the body of the insn. It is the responsibility |
| of the predicate to validate elements of the @code{parallel} beyond |
| those listed in the @code{match_parallel}.@refill |
| |
| A typical use of @code{match_parallel} is to match load and store |
| multiple expressions, which can contain a variable number of elements |
| in a @code{parallel}. For example, |
| @c the following is *still* going over. need to change the code. |
| @c also need to work on grouping of this example. --mew 1feb93 |
| |
| @smallexample |
| (define_insn "" |
| [(match_parallel 0 "load_multiple_operation" |
| [(set (match_operand:SI 1 "gpc_reg_operand" "=r") |
| (match_operand:SI 2 "memory_operand" "m")) |
| (use (reg:SI 179)) |
| (clobber (reg:SI 179))])] |
| "" |
| "loadm 0,0,%1,%2") |
| @end smallexample |
| |
| This example comes from @file{a29k.md}. The function |
| @code{load_multiple_operations} is defined in @file{a29k.c} and checks |
| that subsequent elements in the @code{parallel} are the same as the |
| @code{set} in the pattern, except that they are referencing subsequent |
| registers and memory locations. |
| |
| An insn that matches this pattern might look like: |
| |
| @smallexample |
| (parallel |
| [(set (reg:SI 20) (mem:SI (reg:SI 100))) |
| (use (reg:SI 179)) |
| (clobber (reg:SI 179)) |
| (set (reg:SI 21) |
| (mem:SI (plus:SI (reg:SI 100) |
| (const_int 4)))) |
| (set (reg:SI 22) |
| (mem:SI (plus:SI (reg:SI 100) |
| (const_int 8))))]) |
| @end smallexample |
| |
| @findex match_par_dup |
| @item (match_par_dup @var{n} [@var{subpat}@dots{}]) |
| Like @code{match_op_dup}, but for @code{match_parallel} instead of |
| @code{match_operator}. |
| |
| @findex address |
| @item (address (match_operand:@var{m} @var{n} "address_operand" "")) |
| This complex of expressions is a placeholder for an operand number |
| @var{n} in a ``load address'' instruction: an operand which specifies |
| a memory location in the usual way, but for which the actual operand |
| value used is the address of the location, not the contents of the |
| location. |
| |
| @code{address} expressions never appear in RTL code, only in machine |
| descriptions. And they are used only in machine descriptions that do |
| not use the operand constraint feature. When operand constraints are |
| in use, the letter @samp{p} in the constraint serves this purpose. |
| |
| @var{m} is the machine mode of the @emph{memory location being |
| addressed}, not the machine mode of the address itself. That mode is |
| always the same on a given target machine (it is @code{Pmode}, which |
| normally is @code{SImode}), so there is no point in mentioning it; |
| thus, no machine mode is written in the @code{address} expression. If |
| some day support is added for machines in which addresses of different |
| kinds of objects appear differently or are used differently (such as |
| the PDP-10), different formats would perhaps need different machine |
| modes and these modes might be written in the @code{address} |
| expression. |
| @end table |
| |
| @node Output Template |
| @section Output Templates and Operand Substitution |
| @cindex output templates |
| @cindex operand substitution |
| |
| @cindex @samp{%} in template |
| @cindex percent sign |
| The @dfn{output template} is a string which specifies how to output the |
| assembler code for an instruction pattern. Most of the template is a |
| fixed string which is output literally. The character @samp{%} is used |
| to specify where to substitute an operand; it can also be used to |
| identify places where different variants of the assembler require |
| different syntax. |
| |
| In the simplest case, a @samp{%} followed by a digit @var{n} says to output |
| operand @var{n} at that point in the string. |
| |
| @samp{%} followed by a letter and a digit says to output an operand in an |
| alternate fashion. Four letters have standard, built-in meanings described |
| below. The machine description macro @code{PRINT_OPERAND} can define |
| additional letters with nonstandard meanings. |
| |
| @samp{%c@var{digit}} can be used to substitute an operand that is a |
| constant value without the syntax that normally indicates an immediate |
| operand. |
| |
| @samp{%n@var{digit}} is like @samp{%c@var{digit}} except that the value of |
| the constant is negated before printing. |
| |
| @samp{%a@var{digit}} can be used to substitute an operand as if it were a |
| memory reference, with the actual operand treated as the address. This may |
| be useful when outputting a ``load address'' instruction, because often the |
| assembler syntax for such an instruction requires you to write the operand |
| as if it were a memory reference. |
| |
| @samp{%l@var{digit}} is used to substitute a @code{label_ref} into a jump |
| instruction. |
| |
| @samp{%=} outputs a number which is unique to each instruction in the |
| entire compilation. This is useful for making local labels to be |
| referred to more than once in a single template that generates multiple |
| assembler instructions. |
| |
| @samp{%} followed by a punctuation character specifies a substitution that |
| does not use an operand. Only one case is standard: @samp{%%} outputs a |
| @samp{%} into the assembler code. Other nonstandard cases can be |
| defined in the @code{PRINT_OPERAND} macro. You must also define |
| which punctuation characters are valid with the |
| @code{PRINT_OPERAND_PUNCT_VALID_P} macro. |
| |
| @cindex \ |
| @cindex backslash |
| The template may generate multiple assembler instructions. Write the text |
| for the instructions, with @samp{\;} between them. |
| |
| @cindex matching operands |
| When the RTL contains two operands which are required by constraint to match |
| each other, the output template must refer only to the lower-numbered operand. |
| Matching operands are not always identical, and the rest of the compiler |
| arranges to put the proper RTL expression for printing into the lower-numbered |
| operand. |
| |
| One use of nonstandard letters or punctuation following @samp{%} is to |
| distinguish between different assembler languages for the same machine; for |
| example, Motorola syntax versus MIT syntax for the 68000. Motorola syntax |
| requires periods in most opcode names, while MIT syntax does not. For |
| example, the opcode @samp{movel} in MIT syntax is @samp{move.l} in Motorola |
| syntax. The same file of patterns is used for both kinds of output syntax, |
| but the character sequence @samp{%.} is used in each place where Motorola |
| syntax wants a period. The @code{PRINT_OPERAND} macro for Motorola syntax |
| defines the sequence to output a period; the macro for MIT syntax defines |
| it to do nothing. |
| |
| @cindex @code{#} in template |
| As a special case, a template consisting of the single character @code{#} |
| instructs the compiler to first split the insn, and then output the |
| resulting instructions separately. This helps eliminate redundancy in the |
| output templates. If you have a @code{define_insn} that needs to emit |
| multiple assembler instructions, and there is an matching @code{define_split} |
| already defined, then you can simply use @code{#} as the output template |
| instead of writing an output template that emits the multiple assembler |
| instructions. |
| |
| If the macro @code{ASSEMBLER_DIALECT} is defined, you can use construct |
| of the form @samp{@{option0|option1|option2@}} in the templates. These |
| describe multiple variants of assembler language syntax. |
| @xref{Instruction Output}. |
| |
| @node Output Statement |
| @section C Statements for Assembler Output |
| @cindex output statements |
| @cindex C statements for assembler output |
| @cindex generating assembler output |
| |
| Often a single fixed template string cannot produce correct and efficient |
| assembler code for all the cases that are recognized by a single |
| instruction pattern. For example, the opcodes may depend on the kinds of |
| operands; or some unfortunate combinations of operands may require extra |
| machine instructions. |
| |
| If the output control string starts with a @samp{@@}, then it is actually |
| a series of templates, each on a separate line. (Blank lines and |
| leading spaces and tabs are ignored.) The templates correspond to the |
| pattern's constraint alternatives (@pxref{Multi-Alternative}). For example, |
| if a target machine has a two-address add instruction @samp{addr} to add |
| into a register and another @samp{addm} to add a register to memory, you |
| might write this pattern: |
| |
| @smallexample |
| (define_insn "addsi3" |
| [(set (match_operand:SI 0 "general_operand" "=r,m") |
| (plus:SI (match_operand:SI 1 "general_operand" "0,0") |
| (match_operand:SI 2 "general_operand" "g,r")))] |
| "" |
| "@@ |
| addr %2,%0 |
| addm %2,%0") |
| @end smallexample |
| |
| @cindex @code{*} in template |
| @cindex asterisk in template |
| If the output control string starts with a @samp{*}, then it is not an |
| output template but rather a piece of C program that should compute a |
| template. It should execute a @code{return} statement to return the |
| template-string you want. Most such templates use C string literals, which |
| require doublequote characters to delimit them. To include these |
| doublequote characters in the string, prefix each one with @samp{\}. |
| |
| The operands may be found in the array @code{operands}, whose C data type |
| is @code{rtx []}. |
| |
| It is very common to select different ways of generating assembler code |
| based on whether an immediate operand is within a certain range. Be |
| careful when doing this, because the result of @code{INTVAL} is an |
| integer on the host machine. If the host machine has more bits in an |
| @code{int} than the target machine has in the mode in which the constant |
| will be used, then some of the bits you get from @code{INTVAL} will be |
| superfluous. For proper results, you must carefully disregard the |
| values of those bits. |
| |
| @findex output_asm_insn |
| It is possible to output an assembler instruction and then go on to output |
| or compute more of them, using the subroutine @code{output_asm_insn}. This |
| receives two arguments: a template-string and a vector of operands. The |
| vector may be @code{operands}, or it may be another array of @code{rtx} |
| that you declare locally and initialize yourself. |
| |
| @findex which_alternative |
| When an insn pattern has multiple alternatives in its constraints, often |
| the appearance of the assembler code is determined mostly by which alternative |
| was matched. When this is so, the C code can test the variable |
| @code{which_alternative}, which is the ordinal number of the alternative |
| that was actually satisfied (0 for the first, 1 for the second alternative, |
| etc.). |
| |
| For example, suppose there are two opcodes for storing zero, @samp{clrreg} |
| for registers and @samp{clrmem} for memory locations. Here is how |
| a pattern could use @code{which_alternative} to choose between them: |
| |
| @smallexample |
| (define_insn "" |
| [(set (match_operand:SI 0 "general_operand" "=r,m") |
| (const_int 0))] |
| "" |
| "* |
| return (which_alternative == 0 |
| ? \"clrreg %0\" : \"clrmem %0\"); |
| ") |
| @end smallexample |
| |
| The example above, where the assembler code to generate was |
| @emph{solely} determined by the alternative, could also have been specified |
| as follows, having the output control string start with a @samp{@@}: |
| |
| @smallexample |
| @group |
| (define_insn "" |
| [(set (match_operand:SI 0 "general_operand" "=r,m") |
| (const_int 0))] |
| "" |
| "@@ |
| clrreg %0 |
| clrmem %0") |
| @end group |
| @end smallexample |
| @end ifset |
| |
| @c Most of this node appears by itself (in a different place) even |
| @c when the INTERNALS flag is clear. Passages that require the full |
| @c manual's context are conditionalized to appear only in the full manual. |
| @ifset INTERNALS |
| @node Constraints |
| @section Operand Constraints |
| @cindex operand constraints |
| @cindex constraints |
| |
| Each @code{match_operand} in an instruction pattern can specify a |
| constraint for the type of operands allowed. |
| @end ifset |
| @ifclear INTERNALS |
| @node Constraints |
| @section Constraints for @code{asm} Operands |
| @cindex operand constraints, @code{asm} |
| @cindex constraints, @code{asm} |
| @cindex @code{asm} constraints |
| |
| Here are specific details on what constraint letters you can use with |
| @code{asm} operands. |
| @end ifclear |
| Constraints can say whether |
| an operand may be in a register, and which kinds of register; whether the |
| operand can be a memory reference, and which kinds of address; whether the |
| operand may be an immediate constant, and which possible values it may |
| have. Constraints can also require two operands to match. |
| |
| @ifset INTERNALS |
| @menu |
| * Simple Constraints:: Basic use of constraints. |
| * Multi-Alternative:: When an insn has two alternative constraint-patterns. |
| * Class Preferences:: Constraints guide which hard register to put things in. |
| * Modifiers:: More precise control over effects of constraints. |
| * Machine Constraints:: Existing constraints for some particular machines. |
| * No Constraints:: Describing a clean machine without constraints. |
| @end menu |
| @end ifset |
| |
| @ifclear INTERNALS |
| @menu |
| * Simple Constraints:: Basic use of constraints. |
| * Multi-Alternative:: When an insn has two alternative constraint-patterns. |
| * Modifiers:: More precise control over effects of constraints. |
| * Machine Constraints:: Special constraints for some particular machines. |
| @end menu |
| @end ifclear |
| |
| @node Simple Constraints |
| @subsection Simple Constraints |
| @cindex simple constraints |
| |
| The simplest kind of constraint is a string full of letters, each of |
| which describes one kind of operand that is permitted. Here are |
| the letters that are allowed: |
| |
| @table @asis |
| @cindex @samp{m} in constraint |
| @cindex memory references in constraints |
| @item @samp{m} |
| A memory operand is allowed, with any kind of address that the machine |
| supports in general. |
| |
| @cindex offsettable address |
| @cindex @samp{o} in constraint |
| @item @samp{o} |
| A memory operand is allowed, but only if the address is |
| @dfn{offsettable}. This means that adding a small integer (actually, |
| the width in bytes of the operand, as determined by its machine mode) |
| may be added to the address and the result is also a valid memory |
| address. |
| |
| @cindex autoincrement/decrement addressing |
| For example, an address which is constant is offsettable; so is an |
| address that is the sum of a register and a constant (as long as a |
| slightly larger constant is also within the range of address-offsets |
| supported by the machine); but an autoincrement or autodecrement |
| address is not offsettable. More complicated indirect/indexed |
| addresses may or may not be offsettable depending on the other |
| addressing modes that the machine supports. |
| |
| Note that in an output operand which can be matched by another |
| operand, the constraint letter @samp{o} is valid only when accompanied |
| by both @samp{<} (if the target machine has predecrement addressing) |
| and @samp{>} (if the target machine has preincrement addressing). |
| |
| @cindex @samp{V} in constraint |
| @item @samp{V} |
| A memory operand that is not offsettable. In other words, anything that |
| would fit the @samp{m} constraint but not the @samp{o} constraint. |
| |
| @cindex @samp{<} in constraint |
| @item @samp{<} |
| A memory operand with autodecrement addressing (either predecrement or |
| postdecrement) is allowed. |
| |
| @cindex @samp{>} in constraint |
| @item @samp{>} |
| A memory operand with autoincrement addressing (either preincrement or |
| postincrement) is allowed. |
| |
| @cindex @samp{r} in constraint |
| @cindex registers in constraints |
| @item @samp{r} |
| A register operand is allowed provided that it is in a general |
| register. |
| |
| @cindex @samp{d} in constraint |
| @item @samp{d}, @samp{a}, @samp{f}, @dots{} |
| Other letters can be defined in machine-dependent fashion to stand for |
| particular classes of registers. @samp{d}, @samp{a} and @samp{f} are |
| defined on the 68000/68020 to stand for data, address and floating |
| point registers. |
| |
| @cindex constants in constraints |
| @cindex @samp{i} in constraint |
| @item @samp{i} |
| An immediate integer operand (one with constant value) is allowed. |
| This includes symbolic constants whose values will be known only at |
| assembly time. |
| |
| @cindex @samp{n} in constraint |
| @item @samp{n} |
| An immediate integer operand with a known numeric value is allowed. |
| Many systems cannot support assembly-time constants for operands less |
| than a word wide. Constraints for these operands should use @samp{n} |
| rather than @samp{i}. |
| |
| @cindex @samp{I} in constraint |
| @item @samp{I}, @samp{J}, @samp{K}, @dots{} @samp{P} |
| Other letters in the range @samp{I} through @samp{P} may be defined in |
| a machine-dependent fashion to permit immediate integer operands with |
| explicit integer values in specified ranges. For example, on the |
| 68000, @samp{I} is defined to stand for the range of values 1 to 8. |
| This is the range permitted as a shift count in the shift |
| instructions. |
| |
| @cindex @samp{E} in constraint |
| @item @samp{E} |
| An immediate floating operand (expression code @code{const_double}) is |
| allowed, but only if the target floating point format is the same as |
| that of the host machine (on which the compiler is running). |
| |
| @cindex @samp{F} in constraint |
| @item @samp{F} |
| An immediate floating operand (expression code @code{const_double}) is |
| allowed. |
| |
| @cindex @samp{G} in constraint |
| @cindex @samp{H} in constraint |
| @item @samp{G}, @samp{H} |
| @samp{G} and @samp{H} may be defined in a machine-dependent fashion to |
| permit immediate floating operands in particular ranges of values. |
| |
| @cindex @samp{s} in constraint |
| @item @samp{s} |
| An immediate integer operand whose value is not an explicit integer is |
| allowed. |
| |
| This might appear strange; if an insn allows a constant operand with a |
| value not known at compile time, it certainly must allow any known |
| value. So why use @samp{s} instead of @samp{i}? Sometimes it allows |
| better code to be generated. |
| |
| For example, on the 68000 in a fullword instruction it is possible to |
| use an immediate operand; but if the immediate value is between -128 |
| and 127, better code results from loading the value into a register and |
| using the register. This is because the load into the register can be |
| done with a @samp{moveq} instruction. We arrange for this to happen |
| by defining the letter @samp{K} to mean ``any integer outside the |
| range -128 to 127'', and then specifying @samp{Ks} in the operand |
| constraints. |
| |
| @cindex @samp{g} in constraint |
| @item @samp{g} |
| Any register, memory or immediate integer operand is allowed, except for |
| registers that are not general registers. |
| |
| @cindex @samp{X} in constraint |
| @item @samp{X} |
| @ifset INTERNALS |
| Any operand whatsoever is allowed, even if it does not satisfy |
| @code{general_operand}. This is normally used in the constraint of |
| a @code{match_scratch} when certain alternatives will not actually |
| require a scratch register. |
| @end ifset |
| @ifclear INTERNALS |
| Any operand whatsoever is allowed. |
| @end ifclear |
| |
| @cindex @samp{0} in constraint |
| @cindex digits in constraint |
| @item @samp{0}, @samp{1}, @samp{2}, @dots{} @samp{9} |
| An operand that matches the specified operand number is allowed. If a |
| digit is used together with letters within the same alternative, the |
| digit should come last. |
| |
| @cindex matching constraint |
| @cindex constraint, matching |
| This is called a @dfn{matching constraint} and what it really means is |
| that the assembler has only a single operand that fills two roles |
| @ifset INTERNALS |
| considered separate in the RTL insn. For example, an add insn has two |
| input operands and one output operand in the RTL, but on most CISC |
| @end ifset |
| @ifclear INTERNALS |
| which @code{asm} distinguishes. For example, an add instruction uses |
| two input operands and an output operand, but on most CISC |
| @end ifclear |
| machines an add instruction really has only two operands, one of them an |
| input-output operand: |
| |
| @smallexample |
| addl #35,r12 |
| @end smallexample |
| |
| Matching constraints are used in these circumstances. |
| More precisely, the two operands that match must include one input-only |
| operand and one output-only operand. Moreover, the digit must be a |
| smaller number than the number of the operand that uses it in the |
| constraint. |
| |
| @ifset INTERNALS |
| For operands to match in a particular case usually means that they |
| are identical-looking RTL expressions. But in a few special cases |
| specific kinds of dissimilarity are allowed. For example, @code{*x} |
| as an input operand will match @code{*x++} as an output operand. |
| For proper results in such cases, the output template should always |
| use the output-operand's number when printing the operand. |
| @end ifset |
| |
| @cindex load address instruction |
| @cindex push address instruction |
| @cindex address constraints |
| @cindex @samp{p} in constraint |
| @item @samp{p} |
| An operand that is a valid memory address is allowed. This is |
| for ``load address'' and ``push address'' instructions. |
| |
| @findex address_operand |
| @samp{p} in the constraint must be accompanied by @code{address_operand} |
| as the predicate in the @code{match_operand}. This predicate interprets |
| the mode specified in the @code{match_operand} as the mode of the memory |
| reference for which the address would be valid. |
| |
| @cindex extensible constraints |
| @cindex @samp{Q}, in constraint |
| @item @samp{Q}, @samp{R}, @samp{S}, @dots{} @samp{U} |
| Letters in the range @samp{Q} through @samp{U} may be defined in a |
| machine-dependent fashion to stand for arbitrary operand types. |
| @ifset INTERNALS |
| The machine description macro @code{EXTRA_CONSTRAINT} is passed the |
| operand as its first argument and the constraint letter as its |
| second operand. |
| |
| A typical use for this would be to distinguish certain types of |
| memory references that affect other insn operands. |
| |
| Do not define these constraint letters to accept register references |
| (@code{reg}); the reload pass does not expect this and would not handle |
| it properly. |
| @end ifset |
| @end table |
| |
| @ifset INTERNALS |
| In order to have valid assembler code, each operand must satisfy |
| its constraint. But a failure to do so does not prevent the pattern |
| from applying to an insn. Instead, it directs the compiler to modify |
| the code so that the constraint will be satisfied. Usually this is |
| done by copying an operand into a register. |
| |
| Contrast, therefore, the two instruction patterns that follow: |
| |
| @smallexample |
| (define_insn "" |
| [(set (match_operand:SI 0 "general_operand" "=r") |
| (plus:SI (match_dup 0) |
| (match_operand:SI 1 "general_operand" "r")))] |
| "" |
| "@dots{}") |
| @end smallexample |
| |
| @noindent |
| which has two operands, one of which must appear in two places, and |
| |
| @smallexample |
| (define_insn "" |
| [(set (match_operand:SI 0 "general_operand" "=r") |
| (plus:SI (match_operand:SI 1 "general_operand" "0") |
| (match_operand:SI 2 "general_operand" "r")))] |
| "" |
| "@dots{}") |
| @end smallexample |
| |
| @noindent |
| which has three operands, two of which are required by a constraint to be |
| identical. If we are considering an insn of the form |
| |
| @smallexample |
| (insn @var{n} @var{prev} @var{next} |
| (set (reg:SI 3) |
| (plus:SI (reg:SI 6) (reg:SI 109))) |
| @dots{}) |
| @end smallexample |
| |
| @noindent |
| the first pattern would not apply at all, because this insn does not |
| contain two identical subexpressions in the right place. The pattern would |
| say, ``That does not look like an add instruction; try other patterns.'' |
| The second pattern would say, ``Yes, that's an add instruction, but there |
| is something wrong with it.'' It would direct the reload pass of the |
| compiler to generate additional insns to make the constraint true. The |
| results might look like this: |
| |
| @smallexample |
| (insn @var{n2} @var{prev} @var{n} |
| (set (reg:SI 3) (reg:SI 6)) |
| @dots{}) |
| |
| (insn @var{n} @var{n2} @var{next} |
| (set (reg:SI 3) |
| (plus:SI (reg:SI 3) (reg:SI 109))) |
| @dots{}) |
| @end smallexample |
| |
| It is up to you to make sure that each operand, in each pattern, has |
| constraints that can handle any RTL expression that could be present for |
| that operand. (When multiple alternatives are in use, each pattern must, |
| for each possible combination of operand expressions, have at least one |
| alternative which can handle that combination of operands.) The |
| constraints don't need to @emph{allow} any possible operand---when this is |
| the case, they do not constrain---but they must at least point the way to |
| reloading any possible operand so that it will fit. |
| |
| @itemize @bullet |
| @item |
| If the constraint accepts whatever operands the predicate permits, |
| there is no problem: reloading is never necessary for this operand. |
| |
| For example, an operand whose constraints permit everything except |
| registers is safe provided its predicate rejects registers. |
| |
| An operand whose predicate accepts only constant values is safe |
| provided its constraints include the letter @samp{i}. If any possible |
| constant value is accepted, then nothing less than @samp{i} will do; |
| if the predicate is more selective, then the constraints may also be |
| more selective. |
| |
| @item |
| Any operand expression can be reloaded by copying it into a register. |
| So if an operand's constraints allow some kind of register, it is |
| certain to be safe. It need not permit all classes of registers; the |
| compiler knows how to copy a register into another register of the |
| proper class in order to make an instruction valid. |
| |
| @cindex nonoffsettable memory reference |
| @cindex memory reference, nonoffsettable |
| @item |
| A nonoffsettable memory reference can be reloaded by copying the |
| address into a register. So if the constraint uses the letter |
| @samp{o}, all memory references are taken care of. |
| |
| @item |
| A constant operand can be reloaded by allocating space in memory to |
| hold it as preinitialized data. Then the memory reference can be used |
| in place of the constant. So if the constraint uses the letters |
| @samp{o} or @samp{m}, constant operands are not a problem. |
| |
| @item |
| If the constraint permits a constant and a pseudo register used in an insn |
| was not allocated to a hard register and is equivalent to a constant, |
| the register will be replaced with the constant. If the predicate does |
| not permit a constant and the insn is re-recognized for some reason, the |
| compiler will crash. Thus the predicate must always recognize any |
| objects allowed by the constraint. |
| @end itemize |
| |
| If the operand's predicate can recognize registers, but the constraint does |
| not permit them, it can make the compiler crash. When this operand happens |
| to be a register, the reload pass will be stymied, because it does not know |
| how to copy a register temporarily into memory. |
| |
| If the predicate accepts a unary operator, the constraint applies to the |
| operand. For example, the MIPS processor at ISA level 3 supports an |
| instruction which adds two registers in @code{SImode} to produce a |
| @code{DImode} result, but only if the registers are correctly sign |
| extended. This predicate for the input operands accepts a |
| @code{sign_extend} of an @code{SImode} register. Write the constraint |
| to indicate the type of register that is required for the operand of the |
| @code{sign_extend}. |
| @end ifset |
| |
| @node Multi-Alternative |
| @subsection Multiple Alternative Constraints |
| @cindex multiple alternative constraints |
| |
| Sometimes a single instruction has multiple alternative sets of possible |
| operands. For example, on the 68000, a logical-or instruction can combine |
| register or an immediate value into memory, or it can combine any kind of |
| operand into a register; but it cannot combine one memory location into |
| another. |
| |
| These constraints are represented as multiple alternatives. An alternative |
| can be described by a series of letters for each operand. The overall |
| constraint for an operand is made from the letters for this operand |
| from the first alternative, a comma, the letters for this operand from |
| the second alternative, a comma, and so on until the last alternative. |
| @ifset INTERNALS |
| Here is how it is done for fullword logical-or on the 68000: |
| |
| @smallexample |
| (define_insn "iorsi3" |
| [(set (match_operand:SI 0 "general_operand" "=m,d") |
| (ior:SI (match_operand:SI 1 "general_operand" "%0,0") |
| (match_operand:SI 2 "general_operand" "dKs,dmKs")))] |
| @dots{}) |
| @end smallexample |
| |
| The first alternative has @samp{m} (memory) for operand 0, @samp{0} for |
| operand 1 (meaning it must match operand 0), and @samp{dKs} for operand |
| 2. The second alternative has @samp{d} (data register) for operand 0, |
| @samp{0} for operand 1, and @samp{dmKs} for operand 2. The @samp{=} and |
| @samp{%} in the constraints apply to all the alternatives; their |
| meaning is explained in the next section (@pxref{Class Preferences}). |
| @end ifset |
| |
| @c FIXME Is this ? and ! stuff of use in asm()? If not, hide unless INTERNAL |
| If all the operands fit any one alternative, the instruction is valid. |
| Otherwise, for each alternative, the compiler counts how many instructions |
| must be added to copy the operands so that that alternative applies. |
| The alternative requiring the least copying is chosen. If two alternatives |
| need the same amount of copying, the one that comes first is chosen. |
| These choices can be altered with the @samp{?} and @samp{!} characters: |
| |
| @table @code |
| @cindex @samp{?} in constraint |
| @cindex question mark |
| @item ? |
| Disparage slightly the alternative that the @samp{?} appears in, |
| as a choice when no alternative applies exactly. The compiler regards |
| this alternative as one unit more costly for each @samp{?} that appears |
| in it. |
| |
| @cindex @samp{!} in constraint |
| @cindex exclamation point |
| @item ! |
| Disparage severely the alternative that the @samp{!} appears in. |
| This alternative can still be used if it fits without reloading, |
| but if reloading is needed, some other alternative will be used. |
| @end table |
| |
| @ifset INTERNALS |
| When an insn pattern has multiple alternatives in its constraints, often |
| the appearance of the assembler code is determined mostly by which |
| alternative was matched. When this is so, the C code for writing the |
| assembler code can use the variable @code{which_alternative}, which is |
| the ordinal number of the alternative that was actually satisfied (0 for |
| the first, 1 for the second alternative, etc.). @xref{Output Statement}. |
| @end ifset |
| |
| @ifset INTERNALS |
| @node Class Preferences |
| @subsection Register Class Preferences |
| @cindex class preference constraints |
| @cindex register class preference constraints |
| |
| @cindex voting between constraint alternatives |
| The operand constraints have another function: they enable the compiler |
| to decide which kind of hardware register a pseudo register is best |
| allocated to. The compiler examines the constraints that apply to the |
| insns that use the pseudo register, looking for the machine-dependent |
| letters such as @samp{d} and @samp{a} that specify classes of registers. |
| The pseudo register is put in whichever class gets the most ``votes''. |
| The constraint letters @samp{g} and @samp{r} also vote: they vote in |
| favor of a general register. The machine description says which registers |
| are considered general. |
| |
| Of course, on some machines all registers are equivalent, and no register |
| classes are defined. Then none of this complexity is relevant. |
| @end ifset |
| |
| @node Modifiers |
| @subsection Constraint Modifier Characters |
| @cindex modifiers in constraints |
| @cindex constraint modifier characters |
| |
| @c prevent bad page break with this line |
| Here are constraint modifier characters. |
| |
| @table @samp |
| @cindex @samp{=} in constraint |
| @item = |
| Means that this operand is write-only for this instruction: the previous |
| value is discarded and replaced by output data. |
| |
| @cindex @samp{+} in constraint |
| @item + |
| Means that this operand is both read and written by the instruction. |
| |
| When the compiler fixes up the operands to satisfy the constraints, |
| it needs to know which operands are inputs to the instruction and |
| which are outputs from it. @samp{=} identifies an output; @samp{+} |
| identifies an operand that is both input and output; all other operands |
| are assumed to be input only. |
| |
| @cindex @samp{&} in constraint |
| @cindex earlyclobber operand |
| @item & |
| Means (in a particular alternative) that this operand is an |
| @dfn{earlyclobber} operand, which is modified before the instruction is |
| finished using the input operands. Therefore, this operand may not lie |
| in a register that is used as an input operand or as part of any memory |
| address. |
| |
| @samp{&} applies only to the alternative in which it is written. In |
| constraints with multiple alternatives, sometimes one alternative |
| requires @samp{&} while others do not. See, for example, the |
| @samp{movdf} insn of the 68000. |
| |
| An input operand can be tied to an earlyclobber operand if its only |
| use as an input occurs before the early result is written. Adding |
| alternatives of this form often allows GCC to produce better code |
| when only some of the inputs can be affected by the earlyclobber. |
| See, for example, the @samp{mulsi3} insn of the ARM. |
| |
| @samp{&} does not obviate the need to write @samp{=}. |
| |
| @cindex @samp{%} in constraint |
| @item % |
| Declares the instruction to be commutative for this operand and the |
| following operand. This means that the compiler may interchange the |
| two operands if that is the cheapest way to make all operands fit the |
| constraints. |
| @ifset INTERNALS |
| This is often used in patterns for addition instructions |
| that really have only two operands: the result must go in one of the |
| arguments. Here for example, is how the 68000 halfword-add |
| instruction is defined: |
| |
| @smallexample |
| (define_insn "addhi3" |
| [(set (match_operand:HI 0 "general_operand" "=m,r") |
| (plus:HI (match_operand:HI 1 "general_operand" "%0,0") |
| (match_operand:HI 2 "general_operand" "di,g")))] |
| @dots{}) |
| @end smallexample |
| @end ifset |
| |
| @cindex @samp{#} in constraint |
| @item # |
| Says that all following characters, up to the next comma, are to be |
| ignored as a constraint. They are significant only for choosing |
| register preferences. |
| |
| @ifset INTERNALS |
| @cindex @samp{*} in constraint |
| @item * |
| Says that the following character should be ignored when choosing |
| register preferences. @samp{*} has no effect on the meaning of the |
| constraint as a constraint, and no effect on reloading. |
| |
| Here is an example: the 68000 has an instruction to sign-extend a |
| halfword in a data register, and can also sign-extend a value by |
| copying it into an address register. While either kind of register is |
| acceptable, the constraints on an address-register destination are |
| less strict, so it is best if register allocation makes an address |
| register its goal. Therefore, @samp{*} is used so that the @samp{d} |
| constraint letter (for data register) is ignored when computing |
| register preferences. |
| |
| @smallexample |
| (define_insn "extendhisi2" |
| [(set (match_operand:SI 0 "general_operand" "=*d,a") |
| (sign_extend:SI |
| (match_operand:HI 1 "general_operand" "0,g")))] |
| @dots{}) |
| @end smallexample |
| @end ifset |
| @end table |
| |
| @node Machine Constraints |
| @subsection Constraints for Particular Machines |
| @cindex machine specific constraints |
| @cindex constraints, machine specific |
| |
| Whenever possible, you should use the general-purpose constraint letters |
| in @code{asm} arguments, since they will convey meaning more readily to |
| people reading your code. Failing that, use the constraint letters |
| that usually have very similar meanings across architectures. The most |
| commonly used constraints are @samp{m} and @samp{r} (for memory and |
| general-purpose registers respectively; @pxref{Simple Constraints}), and |
| @samp{I}, usually the letter indicating the most common |
| immediate-constant format. |
| |
| For each machine architecture, the @file{config/@var{machine}.h} file |
| defines additional constraints. These constraints are used by the |
| compiler itself for instruction generation, as well as for @code{asm} |
| statements; therefore, some of the constraints are not particularly |
| interesting for @code{asm}. The constraints are defined through these |
| macros: |
| |
| @table @code |
| @item REG_CLASS_FROM_LETTER |
| Register class constraints (usually lower case). |
| |
| @item CONST_OK_FOR_LETTER_P |
| Immediate constant constraints, for non-floating point constants of |
| word size or smaller precision (usually upper case). |
| |
| @item CONST_DOUBLE_OK_FOR_LETTER_P |
| Immediate constant constraints, for all floating point constants and for |
| constants of greater than word size precision (usually upper case). |
| |
| @item EXTRA_CONSTRAINT |
| Special cases of registers or memory. This macro is not required, and |
| is only defined for some machines. |
| @end table |
| |
| Inspecting these macro definitions in the compiler source for your |
| machine is the best way to be certain you have the right constraints. |
| However, here is a summary of the machine-dependent constraints |
| available on some particular machines. |
| |
| @table @emph |
| @item ARM family---@file{arm.h} |
| @table @code |
| @item f |
| Floating-point register |
| |
| @item F |
| One of the floating-point constants 0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0 |
| or 10.0 |
| |
| @item G |
| Floating-point constant that would satisfy the constraint @samp{F} if it |
| were negated |
| |
| @item I |
| Integer that is valid as an immediate operand in a data processing |
| instruction. That is, an integer in the range 0 to 255 rotated by a |
| multiple of 2 |
| |
| @item J |
| Integer in the range -4095 to 4095 |
| |
| @item K |
| Integer that satisfies constraint @samp{I} when inverted (ones complement) |
| |
| @item L |
| Integer that satisfies constraint @samp{I} when negated (twos complement) |
| |
| @item M |
| Integer in the range 0 to 32 |
| |
| @item Q |
| A memory reference where the exact address is in a single register |
| (`@samp{m}' is preferable for @code{asm} statements) |
| |
| @item R |
| An item in the constant pool |
| |
| @item S |
| A symbol in the text segment of the current file |
| @end table |
| |
| @item AMD 29000 family---@file{a29k.h} |
| @table @code |
| @item l |
| Local register 0 |
| |
| @item b |
| Byte Pointer (@samp{BP}) register |
| |
| @item q |
| @samp{Q} register |
| |
| @item h |
| Special purpose register |
| |
| @item A |
| First accumulator register |
| |
| @item a |
| Other accumulator register |
| |
| @item f |
| Floating point register |
| |
| @item I |
| Constant greater than 0, less than 0x100 |
| |
| @item J |
| Constant greater than 0, less than 0x10000 |
| |
| @item K |
| Constant whose high 24 bits are on (1) |
| |
| @item L |
| 16 bit constant whose high 8 bits are on (1) |
| |
| @item M |
| 32 bit constant whose high 16 bits are on (1) |
| |
| @item N |
| 32 bit negative constant that fits in 8 bits |
| |
| @item O |
| The constant 0x80000000 or, on the 29050, any 32 bit constant |
| whose low 16 bits are 0. |
| |
| @item P |
| 16 bit negative constant that fits in 8 bits |
| |
| @item G |
| @itemx H |
| A floating point constant (in @code{asm} statements, use the machine |
| independent @samp{E} or @samp{F} instead) |
| @end table |
| |
| @item IBM RS6000---@file{rs6000.h} |
| @table @code |
| @item b |
| Address base register |
| |
| @item f |
| Floating point register |
| |
| @item h |
| @samp{MQ}, @samp{CTR}, or @samp{LINK} register |
| |
| @item q |
| @samp{MQ} register |
| |
| @item c |
| @samp{CTR} register |
| |
| @item l |
| @samp{LINK} register |
| |
| @item x |
| @samp{CR} register (condition register) number 0 |
| |
| @item y |
| @samp{CR} register (condition register) |
| |
| @item I |
| Signed 16 bit constant |
| |
| @item J |
| Constant whose low 16 bits are 0 |
| |
| @item K |
| Constant whose high 16 bits are 0 |
| |
| @item L |
| Constant suitable as a mask operand |
| |
| @item M |
| Constant larger than 31 |
| |
| @item N |
| Exact power of 2 |
| |
| @item O |
| Zero |
| |
| @item P |
| Constant whose negation is a signed 16 bit constant |
| |
| @item G |
| Floating point constant that can be loaded into a register with one |
| instruction per word |
| |
| @item Q |
| Memory operand that is an offset from a register (@samp{m} is preferable |
| for @code{asm} statements) |
| |
| @item R |
| AIX TOC entry |
| |
| @item S |
| Windows NT SYMBOL_REF |
| |
| @item T |
| Windows NT LABEL_REF |
| |
| @item U |
| System V Release 4 small data area reference |
| @end table |
| |
| @item Intel 386---@file{i386.h} |
| @table @code |
| @item q |
| @samp{a}, @code{b}, @code{c}, or @code{d} register |
| |
| @item A |
| @samp{a}, or @code{d} register (for 64-bit ints) |
| |
| @item f |
| Floating point register |
| |
| @item t |
| First (top of stack) floating point register |
| |
| @item u |
| Second floating point register |
| |
| @item a |
| @samp{a} register |
| |
| @item b |
| @samp{b} register |
| |
| @item c |
| @samp{c} register |
| |
| @item d |
| @samp{d} register |
| |
| @item D |
| @samp{di} register |
| |
| @item S |
| @samp{si} register |
| |
| @item I |
| Constant in range 0 to 31 (for 32 bit shifts) |
| |
| @item J |
| Constant in range 0 to 63 (for 64 bit shifts) |
| |
| @item K |
| @samp{0xff} |
| |
| @item L |
| @samp{0xffff} |
| |
| @item M |
| 0, 1, 2, or 3 (shifts for @code{lea} instruction) |
| |
| @item N |
| Constant in range 0 to 255 (for @code{out} instruction) |
| |
| @item G |
| Standard 80387 floating point constant |
| @end table |
| |
| @item Intel 960---@file{i960.h} |
| @table @code |
| @item f |
| Floating point register (@code{fp0} to @code{fp3}) |
| |
| @item l |
| Local register (@code{r0} to @code{r15}) |
| |
| @item b |
| Global register (@code{g0} to @code{g15}) |
| |
| @item d |
| Any local or global register |
| |
| @item I |
| Integers from 0 to 31 |
| |
| @item J |
| 0 |
| |
| @item K |
| Integers from -31 to 0 |
| |
| @item G |
| Floating point 0 |
| |
| @item H |
| Floating point 1 |
| @end table |
| |
| @item MIPS---@file{mips.h} |
| @table @code |
| @item d |
| General-purpose integer register |
| |
| @item f |
| Floating-point register (if available) |
| |
| @item h |
| @samp{Hi} register |
| |
| @item l |
| @samp{Lo} register |
| |
| @item x |
| @samp{Hi} or @samp{Lo} register |
| |
| @item y |
| General-purpose integer register |
| |
| @item z |
| Floating-point status register |
| |
| @item I |
| Signed 16 bit constant (for arithmetic instructions) |
| |
| @item J |
| Zero |
| |
| @item K |
| Zero-extended 16-bit constant (for logic instructions) |
| |
| @item L |
| Constant with low 16 bits zero (can be loaded with @code{lui}) |
| |
| @item M |
| 32 bit constant which requires two instructions to load (a constant |
| which is not @samp{I}, @samp{K}, or @samp{L}) |
| |
| @item N |
| Negative 16 bit constant |
| |
| @item O |
| Exact power of two |
| |
| @item P |
| Positive 16 bit constant |
| |
| @item G |
| Floating point zero |
| |
| @item Q |
| Memory reference that can be loaded with more than one instruction |
| (@samp{m} is preferable for @code{asm} statements) |
| |
| @item R |
| Memory reference that can be loaded with one instruction |
| (@samp{m} is preferable for @code{asm} statements) |
| |
| @item S |
| Memory reference in external OSF/rose PIC format |
| (@samp{m} is preferable for @code{asm} statements) |
| @end table |
| |
| @item Motorola 680x0---@file{m68k.h} |
| @table @code |
| @item a |
| Address register |
| |
| @item d |
| Data register |
| |
| @item f |
| 68881 floating-point register, if available |
| |
| @item x |
| Sun FPA (floating-point) register, if available |
| |
| @item y |
| First 16 Sun FPA registers, if available |
| |
| @item I |
| Integer in the range 1 to 8 |
| |
| @item J |
| 16 bit signed number |
| |
| @item K |
| Signed number whose magnitude is greater than 0x80 |
| |
| @item L |
| Integer in the range -8 to -1 |
| |
| @item M |
| Signed number whose magnitude is greater than 0x100 |
| |
| @item G |
| Floating point constant that is not a 68881 constant |
| |
| @item H |
| Floating point constant that can be used by Sun FPA |
| @end table |
| |
| @need 1000 |
| @item SPARC---@file{sparc.h} |
| @table @code |
| @item f |
| Floating-point register that can hold 32 or 64 bit values. |
| |
| @item e |
| Floating-point register that can hold 64 or 128 bit values. |
| |
| @item I |
| Signed 13 bit constant |
| |
| @item J |
| Zero |
| |
| @item K |
| 32 bit constant with the low 12 bits clear (a constant that can be |
| loaded with the @code{sethi} instruction) |
| |
| @item G |
| Floating-point zero |
| |
| @item H |
| Signed 13 bit constant, sign-extended to 32 or 64 bits |
| |
| @item Q |
| Memory reference that can be loaded with one instruction (@samp{m} is |
| more appropriate for @code{asm} statements) |
| |
| @item S |
| Constant, or memory address |
| |
| @item T |
| Memory address aligned to an 8-byte boundary |
| |
| @item U |
| Even register |
| @end table |
| @end table |
| |
| @ifset INTERNALS |
| @node No Constraints |
| @subsection Not Using Constraints |
| @cindex no constraints |
| @cindex not using constraints |
| |
| Some machines are so clean that operand constraints are not required. For |
| example, on the Vax, an operand valid in one context is valid in any other |
| context. On such a machine, every operand constraint would be @samp{g}, |
| excepting only operands of ``load address'' instructions which are |
| written as if they referred to a memory location's contents but actual |
| refer to its address. They would have constraint @samp{p}. |
| |
| @cindex empty constraints |
| For such machines, instead of writing @samp{g} and @samp{p} for all |
| the constraints, you can choose to write a description with empty constraints. |
| Then you write @samp{""} for the constraint in every @code{match_operand}. |
| Address operands are identified by writing an @code{address} expression |
| around the @code{match_operand}, not by their constraints. |
| |
| When the machine description has just empty constraints, certain parts |
| of compilation are skipped, making the compiler faster. However, |
| few machines actually do not need constraints; all machine descriptions |
| now in existence use constraints. |
| @end ifset |
| |
| @ifset INTERNALS |
| @node Standard Names |
| @section Standard Pattern Names For Generation |
| @cindex standard pattern names |
| @cindex pattern names |
| @cindex names, pattern |
| |
| Here is a table of the instruction names that are meaningful in the RTL |
| generation pass of the compiler. Giving one of these names to an |
| instruction pattern tells the RTL generation pass that it can use the |
| pattern in to accomplish a certain task. |
| |
| @table @asis |
| @cindex @code{mov@var{m}} instruction pattern |
| @item @samp{mov@var{m}} |
| Here @var{m} stands for a two-letter machine mode name, in lower case. |
| This instruction pattern moves data with that machine mode from operand |
| 1 to operand 0. For example, @samp{movsi} moves full-word data. |
| |
| If operand 0 is a @code{subreg} with mode @var{m} of a register whose |
| own mode is wider than @var{m}, the effect of this instruction is |
| to store the specified value in the part of the register that corresponds |
| to mode @var{m}. The effect on the rest of the register is undefined. |
| |
| This class of patterns is special in several ways. First of all, each |
| of these names @emph{must} be defined, because there is no other way |
| to copy a datum from one place to another. |
| |
| Second, these patterns are not used solely in the RTL generation pass. |
| Even the reload pass can generate move insns to copy values from stack |
| slots into temporary registers. When it does so, one of the operands is |
| a hard register and the other is an operand that can need to be reloaded |
| into a register. |
| |
| @findex force_reg |
| Therefore, when given such a pair of operands, the pattern must generate |
| RTL which needs no reloading and needs no temporary registers---no |
| registers other than the operands. For example, if you support the |
| pattern with a @code{define_expand}, then in such a case the |
| @code{define_expand} mustn't call @code{force_reg} or any other such |
| function which might generate new pseudo registers. |
| |
| This requirement exists even for subword modes on a RISC machine where |
| fetching those modes from memory normally requires several insns and |
| some temporary registers. Look in @file{spur.md} to see how the |
| requirement can be satisfied. |
| |
| @findex change_address |
| During reload a memory reference with an invalid address may be passed |
| as an operand. Such an address will be replaced with a valid address |
| later in the reload pass. In this case, nothing may be done with the |
| address except to use it as it stands. If it is copied, it will not be |
| replaced with a valid address. No attempt should be made to make such |
| an address into a valid address and no routine (such as |
| @code{change_address}) that will do so may be called. Note that |
| @code{general_operand} will fail when applied to such an address. |
| |
| @findex reload_in_progress |
| The global variable @code{reload_in_progress} (which must be explicitly |
| declared if required) can be used to determine whether such special |
| handling is required. |
| |
| The variety of operands that have reloads depends on the rest of the |
| machine description, but typically on a RISC machine these can only be |
| pseudo registers that did not get hard registers, while on other |
| machines explicit memory references will get optional reloads. |
| |
| If a scratch register is required to move an object to or from memory, |
| it can be allocated using @code{gen_reg_rtx} prior to reload. But this |
| is impossible during and after reload. If there are cases needing |
| scratch registers after reload, you must define |
| @code{SECONDARY_INPUT_RELOAD_CLASS} and perhaps also |
| @code{SECONDARY_OUTPUT_RELOAD_CLASS} to detect them, and provide |
| patterns @samp{reload_in@var{m}} or @samp{reload_out@var{m}} to handle |
| them. @xref{Register Classes}. |
| |
| The constraints on a @samp{move@var{m}} must permit moving any hard |
| register to any other hard register provided that |
| @code{HARD_REGNO_MODE_OK} permits mode @var{m} in both registers and |
| @code{REGISTER_MOVE_COST} applied to their classes returns a value of 2. |
| |
| It is obligatory to support floating point @samp{move@var{m}} |
| instructions into and out of any registers that can hold fixed point |
| values, because unions and structures (which have modes @code{SImode} or |
| @code{DImode}) can be in those registers and they may have floating |
| point members. |
| |
| There may also be a need to support fixed point @samp{move@var{m}} |
| instructions in and out of floating point registers. Unfortunately, I |
| have forgotten why this was so, and I don't know whether it is still |
| true. If @code{HARD_REGNO_MODE_OK} rejects fixed point values in |
| floating point registers, then the constraints of the fixed point |
| @samp{move@var{m}} instructions must be designed to avoid ever trying to |
| reload into a floating point register. |
| |
| @cindex @code{reload_in} instruction pattern |
| @cindex @code{reload_out} instruction pattern |
| @item @samp{reload_in@var{m}} |
| @itemx @samp{reload_out@var{m}} |
| Like @samp{mov@var{m}}, but used when a scratch register is required to |
| move between operand 0 and operand 1. Operand 2 describes the scratch |
| register. See the discussion of the @code{SECONDARY_RELOAD_CLASS} |
| macro in @pxref{Register Classes}. |
| |
| @cindex @code{movstrict@var{m}} instruction pattern |
| @item @samp{movstrict@var{m}} |
| Like @samp{mov@var{m}} except that if operand 0 is a @code{subreg} |
| with mode @var{m} of a register whose natural mode is wider, |
| the @samp{movstrict@var{m}} instruction is guaranteed not to alter |
| any of the register except the part which belongs to mode @var{m}. |
| |
| @cindex @code{load_multiple} instruction pattern |
| @item @samp{load_multiple} |
| Load several consecutive memory locations into consecutive registers. |
| Operand 0 is the first of the consecutive registers, operand 1 |
| is the first memory location, and operand 2 is a constant: the |
| number of consecutive registers. |
| |
| Define this only if the target machine really has such an instruction; |
| do not define this if the most efficient way of loading consecutive |
| registers from memory is to do them one at a time. |
| |
| On some machines, there are restrictions as to which consecutive |
| registers can be stored into memory, such as particular starting or |
| ending register numbers or only a range of valid counts. For those |
| machines, use a @code{define_expand} (@pxref{Expander Definitions}) |
| and make the pattern fail if the restrictions are not met. |
| |
| Write the generated insn as a @code{parallel} with elements being a |
| @code{set} of one register from the appropriate memory location (you may |
| also need @code{use} or @code{clobber} elements). Use a |
| @code{match_parallel} (@pxref{RTL Template}) to recognize the insn. See |
| @file{a29k.md} and @file{rs6000.md} for examples of the use of this insn |
| pattern. |
| |
| @cindex @samp{store_multiple} instruction pattern |
| @item @samp{store_multiple} |
| Similar to @samp{load_multiple}, but store several consecutive registers |
| into consecutive memory locations. Operand 0 is the first of the |
| consecutive memory locations, operand 1 is the first register, and |
| operand 2 is a constant: the number of consecutive registers. |
| |
| @cindex @code{add@var{m}3} instruction pattern |
| @item @samp{add@var{m}3} |
| Add operand 2 and operand 1, storing the result in operand 0. All operands |
| must have mode @var{m}. This can be used even on two-address machines, by |
| means of constraints requiring operands 1 and 0 to be the same location. |
| |
| @cindex @code{sub@var{m}3} instruction pattern |
| @cindex @code{mul@var{m}3} instruction pattern |
| @cindex @code{div@var{m}3} instruction pattern |
| @cindex @code{udiv@var{m}3} instruction pattern |
| @cindex @code{mod@var{m}3} instruction pattern |
| @cindex @code{umod@var{m}3} instruction pattern |
| @cindex @code{smin@var{m}3} instruction pattern |
| @cindex @code{smax@var{m}3} instruction pattern |
| @cindex @code{umin@var{m}3} instruction pattern |
| @cindex @code{umax@var{m}3} instruction pattern |
| @cindex @code{and@var{m}3} instruction pattern |
| @cindex @code{ior@var{m}3} instruction pattern |
| @cindex @code{xor@var{m}3} instruction pattern |
| @item @samp{sub@var{m}3}, @samp{mul@var{m}3} |
| @itemx @samp{div@var{m}3}, @samp{udiv@var{m}3}, @samp{mod@var{m}3}, @samp{umod@var{m}3} |
| @itemx @samp{smin@var{m}3}, @samp{smax@var{m}3}, @samp{umin@var{m}3}, @samp{umax@var{m}3} |
| @itemx @samp{and@var{m}3}, @samp{ior@var{m}3}, @samp{xor@var{m}3} |
| Similar, for other arithmetic operations. |
| |
| @cindex @code{mulhisi3} instruction pattern |
| @item @samp{mulhisi3} |
| Multiply operands 1 and 2, which have mode @code{HImode}, and store |
| a @code{SImode} product in operand 0. |
| |
| @cindex @code{mulqihi3} instruction pattern |
| @cindex @code{mulsidi3} instruction pattern |
| @item @samp{mulqihi3}, @samp{mulsidi3} |
| Similar widening-multiplication instructions of other widths. |
| |
| @cindex @code{umulqihi3} instruction pattern |
| @cindex @code{umulhisi3} instruction pattern |
| @cindex @code{umulsidi3} instruction pattern |
| @item @samp{umulqihi3}, @samp{umulhisi3}, @samp{umulsidi3} |
| Similar widening-multiplication instructions that do unsigned |
| multiplication. |
| |
| @cindex @code{smul@var{m}3_highpart} instruction pattern |
| @item @samp{mul@var{m}3_highpart} |
| Perform a signed multiplication of operands 1 and 2, which have mode |
| @var{m}, and store the most significant half of the product in operand 0. |
| The least significant half of the product is discarded. |
| |
| @cindex @code{umul@var{m}3_highpart} instruction pattern |
| @item @samp{umul@var{m}3_highpart} |
| Similar, but the multiplication is unsigned. |
| |
| @cindex @code{divmod@var{m}4} instruction pattern |
| @item @samp{divmod@var{m}4} |
| Signed division that produces both a quotient and a remainder. |
| Operand 1 is divided by operand 2 to produce a quotient stored |
| in operand 0 and a remainder stored in operand 3. |
| |
| For machines with an instruction that produces both a quotient and a |
| remainder, provide a pattern for @samp{divmod@var{m}4} but do not |
| provide patterns for @samp{div@var{m}3} and @samp{mod@var{m}3}. This |
| allows optimization in the relatively common case when both the quotient |
| and remainder are computed. |
| |
| If an instruction that just produces a quotient or just a remainder |
| exists and is more efficient than the instruction that produces both, |
| write the output routine of @samp{divmod@var{m}4} to call |
| @code{find_reg_note} and look for a @code{REG_UNUSED} note on the |
| quotient or remainder and generate the appropriate instruction. |
| |
| @cindex @code{udivmod@var{m}4} instruction pattern |
| @item @samp{udivmod@var{m}4} |
| Similar, but does unsigned division. |
| |
| @cindex @code{ashl@var{m}3} instruction pattern |
| @item @samp{ashl@var{m}3} |
| Arithmetic-shift operand 1 left by a number of bits specified by operand |
| 2, and store the result in operand 0. Here @var{m} is the mode of |
| operand 0 and operand 1; operand 2's mode is specified by the |
| instruction pattern, and the compiler will convert the operand to that |
| mode before generating the instruction. |
| |
| @cindex @code{ashr@var{m}3} instruction pattern |
| @cindex @code{lshr@var{m}3} instruction pattern |
| @cindex @code{rotl@var{m}3} instruction pattern |
| @cindex @code{rotr@var{m}3} instruction pattern |
| @item @samp{ashr@var{m}3}, @samp{lshr@var{m}3}, @samp{rotl@var{m}3}, @samp{rotr@var{m}3} |
| Other shift and rotate instructions, analogous to the |
| @code{ashl@var{m}3} instructions. |
| |
| @cindex @code{neg@var{m}2} instruction pattern |
| @item @samp{neg@var{m}2} |
| Negate operand 1 and store the result in operand 0. |
| |
| @cindex @code{abs@var{m}2} instruction pattern |
| @item @samp{abs@var{m}2} |
| Store the absolute value of operand 1 into operand 0. |
| |
| @cindex @code{sqrt@var{m}2} instruction pattern |
| @item @samp{sqrt@var{m}2} |
| Store the square root of operand 1 into operand 0. |
| |
| The @code{sqrt} built-in function of C always uses the mode which |
| corresponds to the C data type @code{double}. |
| |
| @cindex @code{ffs@var{m}2} instruction pattern |
| @item @samp{ffs@var{m}2} |
| Store into operand 0 one plus the index of the least significant 1-bit |
| of operand 1. If operand 1 is zero, store zero. @var{m} is the mode |
| of operand 0; operand 1's mode is specified by the instruction |
| pattern, and the compiler will convert the operand to that mode before |
| generating the instruction. |
| |
| The @code{ffs} built-in function of C always uses the mode which |
| corresponds to the C data type @code{int}. |
| |
| @cindex @code{one_cmpl@var{m}2} instruction pattern |
| @item @samp{one_cmpl@var{m}2} |
| Store the bitwise-complement of operand 1 into operand 0. |
| |
| @cindex @code{cmp@var{m}} instruction pattern |
| @item @samp{cmp@var{m}} |
| Compare operand 0 and operand 1, and set the condition codes. |
| The RTL pattern should look like this: |
| |
| @smallexample |
| (set (cc0) (compare (match_operand:@var{m} 0 @dots{}) |
| (match_operand:@var{m} 1 @dots{}))) |
| @end smallexample |
| |
| @cindex @code{tst@var{m}} instruction pattern |
| @item @samp{tst@var{m}} |
| Compare operand 0 against zero, and set the condition codes. |
| The RTL pattern should look like this: |
| |
| @smallexample |
| (set (cc0) (match_operand:@var{m} 0 @dots{})) |
| @end smallexample |
| |
| @samp{tst@var{m}} patterns should not be defined for machines that do |
| not use @code{(cc0)}. Doing so would confuse the optimizer since it |
| would no longer be clear which @code{set} operations were comparisons. |
| The @samp{cmp@var{m}} patterns should be used instead. |
| |
| @cindex @code{movstr@var{m}} instruction pattern |
| @item @samp{movstr@var{m}} |
| Block move instruction. The addresses of the destination and source |
| strings are the first two operands, and both are in mode @code{Pmode}. |
| The number of bytes to move is the third operand, in mode @var{m}. |
| |
| The fourth operand is the known shared alignment of the source and |
| destination, in the form of a @code{const_int} rtx. Thus, if the |
| compiler knows that both source and destination are word-aligned, |
| it may provide the value 4 for this operand. |
| |
| These patterns need not give special consideration to the possibility |
| that the source and destination strings might overlap. |
| |
| @cindex @code{clrstr@var{m}} instruction pattern |
| @item @samp{clrstr@var{m}} |
| Block clear instruction. The addresses of the destination string is the |
| first operand, in mode @code{Pmode}. The number of bytes to clear is |
| the second operand, in mode @var{m}. |
| |
| The third operand is the known alignment of the destination, in the form |
| of a @code{const_int} rtx. Thus, if the compiler knows that the |
| destination is word-aligned, it may provide the value 4 for this |
| operand. |
| |
| @cindex @code{cmpstr@var{m}} instruction pattern |
| @item @samp{cmpstr@var{m}} |
| Block compare instruction, with five operands. Operand 0 is the output; |
| it has mode @var{m}. The remaining four operands are like the operands |
| of @samp{movstr@var{m}}. The two memory blocks specified are compared |
| byte by byte in lexicographic order. The effect of the instruction is |
| to store a value in operand 0 whose sign indicates the result of the |
| comparison. |
| |
| @cindex @code{strlen@var{m}} instruction pattern |
| @item @samp{strlen@var{m}} |
| Compute the length of a string, with three operands. |
| Operand 0 is the result (of mode @var{m}), operand 1 is |
| a @code{mem} referring to the first character of the string, |
| operand 2 is the character to search for (normally zero), |
| and operand 3 is a constant describing the known alignment |
| of the beginning of the string. |
| |
| @cindex @code{float@var{mn}2} instruction pattern |
| @item @samp{float@var{m}@var{n}2} |
| Convert signed integer operand 1 (valid for fixed point mode @var{m}) to |
| floating point mode @var{n} and store in operand 0 (which has mode |
| @var{n}). |
| |
| @cindex @code{floatuns@var{mn}2} instruction pattern |
| @item @samp{floatuns@var{m}@var{n}2} |
| Convert unsigned integer operand 1 (valid for fixed point mode @var{m}) |
| to floating point mode @var{n} and store in operand 0 (which has mode |
| @var{n}). |
| |
| @cindex @code{fix@var{mn}2} instruction pattern |
| @item @samp{fix@var{m}@var{n}2} |
| Convert operand 1 (valid for floating point mode @var{m}) to fixed |
| point mode @var{n} as a signed number and store in operand 0 (which |
| has mode @var{n}). This instruction's result is defined only when |
| the value of operand 1 is an integer. |
| |
| @cindex @code{fixuns@var{mn}2} instruction pattern |
| @item @samp{fixuns@var{m}@var{n}2} |
| Convert operand 1 (valid for floating point mode @var{m}) to fixed |
| point mode @var{n} as an unsigned number and store in operand 0 (which |
| has mode @var{n}). This instruction's result is defined only when the |
| value of operand 1 is an integer. |
| |
| @cindex @code{ftrunc@var{m}2} instruction pattern |
| @item @samp{ftrunc@var{m}2} |
| Convert operand 1 (valid for floating point mode @var{m}) to an |
| integer value, still represented in floating point mode @var{m}, and |
| store it in operand 0 (valid for floating point mode @var{m}). |
| |
| @cindex @code{fix_trunc@var{mn}2} instruction pattern |
| @item @samp{fix_trunc@var{m}@var{n}2} |
| Like @samp{fix@var{m}@var{n}2} but works for any floating point value |
| of mode @var{m} by converting the value to an integer. |
| |
| @cindex @code{fixuns_trunc@var{mn}2} instruction pattern |
| @item @samp{fixuns_trunc@var{m}@var{n}2} |
| Like @samp{fixuns@var{m}@var{n}2} but works for any floating point |
| value of mode @var{m} by converting the value to an integer. |
| |
| @cindex @code{trunc@var{mn}2} instruction pattern |
| @item @samp{trunc@var{m}@var{n}2} |
| Truncate operand 1 (valid for mode @var{m}) to mode @var{n} and |
| store in operand 0 (which has mode @var{n}). Both modes must be fixed |
| point or both floating point. |
| |
| @cindex @code{extend@var{mn}2} instruction pattern |
| @item @samp{extend@var{m}@var{n}2} |
| Sign-extend operand 1 (valid for mode @var{m}) to mode @var{n} and |
| store in operand 0 (which has mode @var{n}). Both modes must be fixed |
| point or both floating point. |
| |
| @cindex @code{zero_extend@var{mn}2} instruction pattern |
| @item @samp{zero_extend@var{m}@var{n}2} |
| Zero-extend operand 1 (valid for mode @var{m}) to mode @var{n} and |
| store in operand 0 (which has mode @var{n}). Both modes must be fixed |
| point. |
| |
| @cindex @code{extv} instruction pattern |
| @item @samp{extv} |
| Extract a bit field from operand 1 (a register or memory operand), where |
| operand 2 specifies the width in bits and operand 3 the starting bit, |
| and store it in operand 0. Operand 0 must have mode @code{word_mode}. |
| Operand 1 may have mode @code{byte_mode} or @code{word_mode}; often |
| @code{word_mode} is allowed only for registers. Operands 2 and 3 must |
| be valid for @code{word_mode}. |
| |
| The RTL generation pass generates this instruction only with constants |
| for operands 2 and 3. |
| |
| The bit-field value is sign-extended to a full word integer |
| before it is stored in operand 0. |
| |
| @cindex @code{extzv} instruction pattern |
| @item @samp{extzv} |
| Like @samp{extv} except that the bit-field value is zero-extended. |
| |
| @cindex @code{insv} instruction pattern |
| @item @samp{insv} |
| Store operand 3 (which must be valid for @code{word_mode}) into a bit |
| field in operand 0, where operand 1 specifies the width in bits and |
| operand 2 the starting bit. Operand 0 may have mode @code{byte_mode} or |
| @code{word_mode}; often @code{word_mode} is allowed only for registers. |
| Operands 1 and 2 must be valid for @code{word_mode}. |
| |
| The RTL generation pass generates this instruction only with constants |
| for operands 1 and 2. |
| |
| @cindex @code{mov@var{mode}cc} instruction pattern |
| @item @samp{mov@var{mode}cc} |
| Conditionally move operand 2 or operand 3 into operand 0 according to the |
| comparison in operand 1. If the comparison is true, operand 2 is moved |
| into operand 0, otherwise operand 3 is moved. |
| |
| The mode of the operands being compared need not be the same as the operands |
| being moved. Some machines, sparc64 for example, have instructions that |
| conditionally move an integer value based on the floating point condition |
| codes and vice versa. |
| |
| If the machine does not have conditional move instructions, do not |
| define these patterns. |
| |
| @cindex @code{s@var{cond}} instruction pattern |
| @item @samp{s@var{cond}} |
| Store zero or nonzero in the operand according to the condition codes. |
| Value stored is nonzero iff the condition @var{cond} is true. |
| @var{cond} is the name of a comparison operation expression code, such |
| as @code{eq}, @code{lt} or @code{leu}. |
| |
| You specify the mode that the operand must have when you write the |
| @code{match_operand} expression. The compiler automatically sees |
| which mode you have used and supplies an operand of that mode. |
| |
| The value stored for a true condition must have 1 as its low bit, or |
| else must be negative. Otherwise the instruction is not suitable and |
| you should omit it from the machine description. You describe to the |
| compiler exactly which value is stored by defining the macro |
| @code{STORE_FLAG_VALUE} (@pxref{Misc}). If a description cannot be |
| found that can be used for all the @samp{s@var{cond}} patterns, you |
| should omit those operations from the machine description. |
| |
| These operations may fail, but should do so only in relatively |
| uncommon cases; if they would fail for common cases involving |
| integer comparisons, it is best to omit these patterns. |
| |
| If these operations are omitted, the compiler will usually generate code |
| that copies the constant one to the target and branches around an |
| assignment of zero to the target. If this code is more efficient than |
| the potential instructions used for the @samp{s@var{cond}} pattern |
| followed by those required to convert the result into a 1 or a zero in |
| @code{SImode}, you should omit the @samp{s@var{cond}} operations from |
| the machine description. |
| |
| @cindex @code{b@var{cond}} instruction pattern |
| @item @samp{b@var{cond}} |
| Conditional branch instruction. Operand 0 is a @code{label_ref} that |
| refers to the label to jump to. Jump if the condition codes meet |
| condition @var{cond}. |
| |
| Some machines do not follow the model assumed here where a comparison |
| instruction is followed by a conditional branch instruction. In that |
| case, the @samp{cmp@var{m}} (and @samp{tst@var{m}}) patterns should |
| simply store the operands away and generate all the required insns in a |
| @code{define_expand} (@pxref{Expander Definitions}) for the conditional |
| branch operations. All calls to expand @samp{b@var{cond}} patterns are |
| immediately preceded by calls to expand either a @samp{cmp@var{m}} |
| pattern or a @samp{tst@var{m}} pattern. |
| |
| Machines that use a pseudo register for the condition code value, or |
| where the mode used for the comparison depends on the condition being |
| tested, should also use the above mechanism. @xref{Jump Patterns} |
| |
| The above discussion also applies to the @samp{mov@var{mode}cc} and |
| @samp{s@var{cond}} patterns. |
| |
| @cindex @code{call} instruction pattern |
| @item @samp{call} |
| Subroutine call instruction returning no value. Operand 0 is the |
| function to call; operand 1 is the number of bytes of arguments pushed |
| (in mode @code{SImode}, except it is normally a @code{const_int}); |
| operand 2 is the number of registers used as operands. |
| |
| On most machines, operand 2 is not actually stored into the RTL |
| pattern. It is supplied for the sake of some RISC machines which need |
| to put this information into the assembler code; they can put it in |
| the RTL instead of operand 1. |
| |
| Operand 0 should be a @code{mem} RTX whose address is the address of the |
| function. Note, however, that this address can be a @code{symbol_ref} |
| expression even if it would not be a legitimate memory address on the |
| target machine. If it is also not a valid argument for a call |
| instruction, the pattern for this operation should be a |
| @code{define_expand} (@pxref{Expander Definitions}) that places the |
| address into a register and uses that register in the call instruction. |
| |
| @cindex @code{call_value} instruction pattern |
| @item @samp{call_value} |
| Subroutine call instruction returning a value. Operand 0 is the hard |
| register in which the value is returned. There are three more |
| operands, the same as the three operands of the @samp{call} |
| instruction (but with numbers increased by one). |
| |
| Subroutines that return @code{BLKmode} objects use the @samp{call} |
| insn. |
| |
| @cindex @code{call_pop} instruction pattern |
| @cindex @code{call_value_pop} instruction pattern |
| @item @samp{call_pop}, @samp{call_value_pop} |
| Similar to @samp{call} and @samp{call_value}, except used if defined and |
| if @code{RETURN_POPS_ARGS} is non-zero. They should emit a @code{parallel} |
| that contains both the function call and a @code{set} to indicate the |
| adjustment made to the frame pointer. |
| |
| For machines where @code{RETURN_POPS_ARGS} can be non-zero, the use of these |
| patterns increases the number of functions for which the frame pointer |
| can be eliminated, if desired. |
| |
| @cindex @code{untyped_call} instruction pattern |
| @item @samp{untyped_call} |
| Subroutine call instruction returning a value of any type. Operand 0 is |
| the function to call; operand 1 is a memory location where the result of |
| calling the function is to be stored; operand 2 is a @code{parallel} |
| expression where each element is a @code{set} expression that indicates |
| the saving of a function return value into the result block. |
| |
| This instruction pattern should be defined to support |
| @code{__builtin_apply} on machines where special instructions are needed |
| to call a subroutine with arbitrary arguments or to save the value |
| returned. This instruction pattern is required on machines that have |
| multiple registers that can hold a return value (i.e. |
| @code{FUNCTION_VALUE_REGNO_P} is true for more than one register). |
| |
| @cindex @code{return} instruction pattern |
| @item @samp{return} |
| Subroutine return instruction. This instruction pattern name should be |
| defined only if a single instruction can do all the work of returning |
| from a function. |
| |
| Like the @samp{mov@var{m}} patterns, this pattern is also used after the |
| RTL generation phase. In this case it is to support machines where |
| multiple instructions are usually needed to return from a function, but |
| some class of functions only requires one instruction to implement a |
| return. Normally, the applicable functions are those which do not need |
| to save any registers or allocate stack space. |
| |
| @findex reload_completed |
| @findex leaf_function_p |
| For such machines, the condition specified in this pattern should only |
| be true when @code{reload_completed} is non-zero and the function's |
| epilogue would only be a single instruction. For machines with register |
| windows, the routine @code{leaf_function_p} may be used to determine if |
| a register window push is required. |
| |
| Machines that have conditional return instructions should define patterns |
| such as |
| |
| @smallexample |
| (define_insn "" |
| [(set (pc) |
| (if_then_else (match_operator |
| 0 "comparison_operator" |
| [(cc0) (const_int 0)]) |
| (return) |
| (pc)))] |
| "@var{condition}" |
| "@dots{}") |
| @end smallexample |
| |
| where @var{condition} would normally be the same condition specified on the |
| named @samp{return} pattern. |
| |
| @cindex @code{untyped_return} instruction pattern |
| @item @samp{untyped_return} |
| Untyped subroutine return instruction. This instruction pattern should |
| be defined to support @code{__builtin_return} on machines where special |
| instructions are needed to return a value of any type. |
| |
| Operand 0 is a memory location where the result of calling a function |
| with @code{__builtin_apply} is stored; operand 1 is a @code{parallel} |
| expression where each element is a @code{set} expression that indicates |
| the restoring of a function return value from the result block. |
| |
| @cindex @code{nop} instruction pattern |
| @item @samp{nop} |
| No-op instruction. This instruction pattern name should always be defined |
| to output a no-op in assembler code. @code{(const_int 0)} will do as an |
| RTL pattern. |
| |
| @cindex @code{indirect_jump} instruction pattern |
| @item @samp{indirect_jump} |
| An instruction to jump to an address which is operand zero. |
| This pattern name is mandatory on all machines. |
| |
| @cindex @code{casesi} instruction pattern |
| @item @samp{casesi} |
| Instruction to jump through a dispatch table, including bounds checking. |
| This instruction takes five operands: |
| |
| @enumerate |
| @item |
| The index to dispatch on, which has mode @code{SImode}. |
| |
| @item |
| The lower bound for indices in the table, an integer constant. |
| |
| @item |
| The total range of indices in the table---the largest index |
| minus the smallest one (both inclusive). |
| |
| @item |
| A label that precedes the table itself. |
| |
| @item |
| A label to jump to if the index has a value outside the bounds. |
| (If the machine-description macro @code{CASE_DROPS_THROUGH} is defined, |
| then an out-of-bounds index drops through to the code following |
| the jump table instead of jumping to this label. In that case, |
| this label is not actually used by the @samp{casesi} instruction, |
| but it is always provided as an operand.) |
| @end enumerate |
| |
| The table is a @code{addr_vec} or @code{addr_diff_vec} inside of a |
| @code{jump_insn}. The number of elements in the table is one plus the |
| difference between the upper bound and the lower bound. |
| |
| @cindex @code{tablejump} instruction pattern |
| @item @samp{tablejump} |
| Instruction to jump to a variable address. This is a low-level |
| capability which can be used to implement a dispatch table when there |
| is no @samp{casesi} pattern. |
| |
| This pattern requires two operands: the address or offset, and a label |
| which should immediately precede the jump table. If the macro |
| @code{CASE_VECTOR_PC_RELATIVE} is defined then the first operand is an |
| offset which counts from the address of the table; otherwise, it is an |
| absolute address to jump to. In either case, the first operand has |
| mode @code{Pmode}. |
| |
| The @samp{tablejump} insn is always the last insn before the jump |
| table it uses. Its assembler code normally has no need to use the |
| second operand, but you should incorporate it in the RTL pattern so |
| that the jump optimizer will not delete the table as unreachable code. |
| |
| @cindex @code{canonicalize_funcptr_for_compare} instruction pattern |
| @item @samp{canonicalize_funcptr_for_compare} |
| Canonicalize the function pointer in operand 1 and store the result |
| into operand 0. |
| |
| Operand 0 is always a @code{reg} and has mode @code{Pmode}; operand 1 |
| may be a @code{reg}, @code{mem}, @code{symbol_ref}, @code{const_int}, etc |
| and also has mode @code{Pmode}. |
| |
| Canonicalization of a function pointer usually involves computing |
| the address of the function which would be called if the function |
| pointer were used in an indirect call. |
| |
| Only define this pattern if function pointers on the target machine |
| can have different values but still call the same function when |
| used in an indirect call. |
| |
| @cindex @code{save_stack_block} instruction pattern |
| @cindex @code{save_stack_function} instruction pattern |
| @cindex @code{save_stack_nonlocal} instruction pattern |
| @cindex @code{restore_stack_block} instruction pattern |
| @cindex @code{restore_stack_function} instruction pattern |
| @cindex @code{restore_stack_nonlocal} instruction pattern |
| @item @samp{save_stack_block} |
| @itemx @samp{save_stack_function} |
| @itemx @samp{save_stack_nonlocal} |
| @itemx @samp{restore_stack_block} |
| @itemx @samp{restore_stack_function} |
| @itemx @samp{restore_stack_nonlocal} |
| Most machines save and restore the stack pointer by copying it to or |
| from an object of mode @code{Pmode}. Do not define these patterns on |
| such machines. |
| |
| Some machines require special handling for stack pointer saves and |
| restores. On those machines, define the patterns corresponding to the |
| non-standard cases by using a @code{define_expand} (@pxref{Expander |
| Definitions}) that produces the required insns. The three types of |
| saves and restores are: |
| |
| @enumerate |
| @item |
| @samp{save_stack_block} saves the stack pointer at the start of a block |
| that allocates a variable-sized object, and @samp{restore_stack_block} |
| restores the stack pointer when the block is exited. |
| |
| @item |
| @samp{save_stack_function} and @samp{restore_stack_function} do a |
| similar job for the outermost block of a function and are used when the |
| function allocates variable-sized objects or calls @code{alloca}. Only |
| the epilogue uses the restored stack pointer, allowing a simpler save or |
| restore sequence on some machines. |
| |
| @item |
| @samp{save_stack_nonlocal} is used in functions that contain labels |
| branched to by nested functions. It saves the stack pointer in such a |
| way that the inner function can use @samp{restore_stack_nonlocal} to |
| restore the stack pointer. The compiler generates code to restore the |
| frame and argument pointer registers, but some machines require saving |
| and restoring additional data such as register window information or |
| stack backchains. Place insns in these patterns to save and restore any |
| such required data. |
| @end enumerate |
| |
| When saving the stack pointer, operand 0 is the save area and operand 1 |
| is the stack pointer. The mode used to allocate the save area is the |
| mode of operand 0. You must specify an integral mode, or |
| @code{VOIDmode} if no save area is needed for a particular type of save |
| (either because no save is needed or because a machine-specific save |
| area can be used). Operand 0 is the stack pointer and operand 1 is the |
| save area for restore operations. If @samp{save_stack_block} is |
| defined, operand 0 must not be @code{VOIDmode} since these saves can be |
| arbitrarily nested. |
| |
| A save area is a @code{mem} that is at a constant offset from |
| @code{virtual_stack_vars_rtx} when the stack pointer is saved for use by |
| nonlocal gotos and a @code{reg} in the other two cases. |
| |
| @cindex @code{allocate_stack} instruction pattern |
| @item @samp{allocate_stack} |
| Subtract (or add if @code{STACK_GROWS_DOWNWARD} is undefined) operand 0 from |
| the stack pointer to create space for dynamically allocated data. |
| |
| Do not define this pattern if all that must be done is the subtraction. |
| Some machines require other operations such as stack probes or |
| maintaining the back chain. Define this pattern to emit those |
| operations in addition to updating the stack pointer. |
| |
| @cindex @code{probe} instruction pattern |
| @item @samp{probe} |
| Some machines require instructions to be executed after space is |
| allocated from the stack, for example to generate a reference at |
| the bottom of the stack. |
| |
| If you need to emit instructions before the stack has been adjusted, |
| put them into the @samp{allocate_stack} pattern. Otherwise, define |
| this pattern to emit the required instructions. |
| |
| No operands are provided. |
| |
| @cindex @code{nonlocal_goto} instruction pattern |
| @item @samp{nonlocal_goto} |
| Emit code to generate a non-local goto, e.g., a jump from one function |
| to a label in an outer function. This pattern has four arguments, |
| each representing a value to be used in the jump. The first |
| argument is to be loadedd into the frame pointer, the second is |
| the address to branch to (code to dispatch to the actual label), |
| the third is the address of a location where the stack is saved, |
| and the last is the address of the label, to be placed in the |
| location for the incoming static chain. |
| |
| On most machines you need not define this pattern, since GNU CC will |
| already generate the correct code, which is to load the frame pointer |
| and static chain, restore the stack (using the |
| @samp{restore_stack_nonlocal} pattern, if defined), and jump indirectly |
| to the dispatcher. You need only define this pattern if this code will |
| not work on your machine. |
| |
| @cindex @code{nonlocal_goto_receiver} instruction pattern |
| @item @samp{nonlocal_goto_receiver} |
| This pattern, if defined, contains code needed at the target of a |
| nonlocal goto after the code already generated by GNU CC. You will not |
| normally need to define this pattern. A typical reason why you might |
| need this pattern is if some value, such as a pointer to a global table, |
| must be restored when the frame pointer is restored. There are no |
| arguments. |
| @end table |
| |
| @node Pattern Ordering |
| @section When the Order of Patterns Matters |
| @cindex Pattern Ordering |
| @cindex Ordering of Patterns |
| |
| Sometimes an insn can match more than one instruction pattern. Then the |
| pattern that appears first in the machine description is the one used. |
| Therefore, more specific patterns (patterns that will match fewer things) |
| and faster instructions (those that will produce better code when they |
| do match) should usually go first in the description. |
| |
| In some cases the effect of ordering the patterns can be used to hide |
| a pattern when it is not valid. For example, the 68000 has an |
| instruction for converting a fullword to floating point and another |
| for converting a byte to floating point. An instruction converting |
| an integer to floating point could match either one. We put the |
| pattern to convert the fullword first to make sure that one will |
| be used rather than the other. (Otherwise a large integer might |
| be generated as a single-byte immediate quantity, which would not work.) |
| Instead of using this pattern ordering it would be possible to make the |
| pattern for convert-a-byte smart enough to deal properly with any |
| constant value. |
| |
| @node Dependent Patterns |
| @section Interdependence of Patterns |
| @cindex Dependent Patterns |
| @cindex Interdependence of Patterns |
| |
| Every machine description must have a named pattern for each of the |
| conditional branch names @samp{b@var{cond}}. The recognition template |
| must always have the form |
| |
| @example |
| (set (pc) |
| (if_then_else (@var{cond} (cc0) (const_int 0)) |
| (label_ref (match_operand 0 "" "")) |
| (pc))) |
| @end example |
| |
| @noindent |
| In addition, every machine description must have an anonymous pattern |
| for each of the possible reverse-conditional branches. Their templates |
| look like |
| |
| @example |
| (set (pc) |
| (if_then_else (@var{cond} (cc0) (const_int 0)) |
| (pc) |
| (label_ref (match_operand 0 "" "")))) |
| @end example |
| |
| @noindent |
| They are necessary because jump optimization can turn direct-conditional |
| branches into reverse-conditional branches. |
| |
| It is often convenient to use the @code{match_operator} construct to |
| reduce the number of patterns that must be specified for branches. For |
| example, |
| |
| @example |
| (define_insn "" |
| [(set (pc) |
| (if_then_else (match_operator 0 "comparison_operator" |
| [(cc0) (const_int 0)]) |
| (pc) |
| (label_ref (match_operand 1 "" ""))))] |
| "@var{condition}" |
| "@dots{}") |
| @end example |
| |
| In some cases machines support instructions identical except for the |
| machine mode of one or more operands. For example, there may be |
| ``sign-extend halfword'' and ``sign-extend byte'' instructions whose |
| patterns are |
| |
| @example |
| (set (match_operand:SI 0 @dots{}) |
| (extend:SI (match_operand:HI 1 @dots{}))) |
| |
| (set (match_operand:SI 0 @dots{}) |
| (extend:SI (match_operand:QI 1 @dots{}))) |
| @end example |
| |
| @noindent |
| Constant integers do not specify a machine mode, so an instruction to |
| extend a constant value could match either pattern. The pattern it |
| actually will match is the one that appears first in the file. For correct |
| results, this must be the one for the widest possible mode (@code{HImode}, |
| here). If the pattern matches the @code{QImode} instruction, the results |
| will be incorrect if the constant value does not actually fit that mode. |
| |
| Such instructions to extend constants are rarely generated because they are |
| optimized away, but they do occasionally happen in nonoptimized |
| compilations. |
| |
| If a constraint in a pattern allows a constant, the reload pass may |
| replace a register with a constant permitted by the constraint in some |
| cases. Similarly for memory references. Because of this substitution, |
| you should not provide separate patterns for increment and decrement |
| instructions. Instead, they should be generated from the same pattern |
| that supports register-register add insns by examining the operands and |
| generating the appropriate machine instruction. |
| |
| @node Jump Patterns |
| @section Defining Jump Instruction Patterns |
| @cindex jump instruction patterns |
| @cindex defining jump instruction patterns |
| |
| For most machines, GNU CC assumes that the machine has a condition code. |
| A comparison insn sets the condition code, recording the results of both |
| signed and unsigned comparison of the given operands. A separate branch |
| insn tests the condition code and branches or not according its value. |
| The branch insns come in distinct signed and unsigned flavors. Many |
| common machines, such as the Vax, the 68000 and the 32000, work this |
| way. |
| |
| Some machines have distinct signed and unsigned compare instructions, and |
| only one set of conditional branch instructions. The easiest way to handle |
| these machines is to treat them just like the others until the final stage |
| where assembly code is written. At this time, when outputting code for the |
| compare instruction, peek ahead at the following branch using |
| @code{next_cc0_user (insn)}. (The variable @code{insn} refers to the insn |
| being output, in the output-writing code in an instruction pattern.) If |
| the RTL says that is an unsigned branch, output an unsigned compare; |
| otherwise output a signed compare. When the branch itself is output, you |
| can treat signed and unsigned branches identically. |
| |
| The reason you can do this is that GNU CC always generates a pair of |
| consecutive RTL insns, possibly separated by @code{note} insns, one to |
| set the condition code and one to test it, and keeps the pair inviolate |
| until the end. |
| |
| To go with this technique, you must define the machine-description macro |
| @code{NOTICE_UPDATE_CC} to do @code{CC_STATUS_INIT}; in other words, no |
| compare instruction is superfluous. |
| |
| Some machines have compare-and-branch instructions and no condition code. |
| A similar technique works for them. When it is time to ``output'' a |
| compare instruction, record its operands in two static variables. When |
| outputting the branch-on-condition-code instruction that follows, actually |
| output a compare-and-branch instruction that uses the remembered operands. |
| |
| It also works to define patterns for compare-and-branch instructions. |
| In optimizing compilation, the pair of compare and branch instructions |
| will be combined according to these patterns. But this does not happen |
| if optimization is not requested. So you must use one of the solutions |
| above in addition to any special patterns you define. |
| |
| In many RISC machines, most instructions do not affect the condition |
| code and there may not even be a separate condition code register. On |
| these machines, the restriction that the definition and use of the |
| condition code be adjacent insns is not necessary and can prevent |
| important optimizations. For example, on the IBM RS/6000, there is a |
| delay for taken branches unless the condition code register is set three |
| instructions earlier than the conditional branch. The instruction |
| scheduler cannot perform this optimization if it is not permitted to |
| separate the definition and use of the condition code register. |
| |
| On these machines, do not use @code{(cc0)}, but instead use a register |
| to represent the condition code. If there is a specific condition code |
| register in the machine, use a hard register. If the condition code or |
| comparison result can be placed in any general register, or if there are |
| multiple condition registers, use a pseudo register. |
| |
| @findex prev_cc0_setter |
| @findex next_cc0_user |
| On some machines, the type of branch instruction generated may depend on |
| the way the condition code was produced; for example, on the 68k and |
| Sparc, setting the condition code directly from an add or subtract |
| instruction does not clear the overflow bit the way that a test |
| instruction does, so a different branch instruction must be used for |
| some conditional branches. For machines that use @code{(cc0)}, the set |
| and use of the condition code must be adjacent (separated only by |
| @code{note} insns) allowing flags in @code{cc_status} to be used. |
| (@xref{Condition Code}.) Also, the comparison and branch insns can be |
| located from each other by using the functions @code{prev_cc0_setter} |
| and @code{next_cc0_user}. |
| |
| However, this is not true on machines that do not use @code{(cc0)}. On |
| those machines, no assumptions can be made about the adjacency of the |
| compare and branch insns and the above methods cannot be used. Instead, |
| we use the machine mode of the condition code register to record |
| different formats of the condition code register. |
| |
| Registers used to store the condition code value should have a mode that |
| is in class @code{MODE_CC}. Normally, it will be @code{CCmode}. If |
| additional modes are required (as for the add example mentioned above in |
| the Sparc), define the macro @code{EXTRA_CC_MODES} to list the |
| additional modes required (@pxref{Condition Code}). Also define |
| @code{EXTRA_CC_NAMES} to list the names of those modes and |
| @code{SELECT_CC_MODE} to choose a mode given an operand of a compare. |
| |
| If it is known during RTL generation that a different mode will be |
| required (for example, if the machine has separate compare instructions |
| for signed and unsigned quantities, like most IBM processors), they can |
| be specified at that time. |
| |
| If the cases that require different modes would be made by instruction |
| combination, the macro @code{SELECT_CC_MODE} determines which machine |
| mode should be used for the comparison result. The patterns should be |
| written using that mode. To support the case of the add on the Sparc |
| discussed above, we have the pattern |
| |
| @smallexample |
| (define_insn "" |
| [(set (reg:CC_NOOV 0) |
| (compare:CC_NOOV |
| (plus:SI (match_operand:SI 0 "register_operand" "%r") |
| (match_operand:SI 1 "arith_operand" "rI")) |
| (const_int 0)))] |
| "" |
| "@dots{}") |
| @end smallexample |
| |
| The @code{SELECT_CC_MODE} macro on the Sparc returns @code{CC_NOOVmode} |
| for comparisons whose argument is a @code{plus}. |
| |
| @node Insn Canonicalizations |
| @section Canonicalization of Instructions |
| @cindex canonicalization of instructions |
| @cindex insn canonicalization |
| |
| There are often cases where multiple RTL expressions could represent an |
| operation performed by a single machine instruction. This situation is |
| most commonly encountered with logical, branch, and multiply-accumulate |
| instructions. In such cases, the compiler attempts to convert these |
| multiple RTL expressions into a single canonical form to reduce the |
| number of insn patterns required. |
| |
| In addition to algebraic simplifications, following canonicalizations |
| are performed: |
| |
| @itemize @bullet |
| @item |
| For commutative and comparison operators, a constant is always made the |
| second operand. If a machine only supports a constant as the second |
| operand, only patterns that match a constant in the second operand need |
| be supplied. |
| |
| @cindex @code{neg}, canonicalization of |
| @cindex @code{not}, canonicalization of |
| @cindex @code{mult}, canonicalization of |
| @cindex @code{plus}, canonicalization of |
| @cindex @code{minus}, canonicalization of |
| For these operators, if only one operand is a @code{neg}, @code{not}, |
| @code{mult}, @code{plus}, or @code{minus} expression, it will be the |
| first operand. |
| |
| @cindex @code{compare}, canonicalization of |
| @item |
| For the @code{compare} operator, a constant is always the second operand |
| on machines where @code{cc0} is used (@pxref{Jump Patterns}). On other |
| machines, there are rare cases where the compiler might want to construct |
| a @code{compare} with a constant as the first operand. However, these |
| cases are not common enough for it to be worthwhile to provide a pattern |
| matching a constant as the first operand unless the machine actually has |
| such an instruction. |
| |
| An operand of @code{neg}, @code{not}, @code{mult}, @code{plus}, or |
| @code{minus} is made the first operand under the same conditions as |
| above. |
| |
| @item |
| @code{(minus @var{x} (const_int @var{n}))} is converted to |
| @code{(plus @var{x} (const_int @var{-n}))}. |
| |
| @item |
| Within address computations (i.e., inside @code{mem}), a left shift is |
| converted into the appropriate multiplication by a power of two. |
| |
| @cindex @code{ior}, canonicalization of |
| @cindex @code{and}, canonicalization of |
| @cindex De Morgan's law |
| De`Morgan's Law is used to move bitwise negation inside a bitwise |
| logical-and or logical-or operation. If this results in only one |
| operand being a @code{not} expression, it will be the first one. |
| |
| A machine that has an instruction that performs a bitwise logical-and of one |
| operand with the bitwise negation of the other should specify the pattern |
| for that instruction as |
| |
| @example |
| (define_insn "" |
| [(set (match_operand:@var{m} 0 @dots{}) |
| (and:@var{m} (not:@var{m} (match_operand:@var{m} 1 @dots{})) |
| (match_operand:@var{m} 2 @dots{})))] |
| "@dots{}" |
| "@dots{}") |
| @end example |
| |
| @noindent |
| Similarly, a pattern for a ``NAND'' instruction should be written |
| |
| @example |
| (define_insn "" |
| [(set (match_operand:@var{m} 0 @dots{}) |
| (ior:@var{m} (not:@var{m} (match_operand:@var{m} 1 @dots{})) |
| (not:@var{m} (match_operand:@var{m} 2 @dots{}))))] |
| "@dots{}" |
| "@dots{}") |
| @end example |
| |
| In both cases, it is not necessary to include patterns for the many |
| logically equivalent RTL expressions. |
| |
| @cindex @code{xor}, canonicalization of |
| @item |
| The only possible RTL expressions involving both bitwise exclusive-or |
| and bitwise negation are @code{(xor:@var{m} @var{x} @var{y})} |
| and @code{(not:@var{m} (xor:@var{m} @var{x} @var{y}))}.@refill |
| |
| @item |
| The sum of three items, one of which is a constant, will only appear in |
| the form |
| |
| @example |
| (plus:@var{m} (plus:@var{m} @var{x} @var{y}) @var{constant}) |
| @end example |
| |
| @item |
| On machines that do not use @code{cc0}, |
| @code{(compare @var{x} (const_int 0))} will be converted to |
| @var{x}.@refill |
| |
| @cindex @code{zero_extract}, canonicalization of |
| @cindex @code{sign_extract}, canonicalization of |
| @item |
| Equality comparisons of a group of bits (usually a single bit) with zero |
| will be written using @code{zero_extract} rather than the equivalent |
| @code{and} or @code{sign_extract} operations. |
| |
| @end itemize |
| |
| @node Peephole Definitions |
| @section Machine-Specific Peephole Optimizers |
| @cindex peephole optimizer definitions |
| @cindex defining peephole optimizers |
| |
| In addition to instruction patterns the @file{md} file may contain |
| definitions of machine-specific peephole optimizations. |
| |
| The combiner does not notice certain peephole optimizations when the data |
| flow in the program does not suggest that it should try them. For example, |
| sometimes two consecutive insns related in purpose can be combined even |
| though the second one does not appear to use a register computed in the |
| first one. A machine-specific peephole optimizer can detect such |
| opportunities. |
| |
| @need 1000 |
| A definition looks like this: |
| |
| @smallexample |
| (define_peephole |
| [@var{insn-pattern-1} |
| @var{insn-pattern-2} |
| @dots{}] |
| "@var{condition}" |
| "@var{template}" |
| "@var{optional insn-attributes}") |
| @end smallexample |
| |
| @noindent |
| The last string operand may be omitted if you are not using any |
| machine-specific information in this machine description. If present, |
| it must obey the same rules as in a @code{define_insn}. |
| |
| In this skeleton, @var{insn-pattern-1} and so on are patterns to match |
| consecutive insns. The optimization applies to a sequence of insns when |
| @var{insn-pattern-1} matches the first one, @var{insn-pattern-2} matches |
| the next, and so on.@refill |
| |
| Each of the insns matched by a peephole must also match a |
| @code{define_insn}. Peepholes are checked only at the last stage just |
| before code generation, and only optionally. Therefore, any insn which |
| would match a peephole but no @code{define_insn} will cause a crash in code |
| generation in an unoptimized compilation, or at various optimization |
| stages. |
| |
| The operands of the insns are matched with @code{match_operands}, |
| @code{match_operator}, and @code{match_dup}, as usual. What is not |
| usual is that the operand numbers apply to all the insn patterns in the |
| definition. So, you can check for identical operands in two insns by |
| using @code{match_operand} in one insn and @code{match_dup} in the |
| other. |
| |
| The operand constraints used in @code{match_operand} patterns do not have |
| any direct effect on the applicability of the peephole, but they will |
| be validated afterward, so make sure your constraints are general enough |
| to apply whenever the peephole matches. If the peephole matches |
| but the constraints are not satisfied, the compiler will crash. |
| |
| It is safe to omit constraints in all the operands of the peephole; or |
| you can write constraints which serve as a double-check on the criteria |
| previously tested. |
| |
| Once a sequence of insns matches the patterns, the @var{condition} is |
| checked. This is a C expression which makes the final decision whether to |
| perform the optimization (we do so if the expression is nonzero). If |
| @var{condition} is omitted (in other words, the string is empty) then the |
| optimization is applied to every sequence of insns that matches the |
| patterns. |
| |
| The defined peephole optimizations are applied after register allocation |
| is complete. Therefore, the peephole definition can check which |
| operands have ended up in which kinds of registers, just by looking at |
| the operands. |
| |
| @findex prev_active_insn |
| The way to refer to the operands in @var{condition} is to write |
| @code{operands[@var{i}]} for operand number @var{i} (as matched by |
| @code{(match_operand @var{i} @dots{})}). Use the variable @code{insn} |
| to refer to the last of the insns being matched; use |
| @code{prev_active_insn} to find the preceding insns. |
| |
| @findex dead_or_set_p |
| When optimizing computations with intermediate results, you can use |
| @var{condition} to match only when the intermediate results are not used |
| elsewhere. Use the C expression @code{dead_or_set_p (@var{insn}, |
| @var{op})}, where @var{insn} is the insn in which you expect the value |
| to be used for the last time (from the value of @code{insn}, together |
| with use of @code{prev_nonnote_insn}), and @var{op} is the intermediate |
| value (from @code{operands[@var{i}]}).@refill |
| |
| Applying the optimization means replacing the sequence of insns with one |
| new insn. The @var{template} controls ultimate output of assembler code |
| for this combined insn. It works exactly like the template of a |
| @code{define_insn}. Operand numbers in this template are the same ones |
| used in matching the original sequence of insns. |
| |
| The result of a defined peephole optimizer does not need to match any of |
| the insn patterns in the machine description; it does not even have an |
| opportunity to match them. The peephole optimizer definition itself serves |
| as the insn pattern to control how the insn is output. |
| |
| Defined peephole optimizers are run as assembler code is being output, |
| so the insns they produce are never combined or rearranged in any way. |
| |
| Here is an example, taken from the 68000 machine description: |
| |
| @smallexample |
| (define_peephole |
| [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4))) |
| (set (match_operand:DF 0 "register_operand" "=f") |
| (match_operand:DF 1 "register_operand" "ad"))] |
| "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])" |
| "* |
| @{ |
| rtx xoperands[2]; |
| xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1); |
| #ifdef MOTOROLA |
| output_asm_insn (\"move.l %1,(sp)\", xoperands); |
| output_asm_insn (\"move.l %1,-(sp)\", operands); |
| return \"fmove.d (sp)+,%0\"; |
| #else |
| output_asm_insn (\"movel %1,sp@@\", xoperands); |
| output_asm_insn (\"movel %1,sp@@-\", operands); |
| return \"fmoved sp@@+,%0\"; |
| #endif |
| @} |
| ") |
| @end smallexample |
| |
| @need 1000 |
| The effect of this optimization is to change |
| |
| @smallexample |
| @group |
| jbsr _foobar |
| addql #4,sp |
| movel d1,sp@@- |
| movel d0,sp@@- |
| fmoved sp@@+,fp0 |
| @end group |
| @end smallexample |
| |
| @noindent |
| into |
| |
| @smallexample |
| @group |
| jbsr _foobar |
| movel d1,sp@@ |
| movel d0,sp@@- |
| fmoved sp@@+,fp0 |
| @end group |
| @end smallexample |
| |
| @ignore |
| @findex CC_REVERSED |
| If a peephole matches a sequence including one or more jump insns, you must |
| take account of the flags such as @code{CC_REVERSED} which specify that the |
| condition codes are represented in an unusual manner. The compiler |
| automatically alters any ordinary conditional jumps which occur in such |
| situations, but the compiler cannot alter jumps which have been replaced by |
| peephole optimizations. So it is up to you to alter the assembler code |
| that the peephole produces. Supply C code to write the assembler output, |
| and in this C code check the condition code status flags and change the |
| assembler code as appropriate. |
| @end ignore |
| |
| @var{insn-pattern-1} and so on look @emph{almost} like the second |
| operand of @code{define_insn}. There is one important difference: the |
| second operand of @code{define_insn} consists of one or more RTX's |
| enclosed in square brackets. Usually, there is only one: then the same |
| action can be written as an element of a @code{define_peephole}. But |
| when there are multiple actions in a @code{define_insn}, they are |
| implicitly enclosed in a @code{parallel}. Then you must explicitly |
| write the @code{parallel}, and the square brackets within it, in the |
| @code{define_peephole}. Thus, if an insn pattern looks like this, |
| |
| @smallexample |
| (define_insn "divmodsi4" |
| [(set (match_operand:SI 0 "general_operand" "=d") |
| (div:SI (match_operand:SI 1 "general_operand" "0") |
| (match_operand:SI 2 "general_operand" "dmsK"))) |
| (set (match_operand:SI 3 "general_operand" "=d") |
| (mod:SI (match_dup 1) (match_dup 2)))] |
| "TARGET_68020" |
| "divsl%.l %2,%3:%0") |
| @end smallexample |
| |
| @noindent |
| then the way to mention this insn in a peephole is as follows: |
| |
| @smallexample |
| (define_peephole |
| [@dots{} |
| (parallel |
| [(set (match_operand:SI 0 "general_operand" "=d") |
| (div:SI (match_operand:SI 1 "general_operand" "0") |
| (match_operand:SI 2 "general_operand" "dmsK"))) |
| (set (match_operand:SI 3 "general_operand" "=d") |
| (mod:SI (match_dup 1) (match_dup 2)))]) |
| @dots{}] |
| @dots{}) |
| @end smallexample |
| |
| @node Expander Definitions |
| @section Defining RTL Sequences for Code Generation |
| @cindex expander definitions |
| @cindex code generation RTL sequences |
| @cindex defining RTL sequences for code generation |
| |
| On some target machines, some standard pattern names for RTL generation |
| cannot be handled with single insn, but a sequence of RTL insns can |
| represent them. For these target machines, you can write a |
| @code{define_expand} to specify how to generate the sequence of RTL. |
| |
| @findex define_expand |
| A @code{define_expand} is an RTL expression that looks almost like a |
| @code{define_insn}; but, unlike the latter, a @code{define_expand} is used |
| only for RTL generation and it can produce more than one RTL insn. |
| |
| A @code{define_expand} RTX has four operands: |
| |
| @itemize @bullet |
| @item |
| The name. Each @code{define_expand} must have a name, since the only |
| use for it is to refer to it by name. |
| |
| @findex define_peephole |
| @item |
| The RTL template. This is just like the RTL template for a |
| @code{define_peephole} in that it is a vector of RTL expressions |
| each being one insn. |
| |
| @item |
| The condition, a string containing a C expression. This expression is |
| used to express how the availability of this pattern depends on |
| subclasses of target machine, selected by command-line options when GNU |
| CC is run. This is just like the condition of a @code{define_insn} that |
| has a standard name. Therefore, the condition (if present) may not |
| depend on the data in the insn being matched, but only the |
| target-machine-type flags. The compiler needs to test these conditions |
| during initialization in order to learn exactly which named instructions |
| are available in a particular run. |
| |
| @item |
| The preparation statements, a string containing zero or more C |
| statements which are to be executed before RTL code is generated from |
| the RTL template. |
| |
| Usually these statements prepare temporary registers for use as |
| internal operands in the RTL template, but they can also generate RTL |
| insns directly by calling routines such as @code{emit_insn}, etc. |
| Any such insns precede the ones that come from the RTL template. |
| @end itemize |
| |
| Every RTL insn emitted by a @code{define_expand} must match some |
| @code{define_insn} in the machine description. Otherwise, the compiler |
| will crash when trying to generate code for the insn or trying to optimize |
| it. |
| |
| The RTL template, in addition to controlling generation of RTL insns, |
| also describes the operands that need to be specified when this pattern |
| is used. In particular, it gives a predicate for each operand. |
| |
| A true operand, which needs to be specified in order to generate RTL from |
| the pattern, should be described with a @code{match_operand} in its first |
| occurrence in the RTL template. This enters information on the operand's |
| predicate into the tables that record such things. GNU CC uses the |
| information to preload the operand into a register if that is required for |
| valid RTL code. If the operand is referred to more than once, subsequent |
| references should use @code{match_dup}. |
| |
| The RTL template may also refer to internal ``operands'' which are |
| temporary registers or labels used only within the sequence made by the |
| @code{define_expand}. Internal operands are substituted into the RTL |
| template with @code{match_dup}, never with @code{match_operand}. The |
| values of the internal operands are not passed in as arguments by the |
| compiler when it requests use of this pattern. Instead, they are computed |
| within the pattern, in the preparation statements. These statements |
| compute the values and store them into the appropriate elements of |
| @code{operands} so that @code{match_dup} can find them. |
| |
| There are two special macros defined for use in the preparation statements: |
| @code{DONE} and @code{FAIL}. Use them with a following semicolon, |
| as a statement. |
| |
| @table @code |
| |
| @findex DONE |
| @item DONE |
| Use the @code{DONE} macro to end RTL generation for the pattern. The |
| only RTL insns resulting from the pattern on this occasion will be |
| those already emitted by explicit calls to @code{emit_insn} within the |
| preparation statements; the RTL template will not be generated. |
| |
| @findex FAIL |
| @item FAIL |
| Make the pattern fail on this occasion. When a pattern fails, it means |
| that the pattern was not truly available. The calling routines in the |
| compiler will try other strategies for code generation using other patterns. |
| |
| Failure is currently supported only for binary (addition, multiplication, |
| shifting, etc.) and bitfield (@code{extv}, @code{extzv}, and @code{insv}) |
| operations. |
| @end table |
| |
| Here is an example, the definition of left-shift for the SPUR chip: |
| |
| @smallexample |
| @group |
| (define_expand "ashlsi3" |
| [(set (match_operand:SI 0 "register_operand" "") |
| (ashift:SI |
| @end group |
| @group |
| (match_operand:SI 1 "register_operand" "") |
| (match_operand:SI 2 "nonmemory_operand" "")))] |
| "" |
| " |
| @end group |
| @end smallexample |
| |
| @smallexample |
| @group |
| @{ |
| if (GET_CODE (operands[2]) != CONST_INT |
| || (unsigned) INTVAL (operands[2]) > 3) |
| FAIL; |
| @}") |
| @end group |
| @end smallexample |
| |
| @noindent |
| This example uses @code{define_expand} so that it can generate an RTL insn |
| for shifting when the shift-count is in the supported range of 0 to 3 but |
| fail in other cases where machine insns aren't available. When it fails, |
| the compiler tries another strategy using different patterns (such as, a |
| library call). |
| |
| If the compiler were able to handle nontrivial condition-strings in |
| patterns with names, then it would be possible to use a |
| @code{define_insn} in that case. Here is another case (zero-extension |
| on the 68000) which makes more use of the power of @code{define_expand}: |
| |
| @smallexample |
| (define_expand "zero_extendhisi2" |
| [(set (match_operand:SI 0 "general_operand" "") |
| (const_int 0)) |
| (set (strict_low_part |
| (subreg:HI |
| (match_dup 0) |
| 0)) |
| (match_operand:HI 1 "general_operand" ""))] |
| "" |
| "operands[1] = make_safe_from (operands[1], operands[0]);") |
| @end smallexample |
| |
| @noindent |
| @findex make_safe_from |
| Here two RTL insns are generated, one to clear the entire output operand |
| and the other to copy the input operand into its low half. This sequence |
| is incorrect if the input operand refers to [the old value of] the output |
| operand, so the preparation statement makes sure this isn't so. The |
| function @code{make_safe_from} copies the @code{operands[1]} into a |
| temporary register if it refers to @code{operands[0]}. It does this |
| by emitting another RTL insn. |
| |
| Finally, a third example shows the use of an internal operand. |
| Zero-extension on the SPUR chip is done by @code{and}-ing the result |
| against a halfword mask. But this mask cannot be represented by a |
| @code{const_int} because the constant value is too large to be legitimate |
| on this machine. So it must be copied into a register with |
| @code{force_reg} and then the register used in the @code{and}. |
| |
| @smallexample |
| (define_expand "zero_extendhisi2" |
| [(set (match_operand:SI 0 "register_operand" "") |
| (and:SI (subreg:SI |
| (match_operand:HI 1 "register_operand" "") |
| 0) |
| (match_dup 2)))] |
| "" |
| "operands[2] |
| = force_reg (SImode, gen_rtx (CONST_INT, |
| VOIDmode, 65535)); ") |
| @end smallexample |
| |
| @strong{Note:} If the @code{define_expand} is used to serve a |
| standard binary or unary arithmetic operation or a bitfield operation, |
| then the last insn it generates must not be a @code{code_label}, |
| @code{barrier} or @code{note}. It must be an @code{insn}, |
| @code{jump_insn} or @code{call_insn}. If you don't need a real insn |
| at the end, emit an insn to copy the result of the operation into |
| itself. Such an insn will generate no code, but it can avoid problems |
| in the compiler.@refill |
| |
| @node Insn Splitting |
| @section Defining How to Split Instructions |
| @cindex insn splitting |
| @cindex instruction splitting |
| @cindex splitting instructions |
| |
| There are two cases where you should specify how to split a pattern into |
| multiple insns. On machines that have instructions requiring delay |
| slots (@pxref{Delay Slots}) or that have instructions whose output is |
| not available for multiple cycles (@pxref{Function Units}), the compiler |
| phases that optimize these cases need to be able to move insns into |
| one-instruction delay slots. However, some insns may generate more than one |
| machine instruction. These insns cannot be placed into a delay slot. |
| |
| Often you can rewrite the single insn as a list of individual insns, |
| each corresponding to one machine instruction. The disadvantage of |
| doing so is that it will cause the compilation to be slower and require |
| more space. If the resulting insns are too complex, it may also |
| suppress some optimizations. The compiler splits the insn if there is a |
| reason to believe that it might improve instruction or delay slot |
| scheduling. |
| |
| The insn combiner phase also splits putative insns. If three insns are |
| merged into one insn with a complex expression that cannot be matched by |
| some @code{define_insn} pattern, the combiner phase attempts to split |
| the complex pattern into two insns that are recognized. Usually it can |
| break the complex pattern into two patterns by splitting out some |
| subexpression. However, in some other cases, such as performing an |
| addition of a large constant in two insns on a RISC machine, the way to |
| split the addition into two insns is machine-dependent. |
| |
| @cindex define_split |
| The @code{define_split} definition tells the compiler how to split a |
| complex insn into several simpler insns. It looks like this: |
| |
| @smallexample |
| (define_split |
| [@var{insn-pattern}] |
| "@var{condition}" |
| [@var{new-insn-pattern-1} |
| @var{new-insn-pattern-2} |
| @dots{}] |
| "@var{preparation statements}") |
| @end smallexample |
| |
| @var{insn-pattern} is a pattern that needs to be split and |
| @var{condition} is the final condition to be tested, as in a |
| @code{define_insn}. When an insn matching @var{insn-pattern} and |
| satisfying @var{condition} is found, it is replaced in the insn list |
| with the insns given by @var{new-insn-pattern-1}, |
| @var{new-insn-pattern-2}, etc. |
| |
| The @var{preparation statements} are similar to those statements that |
| are specified for @code{define_expand} (@pxref{Expander Definitions}) |
| and are executed before the new RTL is generated to prepare for the |
| generated code or emit some insns whose pattern is not fixed. Unlike |
| those in @code{define_expand}, however, these statements must not |
| generate any new pseudo-registers. Once reload has completed, they also |
| must not allocate any space in the stack frame. |
| |
| Patterns are matched against @var{insn-pattern} in two different |
| circumstances. If an insn needs to be split for delay slot scheduling |
| or insn scheduling, the insn is already known to be valid, which means |
| that it must have been matched by some @code{define_insn} and, if |
| @code{reload_completed} is non-zero, is known to satisfy the constraints |
| of that @code{define_insn}. In that case, the new insn patterns must |
| also be insns that are matched by some @code{define_insn} and, if |
| @code{reload_completed} is non-zero, must also satisfy the constraints |
| of those definitions. |
| |
| As an example of this usage of @code{define_split}, consider the following |
| example from @file{a29k.md}, which splits a @code{sign_extend} from |
| @code{HImode} to @code{SImode} into a pair of shift insns: |
| |
| @smallexample |
| (define_split |
| [(set (match_operand:SI 0 "gen_reg_operand" "") |
| (sign_extend:SI (match_operand:HI 1 "gen_reg_operand" "")))] |
| "" |
| [(set (match_dup 0) |
| (ashift:SI (match_dup 1) |
| (const_int 16))) |
| (set (match_dup 0) |
| (ashiftrt:SI (match_dup 0) |
| (const_int 16)))] |
| " |
| @{ operands[1] = gen_lowpart (SImode, operands[1]); @}") |
| @end smallexample |
| |
| When the combiner phase tries to split an insn pattern, it is always the |
| case that the pattern is @emph{not} matched by any @code{define_insn}. |
| The combiner pass first tries to split a single @code{set} expression |
| and then the same @code{set} expression inside a @code{parallel}, but |
| followed by a @code{clobber} of a pseudo-reg to use as a scratch |
| register. In these cases, the combiner expects exactly two new insn |
| patterns to be generated. It will verify that these patterns match some |
| @code{define_insn} definitions, so you need not do this test in the |
| @code{define_split} (of course, there is no point in writing a |
| @code{define_split} that will never produce insns that match). |
| |
| Here is an example of this use of @code{define_split}, taken from |
| @file{rs6000.md}: |
| |
| @smallexample |
| (define_split |
| [(set (match_operand:SI 0 "gen_reg_operand" "") |
| (plus:SI (match_operand:SI 1 "gen_reg_operand" "") |
| (match_operand:SI 2 "non_add_cint_operand" "")))] |
| "" |
| [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3))) |
| (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))] |
| " |
| @{ |
| int low = INTVAL (operands[2]) & 0xffff; |
| int high = (unsigned) INTVAL (operands[2]) >> 16; |
| |
| if (low & 0x8000) |
| high++, low |= 0xffff0000; |
| |
| operands[3] = gen_rtx (CONST_INT, VOIDmode, high << 16); |
| operands[4] = gen_rtx (CONST_INT, VOIDmode, low); |
| @}") |
| @end smallexample |
| |
| Here the predicate @code{non_add_cint_operand} matches any |
| @code{const_int} that is @emph{not} a valid operand of a single add |
| insn. The add with the smaller displacement is written so that it |
| can be substituted into the address of a subsequent operation. |
| |
| An example that uses a scratch register, from the same file, generates |
| an equality comparison of a register and a large constant: |
| |
| @smallexample |
| (define_split |
| [(set (match_operand:CC 0 "cc_reg_operand" "") |
| (compare:CC (match_operand:SI 1 "gen_reg_operand" "") |
| (match_operand:SI 2 "non_short_cint_operand" ""))) |
| (clobber (match_operand:SI 3 "gen_reg_operand" ""))] |
| "find_single_use (operands[0], insn, 0) |
| && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ |
| || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)" |
| [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4))) |
| (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))] |
| " |
| @{ |
| /* Get the constant we are comparing against, C, and see what it |
| looks like sign-extended to 16 bits. Then see what constant |
| could be XOR'ed with C to get the sign-extended value. */ |
| |
| int c = INTVAL (operands[2]); |
| int sextc = (c << 16) >> 16; |
| int xorv = c ^ sextc; |
| |
| operands[4] = gen_rtx (CONST_INT, VOIDmode, xorv); |
| operands[5] = gen_rtx (CONST_INT, VOIDmode, sextc); |
| @}") |
| @end smallexample |
| |
| To avoid confusion, don't write a single @code{define_split} that |
| accepts some insns that match some @code{define_insn} as well as some |
| insns that don't. Instead, write two separate @code{define_split} |
| definitions, one for the insns that are valid and one for the insns that |
| are not valid. |
| |
| @node Insn Attributes |
| @section Instruction Attributes |
| @cindex insn attributes |
| @cindex instruction attributes |
| |
| In addition to describing the instruction supported by the target machine, |
| the @file{md} file also defines a group of @dfn{attributes} and a set of |
| values for each. Every generated insn is assigned a value for each attribute. |
| One possible attribute would be the effect that the insn has on the machine's |
| condition code. This attribute can then be used by @code{NOTICE_UPDATE_CC} |
| to track the condition codes. |
| |
| @menu |
| * Defining Attributes:: Specifying attributes and their values. |
| * Expressions:: Valid expressions for attribute values. |
| * Tagging Insns:: Assigning attribute values to insns. |
| * Attr Example:: An example of assigning attributes. |
| * Insn Lengths:: Computing the length of insns. |
| * Constant Attributes:: Defining attributes that are constant. |
| * Delay Slots:: Defining delay slots required for a machine. |
| * Function Units:: Specifying information for insn scheduling. |
| @end menu |
| |
| @node Defining Attributes |
| @subsection Defining Attributes and their Values |
| @cindex defining attributes and their values |
| @cindex attributes, defining |
| |
| @findex define_attr |
| The @code{define_attr} expression is used to define each attribute required |
| by the target machine. It looks like: |
| |
| @smallexample |
| (define_attr @var{name} @var{list-of-values} @var{default}) |
| @end smallexample |
| |
| @var{name} is a string specifying the name of the attribute being defined. |
| |
| @var{list-of-values} is either a string that specifies a comma-separated |
| list of values that can be assigned to the attribute, or a null string to |
| indicate that the attribute takes numeric values. |
| |
| @var{default} is an attribute expression that gives the value of this |
| attribute for insns that match patterns whose definition does not include |
| an explicit value for this attribute. @xref{Attr Example}, for more |
| information on the handling of defaults. @xref{Constant Attributes}, |
| for information on attributes that do not depend on any particular insn. |
| |
| @findex insn-attr.h |
| For each defined attribute, a number of definitions are written to the |
| @file{insn-attr.h} file. For cases where an explicit set of values is |
| specified for an attribute, the following are defined: |
| |
| @itemize @bullet |
| @item |
| A @samp{#define} is written for the symbol @samp{HAVE_ATTR_@var{name}}. |
| |
| @item |
| An enumeral class is defined for @samp{attr_@var{name}} with |
| elements of the form @samp{@var{upper-name}_@var{upper-value}} where |
| the attribute name and value are first converted to upper case. |
| |
| @item |
| A function @samp{get_attr_@var{name}} is defined that is passed an insn and |
| returns the attribute value for that insn. |
| @end itemize |
| |
| For example, if the following is present in the @file{md} file: |
| |
| @smallexample |
| (define_attr "type" "branch,fp,load,store,arith" @dots{}) |
| @end smallexample |
| |
| @noindent |
| the following lines will be written to the file @file{insn-attr.h}. |
| |
| @smallexample |
| #define HAVE_ATTR_type |
| enum attr_type @{TYPE_BRANCH, TYPE_FP, TYPE_LOAD, |
| TYPE_STORE, TYPE_ARITH@}; |
| extern enum attr_type get_attr_type (); |
| @end smallexample |
| |
| If the attribute takes numeric values, no @code{enum} type will be |
| defined and the function to obtain the attribute's value will return |
| @code{int}. |
| |
| @node Expressions |
| @subsection Attribute Expressions |
| @cindex attribute expressions |
| |
| RTL expressions used to define attributes use the codes described above |
| plus a few specific to attribute definitions, to be discussed below. |
| Attribute value expressions must have one of the following forms: |
| |
| @table @code |
| @cindex @code{const_int} and attributes |
| @item (const_int @var{i}) |
| The integer @var{i} specifies the value of a numeric attribute. @var{i} |
| must be non-negative. |
| |
| The value of a numeric attribute can be specified either with a |
| @code{const_int} or as an integer represented as a string in |
| @code{const_string}, @code{eq_attr} (see below), and @code{set_attr} |
| (@pxref{Tagging Insns}) expressions. |
| |
| @cindex @code{const_string} and attributes |
| @item (const_string @var{value}) |
| The string @var{value} specifies a constant attribute value. |
| If @var{value} is specified as @samp{"*"}, it means that the default value of |
| the attribute is to be used for the insn containing this expression. |
| @samp{"*"} obviously cannot be used in the @var{default} expression |
| of a @code{define_attr}.@refill |
| |
| If the attribute whose value is being specified is numeric, @var{value} |
| must be a string containing a non-negative integer (normally |
| @code{const_int} would be used in this case). Otherwise, it must |
| contain one of the valid values for the attribute. |
| |
| @cindex @code{if_then_else} and attributes |
| @item (if_then_else @var{test} @var{true-value} @var{false-value}) |
| @var{test} specifies an attribute test, whose format is defined below. |
| The value of this expression is @var{true-value} if @var{test} is true, |
| otherwise it is @var{false-value}. |
| |
| @cindex @code{cond} and attributes |
| @item (cond [@var{test1} @var{value1} @dots{}] @var{default}) |
| The first operand of this expression is a vector containing an even |
| number of expressions and consisting of pairs of @var{test} and @var{value} |
| expressions. The value of the @code{cond} expression is that of the |
| @var{value} corresponding to the first true @var{test} expression. If |
| none of the @var{test} expressions are true, the value of the @code{cond} |
| expression is that of the @var{default} expression. |
| @end table |
| |
| @var{test} expressions can have one of the following forms: |
| |
| @table @code |
| @cindex @code{const_int} and attribute tests |
| @item (const_int @var{i}) |
| This test is true if @var{i} is non-zero and false otherwise. |
| |
| @cindex @code{not} and attributes |
| @cindex @code{ior} and attributes |
| @cindex @code{and} and attributes |
| @item (not @var{test}) |
| @itemx (ior @var{test1} @var{test2}) |
| @itemx (and @var{test1} @var{test2}) |
| These tests are true if the indicated logical function is true. |
| |
| @cindex @code{match_operand} and attributes |
| @item (match_operand:@var{m} @var{n} @var{pred} @var{constraints}) |
| This test is true if operand @var{n} of the insn whose attribute value |
| is being determined has mode @var{m} (this part of the test is ignored |
| if @var{m} is @code{VOIDmode}) and the function specified by the string |
| @var{pred} returns a non-zero value when passed operand @var{n} and mode |
| @var{m} (this part of the test is ignored if @var{pred} is the null |
| string). |
| |
| The @var{constraints} operand is ignored and should be the null string. |
| |
| @cindex @code{le} and attributes |
| @cindex @code{leu} and attributes |
| @cindex @code{lt} and attributes |
| @cindex @code{gt} and attributes |
| @cindex @code{gtu} and attributes |
| @cindex @code{ge} and attributes |
| @cindex @code{geu} and attributes |
| @cindex @code{ne} and attributes |
| @cindex @code{eq} and attributes |
| @cindex @code{plus} and attributes |
| @cindex @code{minus} and attributes |
| @cindex @code{mult} and attributes |
| @cindex @code{div} and attributes |
| @cindex @code{mod} and attributes |
| @cindex @code{abs} and attributes |
| @cindex @code{neg} and attributes |
| @cindex @code{ashift} and attributes |
| @cindex @code{lshiftrt} and attributes |
| @cindex @code{ashiftrt} and attributes |
| @item (le @var{arith1} @var{arith2}) |
| @itemx (leu @var{arith1} @var{arith2}) |
| @itemx (lt @var{arith1} @var{arith2}) |
| @itemx (ltu @var{arith1} @var{arith2}) |
| @itemx (gt @var{arith1} @var{arith2}) |
| @itemx (gtu @var{arith1} @var{arith2}) |
| @itemx (ge @var{arith1} @var{arith2}) |
| @itemx (geu @var{arith1} @var{arith2}) |
| @itemx (ne @var{arith1} @var{arith2}) |
| @itemx (eq @var{arith1} @var{arith2}) |
| These tests are true if the indicated comparison of the two arithmetic |
| expressions is true. Arithmetic expressions are formed with |
| @code{plus}, @code{minus}, @code{mult}, @code{div}, @code{mod}, |
| @code{abs}, @code{neg}, @code{and}, @code{ior}, @code{xor}, @code{not}, |
| @code{ashift}, @code{lshiftrt}, and @code{ashiftrt} expressions.@refill |
| |
| @findex get_attr |
| @code{const_int} and @code{symbol_ref} are always valid terms (@pxref{Insn |
| Lengths},for additional forms). @code{symbol_ref} is a string |
| denoting a C expression that yields an @code{int} when evaluated by the |
| @samp{get_attr_@dots{}} routine. It should normally be a global |
| variable.@refill |
| |
| @findex eq_attr |
| @item (eq_attr @var{name} @var{value}) |
| @var{name} is a string specifying the name of an attribute. |
| |
| @var{value} is a string that is either a valid value for attribute |
| @var{name}, a comma-separated list of values, or @samp{!} followed by a |
| value or list. If @var{value} does not begin with a @samp{!}, this |
| test is true if the value of the @var{name} attribute of the current |
| insn is in the list specified by @var{value}. If @var{value} begins |
| with a @samp{!}, this test is true if the attribute's value is |
| @emph{not} in the specified list. |
| |
| For example, |
| |
| @smallexample |
| (eq_attr "type" "load,store") |
| @end smallexample |
| |
| @noindent |
| is equivalent to |
| |
| @smallexample |
| (ior (eq_attr "type" "load") (eq_attr "type" "store")) |
| @end smallexample |
| |
| If @var{name} specifies an attribute of @samp{alternative}, it refers to the |
| value of the compiler variable @code{which_alternative} |
| (@pxref{Output Statement}) and the values must be small integers. For |
| example,@refill |
| |
| @smallexample |
| (eq_attr "alternative" "2,3") |
| @end smallexample |
| |
| @noindent |
| is equivalent to |
| |
| @smallexample |
| (ior (eq (symbol_ref "which_alternative") (const_int 2)) |
| (eq (symbol_ref "which_alternative") (const_int 3))) |
| @end smallexample |
| |
| Note that, for most attributes, an @code{eq_attr} test is simplified in cases |
| where the value of the attribute being tested is known for all insns matching |
| a particular pattern. This is by far the most common case.@refill |
| |
| @findex attr_flag |
| @item (attr_flag @var{name}) |
| The value of an @code{attr_flag} expression is true if the flag |
| specified by @var{name} is true for the @code{insn} currently being |
| scheduled. |
| |
| @var{name} is a string specifying one of a fixed set of flags to test. |
| Test the flags @code{forward} and @code{backward} to determine the |
| direction of a conditional branch. Test the flags @code{very_likely}, |
| @code{likely}, @code{very_unlikely}, and @code{unlikely} to determine |
| if a conditional branch is expected to be taken. |
| |
| If the @code{very_likely} flag is true, then the @code{likely} flag is also |
| true. Likewise for the @code{very_unlikely} and @code{unlikely} flags. |
| |
| This example describes a conditional branch delay slot which |
| can be nullified for forward branches that are taken (annul-true) or |
| for backward branches which are not taken (annul-false). |
| |
| @smallexample |
| (define_delay (eq_attr "type" "cbranch") |
| [(eq_attr "in_branch_delay" "true") |
| (and (eq_attr "in_branch_delay" "true") |
| (attr_flag "forward")) |
| (and (eq_attr "in_branch_delay" "true") |
| (attr_flag "backward"))]) |
| @end smallexample |
| |
| The @code{forward} and @code{backward} flags are false if the current |
| @code{insn} being scheduled is not a conditional branch. |
| |
| The @code{very_likely} and @code{likely} flags are true if the |
| @code{insn} being scheduled is not a conditional branch. |
| The @code{very_unlikely} and @code{unlikely} flags are false if the |
| @code{insn} being scheduled is not a conditional branch. |
| |
| @code{attr_flag} is only used during delay slot scheduling and has no |
| meaning to other passes of the compiler. |
| @end table |
| |
| @node Tagging Insns |
| @subsection Assigning Attribute Values to Insns |
| @cindex tagging insns |
| @cindex assigning attribute values to insns |
| |
| The value assigned to an attribute of an insn is primarily determined by |
| which pattern is matched by that insn (or which @code{define_peephole} |
| generated it). Every @code{define_insn} and @code{define_peephole} can |
| have an optional last argument to specify the values of attributes for |
| matching insns. The value of any attribute not specified in a particular |
| insn is set to the default value for that attribute, as specified in its |
| @code{define_attr}. Extensive use of default values for attributes |
| permits the specification of the values for only one or two attributes |
| in the definition of most insn patterns, as seen in the example in the |
| next section.@refill |
| |
| The optional last argument of @code{define_insn} and |
| @code{define_peephole} is a vector of expressions, each of which defines |
| the value for a single attribute. The most general way of assigning an |
| attribute's value is to use a @code{set} expression whose first operand is an |
| @code{attr} expression giving the name of the attribute being set. The |
| second operand of the @code{set} is an attribute expression |
| (@pxref{Expressions}) giving the value of the attribute.@refill |
| |
| When the attribute value depends on the @samp{alternative} attribute |
| (i.e., which is the applicable alternative in the constraint of the |
| insn), the @code{set_attr_alternative} expression can be used. It |
| allows the specification of a vector of attribute expressions, one for |
| each alternative. |
| |
| @findex set_attr |
| When the generality of arbitrary attribute expressions is not required, |
| the simpler @code{set_attr} expression can be used, which allows |
| specifying a string giving either a single attribute value or a list |
| of attribute values, one for each alternative. |
| |
| The form of each of the above specifications is shown below. In each case, |
| @var{name} is a string specifying the attribute to be set. |
| |
| @table @code |
| @item (set_attr @var{name} @var{value-string}) |
| @var{value-string} is either a string giving the desired attribute value, |
| or a string containing a comma-separated list giving the values for |
| succeeding alternatives. The number of elements must match the number |
| of alternatives in the constraint of the insn pattern. |
| |
| Note that it may be useful to specify @samp{*} for some alternative, in |
| which case the attribute will assume its default value for insns matching |
| that alternative. |
| |
| @findex set_attr_alternative |
| @item (set_attr_alternative @var{name} [@var{value1} @var{value2} @dots{}]) |
| Depending on the alternative of the insn, the value will be one of the |
| specified values. This is a shorthand for using a @code{cond} with |
| tests on the @samp{alternative} attribute. |
| |
| @findex attr |
| @item (set (attr @var{name}) @var{value}) |
| The first operand of this @code{set} must be the special RTL expression |
| @code{attr}, whose sole operand is a string giving the name of the |
| attribute being set. @var{value} is the value of the attribute. |
| @end table |
| |
| The following shows three different ways of representing the same |
| attribute value specification: |
| |
| @smallexample |
| (set_attr "type" "load,store,arith") |
| |
| (set_attr_alternative "type" |
| [(const_string "load") (const_string "store") |
| (const_string "arith")]) |
| |
| (set (attr "type") |
| (cond [(eq_attr "alternative" "1") (const_string "load") |
| (eq_attr "alternative" "2") (const_string "store")] |
| (const_string "arith"))) |
| @end smallexample |
| |
| @need 1000 |
| @findex define_asm_attributes |
| The @code{define_asm_attributes} expression provides a mechanism to |
| specify the attributes assigned to insns produced from an @code{asm} |
| statement. It has the form: |
| |
| @smallexample |
| (define_asm_attributes [@var{attr-sets}]) |
| @end smallexample |
| |
| @noindent |
| where @var{attr-sets} is specified the same as for both the |
| @code{define_insn} and the @code{define_peephole} expressions. |
| |
| These values will typically be the ``worst case'' attribute values. For |
| example, they might indicate that the condition code will be clobbered. |
| |
| A specification for a @code{length} attribute is handled specially. The |
| way to compute the length of an @code{asm} insn is to multiply the |
| length specified in the expression @code{define_asm_attributes} by the |
| number of machine instructions specified in the @code{asm} statement, |
| determined by counting the number of semicolons and newlines in the |
| string. Therefore, the value of the @code{length} attribute specified |
| in a @code{define_asm_attributes} should be the maximum possible length |
| of a single machine instruction. |
| |
| @node Attr Example |
| @subsection Example of Attribute Specifications |
| @cindex attribute specifications example |
| @cindex attribute specifications |
| |
| The judicious use of defaulting is important in the efficient use of |
| insn attributes. Typically, insns are divided into @dfn{types} and an |
| attribute, customarily called @code{type}, is used to represent this |
| value. This attribute is normally used only to define the default value |
| for other attributes. An example will clarify this usage. |
| |
| Assume we have a RISC machine with a condition code and in which only |
| full-word operations are performed in registers. Let us assume that we |
| can divide all insns into loads, stores, (integer) arithmetic |
| operations, floating point operations, and branches. |
| |
| Here we will concern ourselves with determining the effect of an insn on |
| the condition code and will limit ourselves to the following possible |
| effects: The condition code can be set unpredictably (clobbered), not |
| be changed, be set to agree with the results of the operation, or only |
| changed if the item previously set into the condition code has been |
| modified. |
| |
| Here is part of a sample @file{md} file for such a machine: |
| |
| @smallexample |
| (define_attr "type" "load,store,arith,fp,branch" (const_string "arith")) |
| |
| (define_attr "cc" "clobber,unchanged,set,change0" |
| (cond [(eq_attr "type" "load") |
| (const_string "change0") |
| (eq_attr "type" "store,branch") |
| (const_string "unchanged") |
| (eq_attr "type" "arith") |
| (if_then_else (match_operand:SI 0 "" "") |
| (const_string "set") |
| (const_string "clobber"))] |
| (const_string "clobber"))) |
| |
| (define_insn "" |
| [(set (match_operand:SI 0 "general_operand" "=r,r,m") |
| (match_operand:SI 1 "general_operand" "r,m,r"))] |
| "" |
| "@@ |
| move %0,%1 |
| load %0,%1 |
| store %0,%1" |
| [(set_attr "type" "arith,load,store")]) |
| @end smallexample |
| |
| Note that we assume in the above example that arithmetic operations |
| performed on quantities smaller than a machine word clobber the condition |
| code since they will set the condition code to a value corresponding to the |
| full-word result. |
| |
| @node Insn Lengths |
| @subsection Computing the Length of an Insn |
| @cindex insn lengths, computing |
| @cindex computing the length of an insn |
| |
| For many machines, multiple types of branch instructions are provided, each |
| for different length branch displacements. In most cases, the assembler |
| will choose the correct instruction to use. However, when the assembler |
| cannot do so, GCC can when a special attribute, the @samp{length} |
| attribute, is defined. This attribute must be defined to have numeric |
| values by specifying a null string in its @code{define_attr}. |
| |
| In the case of the @samp{length} attribute, two additional forms of |
| arithmetic terms are allowed in test expressions: |
| |
| @table @code |
| @cindex @code{match_dup} and attributes |
| @item (match_dup @var{n}) |
| This refers to the address of operand @var{n} of the current insn, which |
| must be a @code{label_ref}. |
| |
| @cindex @code{pc} and attributes |
| @item (pc) |
| This refers to the address of the @emph{current} insn. It might have |
| been more consistent with other usage to make this the address of the |
| @emph{next} insn but this would be confusing because the length of the |
| current insn is to be computed. |
| @end table |
| |
| @cindex @code{addr_vec}, length of |
| @cindex @code{addr_diff_vec}, length of |
| For normal insns, the length will be determined by value of the |
| @samp{length} attribute. In the case of @code{addr_vec} and |
| @code{addr_diff_vec} insn patterns, the length is computed as |
| the number of vectors multiplied by the size of each vector. |
| |
| Lengths are measured in addressable storage units (bytes). |
| |
| The following macros can be used to refine the length computation: |
| |
| @table @code |
| @findex FIRST_INSN_ADDRESS |
| @item FIRST_INSN_ADDRESS |
| When the @code{length} insn attribute is used, this macro specifies the |
| value to be assigned to the address of the first insn in a function. If |
| not specified, 0 is used. |
| |
| @findex ADJUST_INSN_LENGTH |
| @item ADJUST_INSN_LENGTH (@var{insn}, @var{length}) |
| If defined, modifies the length assigned to instruction @var{insn} as a |
| function of the context in which it is used. @var{length} is an lvalue |
| that contains the initially computed length of the insn and should be |
| updated with the correct length of the insn. If updating is required, |
| @var{insn} must not be a varying-length insn. |
| |
| This macro will normally not be required. A case in which it is |
| required is the ROMP. On this machine, the size of an @code{addr_vec} |
| insn must be increased by two to compensate for the fact that alignment |
| may be required. |
| @end table |
| |
| @findex get_attr_length |
| The routine that returns @code{get_attr_length} (the value of the |
| @code{length} attribute) can be used by the output routine to |
| determine the form of the branch instruction to be written, as the |
| example below illustrates. |
| |
| As an example of the specification of variable-length branches, consider |
| the IBM 360. If we adopt the convention that a register will be set to |
| the starting address of a function, we can jump to labels within 4k of |
| the start using a four-byte instruction. Otherwise, we need a six-byte |
| sequence to load the address from memory and then branch to it. |
| |
| On such a machine, a pattern for a branch instruction might be specified |
| as follows: |
| |
| @smallexample |
| (define_insn "jump" |
| [(set (pc) |
| (label_ref (match_operand 0 "" "")))] |
| "" |
| "* |
| @{ |
| return (get_attr_length (insn) == 4 |
| ? \"b %l0\" : \"l r15,=a(%l0); br r15\"); |
| @}" |
| [(set (attr "length") (if_then_else (lt (match_dup 0) (const_int 4096)) |
| (const_int 4) |
| (const_int 6)))]) |
| @end smallexample |
| |
| @node Constant Attributes |
| @subsection Constant Attributes |
| @cindex constant attributes |
| |
| A special form of @code{define_attr}, where the expression for the |
| default value is a @code{const} expression, indicates an attribute that |
| is constant for a given run of the compiler. Constant attributes may be |
| used to specify which variety of processor is used. For example, |
| |
| @smallexample |
| (define_attr "cpu" "m88100,m88110,m88000" |
| (const |
| (cond [(symbol_ref "TARGET_88100") (const_string "m88100") |
| (symbol_ref "TARGET_88110") (const_string "m88110")] |
| (const_string "m88000")))) |
| |
| (define_attr "memory" "fast,slow" |
| (const |
| (if_then_else (symbol_ref "TARGET_FAST_MEM") |
| (const_string "fast") |
| (const_string "slow")))) |
| @end smallexample |
| |
| The routine generated for constant attributes has no parameters as it |
| does not depend on any particular insn. RTL expressions used to define |
| the value of a constant attribute may use the @code{symbol_ref} form, |
| but may not use either the @code{match_operand} form or @code{eq_attr} |
| forms involving insn attributes. |
| |
| @node Delay Slots |
| @subsection Delay Slot Scheduling |
| @cindex delay slots, defining |
| |
| The insn attribute mechanism can be used to specify the requirements for |
| delay slots, if any, on a target machine. An instruction is said to |
| require a @dfn{delay slot} if some instructions that are physically |
| after the instruction are executed as if they were located before it. |
| Classic examples are branch and call instructions, which often execute |
| the following instruction before the branch or call is performed. |
| |
| On some machines, conditional branch instructions can optionally |
| @dfn{annul} instructions in the delay slot. This means that the |
| instruction will not be executed for certain branch outcomes. Both |
| instructions that annul if the branch is true and instructions that |
| annul if the branch is false are supported. |
| |
| Delay slot scheduling differs from instruction scheduling in that |
| determining whether an instruction needs a delay slot is dependent only |
| on the type of instruction being generated, not on data flow between the |
| instructions. See the next section for a discussion of data-dependent |
| instruction scheduling. |
| |
| @findex define_delay |
| The requirement of an insn needing one or more delay slots is indicated |
| via the @code{define_delay} expression. It has the following form: |
| |
| @smallexample |
| (define_delay @var{test} |
| [@var{delay-1} @var{annul-true-1} @var{annul-false-1} |
| @var{delay-2} @var{annul-true-2} @var{annul-false-2} |
| @dots{}]) |
| @end smallexample |
| |
| @var{test} is an attribute test that indicates whether this |
| @code{define_delay} applies to a particular insn. If so, the number of |
| required delay slots is determined by the length of the vector specified |
| as the second argument. An insn placed in delay slot @var{n} must |
| satisfy attribute test @var{delay-n}. @var{annul-true-n} is an |
| attribute test that specifies which insns may be annulled if the branch |
| is true. Similarly, @var{annul-false-n} specifies which insns in the |
| delay slot may be annulled if the branch is false. If annulling is not |
| supported for that delay slot, @code{(nil)} should be coded.@refill |
| |
| For example, in the common case where branch and call insns require |
| a single delay slot, which may contain any insn other than a branch or |
| call, the following would be placed in the @file{md} file: |
| |
| @smallexample |
| (define_delay (eq_attr "type" "branch,call") |
| [(eq_attr "type" "!branch,call") (nil) (nil)]) |
| @end smallexample |
| |
| Multiple @code{define_delay} expressions may be specified. In this |
| case, each such expression specifies different delay slot requirements |
| and there must be no insn for which tests in two @code{define_delay} |
| expressions are both true. |
| |
| For example, if we have a machine that requires one delay slot for branches |
| but two for calls, no delay slot can contain a branch or call insn, |
| and any valid insn in the delay slot for the branch can be annulled if the |
| branch is true, we might represent this as follows: |
| |
| @smallexample |
| (define_delay (eq_attr "type" "branch") |
| [(eq_attr "type" "!branch,call") |
| (eq_attr "type" "!branch,call") |
| (nil)]) |
| |
| (define_delay (eq_attr "type" "call") |
| [(eq_attr "type" "!branch,call") (nil) (nil) |
| (eq_attr "type" "!branch,call") (nil) (nil)]) |
| @end smallexample |
| @c the above is *still* too long. --mew 4feb93 |
| |
| @node Function Units |
| @subsection Specifying Function Units |
| @cindex function units, for scheduling |
| |
| On most RISC machines, there are instructions whose results are not |
| available for a specific number of cycles. Common cases are instructions |
| that load data from memory. On many machines, a pipeline stall will result |
| if the data is referenced too soon after the load instruction. |
| |
| In addition, many newer microprocessors have multiple function units, usually |
| one for integer and one for floating point, and often will incur pipeline |
| stalls when a result that is needed is not yet ready. |
| |
| The descriptions in this section allow the specification of how much |
| time must elapse between the execution of an instruction and the time |
| when its result is used. It also allows specification of when the |
| execution of an instruction will delay execution of similar instructions |
| due to function unit conflicts. |
| |
| For the purposes of the specifications in this section, a machine is |
| divided into @dfn{function units}, each of which execute a specific |
| class of instructions in first-in-first-out order. Function units that |
| accept one instruction each cycle and allow a result to be used in the |
| succeeding instruction (usually via forwarding) need not be specified. |
| Classic RISC microprocessors will normally have a single function unit, |
| which we can call @samp{memory}. The newer ``superscalar'' processors |
| will often have function units for floating point operations, usually at |
| least a floating point adder and multiplier. |
| |
| @findex define_function_unit |
| Each usage of a function units by a class of insns is specified with a |
| @code{define_function_unit} expression, which looks like this: |
| |
| @smallexample |
| (define_function_unit @var{name} @var{multiplicity} @var{simultaneity} |
| @var{test} @var{ready-delay} @var{issue-delay} |
| [@var{conflict-list}]) |
| @end smallexample |
| |
| @var{name} is a string giving the name of the function unit. |
| |
| @var{multiplicity} is an integer specifying the number of identical |
| units in the processor. If more than one unit is specified, they will |
| be scheduled independently. Only truly independent units should be |
| counted; a pipelined unit should be specified as a single unit. (The |
| only common example of a machine that has multiple function units for a |
| single instruction class that are truly independent and not pipelined |
| are the two multiply and two increment units of the CDC 6600.) |
| |
| @var{simultaneity} specifies the maximum number of insns that can be |
| executing in each instance of the function unit simultaneously or zero |
| if the unit is pipelined and has no limit. |
| |
| All @code{define_function_unit} definitions referring to function unit |
| @var{name} must have the same name and values for @var{multiplicity} and |
| @var{simultaneity}. |
| |
| @var{test} is an attribute test that selects the insns we are describing |
| in this definition. Note that an insn may use more than one function |
| unit and a function unit may be specified in more than one |
| @code{define_function_unit}. |
| |
| @var{ready-delay} is an integer that specifies the number of cycles |
| after which the result of the instruction can be used without |
| introducing any stalls. |
| |
| @var{issue-delay} is an integer that specifies the number of cycles |
| after the instruction matching the @var{test} expression begins using |
| this unit until a subsequent instruction can begin. A cost of @var{N} |
| indicates an @var{N-1} cycle delay. A subsequent instruction may also |
| be delayed if an earlier instruction has a longer @var{ready-delay} |
| value. This blocking effect is computed using the @var{simultaneity}, |
| @var{ready-delay}, @var{issue-delay}, and @var{conflict-list} terms. |
| For a normal non-pipelined function unit, @var{simultaneity} is one, the |
| unit is taken to block for the @var{ready-delay} cycles of the executing |
| insn, and smaller values of @var{issue-delay} are ignored. |
| |
| @var{conflict-list} is an optional list giving detailed conflict costs |
| for this unit. If specified, it is a list of condition test expressions |
| to be applied to insns chosen to execute in @var{name} following the |
| particular insn matching @var{test} that is already executing in |
| @var{name}. For each insn in the list, @var{issue-delay} specifies the |
| conflict cost; for insns not in the list, the cost is zero. If not |
| specified, @var{conflict-list} defaults to all instructions that use the |
| function unit. |
| |
| Typical uses of this vector are where a floating point function unit can |
| pipeline either single- or double-precision operations, but not both, or |
| where a memory unit can pipeline loads, but not stores, etc. |
| |
| As an example, consider a classic RISC machine where the result of a |
| load instruction is not available for two cycles (a single ``delay'' |
| instruction is required) and where only one load instruction can be executed |
| simultaneously. This would be specified as: |
| |
| @smallexample |
| (define_function_unit "memory" 1 1 (eq_attr "type" "load") 2 0) |
| @end smallexample |
| |
| For the case of a floating point function unit that can pipeline either |
| single or double precision, but not both, the following could be specified: |
| |
| @smallexample |
| (define_function_unit |
| "fp" 1 0 (eq_attr "type" "sp_fp") 4 4 [(eq_attr "type" "dp_fp")]) |
| (define_function_unit |
| "fp" 1 0 (eq_attr "type" "dp_fp") 4 4 [(eq_attr "type" "sp_fp")]) |
| @end smallexample |
| |
| @strong{Note:} The scheduler attempts to avoid function unit conflicts |
| and uses all the specifications in the @code{define_function_unit} |
| expression. It has recently come to our attention that these |
| specifications may not allow modeling of some of the newer |
| ``superscalar'' processors that have insns using multiple pipelined |
| units. These insns will cause a potential conflict for the second unit |
| used during their execution and there is no way of representing that |
| conflict. We welcome any examples of how function unit conflicts work |
| in such processors and suggestions for their representation. |
| @end ifset |