V10/cmd/gcc/internals-5



File: internals,  Node: Multi-Alternative,  Next: Class Preferences,  Prev: Simple Constraints,  Up: Constraints

Multiple Alternative Constraints
--------------------------------

Sometimes a single instruction has multiple alternative sets of possible
operands.  For example, on the 68000, a logical-or instruction can combine
register or an immediate value into memory, or it can combine any kind of
operand into a register; but it cannot combine one memory location into
another.

These constraints are represented as multiple alternatives.  An alternative
can be described by a series of letters for each operand.  The overall
constraint for an operand is made from the letters for this operand from
the first alternative, a comma, the letters for this operand from the
second alternative, a comma, and so on until the last alternative.  Here is
how it is done for fullword logical-or on the 68000:

     (define_insn "iorsi3"
       [(set (match_operand:SI 0 "general_operand" "=%m,d")
             (ior:SI (match_operand:SI 1 "general_operand" "0,0")
                     (match_operand:SI 2 "general_operand" "dKs,dmKs")))]
       ...)

The first alternative has `m' (memory) for operand 0, `0' for operand 1
(meaning it must match operand 0), and `dKs' for operand 2.  The second
alternative has `d' (data register) for operand 0, `0' for operand 1, and
`dmKs' for operand 2.  The `=' and `%' in the constraint for operand 0 are
not part of any alternative; their meaning is explained in the next section.

If all the operands fit any one alternative, the instruction is valid. 
Otherwise, for each alternative, the compiler counts how many instructions
must be added to copy the operands so that that alternative applies.  The
alternative requiring the least copying is chosen.  If two alternatives
need the same amount of copying, the one that comes first is chosen.  These
choices can be altered with the `?' and `!' characters:

`?'
     Disparage slightly the alternative that the `?' appears in, as a
     choice when no alternative applies exactly.  The compiler regards this
     alternative as one unit more costly for each `?' that appears in it.

`!'
     Disparage severely the alternative that the `!' appears in.  When
     operands must be copied into registers, the compiler will never choose
     this alternative as the one to strive for.

When an insn pattern has multiple alternatives in its constraints, often
the appearance of the assembler code determined mostly by which alternative
was matched.  When this is so, the C code for writing the assembler code
can use the variable `which_alternative', which is the ordinal number of
the alternative that was actually satisfied (0 for the first, 1 for the
second alternative, etc.).  For example:

     (define_insn ""
       [(set (match_operand:SI 0 "general_operand" "r,m")
             (const_int 0))]
       ""
       "*
       return (which_alternative == 0
               ? \"clrreg %0\" : \"clrmem %0\");
       ")


File: internals,  Node: Class Preferences,  Next: Modifiers,  Prev: Multi-Alternative,  Up: Constraints

Register Class Preferences
--------------------------

The operand constraints have another function: they enable the compiler to
decide which kind of hardware register a pseudo register is best allocated
to.  The compiler examines the constraints that apply to the insns that use
the pseudo register, looking for the machine-dependent letters such as `d'
and `a' that specify classes of registers.  The pseudo register is put in
whichever class gets the most ``votes''.  The constraint letters `g' and
`r' also vote: they vote in favor of a general register.  The machine
description says which registers are considered general.

Of course, on some machines all registers are equivalent, and no register
classes are defined.  Then none of this complexity is relevant.


File: internals,  Node: Modifiers,  Next: No Constraints,  Prev: Class Preferences,  Up: Constraints

Constraint Modifier Characters
------------------------------

`='
     Means that this operand is write-only for this instruction: the
     previous value is discarded and replaced by output data.

`+'
     Means that this operand is both read and written by the instruction.

     When the compiler fixes up the operands to satisfy the constraints, it
     needs to know which operands are inputs to the instruction and which
     are outputs from it.  `=' identifies an output; `+' identifies an
     operand that is both input and output; all other operands are assumed
     to be input only.

`&'
     Means (in a particular alternative) that this operand is written
     before the instruction is finished using the input operands. 
     Therefore, this operand may not lie in a register that is used as an
     input operand or as part of any memory address.

     `&' applies only to the alternative in which it is written.  In
     constraints with multiple alternatives, sometimes one alternative
     requires `&' while others do not.  See, for example, the `movdf' insn
     of the 68000.

     `&' does not obviate the need to write `='.

`%'
     Declares the instruction to be commutative for this operand and the
     following operand.  This means that the compiler may interchange the
     two operands if that is the cheapest way to make all operands fit the
     constraints.  This is often used in patterns for addition instructions
     that really have only two operands: the result must go in one of the
     arguments.  Here for example, is how the 68000 halfword-add
     instruction is defined:

          (define_insn "addhi3"
            [(set (match_operand:HI 0 "general_operand" "=m,r")
               (plus:HI (match_operand:HI 1 "general_operand" "%0,0")
                        (match_operand:HI 2 "general_operand" "di,g")))]
            ...)

     Note that in previous versions of GNU CC the `%' constraint modifier
     always applied to operands 1 and 2 regardless of which operand it was
     written in.  The usual custom was to write it in operand 0.  Now it
     must be in operand 1 if the operands to be exchanged are 1 and 2.

`#'
     Says that all following characters, up to the next comma, are to be
     ignored as a constraint.  They are significant only for choosing
     register preferences.

`*'
     Says that the following character should be ignored when choosing
     register preferences.  `*' has no effect on the meaning of the
     constraint as a constraint.

     Here is an example: the 68000 has an instruction to sign-extend a
     halfword in a data register, and can also sign-extend a value by
     copying it into an address register.  While either kind of register is
     acceptable, the constraints on an address-register destination are
     less strict, so it is best if register allocation makes an address
     register its goal.  Therefore, `*' is used so that the `d' constraint
     letter (for data register) is ignored when computing register
     preferences.

          (define_insn "extendhisi2"
            [(set (match_operand:SI 0 "general_operand" "=*d,a")
                  (sign_extend:SI
                   (match_operand:HI 1 "general_operand" "0,g")))]
            ...)


File: internals,  Node: No Constraints,  Prev: Modifiers,  Up: Constraints

Not Using Constraints
---------------------

Some machines are so clean that operand constraints are not required.  For
example, on the Vax, an operand valid in one context is valid in any other
context.  On such a machine, every operand constraint would be `g',
excepting only operands of ``load address'' instructions which are written
as if they referred to a memory location's contents but actual refer to its
address.  They would have constraint `p'.

For such machines, instead of writing `g' and `p' for all the constraints,
you can choose to write a description with empty constraints.  Then you
write `""' for the constraint in every `match_operand'.  Address operands
are identified by writing an `address' expression around the
`match_operand', not by their constraints.

When the machine description has just empty constraints, certain parts of
compilation are skipped, making the compiler faster.


File: internals,  Node: Standard Names,  Next: Pattern Ordering,  Prev: Constraints,  Up: Machine Desc

Standard Names for Patterns Used in Generation
==============================================

Here is a table of the instruction names that are meaningful in the RTL
generation pass of the compiler.  Giving one of these names to an
instruction pattern tells the RTL generation pass that it can use the
pattern in to accomplish a certain task.

`movM'
     Here M is a two-letter machine mode name, in lower case.  This
     instruction pattern moves data with that machine mode from operand 1
     to operand 0.  For example, `movsi' moves full-word data.

     If operand 0 is a `subreg' with mode M of a register whose natural
     mode is wider than M, the effect of this instruction is to store the
     specified value in the part of the register that corresponds to mode
     M.  The effect on the rest of the register is undefined.

     This class of patterns is special in several ways.  First of all, each
     of these names *must* be defined, because there is no other way to
     copy a datum from one place to another.

     Second, these patterns are not used solely in the RTL generation pass.
      Even the reload pass can generate move insns to copy values from
     stack slots into temporary registers.  When it does so, one of the
     operands is a hard register and the other is an operand that can have
     a reload.

     Therefore, when given such a pair of operands, the pattern must
     generate RTL which needs no temporary registers---no registers other
     than the operands.  For example, if you support the pattern with a
     `define_expand', then in such a case you mustn't call `force_reg' or
     any other such function which might generate new pseudo registers.

     This requirement exists even for subword modes on a RISC machine where
     fetching those modes from memory normally requires several insns and
     some temporary registers.  Look in `spur.md' to see how the
     requirement is satisfied.

     The variety of operands that have reloads depends on the rest of the
     machine description, but typically on a RISC machine these can only be
     pseudo registers that did not get hard registers, while on other
     machines explicit memory references will get optional reloads.

`movstrictM'
     Like `movM' except that if operand 0 is a `subreg' with mode M of a
     register whose natural mode is wider, the `movstrictM' instruction is
     guaranteed not to alter any of the register except the part which
     belongs to mode M.

`addM3'
     Add operand 2 and operand 1, storing the result in operand 0.  All
     operands must have mode M.  This can be used even on two-address
     machines, by means of constraints requiring operands 1 and 0 to be the
     same location.

`subM3', `mulM3', `umulM3', `divM3', `udivM3', `modM3', `umodM3', `andM3', `iorM3', `xorM3'
     Similar, for other arithmetic operations.

`andcbM3'
     Bitwise logical-and operand 1 with the complement of operand 2 and
     store the result in operand 0.

`mulhisi3'
     Multiply operands 1 and 2, which have mode `HImode', and store a
     `SImode' product in operand 0.

`mulqihi3', `mulsidi3'
     Similar widening-multiplication instructions of other widths.

`umulqihi3', `umulhisi3', `umulsidi3'
     Similar widening-multiplication instructions that do unsigned
     multiplication.

`divmodM4'
     Signed division that produces both a quotient and a remainder. 
     Operand 1 is divided by operand 2 to produce a quotient stored in
     operand 0 and a remainder stored in operand 3.

`udivmodM4'
     Similar, but does unsigned division.

`divmodMN4'
     Like `divmodM4' except that only the dividend has mode M; the divisor,
     quotient and remainder have mode N.  For example, the Vax has a
     `divmoddisi4' instruction (but it is omitted from the machine
     description, because it is so slow that it is faster to compute
     remainders by the circumlocution that the compiler will use if this
     instruction is not available).

`ashlM3'
     Arithmetic-shift operand 1 left by a number of bits specified by
     operand 2, and store the result in operand 0.  Operand 2 has mode
     `SImode', not mode M.

`ashrM3', `lshlM3', `lshrM3', `rotlM3', `rotrM3'
     Other shift and rotate instructions.

     Logical and arithmetic left shift are the same.  Machines that do not
     allow negative shift counts often have only one instruction for
     shifting left.  On such machines, you should define a pattern named
     `ashlM3' and leave `lshlM3' undefined.

`negM2'
     Negate operand 1 and store the result in operand 0.

`absM2'
     Store the absolute value of operand 1 into operand 0.

`sqrtM2'
     Store the square root of operand 1 into operand 0.

`ffsM2'
     Store into operand 0 one plus the index of the least significant 1-bit
     of operand 1.  If operand 1 is zero, store zero.  M is the mode of
     operand 0; operand 1's mode is specified by the instruction pattern,
     and the compiler will convert the operand to that mode before
     generating the instruction.

`one_cmplM2'
     Store the bitwise-complement of operand 1 into operand 0.

`cmpM'
     Compare operand 0 and operand 1, and set the condition codes.  The RTL
     pattern should look like this:

          (set (cc0) (minus (match_operand:M 0 ...)
                            (match_operand:M 1 ...)))

     Each such definition in the machine description, for integer mode M,
     must have a corresponding `tstM' pattern, because optimization can
     simplify the compare into a test when operand 1 is zero.

`tstM'
     Compare operand 0 against zero, and set the condition codes.  The RTL
     pattern should look like this:

          (set (cc0) (match_operand:M 0 ...))

`movstrM'
     Block move instruction.  The addresses of the destination and source
     strings are the first two operands, and both are in mode `Pmode'.  The
     number of bytes to move is the third operand, in mode M.

`cmpstrM'
     Block compare instruction, with operands like `movstrM' except that
     the two memory blocks are compared byte by byte in lexicographic
     order.  The effect of the instruction is to set the condition codes.

`floatMN2'
     Convert operand 1 (valid for fixed point mode M) to floating point
     MODE N and store in operand 0 (which has mode N).

`fixMN2'
     Convert operand 1 (valid for floating point mode M) to fixed point
     MODE N as a signed number and store in operand 0 (which has mode N). 
     This instruction's result is defined only when the value of operand 1
     is an integer.

`fixunsMN2'
     Convert operand 1 (valid for floating point mode M) to fixed point
     MODE N as an unsigned number and store in operand 0 (which has mode
     N).  This instruction's result is defined only when the value of
     operand 1 is an integer.

`ftruncM2'
     Convert operand 1 (valid for floating point mode M) to an integer
     value, still represented in floating point mode M, and store it in
     operand 0 (valid for floating point mode M).

`fix_truncMN2'
     Like `fixMN2' but works for any floating point value of mode M by
     converting the value to an integer.

`fixuns_truncMN2'
     Like `fixunsMN2' but works for any floating point value of mode M by
     converting the value to an integer.

`truncMN'
     Truncate operand 1 (valid for mode M) to mode N and store in operand 0
     (which has mode N).  Both modes must be fixed point or both floating
     point.

`extendMN'
     Sign-extend operand 1 (valid for mode M) to mode N and store in
     operand 0 (which has mode N).  Both modes must be fixed point or both
     floating point.

`zero_extendMN'
     Zero-extend operand 1 (valid for mode M) to mode N and store in
     operand 0 (which has mode N).  Both modes must be fixed point.

`extv'
     Extract a bit-field from operand 1 (a register or memory operand),
     where operand 2 specifies the width in bits and operand 3 the starting
     bit, and store it in operand 0.  Operand 0 must have `Simode'. 
     Operand 1 may have mode `QImode' or `SImode'; often `SImode' is
     allowed only for registers.  Operands 2 and 3 must be valid for
     `SImode'.

     The RTL generation pass generates this instruction only with constants
     for operands 2 and 3.

     The bit-field value is sign-extended to a full word integer before it
     is stored in operand 0.

`extzv'
     Like `extv' except that the bit-field value is zero-extended.

`insv'
     Store operand 3 (which must be valid for `SImode') into a bit-field in
     operand 0, where operand 1 specifies the width in bits and operand 2
     the starting bit.  Operand 0 may have mode `QImode' or `SImode'; often
     `SImode' is allowed only for registers.  Operands 1 and 2 must be
     valid for `SImode'.

     The RTL generation pass generates this instruction only with constants
     for operands 1 and 2.

`sCOND'
     Store zero or nonzero in the operand according to the condition codes.
      Value stored is nonzero iff the condition COND is true.  COND is the
     name of a comparison operation expression code, such as `eq', `lt' or
     `leu'.

     You specify the mode that the operand must have when you write the
     `match_operand' expression.  The compiler automatically sees which
     mode you have used and supplies an operand of that mode.

     The value stored for a true condition must have 1 as its low bit. 
     Otherwise the instruction is not suitable and must be omitted from the
     machine description.  You must tell the compiler exactly which value
     is stored by defining the macro `STORE_FLAG_VALUE'.

`bCOND'
     Conditional branch instruction.  Operand 0 is a `label_ref' that
     refers to the label to jump to.  Jump if the condition codes meet
     condition COND.

`call'
     Subroutine call instruction.  Operand 1 is the number of bytes of
     arguments pushed (in mode `SImode'), and operand 0 is the function to
     call.  Operand 0 should be a `mem' RTX whose address is the address of
     the function.

`return'
     Subroutine return instruction.  This instruction pattern name should
     be defined only if a single instruction can do all the work of
     returning from a function.

`casesi'
     Instruction to jump through a dispatch table, including bounds checking.
      This instruction takes five operands:

       1. The index to dispatch on, which has mode `SImode'.

       2. The lower bound for indices in the table, an integer constant.

       3. The upper bound for indices in the table, an integer constant.

       4. A label to jump to if the index has a value outside the bounds.  (If the
          machine-description macro `CASE_DROPS_THROUGH' is defined, then
          an out-of-bounds index drops through to the code following the
          jump table instead of jumping to this label.  In that case, this
          label is not actually used by the `casesi' instruction, but it is
          always provided as an operand.)

       5. A label that precedes the table itself.

     The table is a `addr_vec' or `addr_diff_vec' inside of a `jump_insn'. 
     The number of elements in the table is one plus the difference between
     the upper bound and the lower bound.

`tablejump'
     Instruction to jump to a variable address.  This is a low-level
     capability which can be used to implement a dispatch table when there
     is no `casesi' pattern.

     This pattern requires two operands: the address or offset, and a label
     which should immediately precede the jump table.  If the macro
     `CASE_VECTOR_PC_RELATIVE' is defined then the first operand is an
     absolute address to jump to; otherwise, it is an offset which counts
     from the address of the table.

     The `tablejump' insn is always the last insn before the jump table it
     uses.  Its assembler code normally has no need to use the second
     operand, but you should incorporate it in the RTL pattern so that the
     jump optimizer will not delete the table as unreachable code.


File: internals,  Node: Pattern Ordering,  Next: Dependent Patterns,  Prev: Standard Names,  Up: Machine Desc

When the Order of Patterns Matters
==================================

Sometimes an insn can match more than one instruction pattern.  Then the
pattern that appears first in the machine description is the one used. 
Therefore, more specific patterns (patterns that will match fewer things)
and faster instructions (those that will produce better code when they do
match) should usually go first in the description.

In some cases the effect of ordering the patterns can be used to hide a
pattern when it is not valid.  For example, the 68000 has an instruction
for converting a fullword to floating point and another for converting a
byte to floating point.  An instruction converting an integer to floating
point could match either one.  We put the pattern to convert the fullword
first to make sure that one will be used rather than the other.  (Otherwise
a large integer might be generated as a single-byte immediate quantity,
which would not work.) Instead of using this pattern ordering it would be
possible to make the pattern for convert-a-byte smart enough to deal
properly with any constant value.


File: internals,  Node: Dependent Patterns,  Next: Jump Patterns,  Prev: Pattern Ordering,  Up: Machine Desc

Interdependence of Patterns
===========================

Every machine description must have a named pattern for each of the
conditional branch names `bCOND'.  The recognition template must always
have the form

     (set (pc)
          (if_then_else (COND (cc0) (const_int 0))
                        (label_ref (match_operand 0 "" ""))
                        (pc)))

In addition, every machine description must have an anonymous pattern for
each of the possible reverse-conditional branches.  These patterns look like

     (set (pc)
          (if_then_else (COND (cc0) (const_int 0))
                        (pc)
                        (label_ref (match_operand 0 "" ""))))

They are necessary because jump optimization can turn direct-conditional
branches into reverse-conditional branches.

The compiler does more with RTL than just create it from patterns and
recognize the patterns: it can perform arithmetic expression codes when
constant values for their operands can be determined.  As a result,
sometimes having one pattern can require other patterns.  For example, the
Vax has no `and' instruction, but it has `and not' instructions.  Here is
the definition of one of them:

     (define_insn "andcbsi2"
       [(set (match_operand:SI 0 "general_operand" "")
             (and:SI (match_dup 0)
                     (not:SI (match_operand:SI
                               1 "general_operand" ""))))]
       ""
       "bicl2 %1,%0")

If operand 1 is an explicit integer constant, an instruction constructed
using that pattern can be simplified into an `and' like this:

     (set (reg:SI 41)
          (and:SI (reg:SI 41)
                  (const_int 0xffff7fff)))

(where the integer constant is the one's complement of what appeared in the
original instruction).

To avoid a fatal error, the compiler must have a pattern that recognizes
such an instruction.  Here is what is used:

     (define_insn ""
       [(set (match_operand:SI 0 "general_operand" "")
             (and:SI (match_dup 0)
                     (match_operand:SI 1 "general_operand" "")))]
       "GET_CODE (operands[1]) == CONST_INT"
       "*
     { operands[1]
         = gen_rtx (CONST_INT, VOIDmode, ~INTVAL (operands[1]));
       return \"bicl2 %1,%0\";
     }")

Whereas a pattern to match a general `and' instruction is impossible to
support on the Vax, this pattern is possible because it matches only a
constant second argument: a special case that can be output as an `and not'
instruction.

A ``compare'' instruction whose RTL looks like this:

     (set (cc0) (minus OPERAND (const_int 0)))

may be simplified by optimization into a ``test'' like this:

     (set (cc0) OPERAND)

So in the machine description, each ``compare'' pattern for an integer mode
must have a corresponding ``test'' pattern that will match the result of
such simplification.

In some cases machines support instructions identical except for the
machine mode of one or more operands.  For example, there may be
``sign-extend halfword'' and ``sign-extend byte'' instructions whose
patterns are

     (set (match_operand:SI 0 ...)
          (extend:SI (match_operand:HI 1 ...)))
     
     (set (match_operand:SI 0 ...)
          (extend:SI (match_operand:QI 1 ...)))

Constant integers do not specify a machine mode, so an instruction to
extend a constant value could match either pattern.  The pattern it
actually will match is the one that appears first in the file.  For correct
results, this must be the one for the widest possible mode (`HImode',
here).  If the pattern matches the `QImode' instruction, the results will
be incorrect if the constant value does not actually fit that mode.

Such instructions to extend constants are rarely generated because they are
optimized away, but they do occasionally happen in nonoptimized compilations.


File: internals,  Node: Jump Patterns,  Next: Peephole Definitions,  Prev: Dependent Patterns,  Up: Machine Desc

Defining Jump Instruction Patterns
==================================

GNU CC assumes that the machine has a condition code.  A comparison insn
sets the condition code, recording the results of both signed and unsigned
comparison of the given operands.  A separate branch insn tests the
condition code and branches or not according its value.  The branch insns
come in distinct signed and unsigned flavors.  Many common machines, such
as the Vax, the 68000 and the 32000, work this way.

Some machines have distinct signed and unsigned compare instructions, and
only one set of conditional branch instructions.  The easiest way to handle
these machines is to treat them just like the others until the final stage
where assembly code is written.  At this time, when outputting code for the
compare instruction, peek ahead at the following branch using `NEXT_INSN
(insn)'.  (The variable `insn' refers to the insn being output, in the
output-writing code in an instruction pattern.)  If the RTL says that is an
unsigned branch, output an unsigned compare; otherwise output a signed
compare.  When the branch itself is output, you can treat signed and
unsigned branches identically.

The reason you can do this is that GNU CC always generates a pair of
consecutive RTL insns, one to set the condition code and one to test it,
and keeps the pair inviolate until the end.

To go with this technique, you must define the machine-description macro
`NOTICE_UPDATE_CC' to do `CC_STATUS_INIT'; in other words, no compare
instruction is superfluous.

Some machines have compare-and-branch instructions and no condition code. 
A similar technique works for them.  When it is time to ``output'' a
compare instruction, record its operands in two static variables.  When
outputting the branch-on-condition-code instruction that follows, actually
output a compare-and-branch instruction that uses the remembered operands.

It also works to define patterns for compare-and-branch instructions.  In
optimizing compilation, the pair of compare and branch instructions will be
combined accoprding to these patterns.  But this does not happen if
optimization is not requested.  So you must use one of the solutions above
in addition to any special patterns you define.


File: internals,  Node: Peephole Definitions,  Next: Expander Definitions,  Prev: Jump Patterns,  Up: Machine Desc

Defining Machine-Specific Peephole Optimizers
=============================================

In addition to instruction patterns the `md' file may contain definitions
of machine-specific peephole optimizations.

The combiner does not notice certain peephole optimizations when the data
flow in the program does not suggest that it should try them.  For example,
sometimes two consecutive insns related in purpose can be combined even
though the second one does not appear to use a register computed in the
first one.  A machine-specific peephole optimizer can detect such
opportunities.

A definition looks like this:

     (define_peephole
       [INSN-PATTERN-1
        INSN-PATTERN-2
        ...]
       "CONDITION"
       "TEMPLATE")

In this skeleton, INSN-PATTERN-1 and so on are patterns to match
consecutive instructions.  The optimization applies to a sequence of
instructions when INSN-PATTERN-1 matches the first one, INSN-PATTERN-2
matches the next, and so on.

INSN-PATTERN-1 and so on look *almost* like the second operand of
`define_insn'.  There is one important difference: this pattern is an RTX,
not a vector.  If the `define_insn' pattern would be a vector of one
element, the INSN-PATTERN should be just that element, no vector.  If the
`define_insn' pattern would have multiple elements then the INSN-PATTERN
must place the vector inside an explicit `parallel' RTX.

The operands of the instructions are matched with `match_operands' and
`match_dup', as usual).  What is not usual is that the operand numbers
apply to all the instruction patterns in the definition.  So, you can check
for identical operands in two instructions by using `match_operand' in one
instruction and `match_dup' in the other.

The operand constraints used in `match_operand' patterns do not have any
direct effect on the applicability of the optimization, but they will be
validated afterward, so write constraints that are sure to fit whenever the
optimization is applied.  It is safe to use `"g"' for each operand.

Once a sequence of instructions matches the patterns, the CONDITION is
checked.  This is a C expression which makes the final decision whether to
perform the optimization (do so if the expression is nonzero).  If
CONDITION is omitted (in other words, the string is empty) then the
optimization is applied to every sequence of instructions that matches the
patterns.

The defined peephole optimizations are applied after register allocation is
complete.  Therefore, the optimizer can check which operands have ended up
in which kinds of registers, just by looking at the operands.

The way to refer to the operands in CONDITION is to write `operands[I]' for
operand number I (as matched by `(match_operand I ...)').  Use the variable
`insn' to refer to the last of the insns being matched; use `PREV_INSN' to
find the preceding insns (but be careful to skip over any `note' insns that
intervene).

When optimizing computations with intermediate results, you can use
CONDITION to match only when the intermediate results are not used
elsewhere.  Use the C expression `dead_or_set_p (INSN, OP)', where INSN is
the insn in which you expect the value to be used for the last time (from
the value of `insn', together with use of `PREV_INSN'), and OP is the
intermediate value (from `operands[I]').

Applying the optimization means replacing the sequence of instructions with
one new instruction.  The TEMPLATE controls ultimate output of assembler
code for this combined instruction.  It works exactly like the template of
a `define_insn'.  Operand numbers in this template are the same ones used
in matching the original sequence of instructions.

The result of a defined peephole optimizer does not need to match any of
the instruction patterns, and it does not have an opportunity to match
them.  The peephole optimizer definition itself serves as the instruction
pattern to control how the instruction is output.

Defined peephole optimizers are run in the last jump optimization pass, so
the instructions they produce are never combined or rearranged
automatically in any way.

Here is an example, taken from the 68000 machine description:

     (define_peephole
       [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4)))
        (set (match_operand:DF 0 "register_operand" "f")
             (match_operand:DF 1 "register_operand" "ad"))]
       "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])"
       "*
     {
       rtx xoperands[2];
       xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1);
     #ifdef MOTOROLA
       output_asm_insn (\"move.l %1,(sp)\", xoperands);
       output_asm_insn (\"move.l %1,-(sp)\", operands);
       return \"fmove.d (sp)+,%0\";
     #else
       output_asm_insn (\"movel %1,sp@\", xoperands);
       output_asm_insn (\"movel %1,sp@-\", operands);
       return \"fmoved sp@+,%0\";
     #endif
     }
     ")

The effect of this optimization is to change

     jbsr _foobar
     addql #4,sp
     movel d1,sp@-
     movel d0,sp@-
     fmoved sp@+,fp0

into

     jbsr _foobar
     movel d1,sp@
     movel d0,sp@-
     fmoved sp@+,fp0


File: internals,  Node: Expander Definitions,  Prev: Peephole Definitions,  Up: Machine Desc

Defining RTL Sequences for Code Generation
==========================================

On some target machines, some standard pattern names for RTL generation
cannot be handled with single insn, but a sequence of RTL insns can
represent them.  For these target machines, you can write a `define_expand'
to specify how to generate the sequence of RTL.

A `define_expand' is an RTL expression that looks almost like a
`define_insn'; but, unlike the latter, a `define_expand' is used only for
RTL generation and it can produce more than one RTL insn.

A `define_expand' RTX has four operands:

   * The name.  Each `define_expand' must have a name, since the only use
     for it is to refer to it by name.

   * The RTL template.  This is just like the RTL template for a
     `define_peephole' in that it is a vector of RTL expressions each being
     one insn.

   * The condition, a string containing a C expression.  This expression is
     used to express how the availability of this pattern depends on
     subclasses of target machine, selected by command-line options when
     GNU CC is run.  This is just like the condition of a `define_insn'
     that has a standard name.

   * The preparation statements, a string containing zero or more C
     statements which are to be executed before RTL code is generated from
     the RTL template.

     Usually these statements prepare temporary registers for use as
     internal operands in the RTL template, but they can also generate RTL
     insns directly by calling routines such as `emit_insn', etc.  Any such
     insns precede the ones that come from the RTL template.

The RTL template, in addition to controlling generation of RTL insns, also
describes the operands that need to be specified when this pattern is used.
 In particular, it gives a predicate for each operand.

A true operand, which need to be specified in order to generate RTL from
the pattern, should be described with a `match_operand' in its first
occurrence in the RTL template.  This enters information on the operand's
predicate into the tables that record such things.  GNU CC uses the
information to preload the operand into a register if that is required for
valid RTL code.  If the operand is referred to more than once, subsequent
references should use `match_dup'.

The RTL template may also refer to internal ``operands'' which are
temporary registers or labels used only within the sequence made by the
`define_expand'.  Internal operands are substituted into the RTL template
with `match_dup', never with `match_operand'.  The values of the internal
operands are not passed in as arguments by the compiler when it requests
use of this pattern.  Instead, they are computed within the pattern, in the
preparation statements.  These statements compute the values and store them
into the appropriate elements of `operands' so that `match_dup' can find
them.

There are two special macros defined for use in the preparation statements:
`DONE' and `FAIL'.  Use them with a following semicolon, as a statement.

`DONE'
     Use the `DONE' macro to end RTL generation for the pattern.  The only
     RTL insns resulting from the pattern on this occasion will be those
     already emitted by explicit calls to `emit_insn' within the
     preparation statements; the RTL template will not be generated.

`FAIL'
     Make the pattern fail on this occasion.  When a pattern fails, it
     means that the pattern was not truly available.  The calling routines
     in the compiler will try other strategies for code generation using
     other patterns.

     Failure is currently supported only for binary operations (addition,
     multiplication, shifting, etc.).

     Do not emit any insns explicitly with `emit_insn' before failing.

Here is an example, the definition of left-shift for the SPUR chip:

     (define_expand "ashlsi3"
       [(set (match_operand:SI 0 "register_operand" "")
             (ashift:SI
               (match_operand:SI 1 "register_operand" "")
               (match_operand:SI 2 "nonmemory_operand" "")))]
       ""
       "
     {
       if (GET_CODE (operands[2]) != CONST_INT
           || (unsigned) INTVAL (operands[2]) > 3)
         FAIL;
     }")

This example uses `define_expand' so that it can generate an RTL insn for
shifting when the shift-count is in the supported range of 0 to 3 but fail
in other cases where machine insns aren't available.  When it fails, the
compiler tries another strategy using different patterns (such as, a
library call).

If the compiler were able to handle nontrivial condition-strings in
patterns with names, then there would be possible to use a `define_insn' in
that case.  Here is another case (zero-extension on the 68000) which makes
more use of the power of `define_expand':

     (define_expand "zero_extendhisi2"
       [(set (match_operand:SI 0 "general_operand" "")
             (const_int 0))
        (set (strict_low_part 
               (subreg:HI
                 (match_operand:SI 0 "general_operand" "")
                 0))
             (match_operand:HI 1 "general_operand" ""))]
       ""
       "operands[1] = make_safe_from (operands[1], operands[0]);")

Here two RTL insns are generated, one to clear the entire output operand
and the other to copy the input operand into its low half.  This sequence
is incorrect if the input operand refers to [the old value of] the output
operand, so the preparation statement makes sure this isn't so.  The
function `make_safe_from' copies the `operands[1]' into a temporary
register if it refers to `operands[0]'.  It does this by emitting another
RTL insn.

Finally, a third example shows the use of an internal operand. 
Zero-extension on the SPUR chip is done by `and'-ing the result against a
halfword mask.  But this mask cannot be represented by a `const_int'
because the constant value is too large to be legitimate on this machine. 
So it must be copied into a register with `force_reg' and then the register
used in the `and'.

     (define_expand "zero_extendhisi2"
       [(set (match_operand:SI 0 "register_operand" "")
             (and:SI (subreg:SI
                       (match_operand:HI 1 "register_operand" "")
                       0)
                     (match_dup 2)))]
       ""
       "operands[2]
          = force_reg (SImode, gen_rtx (CONST_INT,
                                        VOIDmode, 65535)); ")


File: internals,  Node: Machine Macros,  Next: Config,  Prev: Machine Desc,  Up: Top

Machine Description Macros
**************************

The other half of the machine description is a C header file conventionally
given the name `tm-MACHINE.h'.  The file `tm.h' should be a link to it. 
The header file `config.h' includes `tm.h' and most compiler source files
include `config.h'.

* Menu:

* Run-time Target::     Defining -m options like -m68000 and -m68020.
* Storage Layout::      Defining sizes and alignments of data types.
* Registers::           Naming and describing the hardware registers.
* Register Classes::    Defining the classes of hardware registers.
* Stack Layout::        Defining which way the stack grows and by how much.
* Library Names::       Specifying names of subroutines to call automatically.
* Addressing Modes::    Defining addressing modes valid for memory operands.
* Condition Code::      Defining how insns update the condition code.
* Assembler Format::    Defining how to write insns and pseudo-ops to output.
* Misc::                Everything else.



File: internals,  Node: Run-time Target,  Next: Storage Layout,  Prev: Machine Macros,  Up: Machine Macros

Run-time Target Specification
=============================

`CPP_PREDEFINES'
     Define this to be a string constant containing `-D' options to define
     the predefined macros that identify this machine and system.

     For example, on the Sun, one can use the value

          "-Dmc68000 -Dsun -Dunix"

`extern int target_flags;'
     This declaration should be present.

`TARGET_...'
     This series of macros is to allow compiler command arguments to enable
     or disable the use of optional features of the target machine.  For
     example, one machine description serves both the 68000 and the 68020;
     a command argument tells the compiler whether it should use 68020-only
     instructions or not.  This command argument works by means of a macro
     `TARGET_68020' that tests a bit in `target_flags'.

     Define a macro `TARGET_FEATURENAME' for each such option.  Its
     definition should test a bit in `target_flags'; for example:

          #define TARGET_68020 (target_flags & 1)

     One place where these macros are used is in the condition-expressions
     of instruction patterns.  Note how `TARGET_68020' appears frequently
     in the 68000 machine description file, `m68k.md'.  Another place they
     are used is in the definitions of the other macros in the
     `tm-MACHINE.h' file.

`TARGET_SWITCHES'
     This macro defines names of command options to set and clear bits in
     `target_flags'.  Its definition is an initializer with a subgrouping
     for each command option.

     Each subgrouping contains a string constant, that defines the option
     name, and a number, which contains the bits to set in `target_flags'. 
     A negative number says to clear bits instead; the negative of the
     number is which bits to clear.  The actual option name is made by
     appending `-m' to the specified name.

     One of the subgroupings should have a null string.  The number in this
     grouping is the default value for `target_flags'.  Any target options
     act starting with that value.

     Here is an example which defines `-m68000' and `-m68020' with opposite
     meanings, and picks the latter as the default:

          #define TARGET_SWITCHES \
            { { "68020", 1},      \
              { "68000", -1},     \
              { "", 1}}

Sometimes certain combinations of command options do not make sense on a
particular target machine.  You can define a macro `OVERRIDE_OPTIONS' to
take account of this.  This macro, if defined, is executed once just after
all the command options have been parsed.


File: internals,  Node: Storage Layout,  Next: Registers,  Prev: Run-time Target,  Up: Machine Macros

Storage Layout
==============

Note that the definitions of the macros in this table which are sizes or
alignments measured in bits do not need to be constant.  They can be C
expressions that refer to static variables, such as the `target_flags'. 
*note Run-time Target::.

`BITS_BIG_ENDIAN'
     Define this macro if the most significant bit in a byte has the lowest
     number.  This means that bit-field instructions count from the most
     significant bit.  If the machine has no bit-field instructions, this
     macro is irrelevant.

`BYTES_BIG_ENDIAN'
     Define this macro if the most significant byte in a word has the
     lowest number.

`WORDS_BIG_ENDIAN'
     Define this macro if, in a multiword object, the most significant word
     has the lowest number.

`BITS_PER_UNIT'
     Number of bits in an addressable storage unit (byte); normally 8.

`BITS_PER_WORD'
     Number of bits in a word; normally 32.

`UNITS_PER_WORD'
     Number of storage units in a word; normally 4.

`POINTER_SIZE'
     Width of a pointer, in bits.

`PARM_BOUNDARY'
     Alignment required for function parameters on the stack, in bits.

`STACK_BOUNDARY'
     Define this macro if you wish to preserve a certain alignment for the
     stack pointer at all times.  The definition is a C expression for the
     desired alignment (measured in bits).

`FUNCTION_BOUNDARY'
     Alignment required for a function entry point, in bits.

`BIGGEST_ALIGNMENT'
     Biggest alignment that any data type can require on this machine, in
     bits.

`EMPTY_FIELD_ALIGNMENT'
     Alignment in bits to be given to a structure bit field that follows an
     empty field such as `int : 0;'.

`STRUCTURE_SIZE_BOUNDARY'
     Number of bits which any structure or union's size must be a multiple
     of.  Each structure or union's size is rounded up to a multiple of this.

     If you do not define this macro, the default is the same as
     `BITS_PER_UNIT'.

`STRICT_ALIGNMENT'
     Define this if instructions will fail to work if given data not on the
     nominal alignment.  If instructions will merely go slower in that
     case, do not define this macro.