| @c \input texinfo | 
 | @c %**start of header | 
 | @c @setfilename agentexpr.info | 
 | @c @settitle GDB Agent Expressions | 
 | @c @setchapternewpage off | 
 | @c %**end of header | 
 |  | 
 | @c This file is part of the GDB manual. | 
 | @c | 
 | @c Copyright (C) 2003--2025 Free Software Foundation, Inc. | 
 | @c | 
 | @c See the file gdb.texinfo for copying conditions. | 
 |  | 
 | @node Agent Expressions | 
 | @appendix The GDB Agent Expression Mechanism | 
 |  | 
 | In some applications, it is not feasible for the debugger to interrupt | 
 | the program's execution long enough for the developer to learn anything | 
 | helpful about its behavior.  If the program's correctness depends on its | 
 | real-time behavior, delays introduced by a debugger might cause the | 
 | program to fail, even when the code itself is correct.  It is useful to | 
 | be able to observe the program's behavior without interrupting it. | 
 |  | 
 | Using GDB's @code{trace} and @code{collect} commands, the user can | 
 | specify locations in the program, and arbitrary expressions to evaluate | 
 | when those locations are reached.  Later, using the @code{tfind} | 
 | command, she can examine the values those expressions had when the | 
 | program hit the trace points.  The expressions may also denote objects | 
 | in memory --- structures or arrays, for example --- whose values GDB | 
 | should record; while visiting a particular tracepoint, the user may | 
 | inspect those objects as if they were in memory at that moment. | 
 | However, because GDB records these values without interacting with the | 
 | user, it can do so quickly and unobtrusively, hopefully not disturbing | 
 | the program's behavior. | 
 |  | 
 | When GDB is debugging a remote target, the GDB @dfn{agent} code running | 
 | on the target computes the values of the expressions itself.  To avoid | 
 | having a full symbolic expression evaluator on the agent, GDB translates | 
 | expressions in the source language into a simpler bytecode language, and | 
 | then sends the bytecode to the agent; the agent then executes the | 
 | bytecode, and records the values for GDB to retrieve later. | 
 |  | 
 | The bytecode language is simple; there are forty-odd opcodes, the bulk | 
 | of which are the usual vocabulary of C operands (addition, subtraction, | 
 | shifts, and so on) and various sizes of literals and memory reference | 
 | operations.  The bytecode interpreter operates strictly on machine-level | 
 | values --- various sizes of integers and floating point numbers --- and | 
 | requires no information about types or symbols; thus, the interpreter's | 
 | internal data structures are simple, and each bytecode requires only a | 
 | few native machine instructions to implement it.  The interpreter is | 
 | small, and strict limits on the memory and time required to evaluate an | 
 | expression are easy to determine, making it suitable for use by the | 
 | debugging agent in real-time applications. | 
 |  | 
 | @menu | 
 | * General Bytecode Design::     Overview of the interpreter. | 
 | * Bytecode Descriptions::       What each one does. | 
 | * Using Agent Expressions::     How agent expressions fit into the big picture. | 
 | * Varying Target Capabilities:: How to discover what the target can do. | 
 | * Rationale::                   Why we did it this way. | 
 | @end menu | 
 |  | 
 |  | 
 | @c @node Rationale | 
 | @c @section Rationale | 
 |  | 
 |  | 
 | @node General Bytecode Design | 
 | @section General Bytecode Design | 
 |  | 
 | The agent represents bytecode expressions as an array of bytes.  Each | 
 | instruction is one byte long (thus the term @dfn{bytecode}).  Some | 
 | instructions are followed by operand bytes; for example, the @code{goto} | 
 | instruction is followed by a destination for the jump. | 
 |  | 
 | The bytecode interpreter is a stack-based machine; most instructions pop | 
 | their operands off the stack, perform some operation, and push the | 
 | result back on the stack for the next instruction to consume.  Each | 
 | element of the stack may contain either a integer or a floating point | 
 | value; these values are as many bits wide as the largest integer that | 
 | can be directly manipulated in the source language.  Stack elements | 
 | carry no record of their type; bytecode could push a value as an | 
 | integer, then pop it as a floating point value.  However, GDB will not | 
 | generate code which does this.  In C, one might define the type of a | 
 | stack element as follows: | 
 | @example | 
 | union agent_val @{ | 
 |   LONGEST l; | 
 |   DOUBLEST d; | 
 | @}; | 
 | @end example | 
 | @noindent | 
 | where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for | 
 | the largest integer and floating point types on the machine. | 
 |  | 
 | By the time the bytecode interpreter reaches the end of the expression, | 
 | the value of the expression should be the only value left on the stack. | 
 | For tracing applications, @code{trace} bytecodes in the expression will | 
 | have recorded the necessary data, and the value on the stack may be | 
 | discarded.  For other applications, like conditional breakpoints, the | 
 | value may be useful. | 
 |  | 
 | Separate from the stack, the interpreter has two registers: | 
 | @table @code | 
 | @item pc | 
 | The address of the next bytecode to execute. | 
 |  | 
 | @item start | 
 | The address of the start of the bytecode expression, necessary for | 
 | interpreting the @code{goto} and @code{if_goto} instructions. | 
 |  | 
 | @end table | 
 | @noindent | 
 | Neither of these registers is directly visible to the bytecode language | 
 | itself, but they are useful for defining the meanings of the bytecode | 
 | operations. | 
 |  | 
 | There are no instructions to perform side effects on the running | 
 | program, or call the program's functions; we assume that these | 
 | expressions are only used for unobtrusive debugging, not for patching | 
 | the running code.   | 
 |  | 
 | Most bytecode instructions do not distinguish between the various sizes | 
 | of values, and operate on full-width values; the upper bits of the | 
 | values are simply ignored, since they do not usually make a difference | 
 | to the value computed.  The exceptions to this rule are: | 
 | @table @asis | 
 |  | 
 | @item memory reference instructions (@code{ref}@var{n}) | 
 | There are distinct instructions to fetch different word sizes from | 
 | memory.  Once on the stack, however, the values are treated as full-size | 
 | integers.  They may need to be sign-extended; the @code{ext} instruction | 
 | exists for this purpose. | 
 |  | 
 | @item the sign-extension instruction (@code{ext} @var{n}) | 
 | These clearly need to know which portion of their operand is to be | 
 | extended to occupy the full length of the word. | 
 |  | 
 | @end table | 
 |  | 
 | If the interpreter is unable to evaluate an expression completely for | 
 | some reason (a memory location is inaccessible, or a divisor is zero, | 
 | for example), we say that interpretation ``terminates with an error''. | 
 | This means that the problem is reported back to the interpreter's caller | 
 | in some helpful way.  In general, code using agent expressions should | 
 | assume that they may attempt to divide by zero, fetch arbitrary memory | 
 | locations, and misbehave in other ways. | 
 |  | 
 | Even complicated C expressions compile to a few bytecode instructions; | 
 | for example, the expression @code{x + y * z} would typically produce | 
 | code like the following, assuming that @code{x} and @code{y} live in | 
 | registers, and @code{z} is a global variable holding a 32-bit | 
 | @code{int}: | 
 | @example | 
 | reg 1 | 
 | reg 2 | 
 | const32 @i{address of z} | 
 | ref32 | 
 | ext 32 | 
 | mul | 
 | add | 
 | end | 
 | @end example | 
 |  | 
 | In detail, these mean: | 
 | @table @code | 
 |  | 
 | @item reg 1 | 
 | Push the value of register 1 (presumably holding @code{x}) onto the | 
 | stack. | 
 |  | 
 | @item reg 2 | 
 | Push the value of register 2 (holding @code{y}). | 
 |  | 
 | @item const32 @i{address of z} | 
 | Push the address of @code{z} onto the stack. | 
 |  | 
 | @item ref32 | 
 | Fetch a 32-bit word from the address at the top of the stack; replace | 
 | the address on the stack with the value.  Thus, we replace the address | 
 | of @code{z} with @code{z}'s value. | 
 |  | 
 | @item ext 32 | 
 | Sign-extend the value on the top of the stack from 32 bits to full | 
 | length.  This is necessary because @code{z} is a signed integer. | 
 |  | 
 | @item mul | 
 | Pop the top two numbers on the stack, multiply them, and push their | 
 | product.  Now the top of the stack contains the value of the expression | 
 | @code{y * z}. | 
 |  | 
 | @item add | 
 | Pop the top two numbers, add them, and push the sum.  Now the top of the | 
 | stack contains the value of @code{x + y * z}. | 
 |  | 
 | @item end | 
 | Stop executing; the value left on the stack top is the value to be | 
 | recorded. | 
 |  | 
 | @end table | 
 |  | 
 |  | 
 | @node Bytecode Descriptions | 
 | @section Bytecode Descriptions | 
 |  | 
 | Each bytecode description has the following form: | 
 |  | 
 | @table @asis | 
 |  | 
 | @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} | 
 |  | 
 | Pop the top two stack items, @var{a} and @var{b}, as integers; push | 
 | their sum, as an integer. | 
 |  | 
 | @end table | 
 |  | 
 | In this example, @code{add} is the name of the bytecode, and | 
 | @code{(0x02)} is the one-byte value used to encode the bytecode, in | 
 | hexadecimal.  The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows | 
 | the stack before and after the bytecode executes.  Beforehand, the stack | 
 | must contain at least two values, @var{a} and @var{b}; since the top of | 
 | the stack is to the right, @var{b} is on the top of the stack, and | 
 | @var{a} is underneath it.  After execution, the bytecode will have | 
 | popped @var{a} and @var{b} from the stack, and replaced them with a | 
 | single value, @var{a+b}.  There may be other values on the stack below | 
 | those shown, but the bytecode affects only those shown. | 
 |  | 
 | Here is another example: | 
 |  | 
 | @table @asis | 
 |  | 
 | @item @code{const8} (0x22) @var{n}: @result{} @var{n} | 
 | Push the 8-bit integer constant @var{n} on the stack, without sign | 
 | extension. | 
 |  | 
 | @end table | 
 |  | 
 | In this example, the bytecode @code{const8} takes an operand @var{n} | 
 | directly from the bytecode stream; the operand follows the @code{const8} | 
 | bytecode itself.  We write any such operands immediately after the name | 
 | of the bytecode, before the colon, and describe the exact encoding of | 
 | the operand in the bytecode stream in the body of the bytecode | 
 | description. | 
 |  | 
 | For the @code{const8} bytecode, there are no stack items given before | 
 | the @result{}; this simply means that the bytecode consumes no values | 
 | from the stack.  If a bytecode consumes no values, or produces no | 
 | values, the list on either side of the @result{} may be empty. | 
 |  | 
 | If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode | 
 | treats it as an integer.  If a value is written is @var{addr}, then the | 
 | bytecode treats it as an address. | 
 |  | 
 | We do not fully describe the floating point operations here; although | 
 | this design can be extended in a clean way to handle floating point | 
 | values, they are not of immediate interest to the customer, so we avoid | 
 | describing them, to save time. | 
 |  | 
 |  | 
 | @table @asis | 
 |  | 
 | @item @code{float} (0x01): @result{} | 
 |  | 
 | Prefix for floating-point bytecodes.  Not implemented yet. | 
 |  | 
 | @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} | 
 | Pop two integers from the stack, and push their sum, as an integer. | 
 |  | 
 | @item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b} | 
 | Pop two integers from the stack, subtract the top value from the | 
 | next-to-top value, and push the difference. | 
 |  | 
 | @item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b} | 
 | Pop two integers from the stack, multiply them, and push the product on | 
 | the stack.  Note that, when one multiplies two @var{n}-bit numbers | 
 | yielding another @var{n}-bit number, it is irrelevant whether the | 
 | numbers are signed or not; the results are the same. | 
 |  | 
 | @item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b} | 
 | Pop two signed integers from the stack; divide the next-to-top value by | 
 | the top value, and push the quotient.  If the divisor is zero, terminate | 
 | with an error. | 
 |  | 
 | @item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b} | 
 | Pop two unsigned integers from the stack; divide the next-to-top value | 
 | by the top value, and push the quotient.  If the divisor is zero, | 
 | terminate with an error. | 
 |  | 
 | @item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b} | 
 | Pop two signed integers from the stack; divide the next-to-top value by | 
 | the top value, and push the remainder.  If the divisor is zero, | 
 | terminate with an error. | 
 |  | 
 | @item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b} | 
 | Pop two unsigned integers from the stack; divide the next-to-top value | 
 | by the top value, and push the remainder.  If the divisor is zero, | 
 | terminate with an error. | 
 |  | 
 | @item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b} | 
 | Pop two integers from the stack; let @var{a} be the next-to-top value, | 
 | and @var{b} be the top value.  Shift @var{a} left by @var{b} bits, and | 
 | push the result. | 
 |  | 
 | @item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b} | 
 | Pop two integers from the stack; let @var{a} be the next-to-top value, | 
 | and @var{b} be the top value.  Shift @var{a} right by @var{b} bits, | 
 | inserting copies of the top bit at the high end, and push the result. | 
 |  | 
 | @item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b} | 
 | Pop two integers from the stack; let @var{a} be the next-to-top value, | 
 | and @var{b} be the top value.  Shift @var{a} right by @var{b} bits, | 
 | inserting zero bits at the high end, and push the result. | 
 |  | 
 | @item @code{log_not} (0x0e): @var{a} @result{} @var{!a} | 
 | Pop an integer from the stack; if it is zero, push the value one; | 
 | otherwise, push the value zero. | 
 |  | 
 | @item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b} | 
 | Pop two integers from the stack, and push their bitwise @code{and}. | 
 |  | 
 | @item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b} | 
 | Pop two integers from the stack, and push their bitwise @code{or}. | 
 |  | 
 | @item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b} | 
 | Pop two integers from the stack, and push their bitwise | 
 | exclusive-@code{or}. | 
 |  | 
 | @item @code{bit_not} (0x12): @var{a} @result{} @var{~a} | 
 | Pop an integer from the stack, and push its bitwise complement. | 
 |  | 
 | @item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b} | 
 | Pop two integers from the stack; if they are equal, push the value one; | 
 | otherwise, push the value zero. | 
 |  | 
 | @item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b} | 
 | Pop two signed integers from the stack; if the next-to-top value is less | 
 | than the top value, push the value one; otherwise, push the value zero. | 
 |  | 
 | @item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b} | 
 | Pop two unsigned integers from the stack; if the next-to-top value is less | 
 | than the top value, push the value one; otherwise, push the value zero. | 
 |  | 
 | @item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits | 
 | Pop an unsigned value from the stack; treating it as an @var{n}-bit | 
 | twos-complement value, extend it to full length.  This means that all | 
 | bits to the left of bit @var{n-1} (where the least significant bit is bit | 
 | 0) are set to the value of bit @var{n-1}.  Note that @var{n} may be | 
 | larger than or equal to the width of the stack elements of the bytecode | 
 | engine; in this case, the bytecode should have no effect. | 
 |  | 
 | The number of source bits to preserve, @var{n}, is encoded as a single | 
 | byte unsigned integer following the @code{ext} bytecode. | 
 |  | 
 | @item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits | 
 | Pop an unsigned value from the stack; zero all but the bottom @var{n} | 
 | bits. | 
 |  | 
 | The number of source bits to preserve, @var{n}, is encoded as a single | 
 | byte unsigned integer following the @code{zero_ext} bytecode. | 
 |  | 
 | @item @code{ref8} (0x17): @var{addr} @result{} @var{a} | 
 | @itemx @code{ref16} (0x18): @var{addr} @result{} @var{a} | 
 | @itemx @code{ref32} (0x19): @var{addr} @result{} @var{a} | 
 | @itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a} | 
 | Pop an address @var{addr} from the stack.  For bytecode | 
 | @code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the | 
 | natural target endianness.  Push the fetched value as an unsigned | 
 | integer. | 
 |  | 
 | Note that @var{addr} may not be aligned in any particular way; the | 
 | @code{ref@var{n}} bytecodes should operate correctly for any address. | 
 |  | 
 | If attempting to access memory at @var{addr} would cause a processor | 
 | exception of some sort, terminate with an error. | 
 |  | 
 | @item @code{ref_float} (0x1b): @var{addr} @result{} @var{d} | 
 | @itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d} | 
 | @itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d} | 
 | @itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d} | 
 | @itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a} | 
 | Not implemented yet. | 
 |  | 
 | @item @code{dup} (0x28): @var{a} => @var{a} @var{a} | 
 | Push another copy of the stack's top element. | 
 |  | 
 | @item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a} | 
 | Exchange the top two items on the stack. | 
 |  | 
 | @item @code{pop} (0x29): @var{a} => | 
 | Discard the top value on the stack. | 
 |  | 
 | @item @code{pick} (0x32) @var{n}: @var{a} @dots{} @var{b} => @var{a} @dots{} @var{b} @var{a} | 
 | Duplicate an item from the stack and push it on the top of the stack. | 
 | @var{n}, a single byte, indicates the stack item to copy.  If @var{n} | 
 | is zero, this is the same as @code{dup}; if @var{n} is one, it copies | 
 | the item under the top item, etc.  If @var{n} exceeds the number of | 
 | items on the stack, terminate with an error. | 
 |  | 
 | @item @code{rot} (0x33): @var{a} @var{b} @var{c} => @var{c} @var{a} @var{b} | 
 | Rotate the top three items on the stack.  The top item (c) becomes the third | 
 | item, the next-to-top item (b) becomes the top item and the third item (a) from | 
 | the top becomes the next-to-top item. | 
 |  | 
 | @item @code{if_goto} (0x20) @var{offset}: @var{a} @result{} | 
 | Pop an integer off the stack; if it is non-zero, branch to the given | 
 | offset in the bytecode string.  Otherwise, continue to the next | 
 | instruction in the bytecode stream.  In other words, if @var{a} is | 
 | non-zero, set the @code{pc} register to @code{start} + @var{offset}. | 
 | Thus, an offset of zero denotes the beginning of the expression. | 
 |  | 
 | The @var{offset} is stored as a sixteen-bit unsigned value, stored | 
 | immediately following the @code{if_goto} bytecode.  It is always stored | 
 | most significant byte first, regardless of the target's normal | 
 | endianness.  The offset is not guaranteed to fall at any particular | 
 | alignment within the bytecode stream; thus, on machines where fetching a | 
 | 16-bit on an unaligned address raises an exception, you should fetch the | 
 | offset one byte at a time. | 
 |  | 
 | @item @code{goto} (0x21) @var{offset}: @result{} | 
 | Branch unconditionally to @var{offset}; in other words, set the | 
 | @code{pc} register to @code{start} + @var{offset}. | 
 |  | 
 | The offset is stored in the same way as for the @code{if_goto} bytecode. | 
 |  | 
 | @item @code{const8} (0x22) @var{n}: @result{} @var{n} | 
 | @itemx @code{const16} (0x23) @var{n}: @result{} @var{n} | 
 | @itemx @code{const32} (0x24) @var{n}: @result{} @var{n} | 
 | @itemx @code{const64} (0x25) @var{n}: @result{} @var{n} | 
 | Push the integer constant @var{n} on the stack, without sign extension. | 
 | To produce a small negative value, push a small twos-complement value, | 
 | and then sign-extend it using the @code{ext} bytecode. | 
 |  | 
 | The constant @var{n} is stored in the appropriate number of bytes | 
 | following the @code{const}@var{b} bytecode.  The constant @var{n} is | 
 | always stored most significant byte first, regardless of the target's | 
 | normal endianness.  The constant is not guaranteed to fall at any | 
 | particular alignment within the bytecode stream; thus, on machines where | 
 | fetching a 16-bit on an unaligned address raises an exception, you | 
 | should fetch @var{n} one byte at a time. | 
 |  | 
 | @item @code{reg} (0x26) @var{n}: @result{} @var{a} | 
 | Push the value of register number @var{n}, without sign extension.  The | 
 | registers are numbered following GDB's conventions. | 
 |  | 
 | The register number @var{n} is encoded as a 16-bit unsigned integer | 
 | immediately following the @code{reg} bytecode.  It is always stored most | 
 | significant byte first, regardless of the target's normal endianness. | 
 | The register number is not guaranteed to fall at any particular | 
 | alignment within the bytecode stream; thus, on machines where fetching a | 
 | 16-bit on an unaligned address raises an exception, you should fetch the | 
 | register number one byte at a time. | 
 |  | 
 | @item @code{getv} (0x2c) @var{n}: @result{} @var{v} | 
 | Push the value of trace state variable number @var{n}, without sign | 
 | extension. | 
 |  | 
 | The variable number @var{n} is encoded as a 16-bit unsigned integer | 
 | immediately following the @code{getv} bytecode.  It is always stored most | 
 | significant byte first, regardless of the target's normal endianness. | 
 | The variable number is not guaranteed to fall at any particular | 
 | alignment within the bytecode stream; thus, on machines where fetching a | 
 | 16-bit on an unaligned address raises an exception, you should fetch the | 
 | register number one byte at a time. | 
 |  | 
 | @item @code{setv} (0x2d) @var{n}: @var{v} @result{} @var{v} | 
 | Set trace state variable number @var{n} to the value found on the top | 
 | of the stack.  The stack is unchanged, so that the value is readily | 
 | available if the assignment is part of a larger expression.  The | 
 | handling of @var{n} is as described for @code{getv}. | 
 |  | 
 | @item @code{trace} (0x0c): @var{addr} @var{size} @result{} | 
 | Record the contents of the @var{size} bytes at @var{addr} in a trace | 
 | buffer, for later retrieval by GDB. | 
 |  | 
 | @item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr} | 
 | Record the contents of the @var{size} bytes at @var{addr} in a trace | 
 | buffer, for later retrieval by GDB.  @var{size} is a single byte | 
 | unsigned integer following the @code{trace} opcode. | 
 |  | 
 | This bytecode is equivalent to the sequence @code{dup const8 @var{size} | 
 | trace}, but we provide it anyway to save space in bytecode strings. | 
 |  | 
 | @item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr} | 
 | Identical to trace_quick, except that @var{size} is a 16-bit big-endian | 
 | unsigned integer, not a single byte.  This should probably have been | 
 | named @code{trace_quick16}, for consistency. | 
 |  | 
 | @item @code{tracev} (0x2e) @var{n}: @result{} @var{a} | 
 | Record the value of trace state variable number @var{n} in the trace | 
 | buffer.  The handling of @var{n} is as described for @code{getv}. | 
 |  | 
 | @item @code{tracenz} (0x2f)  @var{addr} @var{size} @result{} | 
 | Record the bytes at @var{addr} in a trace buffer, for later retrieval | 
 | by GDB.  Stop at either the first zero byte, or when @var{size} bytes | 
 | have been recorded, whichever occurs first. | 
 |  | 
 | @item @code{printf} (0x34)  @var{numargs} @var{string} @result{} | 
 | Do a formatted print, in the style of the C function @code{printf}). | 
 | The value of @var{numargs} is the number of arguments to expect on the | 
 | stack, while @var{string} is the format string, prefixed with a | 
 | two-byte length.  The last byte of the string must be zero, and is | 
 | included in the length.  The format string includes escaped sequences | 
 | just as it appears in C source, so for instance the format string | 
 | @code{"\t%d\n"} is six characters long, and the output will consist of | 
 | a tab character, a decimal number, and a newline.  At the top of the | 
 | stack, above the values to be printed, this bytecode will pop a | 
 | ``function'' and ``channel''.  If the function is nonzero, then the | 
 | target may treat it as a function and call it, passing the channel as | 
 | a first argument, as with the C function @code{fprintf}.  If the | 
 | function is zero, then the target may simply call a standard formatted | 
 | print function of its choice.  In all, this bytecode pops 2 + | 
 | @var{numargs} stack elements, and pushes nothing. | 
 |  | 
 | @item @code{end} (0x27): @result{} | 
 | Stop executing bytecode; the result should be the top element of the | 
 | stack.  If the purpose of the expression was to compute an lvalue or a | 
 | range of memory, then the next-to-top of the stack is the lvalue's | 
 | address, and the top of the stack is the lvalue's size, in bytes. | 
 |  | 
 | @end table | 
 |  | 
 |  | 
 | @node Using Agent Expressions | 
 | @section Using Agent Expressions | 
 |  | 
 | Agent expressions can be used in several different ways by @value{GDBN}, | 
 | and the debugger can generate different bytecode sequences as appropriate. | 
 |  | 
 | One possibility is to do expression evaluation on the target rather | 
 | than the host, such as for the conditional of a conditional | 
 | tracepoint.  In such a case, @value{GDBN} compiles the source | 
 | expression into a bytecode sequence that simply gets values from | 
 | registers or memory, does arithmetic, and returns a result. | 
 |  | 
 | Another way to use agent expressions is for tracepoint data | 
 | collection.  @value{GDBN} generates a different bytecode sequence for | 
 | collection; in addition to bytecodes that do the calculation, | 
 | @value{GDBN} adds @code{trace} bytecodes to save the pieces of | 
 | memory that were used. | 
 |  | 
 | @itemize @bullet | 
 |  | 
 | @item | 
 | The user selects trace points in the program's code at which GDB should | 
 | collect data. | 
 |  | 
 | @item | 
 | The user specifies expressions to evaluate at each trace point.  These | 
 | expressions may denote objects in memory, in which case those objects' | 
 | contents are recorded as the program runs, or computed values, in which | 
 | case the values themselves are recorded. | 
 |  | 
 | @item | 
 | GDB transmits the tracepoints and their associated expressions to the | 
 | GDB agent, running on the debugging target. | 
 |  | 
 | @item | 
 | The agent arranges to be notified when a trace point is hit. | 
 |  | 
 | @item | 
 | When execution on the target reaches a trace point, the agent evaluates | 
 | the expressions associated with that trace point, and records the | 
 | resulting values and memory ranges. | 
 |  | 
 | @item | 
 | Later, when the user selects a given trace event and inspects the | 
 | objects and expression values recorded, GDB talks to the agent to | 
 | retrieve recorded data as necessary to meet the user's requests.  If the | 
 | user asks to see an object whose contents have not been recorded, GDB | 
 | reports an error. | 
 |  | 
 | @end itemize | 
 |  | 
 |  | 
 | @node Varying Target Capabilities | 
 | @section Varying Target Capabilities | 
 |  | 
 | Some targets don't support floating-point, and some would rather not | 
 | have to deal with @code{long long} operations.  Also, different targets | 
 | will have different stack sizes, and different bytecode buffer lengths. | 
 |  | 
 | Thus, GDB needs a way to ask the target about itself.  We haven't worked | 
 | out the details yet, but in general, GDB should be able to send the | 
 | target a packet asking it to describe itself.  The reply should be a | 
 | packet whose length is explicit, so we can add new information to the | 
 | packet in future revisions of the agent, without confusing old versions | 
 | of GDB, and it should contain a version number.  It should contain at | 
 | least the following information: | 
 |  | 
 | @itemize @bullet | 
 |  | 
 | @item | 
 | whether floating point is supported | 
 |  | 
 | @item | 
 | whether @code{long long} is supported | 
 |  | 
 | @item | 
 | maximum acceptable size of bytecode stack | 
 |  | 
 | @item | 
 | maximum acceptable length of bytecode expressions | 
 |  | 
 | @item | 
 | which registers are actually available for collection | 
 |  | 
 | @item | 
 | whether the target supports disabled tracepoints | 
 |  | 
 | @end itemize | 
 |  | 
 | @node Rationale | 
 | @section Rationale | 
 |  | 
 | Some of the design decisions apparent above are arguable. | 
 |  | 
 | @table @b | 
 |  | 
 | @item What about stack overflow/underflow? | 
 | GDB should be able to query the target to discover its stack size. | 
 | Given that information, GDB can determine at translation time whether a | 
 | given expression will overflow the stack.  But this spec isn't about | 
 | what kinds of error-checking GDB ought to do. | 
 |  | 
 | @item Why are you doing everything in LONGEST? | 
 |  | 
 | Speed isn't important, but agent code size is; using LONGEST brings in a | 
 | bunch of support code to do things like division, etc.  So this is a | 
 | serious concern. | 
 |  | 
 | First, note that you don't need different bytecodes for different | 
 | operand sizes.  You can generate code without @emph{knowing} how big the | 
 | stack elements actually are on the target.  If the target only supports | 
 | 32-bit ints, and you don't send any 64-bit bytecodes, everything just | 
 | works.  The observation here is that the MIPS and the Alpha have only | 
 | fixed-size registers, and you can still get C's semantics even though | 
 | most instructions only operate on full-sized words.  You just need to | 
 | make sure everything is properly sign-extended at the right times.  So | 
 | there is no need for 32- and 64-bit variants of the bytecodes.  Just | 
 | implement everything using the largest size you support. | 
 |  | 
 | GDB should certainly check to see what sizes the target supports, so the | 
 | user can get an error earlier, rather than later.  But this information | 
 | is not necessary for correctness. | 
 |  | 
 |  | 
 | @item Why don't you have @code{>} or @code{<=} operators? | 
 | I want to keep the interpreter small, and we don't need them.  We can | 
 | combine the @code{less_} opcodes with @code{log_not}, and swap the order | 
 | of the operands, yielding all four asymmetrical comparison operators. | 
 | For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y < | 
 | x)}. | 
 |  | 
 | @item Why do you have @code{log_not}? | 
 | @itemx Why do you have @code{ext}? | 
 | @itemx Why do you have @code{zero_ext}? | 
 | These are all easily synthesized from other instructions, but I expect | 
 | them to be used frequently, and they're simple, so I include them to | 
 | keep bytecode strings short. | 
 |  | 
 | @code{log_not} is equivalent to @code{const8 0 equal}; it's used in half | 
 | the relational operators. | 
 |  | 
 | @code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8 | 
 | @var{s-n} rsh_signed}, where @var{s} is the size of the stack elements; | 
 | it follows @code{ref@var{m}} and @var{reg} bytecodes when the value | 
 | should be signed.  See the next bulleted item. | 
 |  | 
 | @code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask} | 
 | log_and}; it's used whenever we push the value of a register, because we | 
 | can't assume the upper bits of the register aren't garbage. | 
 |  | 
 | @item Why not have sign-extending variants of the @code{ref} operators? | 
 | Because that would double the number of @code{ref} operators, and we | 
 | need the @code{ext} bytecode anyway for accessing bitfields. | 
 |  | 
 | @item Why not have constant-address variants of the @code{ref} operators? | 
 | Because that would double the number of @code{ref} operators again, and | 
 | @code{const32 @var{address} ref32} is only one byte longer. | 
 |  | 
 | @item Why do the @code{ref@var{n}} operators have to support unaligned fetches? | 
 | GDB will generate bytecode that fetches multi-byte values at unaligned | 
 | addresses whenever the executable's debugging information tells it to. | 
 | Furthermore, GDB does not know the value the pointer will have when GDB | 
 | generates the bytecode, so it cannot determine whether a particular | 
 | fetch will be aligned or not. | 
 |  | 
 | In particular, structure bitfields may be several bytes long, but follow | 
 | no alignment rules; members of packed structures are not necessarily | 
 | aligned either. | 
 |  | 
 | In general, there are many cases where unaligned references occur in | 
 | correct C code, either at the programmer's explicit request, or at the | 
 | compiler's discretion.  Thus, it is simpler to make the GDB agent | 
 | bytecodes work correctly in all circumstances than to make GDB guess in | 
 | each case whether the compiler did the usual thing. | 
 |  | 
 | @item Why are there no side-effecting operators? | 
 | Because our current client doesn't want them?  That's a cheap answer.  I | 
 | think the real answer is that I'm afraid of implementing function | 
 | calls.  We should re-visit this issue after the present contract is | 
 | delivered. | 
 |  | 
 | @item Why aren't the @code{goto} ops PC-relative? | 
 | The interpreter has the base address around anyway for PC bounds | 
 | checking, and it seemed simpler. | 
 |  | 
 | @item Why is there only one offset size for the @code{goto} ops? | 
 | Offsets are currently sixteen bits.  I'm not happy with this situation | 
 | either: | 
 |  | 
 | Suppose we have multiple branch ops with different offset sizes.  As I | 
 | generate code left-to-right, all my jumps are forward jumps (there are | 
 | no loops in expressions), so I never know the target when I emit the | 
 | jump opcode.  Thus, I have to either always assume the largest offset | 
 | size, or do jump relaxation on the code after I generate it, which seems | 
 | like a big waste of time. | 
 |  | 
 | I can imagine a reasonable expression being longer than 256 bytes.  I | 
 | can't imagine one being longer than 64k.  Thus, we need 16-bit offsets. | 
 | This kind of reasoning is so bogus, but relaxation is pathetic. | 
 |  | 
 | The other approach would be to generate code right-to-left.  Then I'd | 
 | always know my offset size.  That might be fun. | 
 |  | 
 | @item Where is the function call bytecode? | 
 |  | 
 | When we add side-effects, we should add this. | 
 |  | 
 | @item Why does the @code{reg} bytecode take a 16-bit register number? | 
 |  | 
 | Intel's IA-64 architecture has 128 general-purpose registers, | 
 | and 128 floating-point registers, and I'm sure it has some random | 
 | control registers. | 
 |  | 
 | @item Why do we need @code{trace} and @code{trace_quick}? | 
 | Because GDB needs to record all the memory contents and registers an | 
 | expression touches.  If the user wants to evaluate an expression | 
 | @code{x->y->z}, the agent must record the values of @code{x} and | 
 | @code{x->y} as well as the value of @code{x->y->z}. | 
 |  | 
 | @item Don't the @code{trace} bytecodes make the interpreter less general? | 
 | They do mean that the interpreter contains special-purpose code, but | 
 | that doesn't mean the interpreter can only be used for that purpose.  If | 
 | an expression doesn't use the @code{trace} bytecodes, they don't get in | 
 | its way. | 
 |  | 
 | @item Why doesn't @code{trace_quick} consume its arguments the way everything else does? | 
 | In general, you do want your operators to consume their arguments; it's | 
 | consistent, and generally reduces the amount of stack rearrangement | 
 | necessary.  However, @code{trace_quick} is a kludge to save space; it | 
 | only exists so we needn't write @code{dup const8 @var{SIZE} trace} | 
 | before every memory reference.  Therefore, it's okay for it not to | 
 | consume its arguments; it's meant for a specific context in which we | 
 | know exactly what it should do with the stack.  If we're going to have a | 
 | kludge, it should be an effective kludge. | 
 |  | 
 | @item Why does @code{trace16} exist? | 
 | That opcode was added by the customer that contracted Cygnus for the | 
 | data tracing work.  I personally think it is unnecessary; objects that | 
 | large will be quite rare, so it is okay to use @code{dup const16 | 
 | @var{size} trace} in those cases. | 
 |  | 
 | Whatever we decide to do with @code{trace16}, we should at least leave | 
 | opcode 0x30 reserved, to remain compatible with the customer who added | 
 | it. | 
 |  | 
 | @end table |