| @c \input texinfo |
| @c %**start of header |
| @c @setfilename agentexpr.info |
| @c @settitle GDB Agent Expressions |
| @c @setchapternewpage off |
| @c %**end of header |
| |
| @c This file is part of the GDB manual. |
| @c |
| @c Copyright (C) 2003--2024 Free Software Foundation, Inc. |
| @c |
| @c See the file gdb.texinfo for copying conditions. |
| |
| @node Agent Expressions |
| @appendix The GDB Agent Expression Mechanism |
| |
| In some applications, it is not feasible for the debugger to interrupt |
| the program's execution long enough for the developer to learn anything |
| helpful about its behavior. If the program's correctness depends on its |
| real-time behavior, delays introduced by a debugger might cause the |
| program to fail, even when the code itself is correct. It is useful to |
| be able to observe the program's behavior without interrupting it. |
| |
| Using GDB's @code{trace} and @code{collect} commands, the user can |
| specify locations in the program, and arbitrary expressions to evaluate |
| when those locations are reached. Later, using the @code{tfind} |
| command, she can examine the values those expressions had when the |
| program hit the trace points. The expressions may also denote objects |
| in memory --- structures or arrays, for example --- whose values GDB |
| should record; while visiting a particular tracepoint, the user may |
| inspect those objects as if they were in memory at that moment. |
| However, because GDB records these values without interacting with the |
| user, it can do so quickly and unobtrusively, hopefully not disturbing |
| the program's behavior. |
| |
| When GDB is debugging a remote target, the GDB @dfn{agent} code running |
| on the target computes the values of the expressions itself. To avoid |
| having a full symbolic expression evaluator on the agent, GDB translates |
| expressions in the source language into a simpler bytecode language, and |
| then sends the bytecode to the agent; the agent then executes the |
| bytecode, and records the values for GDB to retrieve later. |
| |
| The bytecode language is simple; there are forty-odd opcodes, the bulk |
| of which are the usual vocabulary of C operands (addition, subtraction, |
| shifts, and so on) and various sizes of literals and memory reference |
| operations. The bytecode interpreter operates strictly on machine-level |
| values --- various sizes of integers and floating point numbers --- and |
| requires no information about types or symbols; thus, the interpreter's |
| internal data structures are simple, and each bytecode requires only a |
| few native machine instructions to implement it. The interpreter is |
| small, and strict limits on the memory and time required to evaluate an |
| expression are easy to determine, making it suitable for use by the |
| debugging agent in real-time applications. |
| |
| @menu |
| * General Bytecode Design:: Overview of the interpreter. |
| * Bytecode Descriptions:: What each one does. |
| * Using Agent Expressions:: How agent expressions fit into the big picture. |
| * Varying Target Capabilities:: How to discover what the target can do. |
| * Rationale:: Why we did it this way. |
| @end menu |
| |
| |
| @c @node Rationale |
| @c @section Rationale |
| |
| |
| @node General Bytecode Design |
| @section General Bytecode Design |
| |
| The agent represents bytecode expressions as an array of bytes. Each |
| instruction is one byte long (thus the term @dfn{bytecode}). Some |
| instructions are followed by operand bytes; for example, the @code{goto} |
| instruction is followed by a destination for the jump. |
| |
| The bytecode interpreter is a stack-based machine; most instructions pop |
| their operands off the stack, perform some operation, and push the |
| result back on the stack for the next instruction to consume. Each |
| element of the stack may contain either a integer or a floating point |
| value; these values are as many bits wide as the largest integer that |
| can be directly manipulated in the source language. Stack elements |
| carry no record of their type; bytecode could push a value as an |
| integer, then pop it as a floating point value. However, GDB will not |
| generate code which does this. In C, one might define the type of a |
| stack element as follows: |
| @example |
| union agent_val @{ |
| LONGEST l; |
| DOUBLEST d; |
| @}; |
| @end example |
| @noindent |
| where @code{LONGEST} and @code{DOUBLEST} are @code{typedef} names for |
| the largest integer and floating point types on the machine. |
| |
| By the time the bytecode interpreter reaches the end of the expression, |
| the value of the expression should be the only value left on the stack. |
| For tracing applications, @code{trace} bytecodes in the expression will |
| have recorded the necessary data, and the value on the stack may be |
| discarded. For other applications, like conditional breakpoints, the |
| value may be useful. |
| |
| Separate from the stack, the interpreter has two registers: |
| @table @code |
| @item pc |
| The address of the next bytecode to execute. |
| |
| @item start |
| The address of the start of the bytecode expression, necessary for |
| interpreting the @code{goto} and @code{if_goto} instructions. |
| |
| @end table |
| @noindent |
| Neither of these registers is directly visible to the bytecode language |
| itself, but they are useful for defining the meanings of the bytecode |
| operations. |
| |
| There are no instructions to perform side effects on the running |
| program, or call the program's functions; we assume that these |
| expressions are only used for unobtrusive debugging, not for patching |
| the running code. |
| |
| Most bytecode instructions do not distinguish between the various sizes |
| of values, and operate on full-width values; the upper bits of the |
| values are simply ignored, since they do not usually make a difference |
| to the value computed. The exceptions to this rule are: |
| @table @asis |
| |
| @item memory reference instructions (@code{ref}@var{n}) |
| There are distinct instructions to fetch different word sizes from |
| memory. Once on the stack, however, the values are treated as full-size |
| integers. They may need to be sign-extended; the @code{ext} instruction |
| exists for this purpose. |
| |
| @item the sign-extension instruction (@code{ext} @var{n}) |
| These clearly need to know which portion of their operand is to be |
| extended to occupy the full length of the word. |
| |
| @end table |
| |
| If the interpreter is unable to evaluate an expression completely for |
| some reason (a memory location is inaccessible, or a divisor is zero, |
| for example), we say that interpretation ``terminates with an error''. |
| This means that the problem is reported back to the interpreter's caller |
| in some helpful way. In general, code using agent expressions should |
| assume that they may attempt to divide by zero, fetch arbitrary memory |
| locations, and misbehave in other ways. |
| |
| Even complicated C expressions compile to a few bytecode instructions; |
| for example, the expression @code{x + y * z} would typically produce |
| code like the following, assuming that @code{x} and @code{y} live in |
| registers, and @code{z} is a global variable holding a 32-bit |
| @code{int}: |
| @example |
| reg 1 |
| reg 2 |
| const32 @i{address of z} |
| ref32 |
| ext 32 |
| mul |
| add |
| end |
| @end example |
| |
| In detail, these mean: |
| @table @code |
| |
| @item reg 1 |
| Push the value of register 1 (presumably holding @code{x}) onto the |
| stack. |
| |
| @item reg 2 |
| Push the value of register 2 (holding @code{y}). |
| |
| @item const32 @i{address of z} |
| Push the address of @code{z} onto the stack. |
| |
| @item ref32 |
| Fetch a 32-bit word from the address at the top of the stack; replace |
| the address on the stack with the value. Thus, we replace the address |
| of @code{z} with @code{z}'s value. |
| |
| @item ext 32 |
| Sign-extend the value on the top of the stack from 32 bits to full |
| length. This is necessary because @code{z} is a signed integer. |
| |
| @item mul |
| Pop the top two numbers on the stack, multiply them, and push their |
| product. Now the top of the stack contains the value of the expression |
| @code{y * z}. |
| |
| @item add |
| Pop the top two numbers, add them, and push the sum. Now the top of the |
| stack contains the value of @code{x + y * z}. |
| |
| @item end |
| Stop executing; the value left on the stack top is the value to be |
| recorded. |
| |
| @end table |
| |
| |
| @node Bytecode Descriptions |
| @section Bytecode Descriptions |
| |
| Each bytecode description has the following form: |
| |
| @table @asis |
| |
| @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} |
| |
| Pop the top two stack items, @var{a} and @var{b}, as integers; push |
| their sum, as an integer. |
| |
| @end table |
| |
| In this example, @code{add} is the name of the bytecode, and |
| @code{(0x02)} is the one-byte value used to encode the bytecode, in |
| hexadecimal. The phrase ``@var{a} @var{b} @result{} @var{a+b}'' shows |
| the stack before and after the bytecode executes. Beforehand, the stack |
| must contain at least two values, @var{a} and @var{b}; since the top of |
| the stack is to the right, @var{b} is on the top of the stack, and |
| @var{a} is underneath it. After execution, the bytecode will have |
| popped @var{a} and @var{b} from the stack, and replaced them with a |
| single value, @var{a+b}. There may be other values on the stack below |
| those shown, but the bytecode affects only those shown. |
| |
| Here is another example: |
| |
| @table @asis |
| |
| @item @code{const8} (0x22) @var{n}: @result{} @var{n} |
| Push the 8-bit integer constant @var{n} on the stack, without sign |
| extension. |
| |
| @end table |
| |
| In this example, the bytecode @code{const8} takes an operand @var{n} |
| directly from the bytecode stream; the operand follows the @code{const8} |
| bytecode itself. We write any such operands immediately after the name |
| of the bytecode, before the colon, and describe the exact encoding of |
| the operand in the bytecode stream in the body of the bytecode |
| description. |
| |
| For the @code{const8} bytecode, there are no stack items given before |
| the @result{}; this simply means that the bytecode consumes no values |
| from the stack. If a bytecode consumes no values, or produces no |
| values, the list on either side of the @result{} may be empty. |
| |
| If a value is written as @var{a}, @var{b}, or @var{n}, then the bytecode |
| treats it as an integer. If a value is written is @var{addr}, then the |
| bytecode treats it as an address. |
| |
| We do not fully describe the floating point operations here; although |
| this design can be extended in a clean way to handle floating point |
| values, they are not of immediate interest to the customer, so we avoid |
| describing them, to save time. |
| |
| |
| @table @asis |
| |
| @item @code{float} (0x01): @result{} |
| |
| Prefix for floating-point bytecodes. Not implemented yet. |
| |
| @item @code{add} (0x02): @var{a} @var{b} @result{} @var{a+b} |
| Pop two integers from the stack, and push their sum, as an integer. |
| |
| @item @code{sub} (0x03): @var{a} @var{b} @result{} @var{a-b} |
| Pop two integers from the stack, subtract the top value from the |
| next-to-top value, and push the difference. |
| |
| @item @code{mul} (0x04): @var{a} @var{b} @result{} @var{a*b} |
| Pop two integers from the stack, multiply them, and push the product on |
| the stack. Note that, when one multiplies two @var{n}-bit numbers |
| yielding another @var{n}-bit number, it is irrelevant whether the |
| numbers are signed or not; the results are the same. |
| |
| @item @code{div_signed} (0x05): @var{a} @var{b} @result{} @var{a/b} |
| Pop two signed integers from the stack; divide the next-to-top value by |
| the top value, and push the quotient. If the divisor is zero, terminate |
| with an error. |
| |
| @item @code{div_unsigned} (0x06): @var{a} @var{b} @result{} @var{a/b} |
| Pop two unsigned integers from the stack; divide the next-to-top value |
| by the top value, and push the quotient. If the divisor is zero, |
| terminate with an error. |
| |
| @item @code{rem_signed} (0x07): @var{a} @var{b} @result{} @var{a modulo b} |
| Pop two signed integers from the stack; divide the next-to-top value by |
| the top value, and push the remainder. If the divisor is zero, |
| terminate with an error. |
| |
| @item @code{rem_unsigned} (0x08): @var{a} @var{b} @result{} @var{a modulo b} |
| Pop two unsigned integers from the stack; divide the next-to-top value |
| by the top value, and push the remainder. If the divisor is zero, |
| terminate with an error. |
| |
| @item @code{lsh} (0x09): @var{a} @var{b} @result{} @var{a<<b} |
| Pop two integers from the stack; let @var{a} be the next-to-top value, |
| and @var{b} be the top value. Shift @var{a} left by @var{b} bits, and |
| push the result. |
| |
| @item @code{rsh_signed} (0x0a): @var{a} @var{b} @result{} @code{(signed)}@var{a>>b} |
| Pop two integers from the stack; let @var{a} be the next-to-top value, |
| and @var{b} be the top value. Shift @var{a} right by @var{b} bits, |
| inserting copies of the top bit at the high end, and push the result. |
| |
| @item @code{rsh_unsigned} (0x0b): @var{a} @var{b} @result{} @var{a>>b} |
| Pop two integers from the stack; let @var{a} be the next-to-top value, |
| and @var{b} be the top value. Shift @var{a} right by @var{b} bits, |
| inserting zero bits at the high end, and push the result. |
| |
| @item @code{log_not} (0x0e): @var{a} @result{} @var{!a} |
| Pop an integer from the stack; if it is zero, push the value one; |
| otherwise, push the value zero. |
| |
| @item @code{bit_and} (0x0f): @var{a} @var{b} @result{} @var{a&b} |
| Pop two integers from the stack, and push their bitwise @code{and}. |
| |
| @item @code{bit_or} (0x10): @var{a} @var{b} @result{} @var{a|b} |
| Pop two integers from the stack, and push their bitwise @code{or}. |
| |
| @item @code{bit_xor} (0x11): @var{a} @var{b} @result{} @var{a^b} |
| Pop two integers from the stack, and push their bitwise |
| exclusive-@code{or}. |
| |
| @item @code{bit_not} (0x12): @var{a} @result{} @var{~a} |
| Pop an integer from the stack, and push its bitwise complement. |
| |
| @item @code{equal} (0x13): @var{a} @var{b} @result{} @var{a=b} |
| Pop two integers from the stack; if they are equal, push the value one; |
| otherwise, push the value zero. |
| |
| @item @code{less_signed} (0x14): @var{a} @var{b} @result{} @var{a<b} |
| Pop two signed integers from the stack; if the next-to-top value is less |
| than the top value, push the value one; otherwise, push the value zero. |
| |
| @item @code{less_unsigned} (0x15): @var{a} @var{b} @result{} @var{a<b} |
| Pop two unsigned integers from the stack; if the next-to-top value is less |
| than the top value, push the value one; otherwise, push the value zero. |
| |
| @item @code{ext} (0x16) @var{n}: @var{a} @result{} @var{a}, sign-extended from @var{n} bits |
| Pop an unsigned value from the stack; treating it as an @var{n}-bit |
| twos-complement value, extend it to full length. This means that all |
| bits to the left of bit @var{n-1} (where the least significant bit is bit |
| 0) are set to the value of bit @var{n-1}. Note that @var{n} may be |
| larger than or equal to the width of the stack elements of the bytecode |
| engine; in this case, the bytecode should have no effect. |
| |
| The number of source bits to preserve, @var{n}, is encoded as a single |
| byte unsigned integer following the @code{ext} bytecode. |
| |
| @item @code{zero_ext} (0x2a) @var{n}: @var{a} @result{} @var{a}, zero-extended from @var{n} bits |
| Pop an unsigned value from the stack; zero all but the bottom @var{n} |
| bits. |
| |
| The number of source bits to preserve, @var{n}, is encoded as a single |
| byte unsigned integer following the @code{zero_ext} bytecode. |
| |
| @item @code{ref8} (0x17): @var{addr} @result{} @var{a} |
| @itemx @code{ref16} (0x18): @var{addr} @result{} @var{a} |
| @itemx @code{ref32} (0x19): @var{addr} @result{} @var{a} |
| @itemx @code{ref64} (0x1a): @var{addr} @result{} @var{a} |
| Pop an address @var{addr} from the stack. For bytecode |
| @code{ref}@var{n}, fetch an @var{n}-bit value from @var{addr}, using the |
| natural target endianness. Push the fetched value as an unsigned |
| integer. |
| |
| Note that @var{addr} may not be aligned in any particular way; the |
| @code{ref@var{n}} bytecodes should operate correctly for any address. |
| |
| If attempting to access memory at @var{addr} would cause a processor |
| exception of some sort, terminate with an error. |
| |
| @item @code{ref_float} (0x1b): @var{addr} @result{} @var{d} |
| @itemx @code{ref_double} (0x1c): @var{addr} @result{} @var{d} |
| @itemx @code{ref_long_double} (0x1d): @var{addr} @result{} @var{d} |
| @itemx @code{l_to_d} (0x1e): @var{a} @result{} @var{d} |
| @itemx @code{d_to_l} (0x1f): @var{d} @result{} @var{a} |
| Not implemented yet. |
| |
| @item @code{dup} (0x28): @var{a} => @var{a} @var{a} |
| Push another copy of the stack's top element. |
| |
| @item @code{swap} (0x2b): @var{a} @var{b} => @var{b} @var{a} |
| Exchange the top two items on the stack. |
| |
| @item @code{pop} (0x29): @var{a} => |
| Discard the top value on the stack. |
| |
| @item @code{pick} (0x32) @var{n}: @var{a} @dots{} @var{b} => @var{a} @dots{} @var{b} @var{a} |
| Duplicate an item from the stack and push it on the top of the stack. |
| @var{n}, a single byte, indicates the stack item to copy. If @var{n} |
| is zero, this is the same as @code{dup}; if @var{n} is one, it copies |
| the item under the top item, etc. If @var{n} exceeds the number of |
| items on the stack, terminate with an error. |
| |
| @item @code{rot} (0x33): @var{a} @var{b} @var{c} => @var{c} @var{a} @var{b} |
| Rotate the top three items on the stack. The top item (c) becomes the third |
| item, the next-to-top item (b) becomes the top item and the third item (a) from |
| the top becomes the next-to-top item. |
| |
| @item @code{if_goto} (0x20) @var{offset}: @var{a} @result{} |
| Pop an integer off the stack; if it is non-zero, branch to the given |
| offset in the bytecode string. Otherwise, continue to the next |
| instruction in the bytecode stream. In other words, if @var{a} is |
| non-zero, set the @code{pc} register to @code{start} + @var{offset}. |
| Thus, an offset of zero denotes the beginning of the expression. |
| |
| The @var{offset} is stored as a sixteen-bit unsigned value, stored |
| immediately following the @code{if_goto} bytecode. It is always stored |
| most significant byte first, regardless of the target's normal |
| endianness. The offset is not guaranteed to fall at any particular |
| alignment within the bytecode stream; thus, on machines where fetching a |
| 16-bit on an unaligned address raises an exception, you should fetch the |
| offset one byte at a time. |
| |
| @item @code{goto} (0x21) @var{offset}: @result{} |
| Branch unconditionally to @var{offset}; in other words, set the |
| @code{pc} register to @code{start} + @var{offset}. |
| |
| The offset is stored in the same way as for the @code{if_goto} bytecode. |
| |
| @item @code{const8} (0x22) @var{n}: @result{} @var{n} |
| @itemx @code{const16} (0x23) @var{n}: @result{} @var{n} |
| @itemx @code{const32} (0x24) @var{n}: @result{} @var{n} |
| @itemx @code{const64} (0x25) @var{n}: @result{} @var{n} |
| Push the integer constant @var{n} on the stack, without sign extension. |
| To produce a small negative value, push a small twos-complement value, |
| and then sign-extend it using the @code{ext} bytecode. |
| |
| The constant @var{n} is stored in the appropriate number of bytes |
| following the @code{const}@var{b} bytecode. The constant @var{n} is |
| always stored most significant byte first, regardless of the target's |
| normal endianness. The constant is not guaranteed to fall at any |
| particular alignment within the bytecode stream; thus, on machines where |
| fetching a 16-bit on an unaligned address raises an exception, you |
| should fetch @var{n} one byte at a time. |
| |
| @item @code{reg} (0x26) @var{n}: @result{} @var{a} |
| Push the value of register number @var{n}, without sign extension. The |
| registers are numbered following GDB's conventions. |
| |
| The register number @var{n} is encoded as a 16-bit unsigned integer |
| immediately following the @code{reg} bytecode. It is always stored most |
| significant byte first, regardless of the target's normal endianness. |
| The register number is not guaranteed to fall at any particular |
| alignment within the bytecode stream; thus, on machines where fetching a |
| 16-bit on an unaligned address raises an exception, you should fetch the |
| register number one byte at a time. |
| |
| @item @code{getv} (0x2c) @var{n}: @result{} @var{v} |
| Push the value of trace state variable number @var{n}, without sign |
| extension. |
| |
| The variable number @var{n} is encoded as a 16-bit unsigned integer |
| immediately following the @code{getv} bytecode. It is always stored most |
| significant byte first, regardless of the target's normal endianness. |
| The variable number is not guaranteed to fall at any particular |
| alignment within the bytecode stream; thus, on machines where fetching a |
| 16-bit on an unaligned address raises an exception, you should fetch the |
| register number one byte at a time. |
| |
| @item @code{setv} (0x2d) @var{n}: @var{v} @result{} @var{v} |
| Set trace state variable number @var{n} to the value found on the top |
| of the stack. The stack is unchanged, so that the value is readily |
| available if the assignment is part of a larger expression. The |
| handling of @var{n} is as described for @code{getv}. |
| |
| @item @code{trace} (0x0c): @var{addr} @var{size} @result{} |
| Record the contents of the @var{size} bytes at @var{addr} in a trace |
| buffer, for later retrieval by GDB. |
| |
| @item @code{trace_quick} (0x0d) @var{size}: @var{addr} @result{} @var{addr} |
| Record the contents of the @var{size} bytes at @var{addr} in a trace |
| buffer, for later retrieval by GDB. @var{size} is a single byte |
| unsigned integer following the @code{trace} opcode. |
| |
| This bytecode is equivalent to the sequence @code{dup const8 @var{size} |
| trace}, but we provide it anyway to save space in bytecode strings. |
| |
| @item @code{trace16} (0x30) @var{size}: @var{addr} @result{} @var{addr} |
| Identical to trace_quick, except that @var{size} is a 16-bit big-endian |
| unsigned integer, not a single byte. This should probably have been |
| named @code{trace_quick16}, for consistency. |
| |
| @item @code{tracev} (0x2e) @var{n}: @result{} @var{a} |
| Record the value of trace state variable number @var{n} in the trace |
| buffer. The handling of @var{n} is as described for @code{getv}. |
| |
| @item @code{tracenz} (0x2f) @var{addr} @var{size} @result{} |
| Record the bytes at @var{addr} in a trace buffer, for later retrieval |
| by GDB. Stop at either the first zero byte, or when @var{size} bytes |
| have been recorded, whichever occurs first. |
| |
| @item @code{printf} (0x34) @var{numargs} @var{string} @result{} |
| Do a formatted print, in the style of the C function @code{printf}). |
| The value of @var{numargs} is the number of arguments to expect on the |
| stack, while @var{string} is the format string, prefixed with a |
| two-byte length. The last byte of the string must be zero, and is |
| included in the length. The format string includes escaped sequences |
| just as it appears in C source, so for instance the format string |
| @code{"\t%d\n"} is six characters long, and the output will consist of |
| a tab character, a decimal number, and a newline. At the top of the |
| stack, above the values to be printed, this bytecode will pop a |
| ``function'' and ``channel''. If the function is nonzero, then the |
| target may treat it as a function and call it, passing the channel as |
| a first argument, as with the C function @code{fprintf}. If the |
| function is zero, then the target may simply call a standard formatted |
| print function of its choice. In all, this bytecode pops 2 + |
| @var{numargs} stack elements, and pushes nothing. |
| |
| @item @code{end} (0x27): @result{} |
| Stop executing bytecode; the result should be the top element of the |
| stack. If the purpose of the expression was to compute an lvalue or a |
| range of memory, then the next-to-top of the stack is the lvalue's |
| address, and the top of the stack is the lvalue's size, in bytes. |
| |
| @end table |
| |
| |
| @node Using Agent Expressions |
| @section Using Agent Expressions |
| |
| Agent expressions can be used in several different ways by @value{GDBN}, |
| and the debugger can generate different bytecode sequences as appropriate. |
| |
| One possibility is to do expression evaluation on the target rather |
| than the host, such as for the conditional of a conditional |
| tracepoint. In such a case, @value{GDBN} compiles the source |
| expression into a bytecode sequence that simply gets values from |
| registers or memory, does arithmetic, and returns a result. |
| |
| Another way to use agent expressions is for tracepoint data |
| collection. @value{GDBN} generates a different bytecode sequence for |
| collection; in addition to bytecodes that do the calculation, |
| @value{GDBN} adds @code{trace} bytecodes to save the pieces of |
| memory that were used. |
| |
| @itemize @bullet |
| |
| @item |
| The user selects trace points in the program's code at which GDB should |
| collect data. |
| |
| @item |
| The user specifies expressions to evaluate at each trace point. These |
| expressions may denote objects in memory, in which case those objects' |
| contents are recorded as the program runs, or computed values, in which |
| case the values themselves are recorded. |
| |
| @item |
| GDB transmits the tracepoints and their associated expressions to the |
| GDB agent, running on the debugging target. |
| |
| @item |
| The agent arranges to be notified when a trace point is hit. |
| |
| @item |
| When execution on the target reaches a trace point, the agent evaluates |
| the expressions associated with that trace point, and records the |
| resulting values and memory ranges. |
| |
| @item |
| Later, when the user selects a given trace event and inspects the |
| objects and expression values recorded, GDB talks to the agent to |
| retrieve recorded data as necessary to meet the user's requests. If the |
| user asks to see an object whose contents have not been recorded, GDB |
| reports an error. |
| |
| @end itemize |
| |
| |
| @node Varying Target Capabilities |
| @section Varying Target Capabilities |
| |
| Some targets don't support floating-point, and some would rather not |
| have to deal with @code{long long} operations. Also, different targets |
| will have different stack sizes, and different bytecode buffer lengths. |
| |
| Thus, GDB needs a way to ask the target about itself. We haven't worked |
| out the details yet, but in general, GDB should be able to send the |
| target a packet asking it to describe itself. The reply should be a |
| packet whose length is explicit, so we can add new information to the |
| packet in future revisions of the agent, without confusing old versions |
| of GDB, and it should contain a version number. It should contain at |
| least the following information: |
| |
| @itemize @bullet |
| |
| @item |
| whether floating point is supported |
| |
| @item |
| whether @code{long long} is supported |
| |
| @item |
| maximum acceptable size of bytecode stack |
| |
| @item |
| maximum acceptable length of bytecode expressions |
| |
| @item |
| which registers are actually available for collection |
| |
| @item |
| whether the target supports disabled tracepoints |
| |
| @end itemize |
| |
| @node Rationale |
| @section Rationale |
| |
| Some of the design decisions apparent above are arguable. |
| |
| @table @b |
| |
| @item What about stack overflow/underflow? |
| GDB should be able to query the target to discover its stack size. |
| Given that information, GDB can determine at translation time whether a |
| given expression will overflow the stack. But this spec isn't about |
| what kinds of error-checking GDB ought to do. |
| |
| @item Why are you doing everything in LONGEST? |
| |
| Speed isn't important, but agent code size is; using LONGEST brings in a |
| bunch of support code to do things like division, etc. So this is a |
| serious concern. |
| |
| First, note that you don't need different bytecodes for different |
| operand sizes. You can generate code without @emph{knowing} how big the |
| stack elements actually are on the target. If the target only supports |
| 32-bit ints, and you don't send any 64-bit bytecodes, everything just |
| works. The observation here is that the MIPS and the Alpha have only |
| fixed-size registers, and you can still get C's semantics even though |
| most instructions only operate on full-sized words. You just need to |
| make sure everything is properly sign-extended at the right times. So |
| there is no need for 32- and 64-bit variants of the bytecodes. Just |
| implement everything using the largest size you support. |
| |
| GDB should certainly check to see what sizes the target supports, so the |
| user can get an error earlier, rather than later. But this information |
| is not necessary for correctness. |
| |
| |
| @item Why don't you have @code{>} or @code{<=} operators? |
| I want to keep the interpreter small, and we don't need them. We can |
| combine the @code{less_} opcodes with @code{log_not}, and swap the order |
| of the operands, yielding all four asymmetrical comparison operators. |
| For example, @code{(x <= y)} is @code{! (x > y)}, which is @code{! (y < |
| x)}. |
| |
| @item Why do you have @code{log_not}? |
| @itemx Why do you have @code{ext}? |
| @itemx Why do you have @code{zero_ext}? |
| These are all easily synthesized from other instructions, but I expect |
| them to be used frequently, and they're simple, so I include them to |
| keep bytecode strings short. |
| |
| @code{log_not} is equivalent to @code{const8 0 equal}; it's used in half |
| the relational operators. |
| |
| @code{ext @var{n}} is equivalent to @code{const8 @var{s-n} lsh const8 |
| @var{s-n} rsh_signed}, where @var{s} is the size of the stack elements; |
| it follows @code{ref@var{m}} and @var{reg} bytecodes when the value |
| should be signed. See the next bulleted item. |
| |
| @code{zero_ext @var{n}} is equivalent to @code{const@var{m} @var{mask} |
| log_and}; it's used whenever we push the value of a register, because we |
| can't assume the upper bits of the register aren't garbage. |
| |
| @item Why not have sign-extending variants of the @code{ref} operators? |
| Because that would double the number of @code{ref} operators, and we |
| need the @code{ext} bytecode anyway for accessing bitfields. |
| |
| @item Why not have constant-address variants of the @code{ref} operators? |
| Because that would double the number of @code{ref} operators again, and |
| @code{const32 @var{address} ref32} is only one byte longer. |
| |
| @item Why do the @code{ref@var{n}} operators have to support unaligned fetches? |
| GDB will generate bytecode that fetches multi-byte values at unaligned |
| addresses whenever the executable's debugging information tells it to. |
| Furthermore, GDB does not know the value the pointer will have when GDB |
| generates the bytecode, so it cannot determine whether a particular |
| fetch will be aligned or not. |
| |
| In particular, structure bitfields may be several bytes long, but follow |
| no alignment rules; members of packed structures are not necessarily |
| aligned either. |
| |
| In general, there are many cases where unaligned references occur in |
| correct C code, either at the programmer's explicit request, or at the |
| compiler's discretion. Thus, it is simpler to make the GDB agent |
| bytecodes work correctly in all circumstances than to make GDB guess in |
| each case whether the compiler did the usual thing. |
| |
| @item Why are there no side-effecting operators? |
| Because our current client doesn't want them? That's a cheap answer. I |
| think the real answer is that I'm afraid of implementing function |
| calls. We should re-visit this issue after the present contract is |
| delivered. |
| |
| @item Why aren't the @code{goto} ops PC-relative? |
| The interpreter has the base address around anyway for PC bounds |
| checking, and it seemed simpler. |
| |
| @item Why is there only one offset size for the @code{goto} ops? |
| Offsets are currently sixteen bits. I'm not happy with this situation |
| either: |
| |
| Suppose we have multiple branch ops with different offset sizes. As I |
| generate code left-to-right, all my jumps are forward jumps (there are |
| no loops in expressions), so I never know the target when I emit the |
| jump opcode. Thus, I have to either always assume the largest offset |
| size, or do jump relaxation on the code after I generate it, which seems |
| like a big waste of time. |
| |
| I can imagine a reasonable expression being longer than 256 bytes. I |
| can't imagine one being longer than 64k. Thus, we need 16-bit offsets. |
| This kind of reasoning is so bogus, but relaxation is pathetic. |
| |
| The other approach would be to generate code right-to-left. Then I'd |
| always know my offset size. That might be fun. |
| |
| @item Where is the function call bytecode? |
| |
| When we add side-effects, we should add this. |
| |
| @item Why does the @code{reg} bytecode take a 16-bit register number? |
| |
| Intel's IA-64 architecture has 128 general-purpose registers, |
| and 128 floating-point registers, and I'm sure it has some random |
| control registers. |
| |
| @item Why do we need @code{trace} and @code{trace_quick}? |
| Because GDB needs to record all the memory contents and registers an |
| expression touches. If the user wants to evaluate an expression |
| @code{x->y->z}, the agent must record the values of @code{x} and |
| @code{x->y} as well as the value of @code{x->y->z}. |
| |
| @item Don't the @code{trace} bytecodes make the interpreter less general? |
| They do mean that the interpreter contains special-purpose code, but |
| that doesn't mean the interpreter can only be used for that purpose. If |
| an expression doesn't use the @code{trace} bytecodes, they don't get in |
| its way. |
| |
| @item Why doesn't @code{trace_quick} consume its arguments the way everything else does? |
| In general, you do want your operators to consume their arguments; it's |
| consistent, and generally reduces the amount of stack rearrangement |
| necessary. However, @code{trace_quick} is a kludge to save space; it |
| only exists so we needn't write @code{dup const8 @var{SIZE} trace} |
| before every memory reference. Therefore, it's okay for it not to |
| consume its arguments; it's meant for a specific context in which we |
| know exactly what it should do with the stack. If we're going to have a |
| kludge, it should be an effective kludge. |
| |
| @item Why does @code{trace16} exist? |
| That opcode was added by the customer that contracted Cygnus for the |
| data tracing work. I personally think it is unnecessary; objects that |
| large will be quite rare, so it is okay to use @code{dup const16 |
| @var{size} trace} in those cases. |
| |
| Whatever we decide to do with @code{trace16}, we should at least leave |
| opcode 0x30 reserved, to remain compatible with the customer who added |
| it. |
| |
| @end table |