blob: e8738b3aad56504eb39a09b1627459bd7ea8444e [file] [log] [blame]
------------------------------------------------------------------------------
-- --
-- GNAT COMPILER COMPONENTS --
-- --
-- E X P _ D B U G --
-- --
-- S p e c --
-- --
-- Copyright (C) 1996-2003 Free Software Foundation, Inc. --
-- --
-- GNAT is free software; you can redistribute it and/or modify it under --
-- terms of the GNU General Public License as published by the Free Soft- --
-- ware Foundation; either version 2, or (at your option) any later ver- --
-- sion. GNAT is distributed in the hope that it will be useful, but WITH- --
-- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY --
-- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License --
-- for more details. You should have received a copy of the GNU General --
-- Public License distributed with GNAT; see file COPYING. If not, write --
-- to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, --
-- MA 02111-1307, USA. --
-- --
-- GNAT was originally developed by the GNAT team at New York University. --
-- Extensive contributions were provided by Ada Core Technologies Inc. --
-- --
------------------------------------------------------------------------------
-- Expand routines for generation of special declarations used by the
-- debugger. In accordance with the Dwarf 2.2 specification, certain
-- type names are encoded to provide information to the debugger.
with Types; use Types;
with Uintp; use Uintp;
package Exp_Dbug is
-----------------------------------------------------
-- Encoding and Qualification of Names of Entities --
-----------------------------------------------------
-- This section describes how the names of entities are encoded in
-- the generated debugging information.
-- An entity in Ada has a name of the form X.Y.Z ... E where X,Y,Z
-- are the enclosing scopes (not including Standard at the start).
-- The encoding of the name follows this basic qualified naming scheme,
-- where the encoding of individual entity names is as described in
-- Namet (i.e. in particular names present in the original source are
-- folded to all lower case, with upper half and wide characters encoded
-- as described in Namet). Upper case letters are used only for entities
-- generated by the compiler.
-- There are two cases, global entities, and local entities. In more
-- formal terms, local entities are those which have a dynamic enclosing
-- scope, and global entities are at the library level, except that we
-- always consider procedures to be global entities, even if they are
-- nested (that's because at the debugger level a procedure name refers
-- to the code, and the code is indeed a global entity, including the
-- case of nested procedures.) In addition, we also consider all types
-- to be global entities, even if they are defined within a procedure.
-- The reason for treating all type names as global entities is that
-- a number of our type encodings work by having related type names,
-- and we need the full qualification to keep this unique.
-- For global entities, the encoded name includes all components of the
-- fully expanded name (but omitting Standard at the start). For example,
-- if a library level child package P.Q has an embedded package R, and
-- there is an entity in this embdded package whose name is S, the encoded
-- name will include the components p.q.r.s.
-- For local entities, the encoded name only includes the components
-- up to the enclosing dynamic scope (other than a block). At run time,
-- such a dynamic scope is a subprogram, and the debugging formats know
-- about local variables of procedures, so it is not necessary to have
-- full qualification for such entities. In particular this means that
-- direct local variables of a procedure are not qualified.
-- As an example of the local name convention, consider a procedure V.W
-- with a local variable X, and a nested block Y containing an entity
-- Z. The fully qualified names of the entities X and Z are:
-- V.W.X
-- V.W.Y.Z
-- but since V.W is a subprogram, the encoded names will end up
-- encoding only
-- x
-- y.z
-- The separating dots are translated into double underscores.
-----------------------------
-- Handling of Overloading --
-----------------------------
-- The above scheme is incomplete with respect to overloaded
-- subprograms, since overloading can legitimately result in a
-- case of two entities with exactly the same fully qualified names.
-- To distinguish between entries in a set of overloaded subprograms,
-- the encoded names are serialized by adding one of the suffixes:
-- $n (dollar sign)
-- __nn (two underscores)
-- where nn is a serial number (2 for the second overloaded function,
-- 2 for the third, etc.). We use $ if this symbol is allowed, and
-- double underscore if it is not. In the remaining examples in this
-- section, we use a $ sign, but the $ is replaced by __ throughout
-- these examples if $ sign is not available. A suffix of $1 is
-- always omitted (i.e. no suffix implies the first instance).
-- These names are prefixed by the normal full qualification. So
-- for example, the third instance of the subprogram qrs in package
-- yz would have the name:
-- yz__qrs$3
-- A more subtle case arises with entities declared within overloaded
-- subprograms. If we have two overloaded subprograms, and both declare
-- an entity xyz, then the fully expanded name of the two xyz's is the
-- same. To distinguish these, we add the same __n suffix at the end of
-- the inner entity names.
-- In more complex cases, we can have multiple levels of overloading,
-- and we must make sure to distinguish which final declarative region
-- we are talking about. For this purpose, we use a more complex suffix
-- which has the form:
-- $nn_nn_nn ...
-- where the nn values are the homonym numbers as needed for any of
-- the qualifying entities, separated by a single underscore. If all
-- the nn values are 1, the suffix is omitted, Otherwise the suffix
-- is present (including any values of 1). The following example
-- shows how this suffixing works.
-- package body Yz is
-- procedure Qrs is -- Name is yz__qrs
-- procedure Tuv is ... end; -- Name is yz__qrs__tuv
-- begin ... end Qrs;
-- procedure Qrs (X: Int) is -- Name is yz__qrs$2
-- procedure Tuv is ... end; -- Name is yz__qrs__tuv$2_1
-- procedure Tuv (X: Int) is -- Name is yz__qrs__tuv$2_2
-- begin ... end Tuv;
-- procedure Tuv (X: Float) is -- Name is yz__qrs__tuv$2_3
-- type m is new float; -- Name is yz__qrs__tuv__m$2_3
-- begin ... end Tuv;
-- begin ... end Qrs;
-- end Yz;
--------------------
-- Operator Names --
--------------------
-- The above rules applied to operator names would result in names
-- with quotation marks, which are not typically allowed by assemblers
-- and linkers, and even if allowed would be odd and hard to deal with.
-- To avoid this problem, operator names are encoded as follows:
-- Oabs abs
-- Oand and
-- Omod mod
-- Onot not
-- Oor or
-- Orem rem
-- Oxor xor
-- Oeq =
-- One /=
-- Olt <
-- Ole <=
-- Ogt >
-- Oge >=
-- Oadd +
-- Osubtract -
-- Oconcat &
-- Omultiply *
-- Odivide /
-- Oexpon **
-- These names are prefixed by the normal full qualification, and
-- suffixed by the overloading identification. So for example, the
-- second operator "=" defined in package Extra.Messages would
-- have the name:
-- extra__messages__Oeq__2
----------------------------------
-- Resolving Other Name Clashes --
----------------------------------
-- It might be thought that the above scheme is complete, but in Ada 95,
-- full qualification is insufficient to uniquely identify an entity
-- in the program, even if it is not an overloaded subprogram. There
-- are two possible confusions:
-- a.b
-- interpretation 1: entity b in body of package a
-- interpretation 2: child procedure b of package a
-- a.b.c
-- interpretation 1: entity c in child package a.b
-- interpretation 2: entity c in nested package b in body of a
-- It is perfectly legal in both cases for both interpretations to
-- be valid within a single program. This is a bit of a surprise since
-- certainly in Ada 83, full qualification was sufficient, but not in
-- Ada 95. The result is that the above scheme can result in duplicate
-- names. This would not be so bad if the effect were just restricted
-- to debugging information, but in fact in both the above cases, it
-- is possible for both symbols to be external names, and so we have
-- a real problem of name clashes.
-- To deal with this situation, we provide two additional encoding
-- rules for names
-- First: all library subprogram names are preceded by the string
-- _ada_ (which causes no duplications, since normal Ada names can
-- never start with an underscore. This not only solves the first
-- case of duplication, but also solves another pragmatic problem
-- which is that otherwise Ada procedures can generate names that
-- clash with existing system function names. Most notably, we can
-- have clashes in the case of procedure Main with the C main that
-- in some systems is always present.
-- Second, for the case where nested packages declared in package
-- bodies can cause trouble, we add a suffix which shows which
-- entities in the list are body-nested packages, i.e. packages
-- whose spec is within a package body. The rules are as follows,
-- given a list of names in a qualified name name1.name2....
-- If none are body-nested package entities, then there is no suffix
-- If at least one is a body-nested package entity, then the suffix
-- is X followed by a string of b's and n's (b = body-nested package
-- entity, n = not a body-nested package).
-- There is one element in this string for each entity in the encoded
-- expanded name except the first (the rules are such that the first
-- entity of the encoded expanded name can never be a body-nested'
-- package. Trailing n's are omitted, as is the last b (there must
-- be at least one b, or we would not be generating a suffix at all).
-- For example, suppose we have
-- package x is
-- pragma Elaborate_Body;
-- m1 : integer; -- #1
-- end x;
-- package body x is
-- package y is m2 : integer; end y; -- #2
-- package body y is
-- package z is r : integer; end z; -- #3
-- end;
-- m3 : integer; -- #4
-- end x;
-- package x.y is
-- pragma Elaborate_Body;
-- m2 : integer; -- #5
-- end x.y;
-- package body x.y is
-- m3 : integer; -- #6
-- procedure j is -- #7
-- package k is
-- z : integer; -- #8
-- end k;
-- begin
-- null;
-- end j;
-- end x.y;
-- procedure x.m3 is begin null; end; -- #9
-- Then the encodings would be:
-- #1. x__m1 (no BNPE's in sight)
-- #2. x__y__m2X (y is a BNPE)
-- #3. x__y__z__rXb (y is a BNPE, so is z)
-- #4. x__m3 (no BNPE's in sight)
-- #5. x__y__m2 (no BNPE's in sight)
-- #6. x__y__m3 (no BNPE's in signt)
-- #7. x__y__j (no BNPE's in sight)
-- #8. k__z (no BNPE's, only up to procedure)
-- #9 _ada_x__m3 (library level subprogram)
-- Note that we have instances here of both kind of potential name
-- clashes, and the above examples show how the encodings avoid the
-- clash as follows:
-- Lines #4 and #9 both refer to the entity x.m3, but #9 is a library
-- level subprogram, so it is preceded by the string _ada_ which acts
-- to distinguish it from the package body entity.
-- Lines #2 and #5 both refer to the entity x.y.m2, but the first
-- instance is inside the body-nested package y, so there is an X
-- suffix to distinguish it from the child library entity.
-- Note that enumeration literals never need Xb type suffixes, since
-- they are never referenced using global external names.
---------------------
-- Interface Names --
---------------------
-- Note: if an interface name is present, then the external name
-- is taken from the specified interface name. Given the current
-- limitations of the gcc backend, this means that the debugging
-- name is also set to the interface name, but conceptually, it
-- would be possible (and indeed desirable) to have the debugging
-- information still use the Ada name as qualified above, so we
-- still fully qualify the name in the front end.
-------------------------------------
-- Encodings Related to Task Types --
-------------------------------------
-- Each task object defined by a single task declaration is associated
-- with a prefix that is used to qualify procedures defined in that
-- task. Given
--
-- package body P is
-- task body TaskObj is
-- procedure F1 is ... end;
-- begin
-- B;
-- end TaskObj;
-- end P;
--
-- The name of subprogram TaskObj.F1 is encoded as p__taskobjTK__f1,
-- The body, B, is contained in a subprogram whose name is
-- p__taskobjTKB.
------------------------------------------
-- Encodings Related to Protected Types --
------------------------------------------
-- Each protected type has an associated record type, that describes
-- the actual layout of the private data. In addition to the private
-- components of the type, the Corresponding_Record_Type includes one
-- component of type Protection, which is the actual lock structure.
-- The run-time size of the protected type is the size of the corres-
-- ponding record.
-- For a protected type prot, the Corresponding_Record_Type is encoded
-- as protV.
-- The operations of a protected type are encoded as follows: each
-- operation results in two subprograms, a locking one that is called
-- from outside of the object, and a non-locking one that is used for
-- calls from other operations on the same object. The locking operation
-- simply acquires the lock, and then calls the non-locking version.
-- The names of all of these have a prefix constructed from the name of
-- the type, the string "PT", and a suffix which is P or N, depending on
-- whether this is the protected/non-locking version of the operation.
-- Operations generated for protected entries follow the same encoding.
-- Each entry results in two suprograms: a procedure that holds the
-- entry body, and a function that holds the evaluation of the barrier.
-- The names of these subprograms include the prefix 'E' or 'B' res-
-- pectively. The names also include a numeric suffix to render them
-- unique in the presence of overloaded entries.
-- Given the declaration:
-- protected type Lock is
-- function Get return Integer;
-- procedure Set (X: Integer);
-- entry Update (Val : Integer);
-- private
-- Value : Integer := 0;
-- end Lock;
-- the following operations are created:
-- lockPT_getN
-- lockPT_getP,
-- lockPT_setN
-- lockPT_setP
-- lockPT_update1sE
-- lockPT_udpate2sB
----------------------------------------------------
-- Conversion between Entities and External Names --
----------------------------------------------------
No_Dollar_In_Label : constant Boolean := True;
-- True iff the target does not allow dollar signs ("$") in external names
-- ??? We want to migrate all platforms to use the same convention.
-- As a first step, we force this constant to always be True. This
-- constant will eventually be deleted after we have verified that
-- the migration does not cause any unforseen adverse impact.
-- We chose "__" because it is supported on all platforms, which is
-- not the case of "$".
procedure Get_External_Name
(Entity : Entity_Id;
Has_Suffix : Boolean);
-- Set Name_Buffer and Name_Len to the external name of entity E.
-- The external name is the Interface_Name, if specified, unless
-- the entity has an address clause or a suffix.
--
-- If the Interface is not present, or not used, the external name
-- is the concatenation of:
--
-- - the string "_ada_", if the entity is a library subprogram,
-- - the names of any enclosing scopes, each followed by "__",
-- or "X_" if the next entity is a subunit)
-- - the name of the entity
-- - the string "$" (or "__" if target does not allow "$"), followed
-- by homonym suffix, if the entity is an overloaded subprogram
-- or is defined within an overloaded subprogram.
procedure Get_External_Name_With_Suffix
(Entity : Entity_Id;
Suffix : String);
-- Set Name_Buffer and Name_Len to the external name of entity E.
-- If Suffix is the empty string the external name is as above,
-- otherwise the external name is the concatenation of:
--
-- - the string "_ada_", if the entity is a library subprogram,
-- - the names of any enclosing scopes, each followed by "__",
-- or "X_" if the next entity is a subunit)
-- - the name of the entity
-- - the string "$" (or "__" if target does not allow "$"), followed
-- by homonym suffix, if the entity is an overloaded subprogram
-- or is defined within an overloaded subprogram.
-- - the string "___" followed by Suffix
--
-- If this procedure is called in the ASIS mode, it does nothing. See the
-- comments in the body for more details.
--------------------------------------------
-- Subprograms for Handling Qualification --
--------------------------------------------
procedure Qualify_Entity_Names (N : Node_Id);
-- Given a node N, that represents a block, subprogram body, or package
-- body or spec, or protected or task type, sets a fully qualified name
-- for the defining entity of given construct, and also sets fully
-- qualified names for all enclosed entities of the construct (using
-- First_Entity/Next_Entity). Note that the actual modifications of the
-- names is postponed till a subsequent call to Qualify_All_Entity_Names.
-- Note: this routine does not deal with prepending _ada_ to library
-- subprogram names. The reason for this is that we only prepend _ada_
-- to the library entity itself, and not to names built from this name.
procedure Qualify_All_Entity_Names;
-- When Qualify_Entity_Names is called, no actual name changes are made,
-- i.e. the actual calls to Qualify_Entity_Name are deferred until a call
-- is made to this procedure. The reason for this deferral is that when
-- names are changed semantic processing may be affected. By deferring
-- the changes till just before gigi is called, we avoid any concerns
-- about such effects. Gigi itself does not use the names except for
-- output of names for debugging purposes (which is why we are doing
-- the name changes in the first place.
-- Note: the routines Get_Unqualified_[Decoded]_Name_String in Namet
-- are useful to remove qualification from a name qualified by the
-- call to Qualify_All_Entity_Names.
--------------------------------
-- Handling of Numeric Values --
--------------------------------
-- All numeric values here are encoded as strings of decimal digits.
-- Only integer values need to be encoded. A negative value is encoded
-- as the corresponding positive value followed by a lower case m for
-- minus to indicate that the value is negative (e.g. 2m for -2).
-------------------------
-- Type Name Encodings --
-------------------------
-- In the following typ is the name of the type as normally encoded by
-- the debugger rules, i.e. a non-qualified name, all in lower case,
-- with standard encoding of upper half and wide characters
------------------------
-- Encapsulated Types --
------------------------
-- In some cases, the compiler encapsulates a type by wrapping it in
-- a structure. For example, this is used when a size or alignment
-- specification requires a larger type. Consider:
-- type y is mod 2 ** 64;
-- for y'size use 256;
-- In this case the compile generates a structure type y___PAD, which
-- has a single field whose name is F. This single field is 64 bits
-- long and contains the actual value.
-- A similar encapsulation is done for some packed array types,
-- in which case the structure type is y___LJM and the field name
-- is OBJECT.
-- When the debugger sees an object of a type whose name has a
-- suffix not otherwise mentioned in this specification, the type
-- is a record containing a single field, and the name of that field
-- is all upper-case letters, it should look inside to get the value
-- of the field, and neither the outer structure name, nor the
-- field name should appear when the value is printed.
-----------------------
-- Fixed-Point Types --
-----------------------
-- Fixed-point types are encoded using a suffix that indicates the
-- delta and small values. The actual type itself is a normal
-- integer type.
-- typ___XF_nn_dd
-- typ___XF_nn_dd_nn_dd
-- The first form is used when small = delta. The value of delta (and
-- small) is given by the rational nn/dd, where nn and dd are decimal
-- integers.
--
-- The second form is used if the small value is different from the
-- delta. In this case, the first nn/dd rational value is for delta,
-- and the second value is for small.
------------------------------
-- VAX Floating-Point Types --
------------------------------
-- Vax floating-point types are represented at run time as integer
-- types, which are treated specially by the code generator. Their
-- type names are encoded with the following suffix:
-- typ___XFF
-- typ___XFD
-- typ___XFG
-- representing the Vax F Float, D Float, and G Float types. The
-- debugger must treat these specially. In particular, printing
-- these values can be achieved using the debug procedures that
-- are provided in package System.Vax_Float_Operations:
-- procedure Debug_Output_D (Arg : D);
-- procedure Debug_Output_F (Arg : F);
-- procedure Debug_Output_G (Arg : G);
-- These three procedures take a Vax floating-point argument, and
-- output a corresponding decimal representation to standard output
-- with no terminating line return.
--------------------
-- Discrete Types --
--------------------
-- Discrete types are coded with a suffix indicating the range in
-- the case where one or both of the bounds are discriminants or
-- variable.
-- Note: at the current time, we also encode compile time known
-- bounds if they do not match the natural machine type bounds,
-- but this may be removed in the future, since it is redundant
-- for most debugging formats. However, we do not ever need XD
-- encoding for enumeration base types, since here it is always
-- clear what the bounds are from the total number of enumeration
-- literals, and of course we do not need to encode the dummy XR
-- types generated for renamings.
-- typ___XD
-- typ___XDL_lowerbound
-- typ___XDU_upperbound
-- typ___XDLU_lowerbound__upperbound
-- If a discrete type is a natural machine type (i.e. its bounds
-- correspond in a natural manner to its size), then it is left
-- unencoded. The above encoding forms are used when there is a
-- constrained range that does not correspond to the size or that
-- has discriminant references or other compile time known bounds.
-- The first form is used if both bounds are dynamic, in which case
-- two constant objects are present whose names are typ___L and
-- typ___U in the same scope as typ, and the values of these constants
-- indicate the bounds. As far as the debugger is concerned, these
-- are simply variables that can be accessed like any other variables.
-- In the enumeration case, these values correspond to the Enum_Rep
-- values for the lower and upper bounds.
-- The second form is used if the upper bound is dynamic, but the
-- lower bound is either constant or depends on a discriminant of
-- the record with which the type is associated. The upper bound
-- is stored in a constant object of name typ___U as previously
-- described, but the lower bound is encoded directly into the
-- name as either a decimal integer, or as the discriminant name.
-- The third form is similarly used if the lower bound is dynamic,
-- but the upper bound is compile time known or a discriminant
-- reference, in which case the lower bound is stored in a constant
-- object of name typ___L, and the upper bound is encoded directly
-- into the name as either a decimal integer, or as the discriminant
-- name.
-- The fourth form is used if both bounds are discriminant references
-- or compile time known values, with the encoding first for the lower
-- bound, then for the upper bound, as previously described.
-------------------
-- Modular Types --
-------------------
-- A type declared
-- type x is mod N;
-- Is encoded as a subrange of an unsigned base type with lower bound
-- 0 and upper bound N. That is, there is no name encoding. We use
-- the standard encodings provided by the debugging format. Thus
-- we give these types a non-standard interpretation: the standard
-- interpretation of our encoding would not, in general, imply that
-- arithmetic on type x was to be performed modulo N (especially not
-- when N is not a power of 2).
------------------
-- Biased Types --
------------------
-- Only discrete types can be biased, and the fact that they are
-- biased is indicated by a suffix of the form:
-- typ___XB_lowerbound__upperbound
-- Here lowerbound and upperbound are decimal integers, with the
-- usual (postfix "m") encoding for negative numbers. Biased
-- types are only possible where the bounds are compile time
-- known, and the values are represented as unsigned offsets
-- from the lower bound given. For example:
-- type Q is range 10 .. 15;
-- for Q'size use 3;
-- The size clause will force values of type Q in memory to be
-- stored in biased form (e.g. 11 will be represented by the
-- bit pattern 001).
----------------------------------------------
-- Record Types with Variable-Length Fields --
----------------------------------------------
-- The debugging formats do not fully support these types, and indeed
-- some formats simply generate no useful information at all for such
-- types. In order to provide information for the debugger, gigi creates
-- a parallel type in the same scope with one of the names
-- type___XVE
-- type___XVU
-- The former name is used for a record and the latter for the union
-- that is made for a variant record (see below) if that record or
-- union has a field of variable size or if the record or union itself
-- has a variable size. These encodings suffix any other encodings that
-- that might be suffixed to the type name.
-- The idea here is to provide all the needed information to interpret
-- objects of the original type in the form of a "fixed up" type, which
-- is representable using the normal debugging information.
-- There are three cases to be dealt with. First, some fields may have
-- variable positions because they appear after variable-length fields.
-- To deal with this, we encode *all* the field bit positions of the
-- special ___XV type in a non-standard manner.
-- The idea is to encode not the position, but rather information
-- that allows computing the position of a field from the position
-- of the previous field. The algorithm for computing the actual
-- positions of all fields and the length of the record is as
-- follows. In this description, let P represent the current
-- bit position in the record.
-- 1. Initialize P to 0.
-- 2. For each field in the record,
-- 2a. If an alignment is given (see below), then round P
-- up, if needed, to the next multiple of that alignment.
-- 2b. If a bit position is given, then increment P by that
-- amount (that is, treat it as an offset from the end of the
-- preceding record).
-- 2c. Assign P as the actual position of the field.
-- 2d. Compute the length, L, of the represented field (see below)
-- and compute P'=P+L. Unless the field represents a variant part
-- (see below and also Variant Record Encoding), set P to P'.
-- The alignment, if present, is encoded in the field name of the
-- record, which has a suffix:
-- fieldname___XVAnn
-- where the nn after the XVA indicates the alignment value in storage
-- units. This encoding is present only if an alignment is present.
-- The size of the record described by an XVE-encoded type (in bits)
-- is generally the maximum value attained by P' in step 2d above,
-- rounded up according to the record's alignment.
-- Second, the variable-length fields themselves are represented by
-- replacing the type by a special access type. The designated type
-- of this access type is the original variable-length type, and the
-- fact that this field has been transformed in this way is signalled
-- by encoding the field name as:
-- field___XVL
-- where field is the original field name. If a field is both
-- variable-length and also needs an alignment encoding, then the
-- encodings are combined using:
-- field___XVLnn
-- Note: the reason that we change the type is so that the resulting
-- type has no variable-length fields. At least some of the formats
-- used for debugging information simply cannot tolerate variable-
-- length fields, so the encoded information would get lost.
-- Third, in the case of a variant record, the special union
-- that contains the variants is replaced by a normal C union.
-- In this case, the positions are all zero.
-- Discriminants appear before any variable-length fields that depend
-- on them, with one exception. In some cases, a discriminant
-- governing the choice of a variant clause may appear in the list
-- of fields of an XVE type after the entry for the variant clause
-- itself (this can happen in the presence of a representation clause
-- for the record type in the source program). However, when this
-- happens, the discriminant's position may be determined by first
-- applying the rules described in this section, ignoring the variant
-- clause. As a result, discriminants can always be located
-- independently of the variable-length fields that depend on them.
-- The size of the ___XVE or ___XVU record or union is set to the
-- alignment (in bytes) of the original object so that the debugger
-- can calculate the size of the original type.
-- As an example of this encoding, consider the declarations:
-- type Q is array (1 .. V1) of Float; -- alignment 4
-- type R is array (1 .. V2) of Long_Float; -- alignment 8
-- type X is record
-- A : Character;
-- B : Float;
-- C : String (1 .. V3);
-- D : Float;
-- E : Q;
-- F : R;
-- G : Float;
-- end record;
-- The encoded type looks like:
-- type anonymousQ is access Q;
-- type anonymousR is access R;
-- type X___XVE is record
-- A : Character; -- position contains 0
-- B : Float; -- position contains 24
-- C___XVL : access String (1 .. V3); -- position contains 0
-- D___XVA4 : Float; -- position contains 0
-- E___XVL4 : anonymousQ; -- position contains 0
-- F___XVL8 : anonymousR; -- position contains 0
-- G : Float; -- position contains 0
-- end record;
-- Any bit sizes recorded for fields other than dynamic fields and
-- variants are honored as for ordinary records.
-- Notes:
-- 1) The B field could also have been encoded by using a position
-- of zero, and an alignment of 4, but in such a case, the coding by
-- position is preferred (since it takes up less space). We have used
-- the (illegal) notation access xxx as field types in the example
-- above.
-- 2) The E field does not actually need the alignment indication
-- but this may not be detected in this case by the conversion
-- routines.
-- 3) Our conventions do not cover all XVE-encoded records in which
-- some, but not all, fields have representation clauses. Such
-- records may, therefore, be displayed incorrectly by debuggers.
-- This situation is not common.
-----------------------
-- Base Record Types --
-----------------------
-- Under certain circumstances, debuggers need two descriptions
-- of a record type, one that gives the actual details of the
-- base type's structure (as described elsewhere in these
-- comments) and one that may be used to obtain information
-- about the particular subtype and the size of the objects
-- being typed. In such cases the compiler will substitute a
-- type whose name is typically compiler-generated and
-- irrelevant except as a key for obtaining the actual type.
-- Specifically, if this name is x, then we produce a record
-- type named x___XVS consisting of one field. The name of
-- this field is that of the actual type being encoded, which
-- we'll call y (the type of this single field is arbitrary).
-- Both x and y may have corresponding ___XVE types.
-- The size of the objects typed as x should be obtained from
-- the structure of x (and x___XVE, if applicable) as for
-- ordinary types unless there is a variable named x___XVZ, which,
-- if present, will hold the the size (in bits) of x.
-- The type x will either be a subtype of y (see also Subtypes
-- of Variant Records, below) or will contain no fields at
-- all. The layout, types, and positions of these fields will
-- be accurate, if present. (Currently, however, the GDB
-- debugger makes no use of x except to determine its size).
-- Among other uses, XVS types are sometimes used to encode
-- unconstrained types. For example, given
--
-- subtype Int is INTEGER range 0..10;
-- type T1 (N: Int := 0) is record
-- F1: String (1 .. N);
-- end record;
-- type AT1 is array (INTEGER range <>) of T1;
--
-- the element type for AT1 might have a type defined as if it had
-- been written:
--
-- type at1___C_PAD is record null; end record;
-- for at1___C_PAD'Size use 16 * 8;
--
-- and there would also be
--
-- type at1___C_PAD___XVS is record t1: Integer; end record;
-- type t1 is ...
--
-- Had the subtype Int been dynamic:
--
-- subtype Int is INTEGER range 0 .. M; -- M a variable
--
-- Then the compiler would also generate a declaration whose effect
-- would be
--
-- at1___C_PAD___XVZ: constant Integer := 32 + M * 8 + padding term;
--
-- Not all unconstrained types are so encoded; the XVS
-- convention may be unnecessary for unconstrained types of
-- fixed size. However, this encoding is always necessary when
-- a subcomponent type (array element's type or record field's
-- type) is an unconstrained record type some of whose
-- components depend on discriminant values.
-----------------
-- Array Types --
-----------------
-- Since there is no way for the debugger to obtain the index subtypes
-- for an array type, we produce a type that has the name of the
-- array type followed by "___XA" and is a record whose field names
-- are the names of the types for the bounds. The types of these
-- fields is an integer type which is meaningless.
-- To conserve space, we do not produce this type unless one of
-- the index types is either an enumeration type, has a variable
-- upper bound, has a lower bound different from the constant 1,
-- is a biased type, or is wider than "sizetype".
-- Given the full encoding of these types (see above description for
-- the encoding of discrete types), this means that all necessary
-- information for addressing arrays is available. In some
-- debugging formats, some or all of the bounds information may
-- be available redundantly, particularly in the fixed-point case,
-- but this information can in any case be ignored by the debugger.
----------------------------
-- Note on Implicit Types --
----------------------------
-- The compiler creates implicit type names in many situations where
-- a type is present semantically, but no specific name is present.
-- For example:
-- S : Integer range M .. N;
-- Here the subtype of S is not integer, but rather an anonymous
-- subtype of Integer. Where possible, the compiler generates names
-- for such anonymous types that are related to the type from which
-- the subtype is obtained as follows:
-- T name suffix
-- where name is the name from which the subtype is obtained, using
-- lower case letters and underscores, and suffix starts with an upper
-- case letter. For example, the name for the above declaration of S
-- might be:
-- TintegerS4b
-- If the debugger is asked to give the type of an entity and the type
-- has the form T name suffix, it is probably appropriate to just use
-- "name" in the response since this is what is meaningful to the
-- programmer.
-------------------------------------------------
-- Subprograms for Handling Encoded Type Names --
-------------------------------------------------
procedure Get_Encoded_Name (E : Entity_Id);
-- If the entity is a typename, store the external name of
-- the entity as in Get_External_Name, followed by three underscores
-- plus the type encoding in Name_Buffer with the length in Name_Len,
-- and an ASCII.NUL character stored following the name.
-- Otherwise set Name_Buffer and Name_Len to hold the entity name.
--------------
-- Renaming --
--------------
-- Debugging information is generated for exception, object, package,
-- and subprogram renaming (generic renamings are not significant, since
-- generic templates are not relevant at debugging time).
-- Consider a renaming declaration of the form
-- x typ renames y;
-- There is one case in which no special debugging information is required,
-- namely the case of an object renaming where the backend allocates a
-- reference for the renamed variable, and the entity x is this reference.
-- The debugger can handle this case without any special processing or
-- encoding (it won't know it was a renaming, but that does not matter).
-- All other cases of renaming generate a dummy type definition for
-- an entity whose name is:
-- x___XR for an object renaming
-- x___XRE for an exception renaming
-- x___XRP for a package renaming
-- The name is fully qualified in the usual manner, i.e. qualified in
-- the same manner as the entity x would be. In the case of a package
-- renaming where x is a child unit, the qualification includes the
-- name of the parent unit, to disambiguate child units with the same
-- simple name and (of necessity) different parents.
-- Note: subprogram renamings are not encoded at the present time.
-- The type is an enumeration type with a single enumeration literal
-- that is an identifier which describes the renamed variable.
-- For the simple entity case, where y is an entity name,
-- the enumeration is of the form:
-- (y___XE)
-- i.e. the enumeration type has a single field, whose name
-- matches the name y, with the XE suffix. The entity for this
-- enumeration literal is fully qualified in the usual manner.
-- All subprogram, exception, and package renamings fall into
-- this category, as well as simple object renamings.
-- For the object renaming case where y is a selected component or an
-- indexed component, the literal name is suffixed by additional fields
-- that give details of the components. The name starts as above with
-- a y___XE entity indicating the outer level variable. Then a series
-- of selections and indexing operations can be specified as follows:
-- Indexed component
-- A series of subscript values appear in sequence, the number
-- corresponds to the number of dimensions of the array. The
-- subscripts have one of the following two forms:
-- XSnnn
-- Here nnn is a constant value, encoded as a decimal
-- integer (pos value for enumeration type case). Negative
-- values have a trailing 'm' as usual.
-- XSe
-- Here e is the (unqualified) name of a constant entity in
-- the same scope as the renaming which contains the subscript
-- value.
-- Slice
-- For the slice case, we have two entries. The first is for
-- the lower bound of the slice, and has the form
-- XLnnn
-- XLe
-- Specifies the lower bound, using exactly the same encoding
-- as for an XS subscript as described above.
-- Then the upper bound appears in the usual XSnnn/XSe form
-- Selected component
-- For a selected component, we have a single entry
-- XRf
-- Here f is the field name for the selection
-- For an explicit deference (.all), we have a single entry
-- XA
-- As an example, consider the declarations:
-- package p is
-- type q is record
-- m : string (2 .. 5);
-- end record;
--
-- type r is array (1 .. 10, 1 .. 20) of q;
--
-- g : r;
--
-- z : string renames g (1,5).m(2 ..3)
-- end p;
-- The generated type definition would appear as
-- type p__z___XR is
-- (p__g___XEXS1XS5XRmXL2XS3);
-- p__g___XE--------------------outer entity is g
-- XS1-----------------first subscript for g
-- XS5--------------second subscript for g
-- XRm-----------select field m
-- XL2--------lower bound of slice
-- XS3-----upper bound of slice
function Debug_Renaming_Declaration (N : Node_Id) return Node_Id;
-- The argument N is a renaming declaration. The result is a type
-- declaration as described in the above paragraphs. If not special
-- debug declaration, than Empty is returned.
---------------------------
-- Packed Array Encoding --
---------------------------
-- For every packed array, two types are created, and both appear in
-- the debugging output.
-- The original declared array type is a perfectly normal array type,
-- and its index bounds indicate the original bounds of the array.
-- The corresponding packed array type, which may be a modular type, or
-- may be an array of bytes type (see Exp_Pakd for full details). This
-- is the type that is actually used in the generated code and for
-- debugging information for all objects of the packed type.
-- The name of the corresponding packed array type is:
-- ttt___XPnnn
-- where
-- ttt is the name of the original declared array
-- nnn is the component size in bits (1-31)
-- When the debugger sees that an object is of a type that is encoded
-- in this manner, it can use the original type to determine the bounds,
-- and the component size to determine the packing details.
-- Packed arrays are represented in tightly packed form, with no extra
-- bits between components. This is true even when the component size
-- is not a factor of the storage unit size, so that as a result it is
-- possible for components to cross storage unit boundaries.
-- The layout in storage is identical, regardless of whether the
-- implementation type is a modular type or an array-of-bytes type.
-- See Exp_Pakd for details of how these implementation types are used,
-- but for the purpose of the debugger, only the starting address of
-- the object in memory is significant.
-- The following example should show clearly how the packing works in
-- the little-endian and big-endian cases:
-- type B is range 0 .. 7;
-- for B'Size use 3;
-- type BA is array (0 .. 5) of B;
-- pragma Pack (BA);
-- BV : constant BA := (1,2,3,4,5,6);
-- Little endian case
-- BV'Address + 2 BV'Address + 1 BV'Address + 0
-- +-----------------+-----------------+-----------------+
-- | 0 0 0 0 0 0 1 1 | 0 1 0 1 1 0 0 0 | 1 1 0 1 0 0 0 1 |
-- +-----------------+-----------------+-----------------+
-- <---------> <-----> <---> <---> <-----> <---> <--->
-- unused bits BV(5) BV(4) BV(3) BV(2) BV(1) BV(0)
--
-- Big endian case
--
-- BV'Address + 0 BV'Address + 1 BV'Address + 2
-- +-----------------+-----------------+-----------------+
-- | 0 0 1 0 1 0 0 1 | 1 1 0 0 1 0 1 1 | 1 0 0 0 0 0 0 0 |
-- +-----------------+-----------------+-----------------+
-- <---> <---> <-----> <---> <---> <-----> <--------->
-- BV(0) BV(1) BV(2) BV(3) BV(4) BV(5) unused bits
------------------------------------------------------
-- Subprograms for Handling Packed Array Type Names --
------------------------------------------------------
function Make_Packed_Array_Type_Name
(Typ : Entity_Id;
Csize : Uint)
return Name_Id;
-- This function is used in Exp_Pakd to create the name that is encoded
-- as described above. The entity Typ provides the name ttt, and the
-- value Csize is the component size that provides the nnn value.
--------------------------------------
-- Pointers to Unconstrained Arrays --
--------------------------------------
-- There are two kinds of pointers to arrays. The debugger can tell
-- which format is in use by the form of the type of the pointer.
-- Fat Pointers
-- Fat pointers are represented as a struct with two fields. This
-- struct has two distinguished field names:
-- P_ARRAY is a pointer to the array type. The name of this
-- type is the unconstrained type followed by "___XUA". This
-- array will have bounds which are the discriminants, and
-- hence are unparsable, but will give the number of
-- subscripts and the component type.
-- P_BOUNDS is a pointer to a struct, the name of whose type is the
-- unconstrained array name followed by "___XUB" and which has
-- fields of the form
-- LBn (n a decimal integer) lower bound of n'th dimension
-- UBn (n a decimal integer) upper bound of n'th dimension
-- The bounds may be any integral type. In the case of an
-- enumeration type, Enum_Rep values are used.
-- The debugging information will sometimes reference an anonymous
-- fat pointer type. Such types are given the name xxx___XUP, where
-- xxx is the name of the designated type. If the debugger is asked
-- to output such a type name, the appropriate form is "access xxx".
-- Thin Pointers
-- The value of a thin pointer is a pointer to the second field
-- of a structure with two fields. The name of this structure's
-- type is "arr___XUT", where "arr" is the name of the
-- unconstrained array type. Even though it actually points into
-- middle of this structure, the thin pointer's type in debugging
-- information is pointer-to-arr___XUT.
-- The first field of arr___XUT is named BOUNDS, and has a type
-- named arr___XUB, with the structure described for such types
-- in fat pointers, as described above.
-- The second field of arr___XUT is named ARRAY, and contains
-- the actual array. Because this array has a dynamic size,
-- determined by the BOUNDS field that precedes it, all of the
-- information about arr___XUT is encoded in a parallel type named
-- arr___XUT___XVE, with fields BOUNDS and ARRAY___XVL. As for
-- previously described ___XVE types, ARRAY___XVL has
-- a pointer-to-array type. However, the array type in this case
-- is named arr___XUA and only its element type is meaningful,
-- just as described for fat pointers.
--------------------------------------
-- Tagged Types and Type Extensions --
--------------------------------------
-- A type C derived from a tagged type P has a field named "_parent"
-- of type P that contains its inherited fields. The type of this
-- field is usually P (encoded as usual if it has a dynamic size),
-- but may be a more distant ancestor, if P is a null extension of
-- that type.
-- The type tag of a tagged type is a field named _tag, of type void*.
-- If the type is derived from another tagged type, its _tag field is
-- found in its _parent field.
-----------------------------
-- Variant Record Encoding --
-----------------------------
-- The variant part of a variant record is encoded as a single field
-- in the enclosing record, whose name is:
-- discrim___XVN
-- where discrim is the unqualified name of the variant. This field name
-- is built by gigi (not by code in this unit). In the case of an
-- Unchecked_Union record, this discriminant will not appear in the
-- record, and the debugger must proceed accordingly (basically it
-- can treat this case as it would a C union).
-- The type corresponding to this field has a name that is obtained
-- by concatenating the type name with the above string and is similar
-- to a C union, in which each member of the union corresponds to one
-- variant. However, unlike a C union, the size of the type may be
-- variable even if each of the components are fixed size, since it
-- includes a computation of which variant is present. In that case,
-- it will be encoded as above and a type with the suffix "___XVN___XVU"
-- will be present.
-- The name of the union member is encoded to indicate the choices, and
-- is a string given by the following grammar:
-- union_name ::= {choice} | others_choice
-- choice ::= simple_choice | range_choice
-- simple_choice ::= S number
-- range_choice ::= R number T number
-- number ::= {decimal_digit} [m]
-- others_choice ::= O (upper case letter O)
-- The m in a number indicates a negative value. As an example of this
-- encoding scheme, the choice 1 .. 4 | 7 | -10 would be represented by
-- R1T4S7S10m
-- In the case of enumeration values, the values used are the
-- actual representation values in the case where an enumeration type
-- has an enumeration representation spec (i.e. they are values that
-- correspond to the use of the Enum_Rep attribute).
-- The type of the inner record is given by the name of the union
-- type (as above) concatenated with the above string. Since that
-- type may itself be variable-sized, it may also be encoded as above
-- with a new type with a further suffix of "___XVU".
-- As an example, consider:
-- type Var (Disc : Boolean := True) is record
-- M : Integer;
-- case Disc is
-- when True =>
-- R : Integer;
-- S : Integer;
-- when False =>
-- T : Integer;
-- end case;
-- end record;
-- V1 : Var;
-- In this case, the type var is represented as a struct with three
-- fields, the first two are "disc" and "m", representing the values
-- of these record components.
-- The third field is a union of two types, with field names S1 and O.
-- S1 is a struct with fields "r" and "s", and O is a struct with
-- fields "t".
------------------------------------------------
-- Subprograms for Handling Variant Encodings --
------------------------------------------------
procedure Get_Variant_Encoding (V : Node_Id);
-- This procedure is called by Gigi with V being the variant node.
-- The corresponding encoding string is returned in Name_Buffer with
-- the length of the string in Name_Len, and an ASCII.NUL character
-- stored following the name.
---------------------------------
-- Subtypes of Variant Records --
---------------------------------
-- A subtype of a variant record is represented by a type in which the
-- union field from the base type is replaced by one of the possible
-- values. For example, if we have:
-- type Var (Disc : Boolean := True) is record
-- M : Integer;
-- case Disc is
-- when True =>
-- R : Integer;
-- S : Integer;
-- when False =>
-- T : Integer;
-- end case;
-- end record;
-- V1 : Var;
-- V2 : Var (True);
-- V3 : Var (False);
-- Here V2 for example is represented with a subtype whose name is
-- something like TvarS3b, which is a struct with three fields. The
-- first two fields are "disc" and "m" as for the base type, and
-- the third field is S1, which contains the fields "r" and "s".
-- The debugger should simply ignore structs with names of the form
-- corresponding to variants, and consider the fields inside as
-- belonging to the containing record.
-------------------------------------------
-- Character literals in Character Types --
-------------------------------------------
-- Character types are enumeration types at least one of whose
-- enumeration literals is a character literal. Enumeration literals
-- are usually simply represented using their identifier names. In
-- the case where an enumeration literal is a character literal, the
-- name aencoded as described in the following paragraph.
-- A name QUhh, where each 'h' is a lower-case hexadecimal digit,
-- stands for a character whose Unicode encoding is hh, and
-- QWhhhh likewise stands for a wide character whose encoding
-- is hhhh. The representation values are encoded as for ordinary
-- enumeration literals (and have no necessary relationship to the
-- values encoded in the names).
-- For example, given the type declaration
-- type x is (A, 'C', B);
-- the second enumeration literal would be named QU43 and the
-- value assigned to it would be 1.
----------------------------
-- Effect of Optimization --
----------------------------
-- If the program is compiled with optimization on (e.g. -O1 switch
-- specified), then there may be variations in the output from the
-- above specification. In particular, objects may disappear from
-- the output. This includes not only constants and variables that
-- the program declares at the source level, but also the x___L and
-- x___U constants created to describe the lower and upper bounds of
-- subtypes with dynamic bounds. This means for example, that array
-- bounds may disappear if optimization is turned on. The debugger
-- is expected to recognize that these constants are missing and
-- deal as best as it can with the limited information available.
end Exp_Dbug;