gdb: force lval_memory for value components with dynamic data location
Using the test program from the added test, I get this internal error:
$ ./gdb -q -nx --data-directory=data-directory testsuite/outputs/gdb.dwarf2/data-loc-cast/data-loc-cast -ex "with language fortran -- p (outer_type) g_outer_var" -batch
$1 = ( /home/smarchi/src/binutils-gdb/gdb/value.c:1697: internal-error: set_component_location: Assertion `this->lval () == lval_memory' failed.
The test added by this patch mimics a problem that was reported when
debugging a Fortran program. The situation is:
- The DWARF defines a Fortran-style array type, with a
DW_AT_data_location attribute.
- It then defines a structure type (outer_type) with a field
(outer_type::assoc) of that array type.
- Trying to cast a minimal symbol (g_outer_var) to that structure type
leads to the internal error shown above.
The use case for this is: for some reason, a variable of type S isn't
described in the debug info, but GDB still knows about type S. The user
is trying to say: I know that the variable at this address is an S, show
it to me.
A Fortran-style array doesn't hold the data directly. It's a structure
(a descriptor) that contains a pointer to the data and possible other
properties, like the length of the array. The DW_AT_data_location
attribute contains a DWARF expression that yields the location the
actual data, given the location of the descriptor. In GDB's type
system, this translates to a dynamic property of kind
DYN_PROP_DATA_LOCATION pointing to the DWARF expression. Before
instantiating a value with such a dynamic type, the dynamic type must
first be resolved, which is done by function resolve_dynamic_type. That
is, for each dynamic property, compute a concrete value for the specific
instance we're dealing with here. The result of resolve_dynamic_type is
a type that is just like the input type, but where all dynamic
properties now have concrete, constant values.
Here's the timeline of what happens when doing "p (outer_type)
g_outer_var":
1. We start in var_msym_value_operation::evaluate_for_cast.
2. We go in function value_cast, in this branch:
else if (arg2->lval () == lval_memory)
return value_at_lazy (to_type, arg2->address ());
... where to_type is the "outer_type" structure and arg2 is the
minsym. This effectively builds a value pretending that there
exists an instance of outer_type at the address of the minsym, which
is what we want. The resulting value is "lval_memory".
3. Somewhere inside value_at_lazy, there is a call to
resolve_dynamic_type:
#0 resolve_dynamic_type (type=0x7e0ff1cfd560, valaddr=..., addr=0x4340, in_frame=0x7bfff0b3f100) at /home/smarchi/src/binutils-gdb/gdb/gdbtypes.c:3011
#1 0x000055556687dcba in value_from_contents_and_address (type=0x7e0ff1cfd560, valaddr=0x0, address=0x4340, frame=...) at /home/smarchi/src/binutils-gdb/gdb/value.c:3669
#2 0x00005555667ec527 in get_value_at (type=0x7e0ff1cfd560, addr=0x4340, frame=..., lazy=1) at /home/smarchi/src/binutils-gdb/gdb/valops.c:992
#3 0x00005555667ec79f in value_at_lazy (type=0x7e0ff1cfd560, addr=0x4340, frame=...) at /home/smarchi/src/binutils-gdb/gdb/valops.c:1039
#4 0x00005555667e902b in value_cast (type=0x7e0ff1cfd560, arg2=0x7d0ff1c35540) at /home/smarchi/src/binutils-gdb/gdb/valops.c:645
This is good, it returns a structure type where the type of field
"assoc" has a constant DYN_PROP_DATA_LOCATION property that holds
the memory address where the data for this array resides.
4. Back in var_msym_value_operation::evaluate_for_cast, we do:
val = value_cast (to_type, val);
/* Don't allow e.g. '&(int)var_with_no_debug_info'. */
if (val->lval () == lval_memory)
{
if (val->lazy ())
val->fetch_lazy ();
val->set_lval (not_lval);
}
This is meant to make GDB behave more or less like C, where the
result of a cast is not an lvalue, of which you can't take the
address, for instance. I am not an expert in this area, but Pedro
explained that this lval thing in GDB actually conflates two things:
- where is the value (memory, register, only in GDB's mind, etc)
- is this an lvalue (is it assignable, can you take its address,
etc)
Here, we would ideally want to say that the value is not an lvalue,
but still say that it lives at a given address in memory. But since
the two concepts are conflated, we set it to "not_lval", which means
"not an lval and does not exist on target".
If there was a way to say "non-lvalue" and "in memory", I think that
the bug that follows would be hidden.
5. When printing the value, the value-printing code attempts to fetch
the "assoc" field of the struct using value::primitive_field, which
then goes into value::set_component_location. In
set_component_location there is this code, which is where we find
the assert that fails:
/* If the COMPONENT has a dynamic location, and is an
lval_internalvar_component, then we change it to a lval_memory.
Usually a component of an internalvar is created non-lazy, and has
its content immediately copied from the parent internalvar.
However, for components with a dynamic location, the content of
the component is not contained within the parent, but is instead
accessed indirectly. Further, the component will be created as a
lazy value.
By changing the type of the component to lval_memory we ensure
that value_fetch_lazy can successfully load the component.
This solution isn't ideal, but a real fix would require values to
carry around both the parent value contents, and the contents of
any dynamic fields within the parent. This is a substantial
change to how values work in GDB. */
if (this->lval () == lval_internalvar_component)
{
gdb_assert (lazy ());
m_lval = lval_memory;
}
else
gdb_assert (this->lval () == lval_memory);
I think that what this comment is really trying to say is: if a
structure is an internalvar, and a field of that structure has a
dynamic data location, then the actual data is not contained in the
internalvar, it is in memory.
The message for the commit that introduced that code (3c8c6de21da
"gdb: user variables with components of dynamic type")) confirms it,
I think. The message also goes on to explain that we could imagine
a world where the internalvar outer struct value would also capture
the (indirect) contents of the array field. The internalvar value
would then be completely self-contained. I imagine this could be
useful in some cases, but we don't have that today.
The comment makes it sound like it's a hack, but I actually think it
makes sense. This is what is really happening.
This all assumes that the result of the DW_AT_data_location is a
location in memory. I guess this is true for all practical purposes
today, but it would be possible for DW_AT_data_location to yield a
register, a composite, or even "undefined" (meaning optimized out)
as a location (which would be even easier to implement with the
upcoming DWARF 6 "Location descriptions on the DWARF stack" feature
[1]). In that case, the location that we set for the array
component should reflect whatever the DWARF expression returned.
But that is future work, for now, we assume that the data location
can only be in memory.
So, the fix is basically to always set the location of a value
sub-component to "memory", because we assume (for now) that the result
of all DW_AT_data_location expressions will be memory locations.
Now, referring back to point 4 above: if the code in
var_msym_value_operation::evaluate_for_cast could in fact set the value
to "non-lvalue in memory", then we wouldn't hit the assert in
value::set_component_location, because the subcomponent would have
inherited lval_memory from the parent structure. However, I still think
that the fix in this patch is valid on its own. Imagine that the DWARF
says that the outer struct (and thus the array descriptor) is in a
register. Once we evaluate the DW_AT_data_location of the array field,
the value describing the actual data is still in memory.
The takeaway is that regardless of the location of the descriptor, the
DW_AT_data_location expression always returns a memory location (for
now), so we should always set the location of that value to memory. In
other words, we apply the same logic as commit 3c8c6de21da regardless of
the location of the outer value".
Finally, a note about the test: I made the array content very long (100
elements), because I did spot some issues where GDB would conflate the
size of the array data with the size of the array field, within the
outer structure. The test does not expose these issues and I didn't try
to fix them here (one thing at a time), but making the array size very
large might help spot these issues in the future.
[1] https://dwarfstd.org/issues/230524.1.html
Change-Id: Ib10a77b62cd168fc7c08702e0f6dd47b5ac0f097
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33575
Approved-By: Tom Tromey <tom@tromey.com>
3 files changed