gdb/dwarf: read foreign type units
In DWARF 5, foreign type units are type units present in .dwo files.
Type units in .dwo files don't have a matching skeleton in the main
file. When an .debug_names index is present, it can't include those
type units in the regular type unit list, since that list specifies
offsets in the main file. They are instead listed in the foreign TU
list, which is basically just a list of type signatures.
In order to help the debugger locate these units (i.e. find the .dwo
file containing them), individual index entries referencing foreign type
units may also include a reference to a compile unit that the debugger
can follow to find the appropriate .dwo file.
This patch implements reading the .debug_names foreign TU list and using
these "hint" CUs to locate foreign type units. I use the term "hint"
throughout the code, but I don't mind another name if someone has a
better idea.
The first part is to read the actual foreign TU list from the
.debug_names index and create signatured_type objects out of them. This
is done in the new function create_foreign_type_units_from_debug_names.
Append the newly created signatured_type to the
mapped_debug_names_reader::foreign_type_units vector, which will be used
later to resolve DW_IDX_type_unit indices. Populate the
dwarf2_per_bfd::signatured_types set, which contains signatured_types
indexed by signature. And finally, transfer ownership of the object to
the dwarf2_per_bfd::all_units vector.
Previously, all dwarf2_per_cu (including signatured_type) objects were
created with a non-nullptr section. With foreign type units, we don't
know the section at creation time. We also don't know the offset into
section nor the size of the unit. Therefore, add a
dwarf2_per_bfd::allocate_signatured_type overload that takes just the
signature. Remove the "section != nullptr" assert from the
dwarf2_per_cu constructor, but add it to the other allocate_* methods.
Since the new create_foreign_type_units_from_debug_names function adds
items to the dwarf2_per_bfd::all_units vector, the vector needs to be
sorted after create_foreign_type_units_from_debug_names runs. Remove
the finalize_all_units call from create_all_units, making the callers
responsible to call it.
The next step is to read the hint CU attributes when scanning the index
entries. Rework mapped_debug_names_reader::scan_one_entry to remember
which kind of unit references it saw for the entry (comp, type and/or
foreign type) and then figure out what this means. The logic is:
- Did the entry reference a foreign type unit? If so, it's a foreign
type unit. Does it also reference a hint CU? If not, drop the
entry, there's nothing useful we can do with it.
- Otherwise, did the entry reference a (non-foreign) type unit? Then
it's a regular type unit. If so, it shouldn't also have a
DW_IDX_compile_unit.
- Otherwise, did the entry reference a comp unit? If so, it's a comp
unit.
- Otherwise, we don't know what unit the entry references, it's an
error.
Since the .debug_name index attaches hint CU attributes to individual
index entries, my initial implementation added the hint CU information
to the cooked_index_entry structure. I am not sure why DWARF 5 chose to
do it this way, as opposed to attaching one hint per foreign TU. Does
this mean that two type units with the same signature could be different
and, for a specific index entry, it would be important to find one
specific instance of the type unit over the others? I have no idea.
However, I know that the current GDB DWARF reader is not able to load
multiple type units with the same signature but different content. Once
it loads one type unit with a given signature, all subsequent references
to that signature will use that loaded type unit. I therefore chose to
have the .debug_names reader record just one hint CU per foreign TU.
This avoids growing the cooked_index_entry structure for nothing, and
having to pass through this information through multiple layers.
The next step is to locate the .dwo file containing the foreign TUs when
we need them. This sometimes makes use of the hint CU, but not always.
I identified these 3 code paths:
1. When the type unit gets expanded directly, for instance if you use
"ptype" and there is a direct match into the type unit. This case
is handled in load_full_type_unit, calling a new function
fill_in_sig_entry_from_per_cu_hint. This one uses the hint recorded
by the .debug_names reader. When a cooked index entry exists and
refers to a foreign TU for which the section is not yet known, we
know that there exists a hint, otherwise we wouldn't have created
the entry in the first place.
2. The second one is when the type unit is referenced by some other
unit. This case is handled in follow_die_sig_1, calling another new
function fill_in_sig_entry_from_dwo_file. In this case, we know
which unit is referring to the TU, so we use that unit's dwo file to
fill in the details. As explained in the comment, this is sometimes
just an optimization, but sometimes also necessary, if the TU
does not have a hint, due to it not containing any indexed name.
3. Similarly, in dwarf2_base_index_functions::expand_all_symtabs, we
might have to handle foreign type units for which we don't have a
hint. I initially implemented something in two passes, to go dig in
the dwo_file structures to find those TUs, but ended up choosing to
just skip them, for the reasons explained in the comment there.
Setting a dwarf2_per_cu's section a posteriori breaks the assumed
ordering of the dwarf2_per_bfd::all_units vector. After setting the
section, re-sort the vector.
Add a target board to exercise this new code. This board builds with:
- type units (-fdebug-types-section)
- split DWARF (-gsplit-dwarf)
- .debug_names index (created by GDB)
I ran the whole testsuite with this board file and it's not perfect, but
the results are comparable to the dwarf5-fission-debug-types board, for
instance.
There is one known failure that I am unable to get to the bottom of. It
seems orthogonal to my change though, more like an indexer or symbol
reader issue. There are maybe more of this kind, but this is one
example:
FAIL: gdb.ada/tick_length_array_enum_idx.exp: ptype variable_table'length (GDB internal error)
/home/smarchi/src/binutils-gdb/gdb/dwarf2/read.c:1839: internal-error: search_one: Assertion `symtab != nullptr' failed.
The problem appears to be that a cooked index lookup for symbol
variable_table says that a given TU should contain a match. But then
trying to expand the TU makes dw2_instantiate_symtab yield a nullptr
compunit_symtab, I think because the symbol reader found nothing
interesting symbol-wise. And then the assert in search_one triggers.
The issue seems sensitive to some aspects of the environment (gnat
version?). I am able to reproduce the issue on Arch Linux (gnat 15)
with:
$ make check TESTS="gdb.ada/tick_length_array_enum_idx.exp" RUNTESTFLAGS="--target_board=dwarf5-fission-debug-types-debug-names"
But it doesn't reproduce on Debian 13 (gnat 14), Ubuntu 24.04 (gnat
13) or Fedora Rawhide (gnat 16).
Change-Id: I0d4ccc1cbbce3a337794341744d24091e8549d7f
Approved-By: Tom Tromey <tom@tromey.com>
5 files changed