gdb/corefile: improve file backed mapping handling
This commit improves how GDB handles file backed mappings within a
core file, specifically, this is a restructuring of the function
core_target::build_file_mapping.
The primary motivation for this commit was to put in place the
infrastructure to support the next commit in this series, but this
commit does itself make some improvements.
Currently in core_target::build_file_mapping we use
gdbarch_read_core_file_mappings to iterate over the mapped regions
within a core file.
For each region a callback is invoked which is passed details of the
mapping; the file the mapping is from, the offset into the file, and
the address range at which the mapping exists. We are also passed the
build-id for the mapped file in some cases.
We are only told the build-id for the mapped region which actually
contains the ELF header of the mapped file. Other regions of the same
mapped ELF will not have the build-id passed to the callback.
Within core_target::build_file_mapping, in the per-region callback, we
try to find the mapped file based on its filename. If the file can't
be found, and if we have a build-id then we'll ask debuginfod to
download the file.
However we find the file, we cache the opened bfd object, which is
good. Subsequent mappings from the same file will not have a build-id
set, but by that point we already have a cached open bfd object, so
the lack of build-id is irrelevant.
The problem with the above is that if we find a matching file based on
the filename, then we accept that file, even if we have a build-id,
and the build-id doesn't match.
Currently, the mapped region processing is done in a single pass, we
call gdbarch_read_core_file_mappings, and for each mapping, as we see
it, we create the data structures needed to represent that mapping.
In this commit, I will change this to a two phase process. In the
first phase the mappings are grouped together based on the name of the
mapped file. At the end of phase one we have a 'struct mapped_file',
a new struct, for each mapped file. This struct associates an
optional build-id with a list of mapped regions.
In the second phase we try to find the file using its filename. If
the file is found, and the 'struct mapped_file' has a build-id, then
we'll compare the build-id with the file we found. This allows us to
reject on-disk files which have changed since the core file was
created.
If no suitable file was found (either no file found, or a build-id
mismatch) then we can use debuginfod to potentially download a
suitable file.
NOTE: In the future we could potentially add additional sanity
checks here, for example, if a data-file is mapped, and has no
build-id, we can estimate a minimum file size based on the expected
mappings. If the file we find is not big enough then we can reject
the on-disk file. But I don't know how useful this would actually
be, so I've not done that for now.
Having found (or not) a suitable file then we can create the data
structures for each mapped region just as we did before.
The new functionality here is the extra build-id check, and the
possibility of rejecting an on-disk file if the build-id doesn't
match.
This change could have been done within the existing single phase
approach I think, however, in the next approach I need to have all the
mapped regions associated with the expected build-id, and the new two
phase structure allows me to do that, this is the reason for such an
extensive rewrite in this commit.
There's a new test that exercises GDB's ability to find mapped files
via the build-id, and this downloading from debuginfod.
5 files changed