gdb: hold a target_ops_ref in scoped_finish_thread_state

This commit fixes a use after free issue that was reported here:

  https://inbox.sourceware.org/gdb-patches/68354b98-795a-4b50-9eac-e54aa1d01b9d@simark.ca

This issue was exposed by the gdb.replay/missing-thread.exp test that
was added in this commit:

  commit 8bd08ee92c4a7bf2ad9e29c4da32a276ef2257fc
  Date:   Fri May 16 17:56:58 2025 +0100

      gdb: crash if thread unexpectedly disappears from thread list

It is worth pointing out that the use after free issue existed before
this commit, this commit just introduced a test that exposed the issue
when GDB is run with the address sanitizer.

It has taken a while to get this fix ready for upstream as this fix
depended on the recently committed patch:

  commit 43db8f70d86b2492b79f59342187b919fd58b3dd
  Date:   Thu Oct 23 16:34:20 2025 +0100

      gdbsupport: remove undefined behaviour from (forward_)scope_exit

The problem is that the first commit above introduces a test which
causes the remote target to disconnect while processing an inferior
stop event, specifically, within normal_stop (infrun.c), GDB calls
update_thread_list, and it is during this call that the inferior
disconnects.

When the remote target disconnects, GDB immediately unpushes the
remote target.  See remote_unpush_target and its uses in remote.c.

If this is the last use of the remote target, then unpushing it will
cause the target to be deleted.

This is a problem, because in normal_stop, we have an RAII variable
maybe_finish_thread_state, which is an optional
scoped_finish_thread_state, and in some cases, this will hold a
pointer to the process_startum_target which needs to be finished.

So the order of events is:

  1. Call to normal_stop.

  2. Create maybe_finish_thread_state with a pointer to the current
     remote_target object.

  3. Call update_thread_list.

  4. Remote disconnects, GDB unpushes and deletes the current
     remote_target object.  GDB throws an exception.

  5. The exception propagates back to normal_stop.

  6. The destructor for maybe_finish_thread_state runs, and tries to
     make use of its cached pointer to the (now deleted) remote_target
     object.  Badness ensues.

This bug isn't restricted to normal_stop.  If a remote target
disconnects anywhere where there is a scoped_finish_thread_state in
the call stack then this issue could arise.

I think what we need to do is to ensure that the remote_target is not
actually deleted until after the scoped_finish_thread_state has been
cleaned up.

And so, to achieve this, I propose changing scoped_finish_thread_state
to hold a target_ops_ref rather than a pointer to the target_ops
object.  Holding the reference will prevent the object from being
deleted.

The new scoped_finish_thread_state is defined within its own file, and
is a drop in replacement for the existing class.

On my local machine the gdb.replay/missing-thread.exp test passes
cleanly after this commit (with address sanitizers), but when I test
on some other machines with a more recent Fedora install, I'm still
seeing test failures (both before and after this patch), though not
relating to the address sanitizer (at least, I don't see an error from
the sanitizer).  I don't think these other issues are directly related
to the problem being addressed in this commit, and so I'm proposing
this patch for inclusion anyway.  I'll continue to look at the test
and see if I can fix the other failures too.  Or maybe I'll end up
having to back out the test.

Approved-By: Simon Marchi <simon.marchi@efficios.com>
Reviewed-By: Guinevere Larsen <guinevere@redhat.com>
5 files changed