Closed Bug 1750235 Opened 3 years ago Closed 3 years ago

RLB: Panic on `glean.dispatcher` thread if calling `destroy_glean` before init is complete

Categories

(Data Platform and Tools :: Glean: SDK, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: chutten, Assigned: chutten)

Details

Attachments

(1 file)

GTest failures in Firefox Desktop contain MOZ_CRASH(Global Glean object not initialized). It shows the glean.dispatcher thread running a task that uses launch_with_glean while the main thread is in test_reset_glean.

I think I understand this to mean that test_reset_glean is draining the dispatcher which allows its tasks to execute which includes one that uses launch_with_glean. This wouldn't be a problem if there were a Global glean (ie, if init had completed), but this is happening shortly after a call to initialize meaning that, though initialize has been called, it hasn't necessarily completed.

( in at least one of these crashing cases the glean.init thread is in the crashdump, which is evidence that RLB's initalize did not in fact complete )

So I think what we need to do is come up with a dispatcher shutdown that discards queued tasks instead of executing them. And we need to call it in test_reset_glean even if initialize wasn't called yet since there may be pre-init operations that we'd like to have cleared.

(( One alternative approach of only resetting the dispatcher if init has completed probably will not work since that would make test_reset_glean not actually fully reset (pending operations will execute) ))

Or... I suppose we could have test_reset_glean block on initialize completing if initialize has been called. That might be easier to code (though we'll need to store that init thread handle someplace so we can join on it) than some sort of hardened dispatcher reset. It's not performant, but these are test-only APIs so I specifically don't care about that.

I thought I'd be clever and just tell the dispatcher to clear itself out... but that won't be enough because we need the global Glean object later to get the upload manager to drain its tasks, then again to clear the stores....

No, I will indeed have to rewrite destroy_glean to:

  1. If initialize hasn't been called? Do nothing. (already does this)
  2. If initialize has been called but not completed, wait for it to complete, then pass through to 3
  3. If initialize has completed, do all the things that are currently done in the order they are currently done. (already does this, but if initialize has been called at all, not just when completed)

To do this I think we need to store the JoinHandle someplace where it can be joined upon. With a Mutex, naturally

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: