We're now running an experiment where we trigger a "user is inactive" edge early-ish in an orderly Firefox Desktop shutdown. This really shouldn't make a difference, but it seems to be reversing the coverage of context_ids in topsites-impression pings so that Glean is once again sending more.
It's still close enough (Glean hears from only 0.54% of contexts that PC doesn't, and PC hears from 0.34% of contexts that Glean doesn't (net 0.2% gain in Glean's favour)) that this clearly hasn't brought it fully up to the level of topsite-click
(Glean not PC at 3.2%, PC not Glean at 0.48%)... but it's heartening that there are things we can do to influence this if we so choose.
But how? How did this make any difference at all?
Well, after digging around in some oldish (~3yo) code in the Glean SDK uploader, I might have an idea. On init, Glean will trigger the upload of any at-startup "events" or "metrics" pings, and the initial client-active-causing "baseline" ping. But that simply spawns the glean.upload
thread so it can do its own thing.
One thing it can do is wait at most 3 times for 1s each if the upload manager is still loading pending pings from disk. (This number of times and length of time were chosen in the era where Glean only had consumers on mobile, where disk I/O was fast). If all those upload triggers and the preinit dispatcher queue with its topsites-impression
"top-sites" pings was flushed during that waiting period, and the pending pings dir took longer than 3s to scan, the glean.upload
thread would exit.
If, then, no other ping was submitted, the glean.upload
thread would never be restarted and try to upload anything.
But if we changed behaviour so during shutdown we triggered a client inactive which submits some pings which triggers the upload... well, that might just kick things back into gear and give Firefox a chance to upload those submitted pings that have been lying around.
Maybe. (This code is complicated).
But if this is so, then I think there might be an SDK change here we should try involving triggering upload after the pending pings dir has finished being scanned. It might not be a straightforward fix (since we don't want to do this every time we scan the dir, only when we're doing so as part of initializing the global Glean), but it's the closest thing to a lead I've had in some time.