Low/no gmplugin crashes being reported on Windows




Last year
Last year


(Reporter: wlach, Unassigned)


Firefox Tracking Flags

(Not tracked)


According to error aggregates (https://docs.telemetry.mozilla.org/datasets/streaming/error_aggregates/reference.html), we have seen almost no gmplugin crashes on Windows lately:


The code to gather these crashes is quite straightforward:


And also we do see some crashes on other platforms, for example Mac:


While I would like to believe that our Windows code is now so robust that it doesn't crash anymore, I also worry that something might be wrong here.

I talked a bit in #media about this and (at least as of this writing, discussion ongoing) no one seemed quite sure why this would be the case.


Gabriel, you touched the reporting code recently (https://searchfox.org/mozilla-central/rev/877c99c523a054419ec964d4dfb3f0cadac9d497/ipc/glue/CrashReporterHost.cpp#267). Might you have any ideas what's happening?
Flags: needinfo?(gsvelto)
Some further discussion on #media, potentially useful:

10:15:46 <wlach> we seem to be lacking people with an intersection of understanding between the telemetry/crash reporting side and the gmplugin side
10:16:12 <drno> wlach: indeed that is probably a resource close to zero
10:16:41 <drno> wlach: two questions come to my mind
10:17:10 <drno> 1) is the sandbox on windows different then on the other OS’s?
10:17:58 <drno> 2) do we start a separate process for the GMP plugin on Windows? I only tested on Mac, and since I’m traveline I don’t have a Win machine at hand
10:18:01 <jld> Sandboxing is pretty much completely separate on different OSes.
10:19:29 <jld> And GMPs are always their own processes as I understand it.
10:20:31 <jld> Also, the internals of crash reporting are probably pretty different across OSes, especially Windows vs. Unix/Mac.
And yet more:

10:22:56 <drno> wlach: have a look at this https://sql.telemetry.mozilla.org/queries/52126/source
10:23:39 <drno> looks like GMP crash reporting on Win was working up till 53, then stopped working and now is back since 58.0.2
10:24:06 <wlach> drno: the 58.0.2 column is probably just one ping
10:24:21 <wlach> drno: and based on my experience, it is likely bogus
10:24:42 <wlach> but I would agree with the working up to 53 part :)
10:24:44 <drno> wlach: fair enough. So it’s broken since 54 then :)
I've done a bunch of changes to our crash reporting machinery in the last year so I might have broken something myself. I'm currently on parental leave but I'll try to look at this ASAP. Leaving the NI? for now.
Quick update; I'm on parental leave for another week but I should have some time to look into this between today and tomorrow. In the meantime I was wondering if it would be useful to enable crash pings for GMP crashes. It's just a matter of adding the relevant type in [1] for crash pings to be sent. It could be useful not only for measuring crash rates but also for having a redundant data source in case crash submission isn't working properly as it seems to be the case here. Note that crash pings now carry a rather rich amount of info including raw stack traces and the list of loaded modules. If there's interest in it I can enable them and ensure that they're processed correctly on our back-ends.

[1] https://searchfox.org/mozilla-central/rev/8220783953b0311e1d64c2366f732a159f05ed7e/toolkit/components/crashes/CrashManager.jsm#473
I just tried manually crashing the plugin and the crash submission toolbar showed up correctly, I clicked on the submit button and here it is:


So crash submission seems to be working fine.

However I just realized that comment 0 specifically referred to gmplugin crashes recorded in the telemetry data gathered from the main ping. I haven't touched that code so I'll have to dig a little deeper. Enabling gmplugin crash pings as per comment 4 would be a good idea though. Another thing comes to mind, Chris found a significant discrepancy between the content process crashes recorded in the main ping and the actual content crash pings. Chris, do you think we might be seeing a similar problem here with gmplugin crashes being under-counted?
Flags: needinfo?(gsvelto) → needinfo?(chutten)
With no conclusion (yet) to bug 1413172 I don't have a theory of the mechanism by which these counts are disagreeing. However, whatever's doing that doesn't care which OS the system is, so I don't think it's likely to be the same mechanism preventing gmplugin crashes from being counted only on Windows.

But I'll keep an eye out for the possibility while I keep at it.
Flags: needinfo?(chutten)
You need to log in before you can comment on or make changes to this bug.