[Tracking Requested - why for this release]: +++ This bug was initially created as a clone of Bug #1291084 +++ Crashes with the "std::list<T>::clear" signature spiked on Nightly 50 around Jul 31 2016. The signature was later modified by bug 1295362, becoming "std::list<T>::clear | CDeviceChild<T>::~CDeviceChild<T>" (and others, but this is largely the most occurring one). See bug 1291084 for more details. Requesting tracking for 50, since the signature spiked in 50.
From triage with Jet: Bas says the speculative fix from bug 1291084 comment 63 was hoping to fix this. That landed on the 5th, and so should have made beta 5's build... But looking at the crash signatures for [@ std::list<T>::clear | CDeviceChild<T>::~CDeviceChild<T> ] (linked up in this bug's header), I see a lot of reports from 50.0b5. So unless this actually missed beta 5, sounds like the speculative fix didn't work.
Tracking 52+ as we need to get to the bottom of what is going here, especially if the speculative fix did not work.
Can someone link to an exact report they want to use this bug for?
We should figure out what caused this regression in 50. https://crash-stats.mozilla.com/search/?signature=%3Dstd%3A%3Alist%3CT%3E%3A%3Aclear%20%7C%20CDeviceChild%3CT%3E%3A%3A~CDeviceChild%3CT%3E&product=Firefox&date=%3E%3D2016-04-01T18%3A08%3A00.000Z&date=%3C2016-10-12T18%3A08%3A00.000Z&_sort=-date&_facets=signature&_facets=proto_signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-proto_signature The most recurring stacks look similar, you could pick one of the reports with one of the most recurring stacks.
The first Nightly build ID to expose the regression is 20160731030203.
this would be the changelog of what has landed in 50.0a1 20160731030203 -1 day: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2ea3d51ba1bb9f5c3b6921c43ea63f70b4fdf5d2&tochange=e5859dfe0bcbd40f4e33f4a633f73ea3473a7849 and expanding the range for another 2 days: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=db3ed1fdbbeaf5ab1e8fe454780146e7499be3db&tochange=2ea3d51ba1bb9f5c3b6921c43ea63f70b4fdf5d2
Clearing the priority field, this cannot be a P3.
Priority: P3 → --
This is #2 top (both browser and content) crasher on beta, second only to 'OOM | small'. (100.0% in signature vs 07.65% overall) cpu_arch = amd64 (11.03% in signature vs 77.71% overall) useragent_locale = en-US (95.97% in signature vs 35.06% overall) "DWrite+" in app_notes = true (95.97% in signature vs 35.06% overall) "DWrite?" in app_notes = true (95.97% in signature vs 35.06% overall) "D2D1.1+" in app_notes = true (99.95% in signature vs 39.88% overall) os_arch = amd64 (99.95% in signature vs 40.23% overall) platform_version = 6.1.7601 Service Pack 1 (100.0% in signature vs 43.26% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ (04.03% in signature vs 57.33% overall) "D2D1.1-" in app_notes = true (57.81% in signature vs 13.86% overall) adapter_vendor_id = Advanced Micro Devices, Inc. [AMD/ATI] (100.0% in signature vs 59.53% overall) platform_pretty_version = Windows 7 (43.34% in signature vs 03.98% overall) useragent_locale = de (22.90% in signature vs 61.60% overall) adapter_vendor_id = Intel Corporation (78.93% in signature vs 48.05% overall) "D3D11 Layers+" in app_notes = true (78.93% in signature vs 49.36% overall) "D3D11 Layers?" in app_notes = true (51.70% in signature vs 35.15% overall) cpu_microcode_version = null (36.80% in signature vs 20.34% overall) "EGL+" in app_notes = true (36.80% in signature vs 20.74% overall) "WebGL+" in app_notes = true (36.80% in signature vs 21.39% overall) "GL Context+" in app_notes = true (36.80% in signature vs 21.39% overall) "GL Context?" in app_notes = true (36.80% in signature vs 21.48% overall) "EGL?" in app_notes = true It's a 64-bit only crash (Firefox 64-bit on 64-bit OS), only happening on Windows 7 SP1 often with locales != en-US. Looking at the URLs, it seems related to video-playing.
I haven't seen any crashes with those particular crash signatures produced in Bughunter though there have been crashes with different signatures on a few (~38) urls from Socorro which originally had those signatures. The most recent of those on Beta were due to Assertion failure: state.filterSourceGraphicTainted == isWriteOnly which were fixed in bug 1307749 on Nightly/52. I've resubmitted the urls for all Socorro urls (~3100) that we have tested in Bughunter where the original Socorro signature began with std::list<T>::clear. This corresponds to about 10% of the total urls from Socorro with the corresponding signature. The tests will be on 64 bit Fedora, Ubuntu, Windows 7 and Windows 10; 32 bit Windows 7 and Windows 10 for all 3 branches Beta/50, Aurora/51 and Nightly/52. Everyone will be tested with both opt and debug builds while Linux will also test opt-asan. It will take until tomorrow to complete the testing but if I get early results I'll let you know. They decided not to uplift bug 1307749 since the fix was for an "assertion failure". Perhaps you can revisit that.
Jeff, should this be moved to graphics? The speculative fix in bug 1291084 didn't fix it.
Virtual is able to reproduce crashes with this signature and with the signatures from bug 1294748, which makes me think that these crashes might be related. We've also seen an increase of crashes on automation (https://bugzilla.mozilla.org/buglist.cgi?keywords=intermittent-failure%2C%20&keywords_type=allwords&short_desc_type=allwordssubstr&short_desc=nvwgf2umx.dll&resolution=---&query_format=advanced&list_id=13273839). Virtual filed bug 1294748 on 2016-08-12, we've started to see an increased rate of crashes on automation since 2016-08-10, this signature spiked on Nightly on 2016-07-31, went down and spiked again around 2016-08-10: https://crash-stats.mozilla.com/signature/?release_channel=nightly&product=Firefox&signature=std%3A%3Alist%3CT%3E%3A%3Aclear%20%7C%20CDeviceChild%3CT%3E%3A%3A~CDeviceChild%3CT%3E&date=%3E%3D2016-04-20T19%3A47%3A37.000Z&date=%3C2016-10-20T19%3A47%3A37.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#graphs. Virtual, could you try to pin the regression down with mozregression? Ryan noticed that bug 1289525 landed a patch on Jul-31 and one on Aug-10. He's going to start a try build with a backout of those patches. Virtual, could you try this build once it's ready?
Here's the build links to try: https://firstname.lastname@example.org/try-win64/firefox-50.0.en-US.win64.zip https://email@example.com/try-win64/firefox-50.0.en-US.win64.zip
Component: Audio/Video: Playback → Graphics
(In reply to Marco Castelluccio [:marco] from comment #13) > Virtual, could you try to pin the regression down with mozregression? > > Ryan noticed that bug 1289525 landed a patch on Jul-31 and one on Aug-10. > He's going to start a try build with a backout of those patches. Virtual, > could you try this build once it's ready? I'm very sorry, but now I don't have that much time to spend on finding regression range as steps to reproduce are very time consuming, because I'm in the course of finding the new job.
If there's any chance you can try out the builds from comment 14, it would be immensely helpful. We're getting short on time in the 50 cycle and are having a lot of problems reproducing on our end.
For now, the only thing I can do is the advise and the recommendation of backing out the suspected patches. There are available 3 test branches of Firefox which are affected by these issues (Beta , Aurora  and Nightly ) to test with at least 3 reverted packs of patches in each branch to diagnose what and which patches were the cause.
Hardware: Unspecified → x86_64
Version: Trunk → 50 Branch
this signature has totally ceased on 50.0b10 and 51.0a2 builds after 20161020004015. changelog beta:https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=FIREFOX_50_0b9_RELEASE&tochange=FIREFOX_50_0b10_RELEASE changelog aurora: https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchane=95567c6551ea7dfef517e7b3f50fe0be209a3a2a&tochange=10be9d40fa865be7c3c203b9cd042722ab3069ca so maybe we can tentatively change the state of this bug to fixed...
the crash is also consistently gone in 52.0a1 after build 20161019030208 - changelog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=90d8afaddf9150853b0b68b35b30c1e54a8683e7&tochange=99a239e1866a57f987b08dad796528e4ea30e622 the common denominating patch that landed in all those timeframes in beta/aurora/nightly and is most plausible to have fixed this per conversation on irc with RyanVM and marco is bug 1308418.
I still find it quite strange that bug 1308418 fixed this, but the other common bugs are even less likely (bug 1310061, bug 1309980, bug 1211270, bug 1149162, bug 1294442, bug 1311588, bug 1308259).
Not that surprising. Bug 1308418 had UAF of mutexes. What kind of damage it could have is anyone guess.
If true, this is the best surprise/unexpected win of Beta50 cycle! Great job everyone :)
You need to log in before you can comment on or make changes to this bug.