startup crash in _VEC_memzero | _VEC_memzero

VERIFIED FIXED in Firefox 31

Status

()

defect
--
critical
VERIFIED FIXED
5 years ago
3 years ago

People

(Reporter: ashughes, Assigned: bjacob)

Tracking

({crash, topcrash-win})

31 Branch
mozilla33
x86
Windows NT
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox30 unaffected, firefox31+ verified, firefox32 verified, firefox33 verified, firefox47 affected, firefox48 affected, firefox49 affected, firefox-esr45 affected)

Details

(crash signature)

Attachments

(2 attachments)

This bug was filed from the Socorro interface and is 
report bp-c32134e2-9298-4d96-b701-7d6f62140325.
=============================================================
0 	msvcrt.dll 	_VEC_memzero 	
1 	msvcrt.dll 	_VEC_memzero 	

More reports:
https://crash-stats.mozilla.com/report/list?product=Firefox&signature=_VEC_memzero+%7C+_VEC_memzero
=============================================================

Filing this bug as "security restricted" as a precaution since this appears to be an EXCEPTION_ACCESS_VIOLATION_WRITE. The volume is enough to make this the #9 topcrash on Firefox 31 but it seems like most of these are dupes.

Top 5 URLs:
25 	https://www.google.co.in/
23 	about:sessionrestore
12 	http://www.awesomehp.com/...
11 	about:blank
3 	about:home
Robert, do you think this is something we need to worry about given the dupes?
Flags: needinfo?(kairo)
Keywords: topcrash-win

Comment 2

5 years ago
It has been around for a while. If it's not security-sensitive and not reproducible, I think it might not be worrisome.
Flags: needinfo?(kairo)
Dan can you take a look here and comment on whether this is security-sensitive?
Flags: needinfo?(dveditz)
I don't think we need to hide this bug, and I'm not worried about security attacks on startup. But 30% or so of these crashes are not at startup.
Group: core-security
Flags: needinfo?(dveditz)
Tracking this since its still a top crasher as of today on Win
I think this is gone now for whatever reason. I'm seeing only two crashes with Firefox 31, both crashes with builds that are about a month old (possibly users who can't update because they are hitting this crash). We can likely just mark this bug resolved WORKSFORME. Can someone please double check?

Source:
https://crash-stats.mozilla.com/report/list?signature=_VEC_memzero+|+_VEC_memzero&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&version=Firefox%3A31.0a1&hang_type=any&date=2014-04-29+16%3A00%3A00&range_value=1#tab-reports

Updated

5 years ago
Crash Signature: [@ _VEC_memzero | _VEC_memzero] → [@ _VEC_memzero | _VEC_memzero ]
...and it's back again:
https://crash-analysis.mozilla.com/rkaiser/2014-05-05/2014-05-05.firefox.31.explosiveness.html

Any advice on what we can do to investigate this further?
Here's a better stack reconstructed by hand. All of the crashes have Intel graphics cards. Gfx team should have a look.

msvcrt!_VEC_memzero+0x36
d2d1!TextStageManager::MapTextureTransferSurface+0x101
d2d1!TextStageManager::AddStagesForSubrect+0x25
d2d1!HwGlyphRunRealizer::IssueRenderingCommands+0x5f7
d2d1!CHwSurfaceRenderTarget::DrawGlyphRunInternal+0x181
d2d1!CCommand_DrawGlyphRun::Execute+0x27
d2d1!CHwSurfaceRenderTarget::ProcessBatch+0x4c
d2d1!CBatchSerializer::FlushInternal+0x2e
d2d1!DrawingContext::FlushBatch+0x1a
d2d1!DrawingContext::Flush+0x26
d2d1!D2DRenderTargetBase<ID2D1RenderTarget>::Flush+0x52
xul!mozilla::gfx::DrawTargetD2D::Flush+0x21 
xul!gfxContext::~gfxContext+0xfb 
[Frame pointers stop making sense beyond that]

Comment 9

5 years ago
ni?naveed for an owner
Component: General → Graphics
Flags: needinfo?(nihsanullah)
needinfo'ing Milan because this is a Gfx bug, not a JS bug.
Flags: needinfo?(nihsanullah) → needinfo?(milan)
Bas, any ideas? I'm making this up, but if that is a gfxContext getting destroyed, wouldn't that only happen when nsRenderingContext is going away, and would that mean we've already decided to shutdown, and this is a startup crash?
Flags: needinfo?(milan) → needinfo?(bas)
(In reply to Milan Sreckovic [:milan] from comment #11)
> Bas, any ideas? I'm making this up, but if that is a gfxContext getting
> destroyed, wouldn't that only happen when nsRenderingContext is going away,
> and would that mean we've already decided to shutdown, and this is a startup
> crash?

No, we recreate gfxContext's all the time (Once per frame per layer at least). They're just a 'state machine' wrapping a DrawTarget or a cairo surface.
Flags: needinfo?(bas)
Milan, could you have someone working on it? It is still occurring. Thanks
Flags: needinfo?(milan)
I don't know how far we can get with just the info in this bug, but we can probably take a look in 33 timeframe.
Flags: needinfo?(milan)
Assignee: nobody → bas
Bas, any chance this bug could be fixed for 31? 31 is going to move to beta next Monday.
Thanks
Flags: needinfo?(bas)
Not really, since we have no steps to reproduce. As Milan said, there's not a lot of information to go with here. We could try blanket blacklisting things based on the drivers/devices in crash reports but that doesn't really seem like a healthy way to go about it :).
Flags: needinfo?(bas)

Comment 17

5 years ago
Well, we need to do something here. This is firmly the #1 crash in 31 Beta now, with 5% of the total crash volume. We can't ignore it.

Comment 18

5 years ago
https://crash-stats.mozilla.com/report/list?signature=_VEC_memzero+%7C+TextStageManager%3A%3AMapTextureTransferSurface%28D2D_RECT_U+const%26%2C+unsigned+char%2A%2A%2C+unsigned+int%2A%29 is probably connected as well and is #4 with 4% on 31.0b right now, so together those are at >9% of all Beta 31 crashes at this point.
https://crash-analysis.mozilla.com/bsmedberg/bug988549-VEC-memzero-nightly.csv shows there may be a regression between:

20140318030202,68727,
20140319030201,76486,26

Although a single user crashing can affect the nightly stats disproportionately. Liz can you make a nightly regression link for that?

Updated

5 years ago
Flags: needinfo?(lhenry)
This signature is only showing up on x86.  They are nearly all still happening on startup for Windows 7.
 
Here is the graphics adapter info from the crash signature summary:

Vendor 	Adapter 	Report Count 	Percentage
0x8086 	0x0042	2236 	79.601 %
Intel Corporation 	Mobile Intel 4 Series Chipset Family Intel Mobile Graphic	235 	8.366 %
Intel Corporation 	PCI\VEN_8086&DEV_2E32&SUBSYS_31031565&REV_03 Intel G41 express graphics	228 	8.117 %
0x8086 	0x2e22	92 	3.275 %
Intel Corporation 	- Intel Q45/Q43 Express Chipset	13 	0.463 %
0x8086 	0x0046	4 	0.142 %
0x8086 	0x0102	1 	0.036 %
Flags: needinfo?(lhenry)

Updated

5 years ago
Flags: needinfo?(lhenry)
Reporter

Updated

5 years ago
See Also: → 1026074
Benjamin, near as I can tell, http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=082761b7bc54&tochange=3bc3b9e2cd99 .

What do the second and third numbers in the rows in your csv mean?
Flags: needinfo?(lhenry)
(In reply to Liz Henry :lizzard from comment #20)
> This signature is only showing up on x86.  They are nearly all still
> happening on startup for Windows 7.
>  
> Here is the graphics adapter info from the crash signature summary:
> 
> Vendor 	Adapter 	Report Count 	Percentage
> 0x8086 	0x0042	2236 	79.601 %
> Intel Corporation 	Mobile Intel 4 Series Chipset Family Intel Mobile Graphic
> 235 	8.366 %
> Intel Corporation 	PCI\VEN_8086&DEV_2E32&SUBSYS_31031565&REV_03 Intel G41
> express graphics	228 	8.117 %
> 0x8086 	0x2e22	92 	3.275 %
> Intel Corporation 	- Intel Q45/Q43 Express Chipset	13 	0.463 %
> 0x8086 	0x0046	4 	0.142 %
> 0x8086 	0x0102	1 	0.036 %

These are all Intel cards. The top correlation (0x8086 0x0042) maps to Intel Q57/H55 Clarkdale according to http://www.pcidatabase.com/vendor_details.php?id=1302
These crashes are all on Windows 7. If a single OS and gfx series produce the overall #1 top crash, then on those particular machines it must be especially crashy. Can we just find such a machine and play around until it crashes?
Juan, can you check the lab to see if we have a machine available which matches this criteria?
Flags: needinfo?(jbecerra)
The columns are buildid,cumulative-ID, cumulative-crashes
Top graphics adapter is:
0x8086 	0x0042	2946 	83.480 % (of crashes)

No useful information there but the associated driver version in the App Notes is 8.15.10.2418 which is long enough to be useful. Googling it brings up a bunch of fairly elderly Acer Travelmates, Gateway laptops, etc. Does that help?
Thanks Laura, it doesn't hurt to have that info :)

Bas, while we're looking for the machine, is there anything you can suggest we do to get to the STR?  Is this configuration similar to what Matt had with all the problems we've seen last week?  Also, wasn't there a "it crashes the first time after X happens, but then it's OK after that" (I can't recall the details) - could this be it?
Flags: needinfo?(bas)
Marcia and I checked the machines in the QA lab including a few Lenovo Thinkpads, but we haven't been able to find one that matches the specs in hardware mentioned here.

I opened a ticket in the ServiceNow site to see if we have any of the machines in this site: http://support.lenovo.com/en_US/detail.page?LegacyDocID=MIGR-70128 (which bmoss referred me to)
Flags: needinfo?(jbecerra)
Not really :s The codepath in the stack is -really- common (i.e. to the extent I can see it, it would be called every frame for any window with text). The pages it occurs on also show little interesting except that it happens on common startup pages, which is no surprise. There really isn't any information here which is specific to any sort of action or behavior.

The one thing that's interesting here is that it's all Intel, but we didn't see this on the machine Matt had.
Flags: needinfo?(bas)
Thanks, I appreciate the details.  OK, lets see if we can find one of these and reproduce ourselves (see comment 28)

Comment 31

5 years ago
What I find interesting as well is that many of the crashes with the signature here have addresses ending in fbd0, with the https://crash-stats.mozilla.com/report/list?signature=_VEC_memzero+%7C+TextStageManager%3A%3AMapTextureTransferSurface%28D2D_RECT_U+const%26%2C+unsigned+char%2A%2A%2C+unsigned+int%2A%29 signature, the addresses vary more, but a number have similar end bytes like fb29, fb89  or fdbe.
Milan, Any progress on this? It is 4-5% of the overall crashes. thanks
Flags: needinfo?(milan)
See comment 28.  Need the hardware, the stack is too general to give us any useful information.
Flags: needinfo?(milan)
Yikes, these reports are with old drivers that have other issues too.

8.15.10.2141 	81.07 %
8.15.10.2125 	18.91 %

The newest of those is from June 2010. According to [1] those versions have known D2D crashes, and MS won't offer Windows 7 SP1 to machines with those drivers. That's confirmed on crash-stats: 96% are on plain Windows 7 (not SP1).

[1] http://support.microsoft.com/kb/2498452

Comment 35

5 years ago
The reports all say D2D+ from what I can see, should we be blocking those old drivers from D2D? Do we think that would help those crashes?
Flags: needinfo?(bas)
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #35)
> The reports all say D2D+ from what I can see, should we be blocking those
> old drivers from D2D? Do we think that would help those crashes?

I suspect it will, I thought we were blocking these from D2D anyway, but it looks like we're not.
Flags: needinfo?(bas)

Comment 37

5 years ago
(In reply to Bas Schouten (:bas.schouten) from comment #36)
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #35)
> > The reports all say D2D+ from what I can see, should we be blocking those
> > old drivers from D2D? Do we think that would help those crashes?
> 
> I suspect it will, I thought we were blocking these from D2D anyway, but it
> looks like we're not.

Maybe some change in 31 unblocked them for some reason? That would explain why this regressed between 30 and 31.
Jeff, you played around with blacklisting I believe? Any chance this happened.
Flags: needinfo?(jmuizelaar)
Bug 904266 landed in 26, I don't recall us doing much since.

Comment 40

5 years ago
Can we please check that D2D blocklisting works correctly in Firefox 31?
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #40)
> Can we please check that D2D blocklisting works correctly in Firefox 31?

Florin, can you please set up some regression testing for 31.0b7?
Flags: needinfo?(florin.mezei)
Keywords: qawanted
We have done some testing with Firefox 31 Beta 7 (BuildID: 20140703154127) on Direct2D blocklisting, by spoofing graphics system information (see bug 604771), and verifying the "about:support" page. We compared the results to those found in "browser/blocklist.xml" and those from "https://wiki.mozilla.org/Blocklisting/Blocked_Graphics_Drivers#Intel_cards" (for Win 7, Vista and 8).

Issues encountered were:

1. Different driver versions than expected (https://wiki.mozilla.org/Blocklisting/Blocked_Graphics_Drivers#Intel_cards) were seen for some of the Intel cards. See in https://docs.google.com/spreadsheets/d/12yEBqNZqaR6_2XbqFrVwRVxt_URjNN3iMmu-GX60Z0Q/edit#gid=0.	
				
2. Setting for Direct2D does not always change if the same user profile is used (I could not quite figure out when it works and when not). I think this means that a user may still have Direct2D as enabled, in the case where the driver is NOt initially blacklisted, but later becomes blacklisted. See in https://docs.google.com/spreadsheets/d/12yEBqNZqaR6_2XbqFrVwRVxt_URjNN3iMmu-GX60Z0Q/edit#gid=0.
					
3. "browser/blocklist.xml" - at blockID="g278", the value 0x9803 appears twice.	

Let me know if additional investigation is needed.
Flags: needinfo?(florin.mezei)
Keywords: qawanted

Comment 43

5 years ago
Bas, does the testing in comment #42 help you? If blocking doesn't work correctly, we're going to see a ton of crashing once we hit release (for now, we mostly saw this in beta).
Flags: needinfo?(bas)
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #43)
> Bas, does the testing in comment #42 help you? If blocking doesn't work
> correctly, we're going to see a ton of crashing once we hit release (for
> now, we mostly saw this in beta).

I don't see any information in there that helps. It's not clear to me from the comment whether blacklisting is working correctly or not. I'm most interested in what happens on a clean profile/clean machine. Preferably not trying to spoof graphics information but actually installing and testing the relevant drivers.
Flags: needinfo?(bas)

Comment 45

5 years ago
Florin, any chance you can help with the request in comment #44?
Flags: needinfo?(florin.mezei)
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #45)
> Florin, any chance you can help with the request in comment #44?

I've done some additional investigation and there are two big issues with blacklisting in Firefox 31, both regressions from Firefox 30. Testing was still done via spoofing, as there's no way we can do it with actual hardware, since time is extremely limited for us.

1. Different (smaller) values for minimum accepted driver versions are used in Firefox 31 compared to Firefox 30 (see lines 20, 23, 25, 29, 30, and 32 in https://docs.google.com/spreadsheets/d/12yEBqNZqaR6_2XbqFrVwRVxt_URjNN3iMmu-GX60Z0Q/edit#gid=0)

a) Intel 
- Win Vista - GMA 500 - Expected=Firefox30=7.14.10.1006 vs. Firefox31=3.0.20.32
- Win Vista - GMA 3150 - Expected=Firefox30=7.14.10.2124 vs. Firefox31=7.14.10.1910
- Win Vista - GMA GMA X4500/HD - Expected=Firefox30=8.15.10.2202 vs. Firefox31=7.15.10.1666
- Win 7 - GMA 3150 - Expected=Firefox30=8.14.10.2117 vs. Firefox31=8.14.10.1972
- Win 7 - GMA X3000 - Expected=Firefox30=8.15.10.1930 vs. Firefox31=7.15.10.1666
- Win 7 - GMA X3000 - Expected=Firefox30=8.15.10.2202 vs. Firefox31=7.15.10.1666

------> Impact: a user with Intel GMA 500 on Win Vista, has driver 4.0.0.0 which is blocked in Firefox 30, but NOT in Firefox 31.

b) NVidia, AMD
- the behavior here is rather weird in Firefox 31, as it seems that two different values are used for blacklisting: one from "browser/blocklist.xml" and one probably from the same place where the values for Intel GPUs come from. See a specific case below, compared for Firefox 31 and Firefox 30:

Environment: Windows 7 (0x60001) + AMD RadeonTM HD 8570

STR:
i. Create "spoofed-firefox.bat" file with following content:
SET MOZ_GFX_SPOOF_WINDOWS_VERSION=60001
SET MOZ_GFX_SPOOF_VENDOR_ID=0x1002
SET MOZ_GFX_SPOOF_DEVICE_ID=0x5049
SET MOZ_GFX_SPOOF_DRIVER_VERSION=
"C:\Mozilla\Firefox\firefox.exe" -p -no-remote
ii. Set the driver version value, save the file, open it to launch Firefox, create a new profile, start Firefox, and verify "about: support" -> Graphics -> Direct2D Enabled.
iii. Execute step "ii" for Firefox 31 with the following values for the Driver version: 8.6.9.9, 8.7.0.0, 8.982.0.0, 8.982.0.1.
iv. Execute step "ii" for Firefox 30 with the following values for the Driver version: 8.7.9.9, 8.8.0.0, 8.982.0.0, 8.982.0.1.

Results:
At step iii. - Direct2D = "Blocked for your graphics driver version. Try updating your graphics driver to version 9.6 or newer." up to version ~8.6.9.9 -> "true" for version 8.7.0.0 and up (except for 8.982.0.0) -> "Blocked for your graphics driver version." for version=8.982.0.0.
At step iv. - Direct2D = "Blocked for your graphics driver version. Try updating your graphics driver to version 10.6 or newer." up to version ~8.7.9.9 -> "true" for version 8.8.0.0 and up (except for 8.982.0.0) -> "Blocked for your graphics driver version." for version=8.982.0.0

Notice above how the Firefox 31 error message says that the minimum driver version is 9.6, while Firefox 30 says the minimum version is 10.6.

------> Impact: a user with the above setup, has driver 8.7.0.0 which is blocked in Firefox 30, but NOT in Firefox 31.

2. For some driver version values, Firefox 31 enables/disables Direct2D when the new profile is created, but then no longer updates it when the driver version changes (see lines 6 to 12 in https://docs.google.com/spreadsheets/d/12yEBqNZqaR6_2XbqFrVwRVxt_URjNN3iMmu-GX60Z0Q/edit#gid=0). 

------> Impact: User with Firefox 31 on Windows 7 + AMD RadeonTM HD 8570, has driver version 8.900.0.0 (for which Direct2D is "true" on both Firefox 30 and 31), user then updates to version 8.982.0.0 (which should be blocked for both Firefox 30 and 31) => Unless the user creates a new profile, Direct2D will NOT be blocked as expected.
Flags: needinfo?(florin.mezei)

Comment 47

5 years ago
Bas, it looks from comment #46 like we have heavily broken D2D blocking on 31, and I believe this to be one of the prime issues with our instabilities on that train. Are you the one to look into that or someone else? Note that Beta 9 is going to build on Thursday and should be the last build to land any code changes in this train.
Flags: needinfo?(bas)
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #47)
> Bas, it looks from comment #46 like we have heavily broken D2D blocking on
> 31, and I believe this to be one of the prime issues with our instabilities
> on that train. Are you the one to look into that or someone else? Note that
> Beta 9 is going to build on Thursday and should be the last build to land
> any code changes in this train.

I don't think I am, I will discuss this in the graphics daily. We'll make sure this gets addressed.
Flags: needinfo?(bas)
So just to clarify, is the problem here that were not blacklisting hardware that we used to or that we're trying to blacklist hardware but it's not working?
Flags: needinfo?(jmuizelaar)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #49)
> So just to clarify, is the problem here that were not blacklisting hardware
> that we used to or that we're trying to blacklist hardware but it's not
> working?

Jeff, in Firefox 31 we're not blacklisting hardware that we used to in Firefox 30 (minimum driver versions in 31 are not the expected ones from Firefox 30, but they are smaller versions). However note that there seem to be more problems than just this (see comment 46).
This is a patch to backout bug 984417. We should probably take this now so that we have more time to investigate the problem and apply a better fix afterwards.
Comment on attachment 8453524 [details] [diff] [review]
Backout bug 984417

Approval Request Comment
[Feature/regressing bug #]: 984417
[User impact if declined]: Lots more crashes
[Describe test coverage new/current, TBPL]: Was in the tree until 31
[Risks and why]: Should be relatively low risk because it just backs out a blocklist change
[String/UUID change made/needed]: None
Attachment #8453524 - Flags: approval-mozilla-beta?
Attachment #8453524 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I've verified today the behavior after the backout for bug 984417, with Firefox 31 Beta 9 (BuildID: 20140710141843) on Windows 7 x64 (Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0).

Blacklisting for Intel cards now works same as for Firefox 30, and as expected.

There are still 3 blacklisting issues (regressions from Firefox 30) that may still cause issues in the future, but should NOT affect too much the stability of Firefox 31 on the short term.

Remaining issues:

1. There are a few cases (non-Intel cards) where D2D is enabled for Firefox 30, but disabled for Firefox 31 (https://docs.google.com/spreadsheets/d/12yEBqNZqaR6_2XbqFrVwRVxt_URjNN3iMmu-GX60Z0Q/edit#gid=0 - lines 6 - 12 in yellow)
- drivers ADDITIONALLY blocked on Firefox 31 compared to Firefox 31:
* Win 7 - NVidia NVS 5100M - 0x10de - 0x0a6c - drivers 8.17.12.5720 to 8.17.12.5896
* Win Vista - AMD RadeonTM HD 8570? - 0x1002 - / - driver 8.982.0.0
* Win 7 - AMD RadeonTM HD 8570?	- 0x1002 - driver 8.982.0.0
* Win Vista - AMD? - 0x1022 - / - driver 8.982.0.0
* Win 7 - AMD? - 0x1022 - / - driver 8.982.0.0
* Win 8 - AMD RadeonTM HD 8570? - 0x1002 - / - drivers 8.8.0.0 to 9.10.8.0
* Win 8 - AMD?	- 0x1022 - / - drivers 8.8.0.0 to 9.10.8.0

Notes:
- the differences above come from the "browser\blocklist.xml" file. The file exists for Firefox 30 and other older versions, and has the same values, but it seems they were never in fact used in the previous versions, and Firefox 31 is the first version where they are used for blocking Direct2D. Note that these values are used in conjunction with the older ones from Firefox 30, without overriding them.

------> USER IMPACT: if the versions above are intended to be blocked, then there's no negative impact on the user. In any case, since we actually block more versions in 31 compared to 30, this should improve the stability.


2. Direct2D does NOT change state if a user (with Firefox 31) updates from one of the driver versions from the intervals shown above (issue #1) to any other version.

------> USER IMPACT: user with Firefox 31 on Windows 7 + AMD RadeonTM HD 8570, has driver version 8.900.0.0 (for which Direct2D is "true" on Firefox 31), user then updates to version 8.982.0.0 (which should be blocked for Firefox 31) => Unless the user changes/creates a new profile, Direct2D will NOT be blocked as expected, so it may cause issues.


3. Direct2D does NOT change state if a user with Firefox 30 updates to Firefox 31, and he has a driver version allowed on Firefox 30, but blocked in Firefox 31

------> USER IMPACT: user with Firefox 30 on Windows 7 + NVidia NVS 5100M, has driver version 8.17.12.5800 (for which Direct2D is "true" on Firefox 30), user then updates to Firefox 31 (for which the driver version should be blocked) => Unless the user changes/creates a new profile, Direct2D will NOT be blocked as expected, so it may cause issues.

Overall I think the backout has improved blacklisting, and brought it closer to Firefox 30 and to what's intended. Out of the remaining issues, #3 worries me the most as it may still cause problems after users update to Firefox 31 (but should not make things worse I think). Even so, if blacklisting was the main problem for the crashes, then these should go down.
I'm not going to block Firefox 31.0b9 on these issues since we want to get Beta feedback on the landed changes before we build RC next week. 

Bas and Jeff, what do you think about the issues described in comment 54? Should they be addressed in this bug or do they deserve follow-up reports? Can they be addressed in Firefox 31?
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(bas)
(In reply to Anthony Hughes, QA Mentor (:ashughes) from comment #55)
> I'm not going to block Firefox 31.0b9 on these issues since we want to get
> Beta feedback on the landed changes before we build RC next week. 
> 
> Bas and Jeff, what do you think about the issues described in comment 54?
> Should they be addressed in this bug or do they deserve follow-up reports?
> Can they be addressed in Firefox 31?

Not really, but 3 also doesn't really concern me too much, we haven't blocked anything important in 31 that wasn't blocked on 30.

However this might be an issue for 32, so we should still keep an eye on it.
Flags: needinfo?(bas)
Thanks Bas, do you want bugs filed for those issues?

Comment 58

5 years ago
On the actual crash, the volume is significantly down in 31.0b9 (and there was a residual crash with this signature before with the blocklisting working correctly) so I'm pretty confident we can mark this as fixed.

The issues that Florin turned up probably warrant a separate bug being filed for tracking, I guess.
I may have missed something because I haven't read throughly all of the above comments, but was any analysis of the per-GPU-vendor impact of this crash done? I just did some grepping of crashdata files, and it seems that while we always had some _VEC_memzero crashes on all GPU vendors, it spiked specifically on Intel with the patch from bug 984417 while remaining unaffected by this patch on other GPU vendors. Consequently, it seems that we only need to revert the Intel-specific portion of that patch.

bjacob@bjacob-dell:~/hack/crash-stats$ zcat 20140301-pub-crashdata.csv.gz | grep _VEC_memzero | grep AdapterVendorID | sed 's/.*AdapterVendorID\:\ \([0-9a-fA-FxX]*\).*/\1/g' | sort | uniq -c | sort -rn
    127 0x8086
     34 0x1002
     31 0x10de
     12 8086
     11 0x1106
      7 0x0000
      2 1106
      2 1002
      1 10de
      1 0x5333
      1 0x1039
      1 0x102b
      1 00ba
      1 0000
bjacob@bjacob-dell:~/hack/crash-stats$ zcat 20140401-pub-crashdata.csv.gz | grep _VEC_memzero | grep AdapterVendorID | sed 's/.*AdapterVendorID\:\ \([0-9a-fA-FxX]*\).*/\1/g' | sort | uniq -c | sort -rn
    159 0x8086
     32 0x1002
     29 0x10de
     22 8086
      5 0x1039
      3 10de
      2 1002
      2 0x1106
      2 0x0000
      1 0000
bjacob@bjacob-dell:~/hack/crash-stats$ zcat 20140501-pub-crashdata.csv.gz | grep _VEC_memzero | grep AdapterVendorID | sed 's/.*AdapterVendorID\:\ \([0-9a-fA-FxX]*\).*/\1/g' | sort | uniq -c | sort -rn
    139 0x8086
     27 0x10de
     20 0x1002
      9 8086
      9 0x0000
      5 0x1039
      3 0x1106
      1 10de
      1 1002
      1 0x5333
bjacob@bjacob-dell:~/hack/crash-stats$ zcat 20140601-pub-crashdata.csv.gz | grep _VEC_memzero | grep AdapterVendorID | sed 's/.*AdapterVendorID\:\ \([0-9a-fA-FxX]*\).*/\1/g' | sort | uniq -c | sort -rn
    163 0x8086
     42 0x10de
     25 0x1002
      8 8086
      7 0x1106
      4 1002
      3 0x1039
      3 0x0000
      2 0x5333
      2 0000
      1 10de
bjacob@bjacob-dell:~/hack/crash-stats$ zcat 20140701-pub-crashdata.csv.gz | grep _VEC_memzero | grep AdapterVendorID | sed 's/.*AdapterVendorID\:\ \([0-9a-fA-FxX]*\).*/\1/g' | sort | uniq -c | sort -rn
   1345 0x8086
     43 0x10de
     32 0x1002
      8 8086
      5 0x1039
      4 0x0000
      3 0x1106
      1 1002
      1 0000
Posted patch backout-intelSplinter Review
Attachment #8455718 - Flags: review?(jmuizelaar)
Attachment #8455718 - Flags: review?(bas)
Comment on attachment 8455718 [details] [diff] [review]
backout-intel

Approval Request Comment - SEE ABOVE PATCH's APPROVAL - THIS IS JUST A SUBSET OF IT. Also, I know that normally you only approve already-landed patches, but here it saves us time to do this all at once, and again, a bigger version of this patch was already approved above.

[Feature/regressing bug #]: bug 984417
[User impact if declined]: top crasher on Intel GPUs only, on Windows
[Describe test coverage new/current, TBPL]: None since we dont have windows test slaves with Intel graphics
[Risks and why]: Not risky at all
[String/UUID change made/needed]: none
Attachment #8455718 - Flags: approval-mozilla-beta?
Attachment #8455718 - Flags: approval-mozilla-aurora?
Flags: needinfo?(jmuizelaar)
(In reply to Benoit Jacob [:bjacob] from comment #59)

That's a good point... perhaps we can narrow it down even further? For example over 95% of these crashes are on Windows 7. Maybe it's particular to only some of those devFamily/driverVer?
Perhaps - but since at the moment I'm in "lets rush to get this fixed or else it will all get backed out" mode, I would rather land this conservative patch for now.
Attachment #8455718 - Flags: review?(bas) → review+
Firefox 31 RC has been built today, without the change from Benoit. Do we plan on still having this in Firefox 31?
Benoit's change arrived too late for the RC. If we make another build, we could take it.

Comment 66

5 years ago
I think we have pretty clearly established that this specific crash signature is old Intel drivers on Win7, we knew that for a long time. It was pretty much a surprise to us that gfx blocklisting changes were the cause. We should not run into those confusions and run experiment on our beta audience, so I do not think we should take a partial backout on 32 but a full one like on 31. See bug 984417 on a more detailed comment on how I see those things.
So would you be happy with:
 - full backout on 32  (we already did a full backout on 31, right?)
 - intel-only partial backout patch on 33
Comment on attachment 8455718 [details] [diff] [review]
backout-intel

Just talked with jrmuizel. We actually think that for mozilla-aurora (soon to be early beta) we want the Intel-only backout i.e. this patch. Indeed, we have strong confidence that this will fix at least the bulk of the crashiness here, enough to make this acceptable for early beta. Therefore, keeping the mozilla-aurora request here.

For beta on the other hand, the full backout had already landed earlier during the 31 cycle there, so there is no need for anything more.
Attachment #8455718 - Flags: review?(jmuizelaar)
Attachment #8455718 - Flags: approval-mozilla-beta?
Comment on attachment 8455718 [details] [diff] [review]
backout-intel

Same as 31. Happy to take it.
Attachment #8455718 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Pushed the intel-only backout to Aurora:

https://hg.mozilla.org/releases/mozilla-aurora/rev/8d5e337d433b

At this point:

Gecko 32+ (inbound and aurora) have the Intel-only backout.
Gecko 31 (beta) already had the earlier full backout.

So at this point I'm considering this fixed, though of course I would be very interested in hearing about the crash-stats impact!
Whiteboard: [leave open]
Assignee: bas → bjacob

Comment 72

5 years ago
As I mentioned to bjacob on IRC, my main goal is to ensure we do not end up again with the stability group not knowing something has been unblacklisted in some version and be surprised about eventual crashes. Can you please make sure the stability@m.o list gets a message whenever anything is unblacklisted and tell us there exactly which cards/chips/driver-versions have been unblacklisted in which Firefox version?
With that, we will know what to look for. We should probably be good with this in 32 (but please send that message) as with bug 918386 we should be able to search for driver versions in crash-stats when needed to dig into something like that (still need to expose it in crash-stats, I'll file bugs for that).
Mail sent to Stability.

Comment 74

5 years ago
FYI, the remaining cases of bug 768395 (CDevice::DriverInternalErrorCB) is another signature that went down with this backout on 31 (also almost exclusively Intel).
Stats from the last week:
Firefox 33 has 19 crashes
Firefox 32 has 75 crashes
Firefox 31 has 72 crashes (beta9 or higher)

Crashes still obviously exist but this is far outside the topcrash range now. Can this bug report be closed or is there still work to be done?

Comment 76

5 years ago
(In reply to Anthony Hughes, QA Mentor (:ashughes) from comment #75)
> Can this bug report be closed or is there still work to be done?

Given the discussion here, I would close it and file new followups for new/other issues.
Reporter

Updated

5 years ago
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Reporter

Updated

5 years ago
Status: RESOLVED → VERIFIED
Target Milestone: --- → mozilla33
See Also: → 1045055
See Also: → 1045070
Crash volume for signature '_VEC_memzero | _VEC_memzero':
 - nightly(version 50):0 crashes from 2016-06-06.
 - aurora (version 49):1 crash from 2016-06-07.
 - beta   (version 48):34 crashes from 2016-06-06.
 - release(version 47):189 crashes from 2016-05-31.
 - esr    (version 45):64 crashes from 2016-04-07.

Crash volume on the last weeks:
            W. N-1  W. N-2  W. N-3  W. N-4  W. N-5  W. N-6  W. N-7
 - nightly       0       0       0       0       0       0       0
 - aurora        0       1       0       0       0       0       0
 - beta          7       3       3       4       3       7       5
 - release      25      29      25      20      26      29      27
 - esr           4      10      11       7       7       9       4

Affected platform: Windows
You need to log in before you can comment on or make changes to this bug.