Last Comment Bug 803949 - Crash [@ gfxContext::PushClipsToDT]
: Crash [@ gfxContext::PushClipsToDT]
Status: RESOLVED FIXED
[qa-]
: crash, testcase
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: x86_64 Windows 7
: -- critical with 1 vote (vote)
: mozilla19
Assigned To: Bas Schouten (:bas.schouten)
:
Mentors:
Depends on: 758531
Blocks: randomstyles
  Show dependency treegraph
 
Reported: 2012-10-21 02:32 PDT by Jesse Ruderman
Modified: 2012-11-29 08:59 PST (History)
9 users (show)
ryanvm: in‑testsuite-
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
fixed
fixed


Attachments
testcase (crashes Firefox when loaded) (91 bytes, text/html)
2012-10-21 02:32 PDT, Jesse Ruderman
no flags Details
stack (60.85 KB, text/plain)
2012-10-21 02:32 PDT, Jesse Ruderman
no flags Details
about:support - graphics (772 bytes, text/plain)
2012-10-21 12:09 PDT, Jesse Ruderman
no flags Details
patch for debugging (1.78 KB, patch)
2012-10-21 14:10 PDT, Jesse Ruderman
no flags Details | Diff | Review
NaNnotated stack (6.30 KB, text/plain)
2012-10-21 23:36 PDT, Jesse Ruderman
no flags Details
patch for debugging, v2 (6.97 KB, patch)
2012-10-22 11:47 PDT, Jesse Ruderman
no flags Details | Diff | Review
Do not draw when imageSize is 0 (1.10 KB, patch)
2012-10-22 15:32 PDT, Bas Schouten (:bas.schouten)
roc: review+
lukasblakk+bugs: approval‑mozilla‑aurora+
lukasblakk+bugs: approval‑mozilla‑beta+
Details | Diff | Review

Description Jesse Ruderman 2012-10-21 02:32:07 PDT
Created attachment 673671 [details]
testcase (crashes Firefox when loaded)

Only on Windows?
Comment 1 Jesse Ruderman 2012-10-21 02:32:36 PDT
Created attachment 673672 [details]
stack
Comment 2 Jesse Ruderman 2012-10-21 02:36:21 PDT
There's a topcrash with the same signature: bug 758531.

I'm on Windows 7 (64-bit), fwiw.
Comment 3 Scoobidiver (away) 2012-10-21 04:51:19 PDT
It doesn't crash for me.
What is your graphics configuration (see about:support)?
Comment 4 Bas Schouten (:bas.schouten) 2012-10-21 09:23:17 PDT
I don't reproduce either on my main dev machine. I wonder if this is NVidia only.
Comment 5 Jesse Ruderman 2012-10-21 12:09:34 PDT
Created attachment 673718 [details]
about:support - graphics
Comment 6 Jesse Ruderman 2012-10-21 14:08:14 PDT
Bas helped me debug this a bit.

In gfxContext::GetAzureDeviceSpaceClipBounds, when i==3 and c==0, the "else" path is taken.  clip.transform is

[0  0  NaN
 0  1  NaN]

These NaNs poison the rest of the calculation, so the clipBounds seen by gfxContext::PushNewDT is all NaNs.
Comment 7 Jesse Ruderman 2012-10-21 14:10:06 PDT
Created attachment 673729 [details] [diff] [review]
patch for debugging
Comment 8 Bas Schouten (:bas.schouten) 2012-10-21 20:07:57 PDT
(In reply to Jesse Ruderman from comment #7)
> Created attachment 673729 [details] [diff] [review]
> patch for debugging

So I still can't reproduce this, but the following information could really help:

1. Find out when gfxContext::ChangeTransform changes the transform to the invalid state
2. I suspect 1 will lead us to conclude NudgeToIntegers, in which case it would be great where it's going all wrong in that function!
Comment 9 Nick Cameron [:nrc] 2012-10-21 20:20:11 PDT
I tried to reproduce this too (on Nvidia HW) and couldn't. Should it just crash on opening the test case and/or when starting FF on that page? Are there any other necessary STR?
Comment 10 Jesse Ruderman 2012-10-21 23:36:08 PDT
Created attachment 673771 [details]
NaNnotated stack
Comment 11 Bas Schouten (:bas.schouten) 2012-10-22 07:40:27 PDT
(In reply to Jesse Ruderman from comment #10)
> Created attachment 673771 [details]
> NaNnotated stack

Sorry to bug you again Jesse! Could you look in layout/base/nsLayoutUtils.cpp inside ComputeSnappedImageDrawingParameters to see where these #IND is coming from?
Comment 12 Jesse Ruderman 2012-10-22 11:30:33 PDT
The bad division comes from the call at 
http://hg.mozilla.org/mozilla-central/annotate/1c3e4cb1f754/layout/base/nsLayoutUtils.cpp#l3739 
with imageSize = {width: 0., height: 768.}.
Comment 13 Jesse Ruderman 2012-10-22 11:47:39 PDT
Created attachment 673945 [details] [diff] [review]
patch for debugging, v2
Comment 14 Bas Schouten (:bas.schouten) 2012-10-22 15:32:04 PDT
Created attachment 674030 [details] [diff] [review]
Do not draw when imageSize is 0

This could be causing the crashes we're seeing since , in the past setting this would be innocent, i.e. it would cause us not to draw anything, like we should, but it wouldn't cause a crash. Now that we're making temp surfaces we'll try and create NaN size surfaces, which actually causes us to crash. The right fix seems to be to avoid this division-by-zero altogether.

I have no idea why my system does -not- get this function called with a 0 image size on this testcase.
Comment 15 Bas Schouten (:bas.schouten) 2012-10-22 15:41:01 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/c78126b31487
Comment 16 Ryan VanderMeulen [:RyanVM] 2012-10-23 03:57:58 PDT
https://hg.mozilla.org/mozilla-central/rev/c78126b31487

Should this have a test?
Comment 17 Bas Schouten (:bas.schouten) 2012-10-23 04:54:49 PDT
(In reply to Ryan VanderMeulen from comment #16)
> https://hg.mozilla.org/mozilla-central/rev/c78126b31487
> 
> Should this have a test?

Neither me nor Nick can reproduce this crash. We could add Jesse's test to crashtests, but we wouldn't know if the test machines are actually crashing on this and I suppose that would make the value of a test questionable.
Comment 18 Jesse Ruderman 2012-10-23 05:30:19 PDT
Even if the test doesn't "work" on our current set of test machines, it can still be valuable by catching regressions on developers' machines, or as a starting point for the DOM fuzzer to make further mutations.

I'm still baffled by why the test doesn't even show the divide-by-zero issue for Bas. (Bas, did you try saving the testcase and loading it from a file: URL?  Could it depend on the window size?)
Comment 19 Bas Schouten (:bas.schouten) 2012-10-23 05:51:57 PDT
(In reply to Jesse Ruderman from comment #18)
> Even if the test doesn't "work" on our current set of test machines, it can
> still be valuable by catching regressions on developers' machines, or as a
> starting point for the DOM fuzzer to make further mutations.
> 
> I'm still baffled by why the test doesn't even show the divide-by-zero issue
> for Bas. (Bas, did you try saving the testcase and loading it from a file:
> URL?  Could it depend on the window size?)

I tried a wide variety of window sizes thinking that might be the culprit, but not loading from disk.
Comment 20 Jesse Ruderman 2012-10-23 22:35:20 PDT
I filed bug 804924 to see if it makes sense to add some not-NaN assertions.
Comment 21 Scoobidiver (away) 2012-10-25 00:05:40 PDT
I think it should be uplifted to Aurora along with bug 758531. Indeed, they landed in the same build and we don't know which one was the most top crasher contributor.
Comment 22 Bas Schouten (:bas.schouten) 2012-10-26 19:06:30 PDT
Comment on attachment 674030 [details] [diff] [review]
Do not draw when imageSize is 0

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 778367
User impact if declined: Top 10 crasher
Testing completed (on m-c, etc.): m-c
Risk to taking this patch (and alternatives if risky): Very low, this fixes a bug which would cause other problems too before bug 778367. Bug 778367 just turned this into a crasher.
String or UUID changes made by this patch: None
Comment 23 Bas Schouten (:bas.schouten) 2012-10-29 17:47:57 PDT
http://hg.mozilla.org/releases/mozilla-beta/rev/ae3b21c9a569
Comment 24 Ryan VanderMeulen [:RyanVM] 2012-10-30 11:54:57 PDT
https://hg.mozilla.org/releases/mozilla-aurora/rev/e82ab79e04a4
Comment 25 Simona B [:simonab] 2012-11-07 05:29:03 PST
Verified that Firefox 17 beta 5 does not crash when loading the test case attached in the Description (I could reproduce the crash on the Nightly from 2012-10-21), but neither does Firefox 17 beta 3 (and it should). 

I verified the Socorro reports and I found a couple of crashes that happened after the fixes for this bug landed on Firefox 17 beta 4, on the latest Nightly and Aurora. The majority of them are on Windows 8 - Bug 805406 takes care of that - but there are some crash reports also on Windows 7.

The crash reports are: 
* Firefox 17 beta 4:
https://crash-stats.mozilla.com/report/index/406eb276-869c-48be-9286-e004f2121107
https://crash-stats.mozilla.com/report/index/1e8dba72-beb5-40dd-a0a9-ec2592121106
https://crash-stats.mozilla.com/report/index/aeda7489-f8ee-441f-8390-73f212121106
https://crash-stats.mozilla.com/report/index/219f2db8-0b41-4d56-bf56-ab7782121106
https://crash-stats.mozilla.com/report/index/9a639bd1-1b94-42b8-b4c1-513d22121106
https://crash-stats.mozilla.com/report/index/8e3580c7-604b-414b-91ea-383ca2121105
https://crash-stats.mozilla.com/report/index/071e19e6-9a53-43ff-a263-2adfd2121105
https://crash-stats.mozilla.com/report/index/67d51a21-869a-4040-9707-53db32121104
https://crash-stats.mozilla.com/report/index/0d547657-c793-41ae-abb2-319fe2121103
https://crash-stats.mozilla.com/report/index/d854e473-3802-490c-9107-488ce2121103
* Aurora
https://crash-stats.mozilla.com/report/index/06a1d6a6-3937-4348-baae-6a9c32121106
* Nightly
https://crash-stats.mozilla.com/report/index/109a856b-541f-43e2-bbca-b0d4b2121104
https://crash-stats.mozilla.com/report/index/e91eed83-fa42-4439-b60f-bc81e2121104

Are these crashes related with this bug? Is this expected in any way or should I file a separate bug?
Comment 26 Scoobidiver (away) 2012-11-07 05:40:33 PST
(In reply to Simona B [QA] from comment #25)
> but neither does Firefox 17 beta 3 (and it should). 
There's no evidence that this bug was an old regression. The patch landed in Beta and Aurora for safety. Bug 758531 tracks remaining crashes in 17.0 Beta.

> The majority of them are on Windows 8 - Bug 805406 takes care of that - but there
> are some crash reports also on Windows 7.
Bug 805406 takes also care of remaining crashes on Windows 7 in the trunk.
Comment 27 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-11-07 11:07:16 PST
Scoobidiver, can this bug be marked verified? If there is follow-up work on other bugs I'm not sure what more QA can do for *this* bug apart from what's already been done in comment 25.
Comment 28 Scoobidiver (away) 2012-11-07 12:22:24 PST
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #27)
> Scoobidiver, can this bug be marked verified?
Yes for the trunk, but no for Beta. This bug is based on a testcase, not on crash stats.
Comment 29 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-11-07 12:52:35 PST
(In reply to Scoobidiver from comment #28)
> Yes for the trunk, but no for Beta. This bug is based on a testcase, not on
> crash stats.

Simona said the testcase does not crash Firefox 17.0b5 in comment 25. Can this not be marked verified for Beta based on that?
Comment 30 Scoobidiver (away) 2012-11-07 13:06:06 PST
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #29)
> Simona said the testcase does not crash Firefox 17.0b5 in comment 25. Can
> this not be marked verified for Beta based on that?
It depends on the meaning of verified:
No, if you verify that a problem no longer exist.
Yes, if you verify that a problem doesn't exist.
Comment 31 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-11-07 13:09:14 PST
Well I don't see that there's anything else QA can do to test this is fixed.

Note You need to log in before you can comment on or make changes to this bug.