Intermittent test_webgl_conformance_test_suite.html | application crashed [@ mozilla::WebGLContext::ContextLossCallbackStatic(nsITimer*, void*)]

RESOLVED FIXED in Firefox 32

Status

()

defect
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: RyanVM, Assigned: jgilbert)

Tracking

({crash, intermittent-failure})

Trunk
mozilla34
x86_64
macOS
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox31 unaffected, firefox32+ fixed, firefox33+ fixed, firefox34 fixed, firefox-esr24 unaffected, firefox-esr31 wontfix, b2g-v1.4 unaffected, b2g-v2.0 fixed, b2g-v2.1 fixed)

Details

Attachments

(3 attachments, 5 obsolete attachments)

This started when cache2 was enabled in automation on Aurora and Beta. I'll leave it for the WebGL folks to summarize the email chain where it's been discussed up to this point.

https://tbpl.mozilla.org/php/getParsedLog.php?id=46498676&tree=Mozilla-Beta

Rev5 MacOSX Mountain Lion 10.8 mozilla-beta debug test mochitest-1 on 2014-08-21 13:19:19 PDT for push 53d300e03f5b
slave: talos-mtnlion-r5-057

13:38:45  WARNING -  TEST-UNEXPECTED-FAIL | /tests/content/canvas/test/webgl-conformance/test_webgl_conformance_test_suite.html | application terminated with exit code 1
13:38:45     INFO -  INFO | runtests.py | Application ran for: 0:16:50.403456
13:38:45     INFO -  INFO | zombiecheck | Reading PID log: /var/folders/48/lhw5r9md7k706sy9km0t8zkm00000w/T/tmpztqwX4pidlog
13:38:59  WARNING -  PROCESS-CRASH | /tests/content/canvas/test/webgl-conformance/test_webgl_conformance_test_suite.html | application crashed [@ mozilla::WebGLContext::ContextLossCallbackStatic(nsITimer*, void*)]
13:38:59     INFO -  Crash dump filename: /var/folders/48/lhw5r9md7k706sy9km0t8zkm00000w/T/tmptQnrJO/minidumps/EB9538EE-161D-4592-9F07-5887DC2241CD.dmp
13:38:59     INFO -  Operating system: Mac OS X
13:38:59     INFO -                    10.8.0 12A269
13:38:59     INFO -  CPU: amd64
13:38:59     INFO -       family 6 model 42 stepping 7
13:38:59     INFO -       8 CPUs
13:38:59     INFO -  Crash reason:  EXC_BAD_ACCESS / 0x0000000d
13:38:59     INFO -  Crash address: 0x0
13:38:59     INFO -  Thread 0 (crashed)
13:38:59     INFO -   0  XUL!mozilla::WebGLContext::ContextLossCallbackStatic(nsITimer*, void*) [WebGLContextLossTimer.cpp:53d300e03f5b : 46 + 0x0]
13:38:59     INFO -      rbx = 0x000000010f796800   r12 = 0x0000000123954200
13:38:59     INFO -      r13 = 0x00007fff5fbfd300   r14 = 0x0000000000000002
13:38:59     INFO -      r15 = 0x000000010052fb02   rip = 0x0000000102d34b2e
13:38:59     INFO -      rsp = 0x00007fff5fbfd1e0   rbp = 0x00007fff5fbfd1f0
13:38:59     INFO -      Found by: given as instruction pointer in context
13:38:59     INFO -   1  XUL!nsTimerImpl::Fire() [nsTimerImpl.cpp:53d300e03f5b : 618 + 0x9]
13:38:59     INFO -      rbx = 0x0000000102d34ae0   r12 = 0x0000000123954200
13:38:59     INFO -      r13 = 0x00007fff5fbfd300   r14 = 0x0000000000000002
13:38:59     INFO -      r15 = 0x000000010052fb02   rip = 0x00000001014c1e58
13:38:59     INFO -      rsp = 0x00007fff5fbfd200   rbp = 0x00007fff5fbfd270
13:38:59     INFO -      Found by: call frame info
13:38:59     INFO -   2  XUL!nsTimerEvent::Run() [nsTimerImpl.cpp:53d300e03f5b : 711 + 0x4]
13:38:59     INFO -      rbx = 0x000000010ff90350   r12 = 0x000000010052fa00
13:38:59     INFO -      r13 = 0x00007fff5fbfd397   r14 = 0x000000000000ddf3
13:38:59     INFO -      r15 = 0x000000010052fb7c   rip = 0x00000001014c2110
13:38:59     INFO -      rsp = 0x00007fff5fbfd280   rbp = 0x00007fff5fbfd2a0
13:38:59     INFO -      Found by: call frame info
13:38:59     INFO -   3  XUL!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:53d300e03f5b : 766 + 0x5]
13:38:59     INFO -      rbx = 0x000000010052faa0   r12 = 0x000000010052fa00
13:38:59     INFO -      r13 = 0x00007fff5fbfd397   r14 = 0x000000010052fb7c
13:38:59     INFO -      r15 = 0x000000010052fb7c   rip = 0x00000001014bc76c
13:38:59     INFO -      rsp = 0x00007fff5fbfd2b0   rbp = 0x00007fff5fbfd380
13:38:59     INFO -      Found by: call frame info
13:38:59     INFO -   4  XUL!NS_ProcessPendingEvents(nsIThread*, unsigned int) [nsThreadUtils.cpp:53d300e03f5b : 210 + 0xe]
13:38:59     INFO -      rbx = 0x0000000000000000   r12 = 0x000000010052faa0
13:38:59     INFO -      r13 = 0x00007fff5fbfd397   r14 = 0x0000000000000014
13:38:59     INFO -      r15 = 0x00000000001857fb   rip = 0x0000000101429581
13:38:59     INFO -      rsp = 0x00007fff5fbfd390   rbp = 0x00007fff5fbfd3c0
13:38:59     INFO -      Found by: call frame info
13:38:59     INFO -   5  XUL!nsBaseAppShell::NativeEventCallback() [nsBaseAppShell.cpp:53d300e03f5b : 98 + 0xe]
13:38:59     INFO -      rbx = 0x000000010c21a5c0   r12 = 0x0000000000000000
13:38:59     INFO -      r13 = 0x0000000106700430   r14 = 0x000000010052faa0
13:38:59     INFO -      r15 = 0x000000010c21a500   rip = 0x00000001026124d7
13:38:59     INFO -      rsp = 0x00007fff5fbfd3d0   rbp = 0x00007fff5fbfd3f0
13:38:59     INFO -      Found by: call frame info
13:38:59     INFO -   6  XUL!nsAppShell::ProcessGeckoEvents(void*) [nsAppShell.mm:53d300e03f5b : 286 + 0x7]
13:38:59     INFO -      rbx = 0x0000000100127280   r12 = 0x0000000000000000
13:38:59     INFO -      r13 = 0x0000000106700430   r14 = 0x0000000100127298
13:38:59     INFO -      r15 = 0x000000010c21a5c0   rip = 0x00000001025c523e
13:38:59     INFO -      rsp = 0x00007fff5fbfd400   rbp = 0x00007fff5fbfd440
13:38:59     INFO -      Found by: call frame info
Assignee: nobody → jgilbert
Blocks: 1053517
Posted patch floating-timer (obsolete) — Splinter Review
Attachment #8477049 - Flags: review?(dglastonbury)
Comment on attachment 8477049 [details] [diff] [review]
floating-timer

Review of attachment 8477049 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/canvas/WebGLContextLossHandler.cpp
@@ +25,5 @@
> +void
> +WebGLContextLossHandler::StartTimer(unsigned long delayMS)
> +{
> +    RefPtr<WebGLContextLossHandler> timerRef = this;
> +    WebGLContextLossHandler* tempRefForTimer = timerRef.forget().drop();

Please comment what's going on here. It looks scary and sharp enough to cut fingers off.
Attachment #8477049 - Flags: review?(dglastonbury) → review-
Posted patch floating-timer (obsolete) — Splinter Review
Let's just do the AddRef/Release manually. The usages are extremely local, so it shouldn't be too dangerous, and it's generally easier to understand than leaking a TemporaryRef<> between scopes across an void* boundary.
Attachment #8477049 - Attachment is obsolete: true
Attachment #8477079 - Flags: review?(dglastonbury)
Attachment #8477079 - Flags: review?(dglastonbury) → review+
Posted patch floating-timer (obsolete) — Splinter Review
r=kamidphish
Updated patch.
https://tbpl.mozilla.org/?tree=Try&rev=e0b87dc68889
Attachment #8477079 - Attachment is obsolete: true
Attachment #8477117 - Flags: review+
Posted patch floating-timer-aurora (obsolete) — Splinter Review
Ported to Aurora.
r=kamidphish
https://tbpl.mozilla.org/?tree=Try&rev=d2a1f751367a
Attachment #8477118 - Flags: review+
Posted patch floating-timer-beta (obsolete) — Splinter Review
Beta backport.
r=kamidphish
https://tbpl.mozilla.org/?tree=Try&rev=bf6ff796f6a5
Attachment #8477122 - Flags: review+
The trunk patch has an intermittent leak :(
https://tbpl.mozilla.org/php/getParsedLog.php?id=46525395&tree=Try

Haven't seen it on the Aurora/Beta runs yet, but I retriggered more debug tests to see if it reproduces there or not.
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #10)
> The trunk patch has an intermittent leak :(
> https://tbpl.mozilla.org/php/getParsedLog.php?id=46525395&tree=Try
> 
> Haven't seen it on the Aurora/Beta runs yet, but I retriggered more debug
> tests to see if it reproduces there or not.

Leaks on Aurora/Beta too.
My bad. When we cancel the timer, we never Released our ref. It's not safe to do so, however, in case a callback is already headed our way.

Fix in:
https://tbpl.mozilla.org/?tree=Try&rev=b0b23a677b64
No leaks or crashes even with copious retriggers. LGTM!
Unfortunately, there's a serious issue with the previous patch, so this needs re-review.
Attachment #8477117 - Attachment is obsolete: true
Attachment #8477118 - Attachment is obsolete: true
Attachment #8477122 - Attachment is obsolete: true
Attachment #8477606 - Flags: review?(dglastonbury)
It's Saturday in Australia, but I think since we missed the go-to-build, this can wait until Monday.
(In reply to Jeff Gilbert [:jgilbert] from comment #18)
> It's Saturday in Australia, but I think since we missed the go-to-build,
> this can wait until Monday.

Given that we're building Firefox 32 RC1 on Monday and we have had a ton of trouble with beta9, I would really prefer to see this patch reviewed, landed, and in a build that is ready to go by Monday morning if possible. Sorry to ask for the weekend work but I want to mitigate the schedule risk associated with this very late landing as much as possible. If you guys can find the time to do the review and land in the next two days, I will be in your debt. If not, we will deal with this on Monday.
Attachment #8477606 - Flags: review?(dglastonbury) → review+
Hoping that this is stuck on inbound by the time that Jeff wakes up so we can get this rebased for Aurora/Beta uplift ASAP.

https://hg.mozilla.org/integration/mozilla-inbound/rev/34de348f9529
Inbound build looks good. 

Jeff - Can you please create a beta branch patch ASAP as we need to get this on beta so that we can build and merge to release in time to create today's RC.
Flags: needinfo?(jgilbert)
(In reply to Lawrence Mandel [:lmandel] from comment #23)
> Inbound build looks good. 
> 
> Jeff - Can you please create a beta branch patch ASAP as we need to get this
> on beta so that we can build and merge to release in time to create today's
> RC.

I am working on it now.
Flags: needinfo?(jgilbert)
https://hg.mozilla.org/mozilla-central/rev/34de348f9529
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Posted patch beta patchSplinter Review
r=kamidphish
Try is closed, so I can't Try the patch.
Attachment #8478570 - Flags: review+
Posted patch aurora patchSplinter Review
r=kamidphish
Attachment #8478571 - Flags: review+
Comment on attachment 8478570 [details] [diff] [review]
beta patch

Approval Request Comment
[Feature/regressing bug #]: This one.
[User impact if declined]: Some crashes due do race conditions, uncovered by our tests.
[Describe test coverage new/current, TBPL]: On Central
[Risks and why]: Low.
[String/UUID change made/needed]: None
Attachment #8478570 - Flags: approval-mozilla-beta?
Comment on attachment 8478571 [details] [diff] [review]
aurora patch

Approval Request Comment
See beta request.
Attachment #8478571 - Flags: approval-mozilla-aurora?
Comment on attachment 8478570 [details] [diff] [review]
beta patch

While I would normally like to see this in a try run, the patch has hit m-c and this is the last change required for the RC build today. As such, we'll see results of the RC build. Beta+

Thanks for the quick turnaround on this one Jeff.
Attachment #8478570 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8478570 [details] [diff] [review]
beta patch

Beta 32 was merged to release earlier today. Approving for release as well.
Attachment #8478570 - Flags: approval-mozilla-release+
Attachment #8478571 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 980178
Blocks: 1174043
You need to log in before you can comment on or make changes to this bug.