Closed Bug 1346426 Opened 3 years ago Closed 3 years ago

Crash in mozilla::dom::Timeout::SetWhenOrTimeRemaining

Categories

(Core :: DOM: Core & HTML, defect, critical)

53 Branch
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox52 --- unaffected
firefox-esr52 --- unaffected
firefox53 + fixed
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: philipp, Assigned: bkelly)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-f4dbb08b-0b04-40d7-9678-a8bf82170310.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::dom::Timeout::SetWhenOrTimeRemaining(mozilla::TimeStamp const&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) 	dom/base/Timeout.cpp:123
1 	xul.dll 	<lambda_88e96f87d9c25d3705196c1fe2125061>::operator() 	dom/base/TimeoutManager.cpp:1118
2 	xul.dll 	mozilla::dom::TimeoutManager::Freeze() 	dom/base/TimeoutManager.cpp:1109
3 	xul.dll 	nsGlobalWindow::FreezeInternal() 	dom/base/nsGlobalWindow.cpp:11966
4 	xul.dll 	nsGlobalWindow::SaveWindowState() 	dom/base/nsGlobalWindow.cpp:12836
5 	xul.dll 	nsDocShell::CaptureState() 	docshell/base/nsDocShell.cpp:8261
6 	xul.dll 	nsDocShell::SetupNewViewer(nsIContentViewer*) 	docshell/base/nsDocShell.cpp:9316
7 	xul.dll 	nsDocShell::Embed(nsIContentViewer*, char const*, nsISupports*) 	docshell/base/nsDocShell.cpp:7228
8 	xul.dll 	nsDocShell::CreateContentViewer(nsACString_internal const&, nsIRequest*, nsIStreamListener**) 	docshell/base/nsDocShell.cpp:9196
9 	xul.dll 	nsDSURIContentListener::DoContent(nsACString_internal const&, bool, nsIRequest*, nsIStreamListener**, bool*) 	docshell/base/nsDSURIContentListener.cpp:128
10 	xul.dll 	nsDocumentOpenInfo::TryContentListener(nsIURIContentListener*, nsIChannel*) 	uriloader/base/nsURILoader.cpp:736
11 	xul.dll 	nsDocumentOpenInfo::DispatchContent(nsIRequest*, nsISupports*) 	uriloader/base/nsURILoader.cpp:414
12 	xul.dll 	nsDocumentOpenInfo::OnStartRequest(nsIRequest*, nsISupports*) 	uriloader/base/nsURILoader.cpp:277
13 	xul.dll 	mozilla::net::nsHttpChannel::CallOnStartRequest() 	netwerk/protocol/http/nsHttpChannel.cpp:1322
14 	xul.dll 	mozilla::net::nsHttpChannel::ContinueProcessNormal(nsresult) 	netwerk/protocol/http/nsHttpChannel.cpp:2403
15 	xul.dll 	mozilla::net::nsHttpChannel::ProcessNormal() 	netwerk/protocol/http/nsHttpChannel.cpp:2338
16 	xul.dll 	mozilla::net::nsHttpChannel::ContinueProcessResponse2(nsresult) 	netwerk/protocol/http/nsHttpChannel.cpp:2233
17 	xul.dll 	mozilla::net::nsHttpChannel::ContinueProcessResponse1() 	netwerk/protocol/http/nsHttpChannel.cpp:2090
18 	xul.dll 	mozilla::net::nsHttpChannel::ProcessResponse() 	netwerk/protocol/http/nsHttpChannel.cpp:2010
19 	xul.dll 	mozilla::net::nsHttpChannel::OnStartRequest(nsIRequest*, nsISupports*) 	netwerk/protocol/http/nsHttpChannel.cpp:6509
20 	xul.dll 	nsInputStreamPump::OnStateStart() 	netwerk/base/nsInputStreamPump.cpp:524
21 	xul.dll 	nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) 	netwerk/base/nsInputStreamPump.cpp:426

this cross-platform crash signature is starting to show up in firefox 53 and subsequent builds - so far in rather low volume though.
Flags: needinfo?(bkelly)
It looks to me like we're trying to Freeze() a dummy timeout here.  I could easily add an if-statement and continue for dummy timeouts here:

https://dxr.mozilla.org/mozilla-beta/source/dom/base/TimeoutManager.cpp#1109

We have a similar check for Thaw:

https://dxr.mozilla.org/mozilla-beta/source/dom/base/TimeoutManager.cpp#1137

But its unclear to me how we can have a dummy timeout in the list given the stack in the crash.

Any thoughts?
Flags: needinfo?(bkelly) → needinfo?(ehsan)
Stack showing we are indeed triggering the MOZ_DIAGNOSTIC_ASSERT(mWindow) check in aurora/nightly:

https://crash-stats.mozilla.com/report/index/79f19f4d-68a9-403e-a728-7b3a62170310

The beta build crashes later when we try to use mWindow.
Ehsan, it seems we leave the dummy timeouts in the list if we hit this return?

https://dxr.mozilla.org/mozilla-beta/source/dom/base/TimeoutManager.cpp#570

Should that be a break instead?  Or should we remove the dummy timeouts there?
Maybe we should have a RAII stack class that removes the dummy timers when its destroyed.
I'm just going to patch the nullptr access during freeze for now.  We can clean up the dummy timers in a follow-on bug.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Flags: needinfo?(ehsan)
Long term we should figure out why dummy timers are left in the list sometimes.  For right now, though, lets just not crash if freeze() is called on one of these windows.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e61e631a05d06d240a6b0c212445a7a216a6dca8
Attachment #8846661 - Flags: review?(ehsan)
Comment on attachment 8846661 [details] [diff] [review]
Don't crash if a window is frozen while there is a dummy timer in the list. r=ehsan

Review of attachment 8846661 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry I didn't see your previous comments in time.  You're right in comment 3, we are leaving the dummy timeout in the list, which can break all sorts of things...  I'm afraid this fix may end up not being enough.  :(

We need to remove the dummy timeouts like we do here: https://dxr.mozilla.org/mozilla-beta/rev/65c3d718dd5fb6f11912820a5433ff6d13e6362f/dom/base/TimeoutManager.cpp#602 before the early return in this branch.

You can use a RAII class, but I think it may be a bit nicer to use a MakeScopeExit helper for this and use a closure so that you can access the stack variables more easily.  Up to you, either would be fine!
Attachment #8846661 - Flags: review?(ehsan) → review-
Comment on attachment 8846661 [details] [diff] [review]
Don't crash if a window is frozen while there is a dummy timer in the list. r=ehsan

Ehsan and I spoke on IRC and he agreed this is probably an ok fix to uplift:

http://logs.glob.uno/?c=content#c429406

I'll file a follow-on bug to try to track down these errant dummy timeouts on trunk.
Attachment #8846661 - Flags: review- → review+
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2b52802899ae
Don't crash if a window is frozen while there is a dummy timer in the list. r=ehsan
See Also: → 1346903
https://hg.mozilla.org/mozilla-central/rev/2b52802899ae
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment on attachment 8846661 [details] [diff] [review]
Don't crash if a window is frozen while there is a dummy timer in the list. r=ehsan

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1329284
[User impact if declined]: Low frequency crashes when navigating and going back on sites using timers.
[Is this code covered by automated tests?]: Lots of tests, but they don't trigger this race condition.
[Has the fix been verified in Nightly?]: Its landed on nightly, but difficult to verify without exact STR.
[Needs manual test from QE? If yes, steps to reproduce]:  No
[List of other uplifts needed for the feature/fix]: No
[Is the change risky?]: Minimal risk
[Why is the change risky/not risky?]: This patch adds a nullptr check similar to the one already present in an alternate code path.
[String changes made/needed]: None
Attachment #8846661 - Flags: approval-mozilla-beta?
Attachment #8846661 - Flags: approval-mozilla-aurora?
Tracking as regression in 53.
Comment on attachment 8846661 [details] [diff] [review]
Don't crash if a window is frozen while there is a dummy timer in the list. r=ehsan

fix a crash in 53 and 54
Attachment #8846661 - Flags: approval-mozilla-beta?
Attachment #8846661 - Flags: approval-mozilla-beta+
Attachment #8846661 - Flags: approval-mozilla-aurora?
Attachment #8846661 - Flags: approval-mozilla-aurora+
Setting qe-verify- based on Ben's assessment on manual testing needs (see Comment 11) and the fact that this fix has automated coverage.
Flags: qe-verify-
Depends on: 1340413
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.