crash in mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort (with abort message chrome://browser/content/sanitize.js) and Firefox no longer responding on startup

RESOLVED FIXED

Status

()

Core
General
--
critical
RESOLVED FIXED
2 years ago
a year ago

People

(Reporter: philipp, Assigned: mak)

Tracking

(Depends on: 1 bug, {crash, regression})

44 Branch
x86
Windows NT
crash, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44 wontfix, firefox45+ verified, firefox46+ wontfix, firefox47+ wontfix)

Details

(crash signature)

Attachments

(1 obsolete attachment)

(Reporter)

Description

2 years ago
[Tracking Requested - why for this release]:

This bug was filed from the Socorro interface and is 
report bp-a464ca53-513a-40a9-9393-7c5462160215.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	mozglue.dll 	mozalloc_abort(char const* const) 	memory/mozalloc/mozalloc_abort.cpp
1 	xul.dll 	NS_DebugBreak 	xpcom/base/nsDebugImpl.cpp
2 	xul.dll 	nsDebugImpl::Abort(char const*, int) 	xpcom/base/nsDebugImpl.cpp
3 	xul.dll 	NS_InvokeByIndex 	xpcom/reflect/xptcall/md/win32/xptcinvoke.cpp
4 	xul.dll 	XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) 	js/xpconnect/src/XPCWrappedNative.cpp
5 	xul.dll 	XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) 	js/xpconnect/src/XPCWrappedNativeJSOps.cpp
6 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
7 	xul.dll 	Interpret 	js/src/vm/Interpreter.cpp
8 	xul.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp
9 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
10 	xul.dll 	js::CallOrConstructBoundFunction(JSContext*, unsigned int, JS::Value*) 	js/src/jsfun.cpp
11 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
12 	xul.dll 	js::fun_call(JSContext*, unsigned int, JS::Value*) 	js/src/jsfun.cpp
13 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
14 	xul.dll 	js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) 	js/src/vm/Interpreter.cpp
15 	xul.dll 	js::DirectProxyHandler::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/DirectProxyHandler.cpp
16 	xul.dll 	js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) 	js/src/proxy/CrossCompartmentWrapper.cpp
17 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
18 	xul.dll 	Interpret 	js/src/vm/Interpreter.cpp
19 	xul.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp
20 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
21 	xul.dll 	js::CallOrConstructBoundFunction(JSContext*, unsigned int, JS::Value*) 	js/src/jsfun.cpp
22 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
23 	xul.dll 	js::CallOrConstructBoundFunction(JSContext*, unsigned int, JS::Value*) 	js/src/jsfun.cpp
24 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
25 	xul.dll 	js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) 	js/src/vm/Interpreter.cpp
26 	xul.dll 	nsXPCComponents_Utils::CallFunctionWithAsyncStack(JS::Handle<JS::Value>, nsIStackFrame*, nsAString_internal const&, JSContext*, JS::MutableHandle<JS::Value>) 	js/xpconnect/src/XPCComponents.cpp
27 	xul.dll 	NS_InvokeByIndex 	xpcom/reflect/xptcall/md/win32/xptcinvoke.cpp
28 	xul.dll 	XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) 	js/xpconnect/src/XPCWrappedNative.cpp
29 	xul.dll 	XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) 	js/xpconnect/src/XPCWrappedNativeJSOps.cpp
30 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
31 	xul.dll 	Interpret 	js/src/vm/Interpreter.cpp
32 	xul.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp
33 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
34 	xul.dll 	JS::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) 	js/src/jsapi.cpp
35 	xul.dll 	mozilla::dom::AnyCallback::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&) 	obj-firefox/dom/bindings/PromiseBinding.cpp
36 	xul.dll 	mozilla::dom::WrapperPromiseCallback::Call(JSContext*, JS::Handle<JS::Value>) 	dom/promise/PromiseCallback.cpp

AbortMessage 	[11276] ###!!! ABORT: file chrome://browser/content/sanitize.js, line 98

AsyncShutdownTimeout 	{"phase":"Places Database shutdown","conditions":[{"name":"sanitize.js: Sanitize","state":{"progress":{"cache":"cleared","cookies":"cleared","history":"ready","formdata":"ready","downloads":"ready","sessions":"ready"}},"filename":"chrome://browser/content/sanitize.js","lineNumber":98,"stack":["chrome://browser/content/sanitize.js:Sanitizer.prototype.sanitize<:98","resource://gre/modules/Task.jsm:TaskImpl_run:314","resource://gre/modules/Task.jsm:TaskImpl:275","resource://gre/modules/Task.jsm:createAsyncFunction/asyncFunction:249","chrome://browser/content/sanitize.js:Sanitizer.onStartup<:832","resource://gre/modules/Task.jsm:TaskImpl_run:314","resource://gre/modules/Task.jsm:TaskImpl:275","resource://gre/modules/Task.jsm:createAsyncFunction/asyncFunction:249","resource://app/components/nsBrowserGlue.js:BG__finalUIStartup:759","resource://app/components/nsBrowserGlue.js:BG_observe:283"]}]}

the signature is #11 top crasher on release & this crash variant (abort in sanitize.js, line 98) makes up 2/3 of those crashes. it is regressing in firefox 44.

multiple user comments describe that they tried to delete the firefox history and firefox was subsequently stuck in a "not responding" mode after a restart constantly:
https://crash-stats.mozilla.com/search/?signature=%3Dmozalloc_abort+|+NS_DebugBreak+|+nsDebugImpl%3A%3AAbort&abort_message=98&user_comments=~history&_facets=signature&_facets=version&_facets=user_comments&_facets=uptime&_facets=platform_pretty_version&_facets=abort_message&_facets=async_shutdown_timeout&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-user_comments

downgrading to 43 can get you out of the freezing condition on startup again apparently according to https://support.mozilla.org/questions/1110002.
There are various bugs in sanitize on shutdown we are currently handling and I think this is just the tip of the iceberg.
Bug 1243549 is a part of those, it already landed on all the branches
Bug 1244650 is another important part that shows clear on shutdown is broken from a long time
Then there's bug 1245065

Here it's also showing async shutdown is killing us while we try to clear history, that means sanitize takes more than 1 minute.
I'm not really surprised, 1 minute can be a short time if it's the time to clear *all* the sanitize clients. Especially in the async world.
plus we have bug 871908 that is quite bad, but we lack resources to fix it.
Depends on: 871908
Comment hidden (obsolete)
Depends on: 1244650
This is one of the top crash of 45, tracking.
Marco, are you going to take care of that?
tracking-firefox45: ? → +
Flags: needinfo?(mak77)
I will have a look, but first I need to do some planning with the team.
Assignee: nobody → mak77
First of all, the hang on startup is easy to explain, if we fail to sanitize on shutdown, we do on startup. We started doing that "properly" with bug 1089695, in the sense we can now detect interrupted shutdown sanitizations . We could work on avoiding jankiness, but it's a long process we already started years ago involving rewrite of sync APIs and better I/O paths.

I'd first concentrate on making sanitize on startup a rare event in this case.

I looked at the various AsyncShutdownTimeout annotation, on 45 most of them are blocked on "cookies: ready" while on 44 most are on "cookies: failed, history: ready"

The difference between 44 and 45 could be Bug 1243549. That bug still left cookies in the previous not-working state, followup bug 1245065 will fix them.
On the other side Bug 1244650 could already solve the cookies problem, since it starts sanitization at a safer point during shutdown, when the cookies service can do its job.

Now the problem is understanding whether the hang happens due to the cookies issue (we are either at cookies: failed or cookies: ready) or due to history clearing (on 44 we are always at history: ready).

Let's look at telemetry:

FX_SANITIZE_COOKIES indicates about 1% of users for which sanitizing cookies takes around 30s or more
BUT this is not just measuring cookies, indeed by looking at FX_SANITIZE_COOKIES2 it's clear the real culprit is FX_SANITIZE_PLUGINS that is part of FX_SANITIZE_COOKIES

FX_SANITIZE_HISTORY indicates about 0.2% of users for which sanitizing history takes around 30s or more

These users can easily go past the 1 minute phase timeout arbitrarily imposed by async shutdown.

The strange thing is that I'd expect people using clear history on shutdown having mostly an empty history for which sanitization should be quite fast. The assumption is clearly untrue for FX_SANITIZE_PLUGINS, since the time needed to instantiate a plugin and asking it to clear data is not related to it having data or not.

Summing up, we have a 1% of users taking about 30s to clear plugins data, that means we only have 30s left for any other shutdown work in the same phase.
Since we packed up all sanitization and components shutdown in profile-before-change, that's not a lot of time.
Bug 1244650 helps also cause it moves sanitization to profile-change-teardown and since the timeout is per phase, sanitization won't have anymore to share the timeout with components shutdown.

So, while the bug itself is not fixed, the currently landed patches (if uplifted) could already largely reduce its impact to a very tiny % of users.

What we can do further:

1. evaluate whether the 1 minute timeout was a realistic choice, considering the time some users take just to clear plugins data. Increasing the timeout could make regressions less likely to be noticed, but if the increase is well thought it may work. Something like 90s could do.

2. evaluate if we really want to instantiate plugins on shutdown just to clear them... Maybe we could always delay clearing plugins to startup? Maybe we should just stop waiting for plugins data clearing, hoping it will work? Regardless we are already ignoring any error thrown by them.

3. investigate why it takes so much to clear plugins data (I suspect we need to spawn the plugin-container process and go through IPC, that is a slow process), and what we could do to speed it up, but IIRC we already invested time in this in the recent past. I will file a bug in the proper component.

4. investigate history clearing speed up: bug 871908 is likely involved, may not be trivial and it seems to only affect 0.2% of the users clearing history

I will now go through the regressing patch to identify any possible change that could have made us cross the boundary, supposing we were already on the edge of it.
Flags: needinfo?(mak77)
differences after bug 1089695:
- cookies were the first thing to be cleared, now cache is cleared before them, telemetry tells me in general cache should take less than 10 seconds. Still it's likely starting cookies clearing before can help, considered plugins clearing time. Bug 1244650 helps here by starting all the clearing tasks in a strict loop.
- we were not waiting for most of the async sanitizations, we were basically waiting only for cookies. The previous behavior was just wrong.
- we were not yielding to avoid UI jankiness. Probably we should not do this when there's no UI (like on shutdown). I will file a bug for this.
- history was cleared synchronously. The amount of work should be the same, priority could differ since it's a different thread. The only thing that can help us here is profiling and bug 871908. Bug 1244650 helps by starting the process earlier and not concentrating all the cleanup work in a single shutdown phase.
- we are now properly handling partial sanitizations by reissuing sanitize on startup. This has likely nothing to do with the shutdown crash, is probably related to the hang on startup, but should be handled apart.
Depends on: 1249333
Depends on: 1249336

Comment 7

2 years ago
Here is my crash report:

https://crash-stats.mozilla.com/report/index/00e9a6cb-6cf3-4008-a305-72d3b2160219

Firefox's process doesn't get killed after closing Firefox. It started in Firefox 45.0b7, b6 didn't have this problem. Here is some debug output which may help:

WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"sanitize.js: Sanitize","state":{"progress":{"cache":"ready","cookies":"ready","offlineApps":"ready","history":"ready","formdata":"ready","downloads":"ready","sessions":"cleared"}},"filename":"chrome://browser/content/sanitize.js","lineNumber":79,"stack":["chrome://browser/content/sanitize.js:Sanitizer.prototype.sanitize<:79","resource://gre/modules/Task.jsm:TaskImpl_run:315","resource://gre/modules/Task.jsm:TaskImpl:276","resource://gre/modules/Task.jsm:createAsyncFunction/asyncFunction:250","chrome://browser/content/sanitizeDialog.js:gSanitizePromptDialog.sanitize:114","chrome://global/content/bindings/dialog.xml line 375 > Function:anonymous:1","chrome://global/content/bindings/dialog.xml:_fireButtonEvent:376","chrome://global/content/bindings/dialog.xml:_doButtonCommand:344","chrome://global/content/bindings/dialog.xml:_handleButtonCommand:332","chrome://browser/content/sanitize.js:Sanitizer.showUI:748","chrome://browser/content/sanitize.js:Sanitizer.sanitize:761","resource://app/components/nsBrowserGlue.js:BG_sanitize:2178","chrome://browser/content/browser.xul:oncommand:1"]}] Barrier: Places Clients shutdown
WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"Places Clients shutdown","state":"(none)","filename":"/builds/slave/rel-m-beta-l64_bld-00000000000/build/toolkit/components/places/Database.cpp","lineNumber":488,"stack":["resource://gre/components/nsAsyncShutdown.js:nsAsyncShutdownClient.prototype.addBlocker:163","chrome://browser/content/sanitize.js:Sanitizer.onStartup<:767","resource://gre/modules/Task.jsm:TaskImpl_run:315","resource://gre/modules/Task.jsm:TaskImpl:276","resource://gre/modules/Task.jsm:createAsyncFunction/asyncFunction:250","resource://app/components/nsBrowserGlue.js:BG__finalUIStartup:764","resource://app/components/nsBrowserGlue.js:BG_observe:287"]}] Barrier: profile-change-teardown
console.error: 
  Message: Error: SessionFile is closed
  Stack:
    SessionFileInternal.write@resource://app/modules/sessionstore/SessionFile.jsm:277:29
this.SessionFile.write@resource://app/modules/sessionstore/SessionFile.jsm:76:12
SessionSaverInternal._writeState@resource://app/modules/sessionstore/SessionSaver.jsm:242:12
SessionSaverInternal._saveState@resource://app/modules/sessionstore/SessionSaver.jsm:213:12
SessionSaverInternal._saveStateAsync@resource://app/modules/sessionstore/SessionSaver.jsm:226:5
SessionSaverInternal.runDelayed/this._timeoutID<@resource://app/modules/sessionstore/SessionSaver.jsm:145:40
setTimeout_timer@resource://gre/modules/Timer.jsm:30:5
Spinner.prototype.observe@resource://gre/modules/AsyncShutdown.jsm:529:9
See Also: → bug 1206393
See Also: → bug 1187196
David, does that "SessionFile is closed" error raise any idea?
Flags: needinfo?(dteller)
(In reply to Markus Popp from comment #7)
> Here is my crash report:
> 
> https://crash-stats.mozilla.com/report/index/00e9a6cb-6cf3-4008-a305-
> 72d3b2160219

Could you please tell me the builtfrom value from about:buildconfig?
Flags: needinfo?(markus.popp)
Also, did you set Clear history on shutdown, or manually clear from the dialog?
Comment hidden (obsolete)
Comment hidden (obsolete)
Blocks: 1206393
(In reply to Marco Bonardo [::mak] from comment #8)
> David, does that "SessionFile is closed" error raise any idea?

Investigating this over irc.
Flags: needinfo?(dteller)

Comment 14

2 years ago
(In reply to Marco Bonardo [::mak] from comment #9)
> Could you please tell me the builtfrom value from about:buildconfig?

Built from https://hg.mozilla.org/releases/mozilla-beta/rev/3a671727fd14120dd85362997eb84d0a7ce41a11

> Also, did you set Clear history on shutdown, or manually clear from the dialog?

I cleared manually from the dialog.
Flags: needinfo?(markus.popp)
I was able to reproduce the hang in beta, in the end the problem is, as I suspected, bug 871908.

Removing pages from moz_places table creates a long transaction that causes the wal to grow insanely and doing that asynchronously causes a deadlock with queries on the main thread (the async thread is constantly reading from disk and holds the shared mutex).
Once the main thread is blocked it doesn't matter if the other clearings can proceed, we will just timeout and abort. This is the risk when we use the same db connection across 2 threads, and there's nothing we can do until all of Places APIs are async...

For example on shutdown we may deadlock with Windows7Jumplists trying to access history.hasHistoryEntries.
The startup hang is just the bookmarks service trying to initialize synchronously.

Why did we start noticing the cost of bug 871908 only after the fix for bug 1089695?
- clear history on startup before was happening rarely, especially cause we were ignoring most shutdown errors. Now that it runs more often on startup, it clearly hangs more often.
- on shutdown we execute queries asynchronously instead of synchronously and we wait for them.

I'm looking for a low risk fix that will "hide" the issue, that is how it was before, since I don't think we have the space for a correct fix.
Depends on: 1249786
(In reply to Marco Bonardo [::mak] from comment #16)
> - clear history on startup before was happening rarely, especially cause we
> were ignoring most shutdown errors. Now that it runs more often on startup,
> it clearly hangs more often.

And the just found bug 1249786.

Comment 18

2 years ago
it was ok on or before ff45b6 but 4 shutdown crashes every time since ff45b7

https://crash-stats.mozilla.com/report/index/942692f1-75a5-4735-b22b-d29f32160219
https://crash-stats.mozilla.com/report/index/f2fa5c8e-c19f-4317-add0-e93a42160219
https://crash-stats.mozilla.com/report/index/dc995a15-c1a2-412e-ac3b-8d72d2160220
https://crash-stats.mozilla.com/report/index/a1a5d3a9-cc4b-458b-9c45-e2a332160221

not sure if these are referring to the same cause
yes, it's the same cause.
https://bugzilla.mozilla.org/show_bug.cgi?id=1244650#c38 found another reason for a timeout abort, bug 976940
Depends on: 976940
[Tracking Requested - why for this release]: Crashing on shutdown.
status-firefox44: affected → wontfix
tracking-firefox44: ? → ---
tracking-firefox46: --- → ?
tracking-firefox47: --- → ?
status-firefox46: --- → affected
status-firefox47: --- → affected
Duplicate of this bug: 1249512
Philipp mentioned it's a top crash on 45.0b8. Tracked for 46 and 47.
tracking-firefox46: ? → +
tracking-firefox47: ? → +
(Reporter)

Comment 24

2 years ago
this signature seems to have become much worse in 45.0b7 (not released) and 45.0b8 - in very early data for these builds it makes up >50% of all crashes!

some users clearly connect the issue to the latest beta update:
https://crash-stats.mozilla.com/search/?product=Firefox&version=45.0b8&signature=%3Dmozalloc_abort+|+NS_DebugBreak+|+nsDebugImpl%3A%3AAbort&_facets=signature&_facets=user_comments&_facets=version&_facets=platform_pretty_version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-user_comments
yep, to fix this we need at least:
bug 1249786 - since currently we are sanitizing at every startup, even when not needed
bug 976940 - users with disabled form history crash every time
bug 871908 - this is the less trivial fix for which I'm trying to find a solution that is small enough to be uplifted.
Depends on: 1250363
No longer depends on: 871908, 1249336
Summary: crash in mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort and Firefox no longer responding on startup → crash in mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort (with abort message chrome://browser/content/sanitize.js) and Firefox no longer responding on startup

Comment 26

2 years ago
b7 looks like this went off the rails.  would another short term strategy be to back out the changes in b7 that lead to the dramatic increase in this signature, then try to go back to work on figuring out the complex set of things that are needed to solve the bigger underlying problem and let that set of fixes ride trains.

build  rank pct-of-crashes

45.b8    1 50.61% new mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 
45.b7    1 60.46% new mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 

45.b6    6 2.31% new  mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort
45.b5    7 2.09% 0.0  mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort
45.b4   10 1.57%      mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 
45.b3    9 5 1.54% -0.mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 
45.b2   10 1.44%      mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 
45.b1    3 2.93%      mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 
44.0.2  12  1.18%     mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 
44.0.1  23 -14 	0.58% mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 
44.0    10 1.06%      mozalloc_abort | NS_DebugBreak | nsDebugImpl::Abort 

looks like there are also a number of shutdownhang | WaitForSingleObjectEx .... and 
shutdownhang | _VEC_memset signatures that were part of 44.0-45.b6 that have reduced in relative volume since this exploded.  are any of those related or is it just the case that we crash hear earlier now in shut down and avoid ever getting to the conditions of those shutdownhang bugs?
(In reply to philipp from comment #24)
> this signature seems to have become much worse in 45.0b7 (not released) and
> 45.0b8 - in very early data for these builds it makes up >50% of all crashes!

It also got much worse on nightly (similar massive spike in crashes) starting with the February 18 nightly build.  The pushlog for the regression worsening on nightly is:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=709f559b5406&tochange=1150ac4755c7

(I previously thought this bug was about that spike, and didn't notice that it was filed two days before it happened.)

Comment 28

2 years ago
b7 was never released, please ignore it. b8 is what you want to take a look at, and its data is *very* early as it was only released a few hours ago and updates were already disabled again for this bug. In any case, this very probably means that we will not release 45 on time.

Comment 29

2 years ago
bug 1249512 has STR.
(In reply to magicp from comment #29)
> bug 1249512 has STR.

the "disable form history" hang is simple to fix, bug 976940 will land as soon as the tree reopens. But I doubt all the hangs are just due to people disabling form history.

A part of them is very likely due to history clearing (bug 1250363 contains a workaround for the more complex technical debt bug we have from quite some time, needs some additional hours of testing), and a part is due to the fact we are sanitizing on startup also when not needed (bug 1249786, unfortunately hit a, so far, unexplained talos regression).

The critical point here is the beta delay, for which we may evaluate a complex backout of various changesets to go back to a situation were issues exists, but they don't cause shutdown crashes.

That involves backing out the following changesets (from the newest to the oldest):
https://hg.mozilla.org/releases/mozilla-beta/rev/32608d640a8c
https://hg.mozilla.org/releases/mozilla-beta/rev/8ade38cf0509
https://hg.mozilla.org/releases/mozilla-beta/rev/fb1653512406
https://hg.mozilla.org/releases/mozilla-beta/rev/46bf92e380db
https://hg.mozilla.org/releases/mozilla-beta/rev/b02e8e8f06da
https://hg.mozilla.org/releases/mozilla-beta/rev/466f3cc06a67
https://hg.mozilla.org/releases/mozilla-beta/rev/9a6d45264eb2

some backouts may not be trivial cause other stuff landed in the middle. This will bring us back to 44.0a1 that should also help with the spike of crashes in 44 due to Bug 1089695.
Blocks: 1250424
I cloned this bug to bug 1250424, that will track work to properly fix this in Nightly 47 and Aurora 46.

Here I'll take care of the backouts in Beta 45.
status-firefox46: affected → wontfix
status-firefox47: affected → wontfix
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5273eb9729b1
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0de004c9d029
https://treeherder.mozilla.org/#/jobs?repo=try&revision=25320fc5c047
Landed the backout, now I'll have some fun marking the backed out bugs flags...
https://hg.mozilla.org/releases/mozilla-beta/rev/d071c2a3c872
https://hg.mozilla.org/releases/mozilla-beta/rev/18a7dc1d454f
https://hg.mozilla.org/releases/mozilla-beta/rev/b9e0e84444f8
https://hg.mozilla.org/releases/mozilla-beta/rev/6eb8ca2c57ec
https://hg.mozilla.org/releases/mozilla-beta/rev/ce5224000cec
https://hg.mozilla.org/releases/mozilla-beta/rev/30d48874cea4
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox45: affected → fixed
Resolution: --- → FIXED
(Reporter)

Comment 36

2 years ago
the affected user at https://support.mozilla.org/questions/1110002 confirmed that the original behaviour from 43 (with no crashes and no hangs) was restored in the latest beta builds for 45 :-)
status-firefox45: fixed → verified
Thanks for reporting back!
You need to log in before you can comment on or make changes to this bug.