Closed Bug 1673716 Opened 3 months ago Closed 2 months ago

Crash in [@ AsyncShutdownTimeout | quit-application-granted | AboutHomeStartupCache: Writing cache]

Categories

(Firefox :: New Tab Page, defect, P3)

Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
85 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox82 --- unaffected
firefox83 --- unaffected
firefox84 --- wontfix
firefox85 --- fixed

People

(Reporter: aryx, Assigned: mconley)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

Attachments

(6 files)

1 installation (OS X 10.12)

Maybe Fission related. (DOMFissionEnabled=1)

Crash report: https://crash-stats.mozilla.org/report/index/d27f47b6-89d8-4cc8-b39c-5dcf60201027

MOZ_CRASH Reason: MOZ_CRASH()

Top 10 frames of crashing thread:

0 libmozglue.dylib mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:33
1 XUL NS_DebugBreak xpcom/base/nsDebugImpl.cpp:435
2 XUL nsDebugImpl::Abort xpcom/base/nsDebugImpl.cpp:134
3 XUL NS_InvokeByIndex 
4 XUL XPCWrappedNative::CallMethod js/xpconnect/src/XPCWrappedNative.cpp:1142
5 XUL XPC_WN_CallMethod js/xpconnect/src/XPCWrappedNativeJSOps.cpp:925
6 XUL js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:598
7 XUL Interpret js/src/vm/Interpreter.cpp:3336
8 XUL js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:635
9 XUL js::Call js/src/vm/Interpreter.cpp:680
Priority: -- → P3

Looking at that crash report, we're timing out in an AsyncShutdown blocker with this metadata:

{"phase":"quit-application-granted","conditions":[{"name":"AboutHomeStartupCache: Writing cache","state":"Getting cache streams","filename":"resource:///modules/BrowserGlue.jsm","lineNumber":5194,"stack":["resource:///modules/BrowserGlue.jsm:init:5194","resource:///modules/BrowserGlue.jsm:BG__beforeUIStartup:1419","resource:///modules/BrowserGlue.jsm:BG_observe:1041"]}]}

So the AboutHomeStartupCache is timing out waiting for the privileged about content process to return the cache streams here:

https://searchfox.org/mozilla-central/rev/16d30bafd4e5276d6d3c632fb52a6c71e739cc44/browser/components/BrowserGlue.jsm#5309-5310

I wonder if this can occur if the privileged about content process is force-killed before it has a chance to respond. We should probably update the AboutHomeStartupCache to notice if there's a _cacheDeferred available on AboutHomeStartupCache when the privileged about content process causes ipc:content-shutdown to fire, and if so, make sure we resolve it.

Assignee: nobody → mconley
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/b38c39fa6915
Resolve the about:home startup cache request if the privileged about content process crashes. r=Gijs
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch

Can you take another look at this? There are still crash reports for recent build IDs.

Flags: needinfo?(mconley)

Yeah, this didn't fix it. :/ It's strictly an improvement, I believe, but didn't fix the overall issue.

Status: RESOLVED → REOPENED
Flags: needinfo?(mconley)
Resolution: FIXED → ---
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5fdcb0dfa86f
Try to make sure the about:home startup cache worker construction Promise always resolves. r=Gijs
Status: REOPENED → RESOLVED
Closed: 3 months ago2 months ago
Resolution: --- → FIXED

The severity field is not set for this bug.
:thecount, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(sdowne)

This is currently not enabled by default outside of Firefox Nightly, so setting Severity to N/A.

Severity: -- → N/A
Flags: needinfo?(sdowne)

It is theoretically possible for two about:home documents to be loading
at the same time during startup (for example, if the user passed in a series
of URLs to open via the command-line, including multiple about:home's).

This patch ensures that only one of those documents (the first to load)
gets to consume the cached streams.

Attachment #9188629 - Attachment description: Bug 1673716 - [WIP] Make AboutHomeStartupCache handle content process crashes better. → Bug 1673716 - Make AboutHomeStartupCache ignore all but the first privileged about content process. r?Gijs
Attachment #9188733 - Attachment description: Bug 1673716 - Ensure the about:home startup cache can only be used by one document. r?Gijs! → Bug 1673716 - Ensure the about:home startup cache can only be used by one BrowsingContext. r?Gijs!
Attachment #9188629 - Attachment description: Bug 1673716 - Make AboutHomeStartupCache ignore all but the first privileged about content process. r?Gijs → Bug 1673716 - Make AboutHomeStartupCache ignore all but the first privileged about content process. r?Gijs!

The severity field is not set for this bug.
:thecount, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(sdowne)
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dc9ac32cc19b
Ensure the about:home startup cache can only be used by one BrowsingContext. r=Gijs
https://hg.mozilla.org/integration/autoland/rev/113f5e76a871
Make AboutHomeStartupCache ignore all but the first privileged about content process. r=Gijs
Status: REOPENED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED

The crash volume is unchanged since the patches landed. Shall this bug be reopened?

Flags: needinfo?(mconley)

Yep.

Status: RESOLVED → REOPENED
Flags: needinfo?(mconley)
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Flags: needinfo?(sdowne)
Target Milestone: 84 Branch → ---
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7a360fe377ea
Only wait for 1 second when trying to create about:home startup cache on shutdown. r=emalysz
Status: ASSIGNED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 85 Branch
Regressions: 1679989

Too early to say whether or not we've brought down the frequency of the AsyncShutdown crashes, but it's clear we haven't eliminated them. Reopening. sigh.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

It looks like the previous patch landings have had no effect whatsoever.

Upon examination, it appears that DeferredTask will wait for a previous instance of the task to fire on finalization if one was already underway... so I suppose it's possible that one of the previous cache tasks has just died on us, which is why we're timing out here. In that case, we wouldn't enter the finalized branch of cacheNow that the last patch added. I think maybe I should move the timeout within the onShutdown method instead.

In an earlier attempt to fix this shutdown hang, a timeout was added to the cacheNow
task function to try to have a maximum of 1s of wait time during the shutdown blocker
before giving up and letting the shutdown proceed.

This didn't seem to put a dent in the shutdown hangs. It looks like DeferredTasks
that are being finalized don't actually re-enter the task if the task was already
running, which might explain why in some cases the timeout wasn't being hit. This
patch makes sure that the timeout is being used regardless of whether or not the
cache task is already underway.

Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fc5b2058d79a
Move the AboutHomeStartupCache shutdown blocker timeout to the onShutdown method. r=emalysz
Status: REOPENED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED
Regressions: 1680191
Duplicate of this bug: 1676088
Crash Signature: [@ AsyncShutdownTimeout | quit-application-granted | AboutHomeStartupCache: Writing cache] → [@ AsyncShutdownTimeout | quit-application-granted | AboutHomeStartupCache: Writing cache] [@ NS_DebugBreak(unsigned int, char const*, char const*, char const*, int)]
You need to log in before you can comment on or make changes to this bug.