[e10s] Crash in mozalloc_abort | NS_DebugBreak | ErrorLoadingBuiltinSheet when using specific user.js

VERIFIED FIXED in Firefox 50

Status

()

defect
--
critical
VERIFIED FIXED
3 years ago
3 years ago

People

(Reporter: matthieu, Assigned: heycam)

Tracking

({crash, reproducible})

48 Branch
mozilla52
Unspecified
Windows
Points:
---

Firefox Tracking Flags

(e10s?, firefox48 wontfix, firefox49 wontfix, firefox50 verified, firefox51 verified, firefox52 verified)

Details

(crash signature)

Attachments

(5 attachments)

Posted file user.js
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160823121617

Steps to reproduce:

We just  use a specific user.js for our intranet.

When the version firefox 48 is installed, firefox crashes and has to repaired.

We have to put firefox 47


Actual results:

firefox crash


Expected results:

no crash
Thanks for taking the time to report this!

Please follow the steps on http://support.mozilla.com/kb/Firefox%20crashes and report back here.
Please provide the crash ID from about:crashes : https://developer.mozilla.org/en/How_to_get_a_stacktrace_for_a_bug_report explains how to do this. When doing so, please also add the keyword "crashreportid" to the "Keywords" field of this report.
Flags: needinfo?(matthieu)
I'm able to reproduce the crash with e10s enabled.

Reporter, can you type about:config in the location bar and search for values for strings:
browser.tabs.remote.autostart
browser.tabs.remote.autostart.2

In addition, could you type about:crashes and post some crash reports links (bp-...).
Here 's the values :
browser.tabs.remote.autostart    : false
browser.tabs.remote.autostart.2  : true

I work on a Intranet where user can use Firefox or Internet Explorer.
Since yesterday, we have more than Hundred call about that. 

So I think that any config with Firefox 48 and our user.js will crash.
(In reply to Mattig from comment #3)
> browser.tabs.remote.autostart.2  : true

I think it's the issue. Could you set this string to "false", restart Firefox and test again. Crashes should stop.
I can confim the issue: every tab crash when Firefox is starting after saving this user.js in the profile.

I can reproduce it in both cases:
1)
browser.tabs.remote.autostart=true
browser.tabs.remote.autostart.2=false
2) (default scenario in FF48)
browser.tabs.remote.autostart=false
browser.tabs.remote.autostart.2=true

CR:
https://crash-stats.mozilla.com/report/index/e93f80a1-eb98-47f0-b5ea-a70912160906
https://crash-stats.mozilla.com/report/index/ba9dcde0-7dbf-421b-8bbf-d95082160906
Severity: normal → critical
Status: UNCONFIRMED → NEW
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | ErrorLoadingBuiltinSheet ]
tracking-e10s: --- → ?
Component: Untriaged → Security: Process Sandboxing
Ever confirmed: true
Keywords: crash, reproducible
OS: Unspecified → Windows
Summary: Crash with User.js and Firefox 48 → [e10s] Crash in mozalloc_abort | NS_DebugBreak | ErrorLoadingBuiltinSheet when using specific user.js
Posted file prefs.js
STR: You can create a clean profile and copy the pref.js file I attached.
Thanks.
We afraid that in a future update, the pref.js will be replaced ...

So We prefer to install firefox 47 following that link 
 https://download-installer.cdn.mozilla.net/pub/firefox/releases/47.0.1
and stop automatic update.

The problem is also on Mac OS
Did you test my suggestion in comment #4?
You can use the config file to set the prefs of all the Firefox you have to manage.
Component: Security: Process Sandboxing → DOM: Content Processes
The problem happens also when we put the content of our user.js in the pref.js 
The id of the last problem was bp-f9b11de9-8cca-465b-a215-cc5342160920

We have more than 2000 users, we try to use firefox instead of IE, but with this pb, its not easy.
See Also: → 1303764
Posted file assertion backtrace
With a debug build, I hit this assertion in the content process with the user.js in my profile and e10s enabled.  This means we haven't called nsContentUtils::Init yet.  Probably the ErrorLoadingBuiltinSheet is just the first release fatal assertion we hit when we fail to do a bunch of init stuff here.

nsContentUtils::Init is called from nsLayoutStatics::Initialize:

http://searchfox.org/mozilla-central/rev/dbbbafc979a8362a1c8e3e99dbea188fc832b1c5/layout/build/nsLayoutStatics.cpp#178

and that is called from nsLayoutModule.cpp's Initialize function.  That nsLayoutStatics::Initialize call comes after the xpcModuleCtor -> nsXPConnect::InitStatics call which we're under when we assert, though:

http://searchfox.org/mozilla-central/rev/dbbbafc979a8362a1c8e3e99dbea188fc832b1c5/layout/build/nsLayoutModule.cpp#426-431

I'm not sure what's different about the startup sequence when e10s is disabled that makes us not run into this.
I thought maybe it's expected that it's not possible to create nsIURIs at this point during startup given the current order that things are initialized in, and that we could lazily fill in mFileURIWhitelist when it's needed.  When I do that, I then run into the LoadSheetSync failure.
The nsIURI creation might be a red herring.  I can reproduce this problem just with this pref set:

network.protocol-handler.external.file = true
OK, the real problem is that we rely on reading some things from file: URIs during startup.  That's not compatible with file: being set to use an external protocol handler with network.protocol-handler.external.file=true.  For whatever reason, in the parent process (or in non-e10s mode) we look up the file: protocol handler quite early, and this is before we've loaded user prefs.  So we do return an nsIFileProtocolHandler as expected, and not an nsExternalProtocolHandler.  The nsIFileProtocolHandler gets cached with nsIOService::CacheProtocolHandler and then all subsequent file: protocol handler lookups use it, even after we load the user pref telling us to use the nsExternalProtocolHandler.  In content processes, the first file: URI look must be late enough that the pref is there and we return an nsExternalProtocolHandler.

So it seems that network.protocol-handler.external.file=true is just broken in non-e10s mode, but in content processes it happens to work, breaking our internal uses of file: URIs.

Patrick, is it OK to remove support for network.protocol-handler.external.file?
Flags: needinfo?(mcmanus)
Turns out both fixes are needed.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=92a4a1d73386&group_state=expanded
Assignee: nobody → cam
Status: NEW → ASSIGNED
Flags: needinfo?(mcmanus)
Attachment #8793625 - Flags: review?(mcmanus) → review?(jduell.mcbugs)
Comment on attachment 8793626 [details]
Bug 1300720 - Part 2: Lazily initialize nsScriptSecurityManager::mFileURIWhitelist.

https://reviewboard.mozilla.org/r/80338/#review79104

::: caps/nsScriptSecurityManager.h:129
(Diff revision 1)
>      CheckLoadURIFlags(nsIURI* aSourceURI, nsIURI* aTargetURI, nsIURI* aSourceBaseURI,
>                        nsIURI* aTargetBaseURI, uint32_t aFlags);
>  
> +    // Returns the file URI whitelist, initializing it if it has not been
> +    // initialized.
> +    const nsTArray<nsCOMPtr<nsIURI>>& GetFileURIWhitelist();

I'd call this EnsureFileURIWhitelist instead.
Comment on attachment 8793626 [details]
Bug 1300720 - Part 2: Lazily initialize nsScriptSecurityManager::mFileURIWhitelist.

https://reviewboard.mozilla.org/r/80338/#review79106
Attachment #8793626 - Flags: review?(bobbyholley) → review+
Duplicate of this bug: 1304996
Comment on attachment 8793625 [details]
Bug 1300720 - Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler.

Honza, can you take this? See especially comment 13.
Attachment #8793625 - Flags: review?(jduell.mcbugs) → review?(honzab.moz)
Mattig, why are you setting Firefox to handle file:// URLs externally?  I don't want to say its wrong.  I am finding use cases and vindication for it.  The fix is going to disable it, so it may have another bad impact for you.

In general tho, I am all for disabling external protocol support for static/local resources, definitely from chrome.  But I'm afraid the level this api works on disallow recognition of the load originator.  Hence, disable even for content.
Comment on attachment 8793625 [details]
Bug 1300720 - Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler.

https://reviewboard.mozilla.org/r/80336/#review81010

::: netwerk/base/nsIOService.cpp:508
(Diff revision 1)
>  }
>   
> +static bool
> +UsesExternalProtocolHandler(const char* aScheme)
> +{
> +    if (NS_LITERAL_CSTRING("file").Equals(aScheme)) {

yeah!

should we also add chrome and resource?
Attachment #8793625 - Flags: review?(honzab.moz)
Comment on attachment 8793625 [details]
Bug 1300720 - Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler.

https://reviewboard.mozilla.org/r/80336/#review81016
Attachment #8793625 - Flags: review+
(In reply to Honza Bambas (:mayhemer) from comment #21)
> In general tho, I am all for disabling external protocol support for
> static/local resources, definitely from chrome.  But I'm afraid the level
> this api works on disallow recognition of the load originator.  Hence,
> disable even for content.

Yeah, I guess it would be nice if we could distinguish between chrome and content uses.  But given this is all broken for file: URIs currently I think it's fine to properly disable it, and if we need a mechanism later to block file: URI access only from content we can look atit later.
Pushed by cmccormack@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8625d6dd0df7
Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler. r=honza
https://hg.mozilla.org/integration/mozilla-inbound/rev/101479389303
Part 2: Lazily initialize nsScriptSecurityManager::mFileURIWhitelist. r=bholley
(In reply to Honza Bambas (:mayhemer) from comment #21)
> Mattig, why are you setting Firefox to handle file:// URLs externally?  I
> don't want to say its wrong.  I am finding use cases and vindication for it.
> The fix is going to disable it, so it may have another bad impact for you.
> 
> In general tho, I am all for disabling external protocol support for
> static/local resources, definitely from chrome.  But I'm afraid the level
> this api works on disallow recognition of the load originator.  Hence,
> disable even for content.

In my case, we use this to be able to open file:// links pointing to a folder directly in Windows Explorer. We know it's disabled by default for security reasons, so we've set an exception rule only for our company Intranet. 
Our main tools are based on web applications, that's why it's very convenient for us to be able to do this.
Firefox has its own file explorer, but it's not as complete as Windows one (e.g drag & drop to an application).

If you disable this behavior, is there any way to achieve this?
(In reply to Alex from comment #28)
> If you disable this behavior, is there any way to achieve this?

No, the "network.protocol-handler.external.file" preference will no longer have any effect.  

But you could potentially implement an add-on (extension) that may redirect file:// URI handling to an external protocol, i.e. implement what the "network.protocol-handler.external.*" pref does as an add-on.  You can then allow that behavior only on the child process to make navigating to file:// uri be redirected to external protocol only from content (=web pages).

How exactly to write such an add-on is beyond this bug's scope.
Flags: needinfo?(matthieu)
Comment on attachment 8793625 [details]
Bug 1300720 - Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler.

https://reviewboard.mozilla.org/r/80336/#review81340

The review should be done by :mayhemer I believe.

Honza
Attachment #8793625 - Flags: review?(odvarko)
Sorry, guessed the wrong Bugzilla nick in my commit message.
https://hg.mozilla.org/mozilla-central/rev/8625d6dd0df7
https://hg.mozilla.org/mozilla-central/rev/101479389303
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Comment on attachment 8793625 [details]
Bug 1300720 - Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler.

We should uplift this, though I'm not sure how far we should uplift it.

Approval Request Comment
[Feature/regressing bug #]: e10s
[User impact if declined]: users with e10s enabled and certain prefs set will reliably crash content processes
[Describe test coverage new/current, TreeHerder]: no automated testing for the crash, but I did manually verify the fix worked; fix landed on mozilla-central yesterday
[Risks and why]: risk of fix itself causing problems is low IMO, since the patches are pretty self contained, and the effect of the patches is to move us off an uncommonly used path (file: URIs effectively disabled)
[String/UUID change made/needed]: -
Attachment #8793625 - Flags: approval-mozilla-beta?
Attachment #8793625 - Flags: approval-mozilla-aurora?
Comment on attachment 8793625 [details]
Bug 1300720 - Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler.

Taking it to aurora, will make the call for beta later.
Attachment #8793625 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Cameron, just part 1 or also part 2 for uplifts ?
Flags: needinfo?(cam)
Both parts, thanks Tomcat.
Flags: needinfo?(cam)
I would like this to stabilize on Nightly52 and Aurora51 before uplifting to Beta. This fix might make it in 50.0b6.
Comment on attachment 8793625 [details]
Bug 1300720 - Part 1: Prevent file:, chrome: and resource: URIs from using an external protocol handler.

I got a second opinion on this one from Overholt, he suggested I take it in b5. Beta50+
Attachment #8793625 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Reproduced the issue on Nightly 2016-09-06 using the STR in comment 6.
Verified fixed 50b11, 51.0a2 (2016-11-02), 52.0a1 (2016-11-02) Win 7.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.