Closed Bug 1593240 Opened 4 years ago Closed 4 years ago

Crash in [@ InvalidArrayIndex_CRASH | mozilla::ExtensionPolicyService::CheckContentScripts] with NoScript 11.0.4

Categories

(WebExtensions :: General, defect, P3)

defect

Tracking

(firefox-esr68 wontfix, firefox70 wontfix, firefox71 wontfix, firefox72 wontfix, firefox75 wontfix, firefox76 wontfix, firefox77 fixed)

RESOLVED FIXED
mozilla77
Tracking Status
firefox-esr68 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix
firefox72 --- wontfix
firefox75 --- wontfix
firefox76 --- wontfix
firefox77 --- fixed

People

(Reporter: philipp, Assigned: rpl)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-4dd7f3aa-1893-4cd5-b003-43ea90191031.

Top 10 frames of crashing thread:

0 xul.dll InvalidArrayIndex_CRASH xpcom/ds/nsTArray.cpp:27
1 xul.dll void mozilla::ExtensionPolicyService::CheckContentScripts toolkit/components/extensions/ExtensionPolicyService.cpp
2 xul.dll void mozilla::ExtensionPolicyService::CheckDocument toolkit/components/extensions/ExtensionPolicyService.cpp:465
3 xul.dll nsresult mozilla::ExtensionPolicyService::Observe toolkit/components/extensions/ExtensionPolicyService.cpp:262
4 xul.dll nsObserverService::NotifyObservers xpcom/ds/nsObserverService.cpp:291
5 xul.dll nsContentSink::NotifyDocElementCreated dom/base/nsContentSink.cpp:1540
6 xul.dll nsDocElementCreatedNotificationRunner::Run dom/base/nsDocElementCreatedNotificationRunner.h:24
7 xul.dll nsContentUtils::RemoveScriptBlocker dom/base/nsContentUtils.cpp:5201
8 xul.dll mozilla::dom::Document::EndUpdate dom/base/Document.cpp:6973
9 xul.dll void nsHtml5AutoPauseUpdate::nsHtml5AutoPauseUpdate parser/html/nsHtml5AutoPauseUpdate.h:17

these crash reports are increasing in the past couple of days - most reports show that noscript installed, the spike is corresponding to the release of version 11.0.4 of the addon.

some comments indicate that the tab crash is quite easy to reproduce while browsing around on https://www.imdb.com/.

[Tracking Requested - why for this release]:
this easily reproduces for me in a new profile when noscript is installed by just visiting the imdb homepage.

the crashes happen on firefox 70 and 68esr, but no longer reproduce in firefox 71.0b - something in this range mitigated the issue:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=267f9b233386ffbb192ae5d506bb58462dbba4c8&tochange=cbb879b04f400e947ccce656e5bb485416e1dda4
https://bugzilla.mozilla.org/buglist.cgi?bug_id=1583932%2C1403051%2C1582726%2C1570567%2C1581512%2C1577643%2C&list_id=14970321

Please check whether it's fixed in https://github.com/hackademix/noscript/releases/tag/11.0.6rc3 (There are several probably related changes, and I cannot reproduce, at least).
I've just submitted 11.0.6 stable on AMO, anyway, waiting for review.
Thanks!

Yes thanks, NoScript 11.0.6 appears to fix the crash in my case.

I think that the policy.mContentScripts is being mutated during the iteration, which invalidates the ScriptArray iterator (after which it should not be used any more, but we do):

https://searchfox.org/mozilla-central/rev/e8e4a8641f3e6728ea42ae30905ce2145bfd5568/toolkit/components/extensions/ExtensionPolicyService.cpp#494,503-504

NoScript 11.0.4 uses contentScripts.register and some synchronous XHR (an insane amount of them, in a loop) (which blocks document_start), which supports my thought of mContentScripts is being mutated.

philipp, I'm not sure if your regression range is correct. I think that it is much more likely that the spike shows up due to the popularity of NoScript and the size of the population of release.

rob, this was meant more as a "fix"-range to explain why we were not seeing the problem in 71. but since new crash data is indicating that the noscript update successfully fixed this kind of crash, we don't have to investigate this any further.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED

Extensions should not be able to crash tabs, so I'm reopening this.

In comment 4 I pointed to a hazard in our code that could be responsible for this.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

(In reply to Rob Wu [:robwu] from comment #4)

NoScript 11.0.4 uses contentScripts.register and some synchronous XHR (an insane amount of them, in a loop) (which blocks document_start), which supports my thought of mContentScripts is being mutated.

NoScript doen't use contentScripts.register() anymore for DOM-based CSP injection (as it did previously) because

  1. it doesn't exist in Chromium
  2. There would be no way to take in account the tab and the frame hierarchy, and since I switched to a pseudo-sync messaging based approach to retrieve it it made more sense getting the whole package in one interaction.

However I'd be happy to switch back to contentScripts.register() (and remove the need for synchronous XHR, in Firefox at least) if there was a way for content scripts to know about their tab id and their ancestors since the beginning of their execution (without requiring asynchronous calls).

Priority: -- → P2
Priority: P2 → P3

ExtensionPolicyService::CheckContentScripts does retrieve mContentScripts a WebExtensionPolicy instance
and it may call the ExtensionProcessScript methods PreloadContentScript or LoadContentScript while iterating
over it mContentScript.

Both PreloadContentScript and LoadContentScript are going to run some privileged JS code, and LoadContentScript
will load an extension content script. There is a chance that some of the JS code executed could call
WebExtensionPolicy::UnregisterContentScript (or RegisterContentScript) and mutate the mContentScripts array
that EPS::CheckContentScripts is already iterating over, and when that happens it is possible that once the
execution goes back to the ongoing CheckContentScripts iteration, the iterator is invalidated and
the InvalidArrayIndex_CRASH triggered.

This patch applies the following changes to avoid that:

  • Introduce a new WebExtensionPolicy::ForEachContentScript, which takes a labda with signature
    void (RefPtr<WebExtensionContentScript>&)
    meant to be used to iterate over the registered content scripts without retrieving the mContentScripts
    array.
  • Change ExtensionPolicyService::CheckContentScripts to use policy->ForEachContentScript
    instead of policy->GetContentScripts, and collect the matching script to be loaded into a
    new temporary array instead of calling ProcessScript->LoadContentScript while still iterating
    over mContentScripts.

As an additional proposed change, in WebExtensionPolicy::RegisterContentScript/UnregisterContentScript
we could also check if a call to WebExtensionPolicy::ForEachContentScript is still in progress
and throw an error instead of mutating mContentScripts (in a follow up may be also worth to collect
some telemetry to double-check if that is actual ever happening and which is the incidence).

Assignee: nobody → lgreco
Status: REOPENED → ASSIGNED

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression
Attachment #9137691 - Attachment description: Bug 1593240 - Prevent WebExtensionPolicy mContentScripts to be changed while EPS CheckContentScripts is iterating over it. → Bug 1593240 - Prevent ExtensionPolicyWebExtensionPolicy mContentScripts to be changed while EPS CheckContentScripts is iterating over it.
Attachment #9137691 - Attachment description: Bug 1593240 - Prevent ExtensionPolicyWebExtensionPolicy mContentScripts to be changed while EPS CheckContentScripts is iterating over it. → Bug 1593240 - Prevent re-entrancy issues in ExtensionPolicyService::CheckContentScripts related to EPS.loadContentScript calls

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:rpl, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(lgreco)

I did push this patch on talos, and I also did run some tests locally with a couple of test extensions which were registering a considerable number of content scripts and then collected some profile, to see if I was able to get any measurable difference with and without the patch, but didn't got anything noticeable.

I discussed with Shane on Friday and we agreed to land this today.

Flags: needinfo?(lgreco)
Pushed by luca.greco@alcacoop.it:
https://hg.mozilla.org/integration/autoland/rev/248a9758cc2f
Prevent re-entrancy issues in ExtensionPolicyService::CheckContentScripts related to EPS.loadContentScript calls r=robwu,mixedpuppy
Status: ASSIGNED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla77
You need to log in before you can comment on or make changes to this bug.