Closed Bug 1191145 Opened 9 years ago Closed 8 years ago

Stop blocking scripts when handling IPC messages

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla44

Tracking Flags:

Tracking

Status

firefox42

---

affected

firefox44

---

fixed

firefox47

---

fixed

People

(Reporter: billm, Assigned: billm)

References

Details

Attachments

(1 file)

patch 9 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) 6.26 KB, patch	dvander : review+	Details \| Diff \| Splinter Review

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Description

•

9 years ago

Attached patch patch — Details — Splinter Review

We added the script blocking to handle the case where a CPOW caused the child process to run a nested event loop somehow. That would cause a deadlock. Now we handle this case by canceling the CPOW, so I think there's no reason to have a script blocker (which often was ineffective anyway).

I think we should remove this code since it's pretty flaky. We sometimes run scripts even if there's a script blocker on the stack. We also print all sorts of NS_ASSERTIONs about this during tests, which everyone just ignores.

Attachment #8643390 - Flags: review?(dvander)

Benjamin Smedberg

Comment 1

•

9 years ago

I really don't like this. We're trading off some amount of addon compatibility with the certainty that at some point web script will enter a nested loop and the addon will experience inconsistent behavior and unexpected exceptions.

I'm also really scared about "we print all sorts of NS_ASSERTIONs about this during tests". I thought this was going to be a fatal error, and if we're experiencing this in our own automated tests without addons, there's clearly something seriously wrong that we should fix instead of ignoring-via-thrown-exceptions.

Benjamin Smedberg

Comment 2

•

9 years ago

Do we also cancel CPOWs when content script attempts to enter plugin code? That is equally dangerous in terms of deadlocks or other out-of-sync behavior.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 3

•

9 years ago

Bill, and Benjamin, did you reach a conclusion on this topic?  I'm asking because David is the reviewer, and the code that implements the above described intent is trivially correct, but I'm worried about whether we agree about the intent itself.

Flags: needinfo?(wmccloskey)

Flags: needinfo?(benjamin)

Benjamin Smedberg

Comment 4

•

9 years ago

No, we reached no conclusion. I still oppose this change in general, but we haven't talked about it again.

Flags: needinfo?(benjamin)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 5

•

9 years ago

Sorry this too me so long. I've been busy and also avoiding this.

Here's the situation we're in now (without this patch):

When we dispatch a CPOW, we put a script blocker on the stack. If we try to run content JS during this time, three things can happen:
1. We delay running the JS code until after the CPOW has been processed. This is the ideal case.
2. We never run the JS code.
3. We run the JS code even though we aren't supposed to and we print a warning.

Additionally, we used to crash if you entered content JS code during a CPOW, but that made Firefox completely unusable with some add-ons installed, so I backed it out a long time ago (bug 950745). The main issue was add-ons that call dispatchEvent on some content DOM node.

There have never been any NPAPI checks. I asked you how to implement them in bug bug 950745 and you didn't respond. I'm not sure what they would be checking for. Plugins don't do synchronous communication with the chrome process, so they would have to call back into the content process to trigger problems, and I assume that wouldn't be any different than content JS at that point.

In any of the cases above where we do run content JS during a CPOW and it triggers a nested event loop, we cancel the CPOW.

In my opinion, this patch makes our behavior more consistent. Rather than randomly block some JS and not other JS, we allow it to run and rely on cancellation to avoid deadlock.

It sounds like your objection is to the idea of cancellation, since it means the add-on will sometimes work and not other times (like on pages that do sync XHRs). I think we just have a fundamental disagreement about what CPOWs are for. They're there because I think most users would rather have a sorta working add-on than no add-on. We're being pretty up front about the fact that CPOWs are terrible. Very few add-on developers that I know of are using them intentionally.

(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> I'm also really scared about "we print all sorts of NS_ASSERTIONs about this
> during tests". I thought this was going to be a fatal error, and if we're
> experiencing this in our own automated tests without addons, there's clearly
> something seriously wrong that we should fix instead of
> ignoring-via-thrown-exceptions.

The assertions happen because we're running content JS when there is a script blocker on the stack (case 3 above). That works fine as long as there's no nested event loop involved.

Again, this is about getting something to work. I would rather have these tests in the state they're in than not have them. Eventually, we'll rewrite them using BrowserTestUtils or whatever, but until then, they serve a purpose.

This patch will allow us to get more tests working since it avoids case 2 above, which usually breaks tests.

Flags: needinfo?(wmccloskey)

David Anderson [:dvander] - inactive, e-mail if emergency

Updated

•

9 years ago

Attachment #8643390 - Flags: review?(dvander) → review+

Pulsebot

Comment 6

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/344cb85a558f

Carsten Book [:Tomcat]

Comment 7

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/344cb85a558f

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox44: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla44

Pulsebot

Comment 8

•

9 years ago

Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f913b64862d8

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 9

•

9 years ago

Backed out to see if it's related to a topcrash.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Pulsebot

Comment 10

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b237f1bffcfd

Carsten Book [:Tomcat]

Comment 11

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b237f1bffcfd

Status: REOPENED → RESOLVED

Closed: 9 years ago → 8 years ago

status-firefox47: --- → fixed

Resolution: --- → FIXED

Daniel Holbert [:dholbert]

Updated

•

8 years ago

Blocks: 1239767

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Stop blocking scripts when handling IPC messages

Categories

(Core :: IPC, defect)

Tracking

()

People

(Reporter: billm, Assigned: billm)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated

Attachment

General

Description

File Name

Content Type