Closed Bug 817342 Opened 8 years ago Closed 8 years ago
Crash [@ js::GCMarker::process
Mark Stack Top(js::Slice Budget&) ] with Redirector Extension
As a result of bug 790338, there is a new crash that can be exposed when the Print Edit and Redirector extensions are present and Print Edit is invoked. bp-da0b6941-aaee-4977-aa90-fe7372121201 Regression window: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=7e6ac3bfbae9&tochange=26172ff887ae STR 1. Launch Firefox with a new profile 2. Install the following extensions and restart Firefox: a) Print Edit: https://addons.mozilla.org/en-US/firefox/addon/print-edit/ b) Redirector: https://addons.mozilla.org/en-US/firefox/addon/redirector/ 3. Click the home button to load the Firefox default home page 4. Right-click the page and choose Print... --> Print Edit 5. Wait a few seconds Crashing Thread Frame Module Signature Source 0 mozjs.dll js::GCMarker::processMarkStackTop js/src/gc/Marking.cpp:1348 1 mozjs.dll js::GCMarker::drainMarkStack js/src/gc/Marking.cpp:1407 2 mozjs.dll MarkGrayReferences js/src/jsgc.cpp:2743 3 mozjs.dll EndMarkingCompartmentGroup js/src/jsgc.cpp:3214 4 mozjs.dll SweepPhase js/src/jsgc.cpp:3441 5 mozjs.dll IncrementalCollectSlice js/src/jsgc.cpp:3879 6 mozjs.dll GCCycle js/src/jsgc.cpp:4006 7 mozjs.dll Collect js/src/jsgc.cpp:4121 8 mozjs.dll js::GCFinalSlice js/src/jsgc.cpp:4166 9 xul.dll CCTimerFired dom/base/nsJSEnvironment.cpp:3256 10 xul.dll nsTimerEvent::Run xpcom/threads/nsTimerImpl.cpp:565 11 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:627 12 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:82 13 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:208 14 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:182 15 xul.dll nsBaseAppShell::Run widget/xpwidgets/nsBaseAppShell.cpp:163 16 xul.dll nsAppShell::Run widget/windows/nsAppShell.cpp:229 17 xul.dll nsAppStartup::Run toolkit/components/startup/nsAppStartup.cpp:290 18 xul.dll XREMain::XRE_mainRun toolkit/xre/nsAppRunner.cpp:3823 19 xul.dll XREMain::XRE_main toolkit/xre/nsAppRunner.cpp:3890 20 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:4084 21 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:105 22 firefox.exe __tmainCRTStartup crtexe.c:552 23 kernel32.dll BaseThreadInitThunk 24 ntdll.dll __RtlUserThreadStart 25 ntdll.dll _RtlUserThreadStart
Excellent! I'm able to reproduce on Windows. Setting OS to "Windows 7" even though it also happens on XP.
OS: All → Windows 7
I could have sworn I tried this earlier in a debug build, but I guess I forgot. Anyway, this trips a compartment assertion in a Linux debug build for me. It seems to have something to do with the document's JS wrapper not having the same global as the window. It also nearly trips this assertion in nsDocument::SetScriptGlobalObject: NS_ASSERTION(JS_GetGlobalForObject(cx, obj) == newScope, "Wrong scope, this is really bad!"); However, the call to JS_GetGlobalForObject dies first because cx->compartment != obj->compartment(). Could a DOM person please look into this?
Here's a stack trace of the assertion. It might have something to do with printing.
So in this case we have: 1) The document has a wrapper. 2) The JSContext of the new global is not same-compartment with that wrapper. 3) In either case, the global of the document's wrapper is not the new global. Presumably the context compartment either matches the new global or matches whatever script is running on that context...
This could potentially be due to bugs in the addons themselves, maybe moving objects between compartments when they shouldn't.
Whiteboard: [js:p1] → [js:p1] could be bug(s) in the addons?
(In reply to Daniel Veditz [:dveditz] from comment #5) > This could potentially be due to bugs in the addons themselves, maybe moving > objects between compartments when they shouldn't. These addons don't have any binary components. And stuff written in JS is supposed to be safe against compartment mismatches. So I think it's our bug.
Whiteboard: [js:p1] could be bug(s) in the addons? → [js:p1]
If nobody else is looking at this, I can try to take a look.
Go for it. It would be great to get this fixed.
In addition to Print Edit, doing Print Preview or even just Print seem to cause the same problem. The Print Edit addon doesn't even seem to be needed. With just Redirector enabled, using the standard "print" command on the default homepage causes the same problem. Disabling the addon causes the problem to go away.
Also, the STR for the crash indicates that it takes a few seconds (suggesting that it needs to wait for a GC to run), but the assertion happens immediately. It might be useful to see if this trips the assert before the above regression range where the crash starts happening.
(In reply to Andrew McCreight [:mccr8] from comment #10) > Also, the STR for the crash indicates that it takes a few seconds > (suggesting that it needs to wait for a GC to run), but the assertion > happens immediately. It might be useful to see if this trips the assert > before the above regression range where the crash starts happening. I'm reasonably certain that this has been a problem forever. If that's not true, please assign it back to me.
Olli, could you take a look at this? It looks like it is a problem in printing code, and you are more familiar with it than me.
Summary: Crash [@ js::GCMarker::processMarkStackTop(js::SliceBudget&) ] with Print Edit and Redirector Extensions → Crash [@ js::GCMarker::processMarkStackTop(js::SliceBudget&) ] with Redirector Extension
Will look at this next week. Sorry about the delay.
Ghostery also seems to cause the same problem.
Yup, bug 825380 is filed on that (but is probably a dupe of this).
So we end up running script runners for <link> for example and content policy touches document which doesn't have script global object yet. Printing is pretty unique here, because of its cloning. Patch coming.
Make sure we don't run things which might cause content policy checks etc before the documents have their script globals set. It would be nice to have script blocker for longer time, but unfortunately printing related dialogs require scripts to run. Though, it is only chrome scripts which can access documents, so running scripts isn't that bad.
Attachment #700022 - Flags: review?(roc)
Attachment #700022 - Flags: review?(roc) → review+
Comment on attachment 700022 [details] [diff] [review] patch [Security approval request comment] How easily could an exploit be constructed based on the patch? Not easy, and as far as I see it requires some chrome js (like addons). Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Comment will be about delaying script runners in order to run them all at the same time. Kind of vague. Nothing about compartments though. Which older supported branches are affected by this flaw? All Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? The same patch should applies everywhere How likely is this patch to cause regressions; how much testing does it need? Changes to printing are always a bit regression risky since we don't have good tests (because for many things we just don't have the testing framework which could handle such tests).
Attachment #700022 - Flags: sec-approval?
Comment on attachment 700022 [details] [diff] [review] patch Per https://wiki.mozilla.org/Security/Bug_Approval_Process this shouldn't need approval. I'll land tomorrow.
Attachment #700022 - Flags: review+
Comment on attachment 700022 [details] [diff] [review] patch Olli cleared the wrong flag. Carrying forward roc's r+.
Attachment #700022 - Flags: sec-approval? → review+
(In reply to Olli Pettay [:smaug] from comment #21) > Comment on attachment 700022 [details] [diff] [review] > patch > > Per https://wiki.mozilla.org/Security/Bug_Approval_Process this shouldn't > need approval. That's right.
https://crash-stats.mozilla.com/report/list?signature=js%3A%3AGCMarker%3A%3AprocessMarkStackTop%28js%3A%3ASliceBudget%26%29 is a top crash across all our somewhat recent versions, what amount of those crashes do you think this patch should fix? If a major part, we should consider to uplift this to as many versions as possible.
It is difficult to say, as that is a generic stack we hit any time there is any kind of GC-related problem. This patch only affects people trying to do print-preview with certain addons (Redirector, Ghostery, probably others). In Nightly, my theory is that these errors are bucketed somewhere else due to additional runtime checks (js::CompartmentChecker::fail), so if we look at the Nightly 20 numbers, we see that 1.37% of crashes have the compartment checker assertion, and 1.14% have the processMarkStackTop stack. If we look at Aurora 19, we see that 3.19% of crashes are processMarkStackTop, which happens to be in the ballpark of 1.37+1.14. Thus, very optimistically, we could say we might get rid of half of them. On the other hand, I would guess these kinds of addons are much rarer beyond Nightly and Aurora, and thus this patch may have little overall effect. It may also be the case that this problem doesn't result in crashes all the time without the assertion, so we are overestimating the % that seem print preview related, and thus landing this fix will not help nearly so much. Another way to get an estimate would be to see what % of processMarkStackTop crashes have Ghostery or this Redirector thing, compared to the over all number. If the corrrelation is high, that suggests a decent number could also be print preview related, and thus fixed by this bug.
(In reply to Robert Kaiser (:email@example.com) from comment #25) > If a major part, we should consider to uplift this to as many versions as possible. This bug is a regression in Firefox 20 (see comment 0 and tracking flags) so it should be uplifted only in Aurora.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
I think we could uplift the patch, but I'd like to wait for few days to get at least tiny amount testing.
Nominating for tracking as a speculative partial fix for a top crash.
(In reply to Scoobidiver from comment #27) > (In reply to Robert Kaiser (:firstname.lastname@example.org) from comment #25) > > If a major part, we should consider to uplift this to as many versions as possible. > This bug is a regression in Firefox 20 (see comment 0 and tracking flags) so > it should be uplifted only in Aurora. That's not exactly true. This particular test case only reproduces in FF20. However, as far as I know the bug has been present since long before that. It just didn't reproduce in this circumstance until bug 790338. I think we should uplift everywhere we can, although it's up to Olli to evaluate the risk of that.
This needs some testing. Manual testing I'm afraid. Unfortunately printing is so rarely used feature nowadays that regression bugs may be filed months after fixing the bug.
(In reply to Olli Pettay [:smaug] from comment #32) > This needs some testing. Manual testing I'm afraid. > Unfortunately printing is so rarely used feature nowadays that regression > bugs may be filed > months after fixing the bug. Adding qawanted to fulfil this request. I've talked with Olli via IRC about this and we agreed to getting some broad, generic print testing using the Softvision team. I'll coordinate this testing and try to report back any results in the next week.
QA Contact: anthony.s.hughes
The print testing has been completed and is currently stored on a protected Google Doc. Olli, I'm working on getting the doc shared with you so you can view the raw results. If you have no questions or concerns with the results I will try to post a distilled version in this and related bugs.
So I didn't see any regressions based on that doc. Anthony, could you verify (once you're back).
Comment on attachment 700022 [details] [diff] [review] patch [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 487667 and compartments User impact if declined: Crashes Testing completed (on m-c, etc.): Landed m-c 2013-01-10 and see comment 33 and comment 35 Risk to taking this patch (and alternatives if risky): Tiny bit risky, since script runners end up running later String or UUID changes made by this patch: NA
Attachment #700022 - Flags: approval-mozilla-aurora?
Release mode assertions that this patch fixes are something like 3% of all crashes on Aurora. Every time somebody with one of a wide variety of addons tries to do print preview, they hit this assertion. The assertion is disabled in beta and later, but at least some of these will turn into real crashes.
Comment on attachment 700022 [details] [diff] [review] patch Willing to take the risk on Aurora since this might help with some topcrashers and qa should continue to verify as well as watch for regressions since we'll have time to backout if the payoff isn't what we hoped here.
Attachment #700022 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Olli Pettay [:smaug] from comment #37) > So I didn't see any regressions based on that doc. Anthony, could you verify > (once you're back). I confirm that the testing found no regressions.
You need to log in before you can comment on or make changes to this bug.