Closed Bug 1256992 Opened 9 years ago Closed 9 years ago

crash in mozilla::SandboxBroker::SandboxBroker

Categories

(Core :: Security: Process Sandboxing, defect)

x86
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox46 + fixed
firefox47 + fixed
firefox48 + fixed

People

(Reporter: MatsPalmgren_bugz, Assigned: bobowen)

References

Details

(Keywords: crash, Whiteboard: [sbwc1])

Crash Data

Attachments

(4 files, 1 obsolete file)

This bug was filed from the Socorro interface and is report bp-346eb15b-b6b1-43cf-a296-339242160312. ============================================================= This signature has relatively high volume in Firefox 46b1 crash data - it's currently at #35 in the 46b1 Top Crashers List. The number of reported crashes is about an order of magnitude higher than in v45 which seems worrying. Product Version Count Percentage Firefox 46.0b1 152 62.3% Firefox 44.0.2 34 13.9% Firefox 45.0b99 11 4.5% Firefox 45.0 10 4.1% Firefox 43.0b1 3 1.2% Firefox 38.7.0esr 2 0.8% [... long tail ...] Operating System Count Percentage Windows 7 129 86.0% Windows 8.1 10 6.7% Windows 10 7 4.7% Windows 8 2 1.3% Windows XP 2 1.3% Architecture Count Percentage x86 150 100.0% http://hg.mozilla.org/releases/mozilla-beta/annotate/fb3494d06dfb/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp#l34 Stack: mozilla::SandboxBroker::SandboxBroker() mozilla::ipc::GeckoChildProcessHost::GeckoChildProcessHost(GeckoProcessType, base::ChildPrivileges) mozilla::gmp::GMPProcessParent::GMPProcessParent(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) mozilla::gmp::GMPParent::LoadProcess() mozilla::gmp::GMPParent::EnsureProcessLoaded() mozilla::gmp::GMPParent::GetGMPContentParent(mozilla::UniquePtr<mozilla::gmp::GetGMPContentParentCallback, mozilla::DefaultDelete<mozilla::gmp::GetGMPContentParentCallback> >&&) mozilla::gmp::GeckoMediaPluginServiceParent::GetContentParentFrom(nsACString_internal const&, nsCString const&, nsTArray<nsCString> const&, mozilla::UniquePtr<mozilla::gmp::GetGMPContentParentCallback, mozilla::DefaultDelete<mozilla::gmp::GetGMPContentParentCallback> >&&) mozilla::gmp::GeckoMediaPluginService::GetGMPVideoDecoder(nsTArray<nsCString>*, nsACString_internal const&, mozilla::UniquePtr<GMPVideoGetterCallback<GMPVideoDecoderProxy>, mozilla::DefaultDelete<GMPVideoGetterCallback<GMPVideoDecoderProxy> > >&&) mozilla::GMPVideoDecoder::Init() mozilla::MediaDataDecoderProxy::InternalInit() mozilla::detail::ProxyRunnable<mozilla::MozPromise<mozilla::TrackInfo::TrackType, mozilla::MediaDataDecoder::DecoderFailureReason, 1>, mozilla::MediaDataDecoderProxy>::Run() nss3.dll@0x28bff NS_ProcessNextEvent(nsIThread*, bool) mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) MessageLoop::RunHandler()
Whiteboard: [sb?]
Flags: needinfo?(bobowen.code)
Assuming that this is some sort of threading issue this should fix it. If not, then this will prevent it from crashing while we're not sandboxing content. I didn't bother checking the env var or sandbox level, so as to not complicate things. I put the initialisation with the existing bits just so they are not scattered around. As I've mentioned before I think SandboxBroker should acutally live in xul and then we could put the other initialisation into it as well. Presumably this would make adding it into xpcshell easier. This could be a cross OS class I think, with per OS implementation. Linux already has one that lives in xul.
Attachment #8731668 - Flags: review?(aklotz)
Assignee: nobody → bobowen.code
Status: NEW → ASSIGNED
Flags: needinfo?(bobowen.code)
Whiteboard: [sb?] → [sbwc1]
This is looking more promising: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d2e9713e18e0 Not sure about the placement of the Initialize calls and if someone else needs to review as well. See comment 1 for other notes.
Attachment #8731704 - Flags: review?(aklotz)
Attachment #8731668 - Attachment is obsolete: true
Attachment #8731668 - Flags: review?(aklotz)
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Review of attachment 8731704 [details] [diff] [review]: ----------------------------------------------------------------- OK, but conditional on asking an xpcshell peer to review the XPCShellImpl.cpp part.
Attachment #8731704 - Flags: review?(aklotz) → review+
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Thanks Aaron. Bobby - would you be able to OK this change to XPCShellImpl.cpp?
Attachment #8731704 - Flags: review?(bobbyholley)
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Review of attachment 8731704 [details] [diff] [review]: ----------------------------------------------------------------- rs=me on initializing the sandbox.
Attachment #8731704 - Flags: review?(bobbyholley) → review+
Flags: needinfo?(bobowen.code)
OK, I've had to add in the initialisation to the gtest code in nsAppRunner.cpp as well. Try run with this and running all tests (I think): https://treeherder.mozilla.org/#/jobs?repo=try&revision=e54e320fb042 I've also made it set the mPolicy to nullptr if there is no sBrokerService, as on inbound it was crashing in SetSecurityLevelForGMPlugin for the gtests because mPolicy was uninitialised and not null.
Flags: needinfo?(bobowen.code)
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Sheriff note: patch that landed was slightly different. Beta doesn't quite apply cleanly. Minor context change, I'll upload a patch. Might want to wait until this has spent a couple of days on Nightly, but requesting now as I'm off on PTO. Approval Request Comment [Feature/regressing bug #]: Possible threading issue has existed since chromium sandbox code landed. [User impact if declined]: Crash if issue occurs. More instances seemed to be occurring on Beta, possibly due to e10s testing. [Describe test coverage new/current, TreeHerder]: GMP, NPAPI 64-bit and e10s test the various launching of sandboxed processes. The crash can occur even without the sandbox being enabled. [Risks and why]: Low to medium: changes are across a few files, but all are fairly simple. [String/UUID change made/needed]: None
Attachment #8731704 - Flags: approval-mozilla-beta?
Attachment #8731704 - Flags: approval-mozilla-aurora?
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Looks like this patch is for aurora. This may help with crashes!
Attachment #8731704 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8734153 [details] [diff] [review] Beta patch - Initialize Windows sandbox BrokerServices before any child processes are created [Triage Comment] Might help with many sorts of crashes. Let's try it on beta, this may land for beta 5 if all goes well, if not, then beta 6.
Attachment #8734153 - Flags: approval-mozilla-beta+
Depends on: 1260115
This may have caused bug 1260115 on aurora and possibly beta. We should back it out and potentially respin any beta build that includes it.
Flags: needinfo?(lhenry)
Wes can you back this out of aurora and beta? Thanks
Flags: needinfo?(lhenry) → needinfo?(wkocher)
Attachment #8734153 - Flags: approval-mozilla-beta+ → approval-mozilla-beta?
Switching flags to affected for 46 and 47 since we backed out the patch on those branches. Bobby can you take a look or suggest someone who might be able to help here?
Flags: needinfo?(bobbyholley)
I just rubber-stamped a small change to XPConnect here - aklotz reviewed the meat of the change.
Flags: needinfo?(bobbyholley)
bowen will be back this week.
OK, we can wait for bowen, thanks
Flags: needinfo?(aklotz)
OK, so to prevent these crashes (bug 1261261 and bug 1260115) we'd need to add null checks into these other methods. What is slightly more worrying is that for these problems to occur either Initialize isn't being called soon enough or it's failing. I'll see if I can reproduce with the patch reapplied locally for Aurora, given that it became the top crash.
We could still take a patch for beta 9 later this week. This is the #35 topcrash in 46.0b5. For now I'm taking off the approval request for beta, but still tracking.
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Looks like we don't have a viable patch for beta yet.
Attachment #8731704 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #8734153 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
I'm going to re-open this and land extra changes here as it will make it easier to uplift to Fx47, if we decide to do that. One thing I noticed, all of the crash reports that I've looked at have a very large number of threads. The initialisation for the broker does start a new thread, so maybe there is some problem with that rather than two Init()s happening at the same time. If the large number of threads is causing an issue, maybe it would be manifesting itself in other ways as well.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Clearing this as it was backed out.
Attachment #8731704 - Flags: approval-mozilla-aurora+
MozReview-Commit-ID: Fu05wLn27UG This moves the initialization into XRE_mainInit, which is probably a better place and works for gtest as well. Hopefully, the problems when I uplifted to Aurora were because I wasn't initialising early enough. If not, I've added Telemetry to log the result of SandboxBroker::Initialize. So we should know if it's failing in some circumstances. Also added null checks, to prevent the crashes I caused if the Initialize isn't called/fails. As before this is only the case if content sandboxing isn't compiled in, if it is we crash. Although we haven't seen that on Nightly as far as I can see. Do you know who (if) I need to ask for review to get the Telemetry added? Try push with this patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d69d83e42498e715664a2fc880957786f9c3abd2
Attachment #8738249 - Flags: review?(aklotz)
Attachment #8738249 - Flags: review?(aklotz) → review+
(In reply to Bob Owen (:bobowen) from comment #34) > Do you know who (if) I need to ask for review to get the Telemetry added? Until somebody says otherwise, my r+ should suffice on that given my previous role on the perf team.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Early days, but as there are no reports of SandboxBroker::Initialize failing in the telemetry I think we can try this on Aurora again now. This time there will be the second patch as well. Approval Request Comment [Feature/regressing bug #]: Possible threading issue has existed since chromium sandbox code landed. It is also possible that another underlying problem with large numbers of threads is causing this to fail. Bug 1262964 has been filed on this. [User impact if declined]: Crash if issue occurs. More instances seemed to be occurring on Beta. [Describe test coverage new/current, TreeHerder]: GMP, NPAPI 64-bit and e10s test the various launching of sandboxed processes. The crash can occur even without the sandbox being enabled. [Risks and why]: Low to medium: changes are across a few files, but all are fairly simple. This also caused a top crash on Aurora last time, but those issues should have been addressed. [String/UUID change made/needed]: None
Attachment #8731704 - Flags: approval-mozilla-aurora?
Comment on attachment 8738249 [details] [diff] [review] Part 2: Move SandboxBroker Initialization earlier and add telemetry and extra null checks See comment 38 for main questions. Change to file toolkit/components/telemetry/Histograms.json doesn't quite apply cleanly, but only because other histograms at the end of the file aren't in Aurora.
Attachment #8738249 - Flags: approval-mozilla-aurora?
Comment on attachment 8731704 [details] [diff] [review] Initialize Windows sandbox BrokerServices before any child processes are created Taking this in Aurora with the hopes that if all goes well, we can uplift to Beta46.
Attachment #8731704 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8738249 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
has problems to apply to aurora: grafting 338383:65d070edff8b "Bug 1256992 Part 2: Move SandboxBroker Initialization earlier and add telemetry and extra null checks. r=aklotz" merging security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp merging toolkit/components/telemetry/Histograms.json merging toolkit/xre/nsAppRunner.cpp warning: conflicts while merging toolkit/components/telemetry/Histograms.json! (edit, then use 'hg resolve --mark') abort: unresolved conflicts, can't continue (use hg resolve and hg graft --continue)
Flags: needinfo?(bobowen.code)
backed out for possible problems in my conflict resolution, Bob can you take a look, thanks!
Might this still be good to uplift to beta? Are the crashes mostly e10s related, or not?
Flags: needinfo?(bobowen.code)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #45) > Might this still be good to uplift to beta? Are the crashes mostly e10s > related, or not? Yes I think so, just wanted to make sure I didn't cause a top-crash on Aurora like last time. :-) The crashes are actually generally non-e10s, because if you have e10s enabled then the broker services are getting initialised early on anyway.
Flags: needinfo?(bobowen.code)
Comment on attachment 8741397 [details] [diff] [review] Beta patch for Part 2: Move SandboxBroker Initialization earlier and add telemetry and extra null checks [Triage Comment] Please uplift to beta, we'll try this out in beta 11.
Attachment #8741397 - Flags: approval-mozilla-beta+
Attachment #8734153 - Flags: approval-mozilla-beta- → approval-mozilla-beta+
Part 1 doesn't apply cleanly to beta, is it needed or should I just land part 2?
Flags: needinfo?(bobowen.code)
(In reply to Wes Kocher (:KWierso) from comment #49) > Part 1 doesn't apply cleanly to beta, is it needed or should I just land > part 2? It needs both, they apply for me I can push if you like: https://bugzilla.mozilla.org/attachment.cgi?id=8734153 https://bugzilla.mozilla.org/attachment.cgi?id=8741397
Flags: needinfo?(bobowen.code)
(In reply to Bob Owen (:bobowen) from comment #51) > (In reply to Wes Kocher (:KWierso) from comment #49) > > Part 1 doesn't apply cleanly to beta, is it needed or should I just land > > part 2? > > It needs both, they apply for me I can push if you like: > https://bugzilla.mozilla.org/attachment.cgi?id=8734153 > https://bugzilla.mozilla.org/attachment.cgi?id=8741397 Yeah, helps if I actually pay attention and uplift the correct pair of patches... :)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: