If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Permaorange in ipc xpcshell tests on OS X when Gecko 55 merges to beta on 2017-06-12

RESOLVED FIXED in Firefox 55

Status

()

Core
Security: Process Sandboxing
RESOLVED FIXED
4 months ago
4 months ago

People

(Reporter: philor, Assigned: Alex_Gaynor)

Tracking

({regression})

55 Branch
mozilla55
regression
Points:
---

Firefox Tracking Flags

(firefox-esr45 unaffected, firefox-esr52 unaffected, firefox53 unaffected, firefox54 unaffected, firefox55blocking fixed)

Details

(Whiteboard: sb+)

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

(Reporter)

Description

4 months ago
Something that landed in the last week is going to result in massive failure in the unit_ipc xpcshell tests on OS X when we merge to beta next week, a la https://treeherder.mozilla.org/logviewer.html#?job_id=104748049&repo=try

[Tracking Requested - why for this release]: merge bustage, closed tree, delayed b1
I see a lot of odd errors, but it is hard to tell what is normal or not because regular XPCShell logs don't output anything.

Here's one that is actually causing a failure:
ReferenceError: can't access lexical declaration `redirectPath' before initialization at /builds/slave/test/build/tests/xpcshell/tests/dom/base/test/unit/test_bug553888.js:40

I'm not sure why we're hitting this, and only on OSX. There are a lot of JS language errors, but I can't imagine we'd change anything on OS X only.

Haik, is there something different about OS X sandboxing in XPCShell between Nightly and Beta? Maybe that is causing these issues? Or is there no sandboxing?
Flags: needinfo?(haftandilian)
this needs to be addressed before 55.0b1 so marking as blocker.
tracking-firefox55: ? → blocking
(In reply to Andrew McCreight [:mccr8] from comment #1)
> I see a lot of odd errors, but it is hard to tell what is normal or not
> because regular XPCShell logs don't output anything.
> 
> Here's one that is actually causing a failure:
> ReferenceError: can't access lexical declaration `redirectPath' before
> initialization at
> /builds/slave/test/build/tests/xpcshell/tests/dom/base/test/unit/
> test_bug553888.js:40
> 
> I'm not sure why we're hitting this, and only on OSX. There are a lot of JS
> language errors, but I can't imagine we'd change anything on OS X only.
> 
> Haik, is there something different about OS X sandboxing in XPCShell between
> Nightly and Beta? Maybe that is causing these issues? Or is there no
> sandboxing?

There should be no difference right now between Nightly and Beta. And my understanding is that xpcshell tests aren't sandboxed due to Bug 1358652 "xpcshell e10s tests aren't sandboxed".

We did land Bug 1363760 "Install specialpowers as a non-temporary addon in tests" last week, but that's not Mac-specific.
Flags: needinfo?(haftandilian)
Assignee: nobody → continuation
I was able to reproduce this locally using two patches from philor's try run, on OSX. I went through and compared a successful log to a failing log, because there is a lot of error spew even in a passing run.

The first divergent badness seems to be a failure in httpd.js:

20:11:35     INFO -  CHILD-TEST-STARTED
20:11:35     INFO -  PID 6497 | [Child 6498] WARNING: failed to bind socket: file /home/worker/workspace/build/src/netwerk/base/nsServerSocket.cpp, line 360
20:11:35     INFO -  PID 6497 | !!! could not start server on port -1: [Exception... "Component returned failure code: 0x804b000d (NS_ERROR_CONNECTION_REFUSED) [nsIServerSocket.init]"  nsresult: "0x804b000d (NS_ERROR_CONNECTION_REFUSED)"  location: "JS frame :: resource://testing-common/httpd.js :: _start :: line 550"  data: no]

We're trying to start an httpd.js in a content process, and getting refused.

Then I read over bug 1358652 comment 0, which Haik pointed out to me in comment 3, and thought, gee, that sounds a lot like what I'm seeing. Anyways, I poked around in the code a bit, and it turns out that bug 1358223 made it so that the content process sandbox level is hardcoded to 1 in non-Nightly builds.

This also applies to content processes started in XPCShell, and thus as jld predicted, these tests break. I locally confirmed that commenting out the chunk of code inside the "#if !defined(NIGHTLY_BUILD) && (defined(XP_WIN) || defined(XP_MACOSX))" block in GetEffectiveContentSandboxLevel() makes test_bug553888_wrap.js pass again.

Can you take a look at this, Alex? I'm not sure what the right way to fix this for XPCShell tests is...
Assignee: continuation → nobody
Blocks: 1358223
Component: IPC → Security: Process Sandboxing
Flags: needinfo?(agaynor)
Keywords: regression
It might also be worth looking at why these tests didn't break on Windows, too. I don't know if that's expected.
(Assignee)

Comment 6

4 months ago
Ouch. Sorry about this!

I don't know why this wouldn't have affected Windows, I'd have expected it to break Windows and macOS, but not Linux.

It looks like properly fixing up xpcshell tests to pass under the sandbox is a pretty large scale project. The options I see are either:

a) Set the |MOZ_DISABLE_CONTENT_SANDBOX| environment variable when running xpcshell tests. That should give you the same effective behavior as setting the pref to 0, and is probably the most correct way to do this. I don't know much about the xpcshell runner, so hopefully someone can chime in.
b) Backout/partially backout the patch on beta. (Partial backout could be acheived by commenting out line 19-21 of SandboxSettings.cpp)
Flags: needinfo?(agaynor)
Maybe buildCoreEnvironment in testing/xpcshell/runxpcshelltests.py is the right place to set an environment variable? Those look like pretty generic settings.
(Reporter)

Comment 8

4 months ago
Windows is broken, https://treeherder.mozilla.org/logviewer.html#?job_id=104763022&repo=try, but very lightly and only in crashreporter and underneath another separate bustage, so I hadn't yet spotted it as being the same thing.
(Assignee)

Comment 9

4 months ago
Oh, on further thinking: the thing that's failing is binding a socket, I'm not sure the Windows sandbox prohibits that. That'd explain why the bustage is macOS only.

buildCoreEnvironment sounds like a plausible location for adding the environment variable.
(Assignee)

Updated

4 months ago
Assignee: nobody → agaynor
Comment hidden (mozreview-request)
(Assignee)

Comment 11

4 months ago
Patch is small, so hopefully we can have this landed with plenty of time before the merge to beta.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa1c5078f498b58a4f31be1e06a896a2db352d6e

Comment 12

4 months ago
mozreview-review
Comment on attachment 8875404 [details]
Bug 1370438 - The MOZ_DISABLE_CONTENT_SANDBOX environment variable now works on macOS and is used in the xpcshell tests;

https://reviewboard.mozilla.org/r/146856/#review150922
Attachment #8875404 - Flags: review?(haftandilian) → review+
(Assignee)

Updated

4 months ago
Keywords: checkin-needed

Updated

4 months ago
Whiteboard: sb+

Comment 13

4 months ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/a5a453664f8f
The MOZ_DISABLE_CONTENT_SANDBOX environment variable now works on macOS and is used in the xpcshell tests; r=haik
Keywords: checkin-needed

Comment 14

4 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/a5a453664f8f
Status: NEW → RESOLVED
Last Resolved: 4 months ago
status-firefox55: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.