Closed Bug 1194533 (e10s-tests-osx) Opened 4 years ago Closed 3 years ago

Run e10s tests on OSX

Categories

(Firefox :: General, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 49
Tracking Status
e10s + ---
firefox46 --- wontfix
firefox47 --- wontfix
firefox48 --- fixed
firefox49 --- fixed

People

(Reporter: mrbkap, Assigned: RyanVM)

References

(Depends on 3 open bugs, Blocks 1 open bug)

Details

Attachments

(1 file)

As we're getting closer to having e10s ride the trains, having good automated test coverage is going to get more and more important. We currently only run mochitests with e10s on Linux and I'd like to extend that to OSX and Windows.

I understand that there's a cost in machine time to this, so we might want to figure out a throttling strategy to not cause too much drag on treeherder. What needs to be done to make this happen?

I have a try run at https://treeherder.mozilla.org/#/jobs?repo=try&revision=1cad3dcad6ff
Flags: needinfo?(ryanvm)
The main issue I see here is capacity. OSX 10.10 is our smallest slave pool by a large margin (the next-smallest pool - OSX 10.6 - has ~50% more capacity). RelOps has new OSX hardware in the budget for this year, but it's difficult to say how long it will take for it to be ordered, delivered, installed, validated for production duty, etc.

Joel Maher and Kim Moir have done a lot of work on SETA (Search for Extraneous Test Automation) to reduce the amount of test load by leveraging historical data to force-coalesce certain test suites. However, we need said data before it can be used to reduce load :). Already we see OSX 10.10 test backlog on a daily basis under typical load conditions. And we've already had to resort to ghastly hacks like not running 10.10 tests by default on Try to even have things be as good as they currently are.

Windows isn't in much greater shape, but isn't quite as dire as 10.10. We have a small ray of hope of increasing the Windows test slave capacity in the relative near future by pushing Windows builds into the cloud and repurposing those machines as test slaves instead. However, I don't know what the current status of that work is.

I'm sorry to be a stick in the mud about this. I can fully appreciate why you want to see this done, but I'm afraid that adding another 10 jobs per platform (assuming opt+debug) per push is going to cause tremendous pain from a test backlog standpoint.

I've added a few RelEng/RelOps people to the CC list in case they want to chime in as well.
No longer depends on: 1194550
Flags: needinfo?(ryanvm)
Depends on: 1194550
FYI the order for the new OS X machines has been placed, but it will likely be at least a month before we see any capacity increase there, and that will be gradual as we perform rolling replacements of the existing legacy hardware.

The windows build machines can not be re-purposed as test machines (they run on very different, very old hardware that can't even support our needs for tests), so there won't be any increase in capacity coming from that quarter.
We can crank up SETA again now that we have more tools in place.  How many new OSX machines did we order?  We keep adding jobs, I think we should be targeting 50% capacity growth.  With that said, do we have a pending order to increase our windows capacity?
Based on growth numbers from the past 4 years, we've ordered 200 minis (roughly double what we have now for 10.10), but that capacity has to last us 3 years. We're also shifting the try load from 10.6 to 10.10, so there will be a pretty immediate increase in load due to that.

There are no plans or budget to increase windows test hardware capacity this year.
thanks for the info :arr.  Sounds like we need to look more into selective test running per commit with things like SETA or other tools if we want to turn this stuff on.
I figured that this wouldn't be as easy as disabling a couple of tests and flipping a switch :-)

That being said, this is definitely something that needs to happen, preferably before e10s hits the beta channel. There's no way that it makes sense to ship e10s without full automated test coverage on all of the platforms we ship on.

Out of curiosity, what do the numbers look like for Windows, which is probably the most important platform to run tests on?
:ryanvm, do you have the numbers that :mdbkap asked for in comment 6, or do we need to ask someone else for those?
Flags: needinfo?(ryanvm)
All I know is what slave health tells me - 172 WinXP, 171 Win7, 180 Win8.
Flags: needinfo?(ryanvm)
Note that those are the absolute numbers, not accounting for any disabled/loaner/decommed/whatever slaves not taking jobs.
Oh, sorry, yes, I could have provided those, but I presumed that :mrbkap was asking about utilization and available capacity to run jobs.
Besides "awful" I don't have a great answer. We're constantly backlogged on Windows as it is. Even with inbound closed for a good portion of the day today, XP currently has 386 pending, 7 has 659, and 8 has 534.

We're going to need some seriously aggressive SETA to withstand turning on 10 more jobs per platform.
I think our only real option is to enable the tests and suffer worse wait times / coalescing. 

The real question is how to minimize the impact.

Can we start by running the e10s tests infrequently? Or conversely, run the non-e10s tests infrequently?
We have lots of new OSX hardware, where do we stand on standing up more e10s test coverage now?
Flags: needinfo?(jgriffin)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
> We have lots of new OSX hardware, where do we stand on standing up more e10s
> test coverage now?

We haven't started; I've been prioritizing Windows but can start looking at things on OSX.
Flags: needinfo?(jgriffin)
Here's what a current try run looks like with e10s enabled: https://treeherder.mozilla.org/#/jobs?repo=try&revision=544469683953&selectedJob=17293727

It's not too bad; we can get to green by selectively disabling pretty easily, I hope.
Depends on: 1237146
Depends on: 1252207
Depends on: 1252223
Depends on: 1252230
Summary: Run e10s mochitest on OSX → Run e10s tests on OSX
Depends on: 1252237
Depends on: 1252241
Depends on: 1252254
Depends on: 1252263
Depends on: 1252266
Depends on: 1242085
Depends on: 1252273
Depends on: 1252278
Depends on: 1250058
No longer depends on: 1252278
Depends on: 1252283
Depends on: 1252345
Depends on: 1252348
Depends on: 1252349
Depends on: perf-e10s-leak
Depends on: 1179542
No longer depends on: 1252241
Depends on: 1238707
Depends on: 1160011
Keywords: leave-open
Depends on: 1252875
Depends on: 1253035
Depends on: 1253037
Depends on: 1210117
Depends on: 1159963
Depends on: 1206424
Depends on: 1252201
Quick status update:

I've landed test disabling patches for a lot of the deps from this bug and things are starting to shape up to a point where we could conceivably have them running soon in some capacity in production. Mochitest-JP is broken across the board (and my understanding is that's the case in general for them on e10s). Mochitest-push is failing 20-30% of the time on opt/debug, hitting two different known intermittents. Mochitest-other is broken, but that's the case for e10s in general.

OPT: Mochitests are mostly green at this point, except for a never-ending parade of devtools intermittent whack-a-mole. Some websocket tests seem to be pretty orange-prone (see deps). Crashtests and jsreftests are green and stable. Reftests are permafail on 10.6 due to APZ issues related to scrollbar drawing. If push came to shove, we could probably just annotate those away, but kats is actively investigating them. Reftests are better on 10.10, but still fail about 20% of the time.

DEBUG: Mochitest-plain is mostly green, save for a smattering of gfx leaks across different directories and a semi-frequent service worker leak-the-world that affects Windows too. Crashtests and mochitest-gl are currently perma-assert and/or crashing. I could annotate around the crashtest failures if need-be, but it looks like the gfx team is already working on a fix anyway. Reftests are in the same boat as their opt counterparts. Devtools have been a never-ending game of docShell leaks, and disabling one test just makes another one pop up instead.
Depends on: 1253238
Depends on: 1235056
Depends on: 1253252
Depends on: 1246453
Depends on: 1253690
Depends on: 1253710
Depends on: 1253714
Depends on: 1254447
No longer depends on: 1254447
Depends on: 1254814
Alias: e10s-tests-osx
No longer depends on: 1206424
Depends on: 1264059
Depends on: 1264073
Depends on: 1268169
Depends on: 1268319
No longer depends on: 1179542
The new OSX 10.10 test machines are online now. I think it's time to get OSX debug e10s tests enabled across the board instead of only on Ash. There's on devtools-specific permafail that will need addressing (bug 1268319), but I'll happily skip that directory on e10s if that's all that's blocking at this point.
We're already running all the opt suites we care about in production, so we only need to turn on the remaining debug tests. I'm targeting 48+ since that's the version we're targeting for wider release.

Builders added:
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test crashtest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test jsreftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-1
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-2
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-3
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-4
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-5
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-1
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-2
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-3
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-4
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-5
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-6
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-7
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-e10s-devtools-chrome-8
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-gl-e10s
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test mochitest-media-e10s
+ Rev7 MacOSX Yosemite 10.10.5 fx-team debug test reftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test crashtest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test jsreftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-1
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-2
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-3
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-4
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-5
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-1
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-2
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-3
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-4
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-5
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-6
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-7
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-e10s-devtools-chrome-8
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-gl-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test mochitest-media-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-aurora debug test reftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test crashtest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test jsreftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-1
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-2
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-3
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-4
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-5
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-1
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-2
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-3
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-4
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-5
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-6
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-7
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-e10s-devtools-chrome-8
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-gl-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test mochitest-media-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-central debug test reftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test crashtest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test jsreftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-1
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-2
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-3
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-4
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-5
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-1
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-2
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-3
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-4
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-5
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-6
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-7
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-e10s-devtools-chrome-8
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-gl-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test mochitest-media-e10s
+ Rev7 MacOSX Yosemite 10.10.5 mozilla-inbound debug test reftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 try debug test crashtest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 try debug test jsreftest-e10s
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-1
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-2
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-3
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-4
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-5
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-1
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-2
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-3
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-4
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-5
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-6
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-7
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-e10s-devtools-chrome-8
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-gl-e10s
+ Rev7 MacOSX Yosemite 10.10.5 try debug test mochitest-media-e10s
+ Rev7 MacOSX Yosemite 10.10.5 try debug test reftest-e10s
Assignee: mrbkap → ryanvm
Status: NEW → ASSIGNED
Attachment #8752290 - Flags: review?(rail)
Attachment #8752290 - Flags: review?(rail) → review+
Comment on attachment 8752290 [details] [diff] [review]
enable remaining OSX debug e10s suites on Gecko 48+

https://hg.mozilla.org/build/buildbot-configs/rev/343ac36a1864
Attachment #8752290 - Flags: checkin+
\m/
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Keywords: leave-open
Target Milestone: --- → Firefox 49
You need to log in before you can comment on or make changes to this bug.