Closed
Bug 994920
Opened 10 years ago
Closed 6 years ago
Run media mochitests on B2G emulators on faster VMs
Categories
(Firefox OS Graveyard :: General, defect, P2)
Firefox OS Graveyard
General
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: jgriffin, Unassigned)
References
Details
(Keywords: ateam-b2g-big, Whiteboard: [leave open])
Attachments
(5 files)
848 bytes,
patch
|
ahal
:
review+
|
Details | Diff | Splinter Review |
1.12 KB,
patch
|
ahal
:
review+
|
Details | Diff | Splinter Review |
2.24 KB,
patch
|
jlund
:
review+
|
Details | Diff | Splinter Review |
1.82 KB,
text/plain
|
Details | |
2.74 KB,
text/plain
|
Details |
Currently, media mochitests (and possibly some others) on B2G emulators are very CPU-bound when run on the existing EC2 instances. This causes timeouts and other problems. There are a couple of potential solutions to this. One is to be able to run those tests on beefier EC2 instances (bug 985650), but according to Armen, this is a lot of work and not likely to happen soon. Therefore, in the short term, I propose we run the media mochitests using IX hardware slaves, instead of EC2 instances. There are several pieces here: splitting the media tests into their own chunk (similar, probably, to how we split devtools mochitests for desktop in bug 984930), and then landing the mozharness and buildbot-configs changes that will be needed to schedule these in TBPL.
Reporter | ||
Comment 1•10 years ago
|
||
Maire, can you provide some context to help determine priority?
Reporter | ||
Comment 2•10 years ago
|
||
Joel, do you think the subsuite approach you're using in bug 984930 is the best way to go here,as far as splitting out media tests into a separate chunk?
Flags: needinfo?(jmaher)
Comment 3•10 years ago
|
||
we have this problem with mochitest-gl (webgl sub test harness in mochitest-plain) for android. caveat: * subsuite is on/off- no conditions for build type or platform right now. This means that if we defined subsuite=media, then all those tests will not be run by default on desktop ideas: * we can skip-if = b2g, then create a specific job type for b2g-media (mozharness target using --test-path or --manifest) and a buildbot builder that only runs that on hardware. This isn't a reality until we get the build to stop filtering tests based on skip-if conditions (bug 989583) * we can add platform conditions to the subsuite (subsuite-if) and then we can skip this for b2g only. the problem here is it starts turning somewhat clean manifest syntax and muddles the water. no matter what route gets done, we would need to create a mozharness target and a buildbot builder to run it.
Flags: needinfo?(jmaher)
Reporter | ||
Comment 4•10 years ago
|
||
Another option is to run _all_ of the B2G mochitests on IX slaves, until we either have the ability to run them on faster EC2 nodes or skip-if filtering is addressed. From an e-mail from Maire, I think this is needed for B2G 2.0, so it's not a fire drill, and we may be able to wait for a good solution. Needinfo'ing her so she can comment on deadlines.
Flags: needinfo?(mreavy)
Reporter | ||
Updated•10 years ago
|
Priority: -- → P2
Comment 5•10 years ago
|
||
WebRTC is the headline feature for v2.0. We're targeting getting all our patches landed by mid May so that we are solid by FC on June 9. This week the "media quality" part of our schedule was badly hurt by B2G emulator time outs that were not problems in our code, but problems with the slowness of the current emulator. Anything (even hackish) that you guys can do in the very near term to improve the B2G emulator perf will get the full support (and many thanks!) from me and the WebRTC team. Randell Jesup posted to dev-platform detailing the issues he ran into: https://groups.google.com/forum/#!topic/mozilla.dev.platform/qzyz-NzLqT0 Thanks!
Flags: needinfo?(mreavy)
Reporter | ||
Comment 6•10 years ago
|
||
For the record, the plan is: * we can skip-if = b2g, then create a specific job type for b2g-media (mozharness target using --test-path or --manifest) and a buildbot builder that only runs that on hardware. This isn't a reality until we get the build to stop filtering tests based on skip-if conditions (bug 989583) If bug 989583 gets bogged down, we'll go with the subsuite approach, which is more work on the infrastructure side.
Depends on: 989583
Reporter | ||
Comment 7•10 years ago
|
||
FYI, I think bug 989583 should be resolved this week, which will allow us to proceed with this.
Comment 8•10 years ago
|
||
Thanks, Jonathan. Do you have an ETA for this bug?
Comment 9•10 years ago
|
||
To give some more context about the urgency: In bug 1016498 I described in a comment that the logs show random delays between two WebRTC API calls in our test from 0.5 up to 13s with an average of 5.5s. We are getting so many oranges related to these problems on B2G emulator, that I'm inclined to de-activate all of our tests on B2G emulator which involve a network connection. The two other options I see are: - run the tests on some form of dedicated hardware - create special builds for B2G emulator with insane timeout values to avoid getting test failures - but I don't like this approach as we are then not testing any more what we are shipping to customers
Reporter | ||
Comment 10•10 years ago
|
||
Once bug 989583 is resolved (ETA: ~1 week) it should take us about a week to move these tests to IX slaves, which are real hardware slaves. So, about 2 weeks total.
Reporter | ||
Updated•10 years ago
|
Assignee: nobody → jgriffin
Reporter | ||
Comment 11•10 years ago
|
||
Steps we need to do here: 1 - create a separate mochitest-media job on cedar (on emulators initially) 2 - land a patch on cedar to skip-if = b2g media mochitests, and verify it doesn't prevent the mochitest-media job from running normally 3 - move the mochitest-media job to IX hardware slaves 4 - green up tests, as needed 5 - schedule the mochitest-media job on all trunk branches on IX hardware slaves 6 - land the patch from step 2 on trunk branches
Reporter | ||
Comment 12•10 years ago
|
||
Attachment #8438077 -
Flags: review?(ahalberstadt)
Reporter | ||
Comment 13•10 years ago
|
||
Attachment #8438078 -
Flags: review?(ahalberstadt)
Reporter | ||
Comment 14•10 years ago
|
||
Attachment #8438079 -
Flags: review?(jlund)
Comment 15•10 years ago
|
||
Jonathan let me know if I can help in any way with any of the steps you outlined.
Comment 16•10 years ago
|
||
Comment on attachment 8438077 [details] [diff] [review] Add --test-path to in-tree B2G mochitest config, Review of attachment 8438077 [details] [diff] [review]: ----------------------------------------------------------------- I'm confused by this, didn't you say we were going to create a new mochitest job? This will affect the other jobs too. How will we specify different test_path config variables while re-using mochitest_options for the new job?
Attachment #8438077 -
Flags: review?(ahalberstadt) → review-
Comment 17•10 years ago
|
||
Comment on attachment 8438078 [details] [diff] [review] Add --test-path support to B2G mochitest mozharness script, Review of attachment 8438078 [details] [diff] [review]: ----------------------------------------------------------------- This part looks good.
Attachment #8438078 -
Flags: review?(ahalberstadt) → review+
Reporter | ||
Comment 18•10 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #16) > Comment on attachment 8438077 [details] [diff] [review] > Add --test-path to in-tree B2G mochitest config, > > Review of attachment 8438077 [details] [diff] [review]: > ----------------------------------------------------------------- > > I'm confused by this, didn't you say we were going to create a new mochitest > job? This will affect the other jobs too. How will we specify different > test_path config variables while re-using mochitest_options for the new job? By default, test_path will be None, so it will be excluded from other jobs here: http://hg.mozilla.org/build/mozharness/file/aa104dcaf661/scripts/b2g_desktop_unittest.py#l185 This is similar to how we handle the browser_arg argument, which isn't used for most runs. See the third patch for the related buildbot config that actually creates and schedules the new job.
Comment 19•10 years ago
|
||
Comment on attachment 8438077 [details] [diff] [review] Add --test-path to in-tree B2G mochitest config, Review of attachment 8438077 [details] [diff] [review]: ----------------------------------------------------------------- Oh heh, that's kind of dirty :p. I was thinking that the string interpolation would convert None to 'None' and therefore be True, but I guess that has already been taken care of. In that case, this looks good too!
Attachment #8438077 -
Flags: review- → review+
Comment 20•10 years ago
|
||
Comment on attachment 8438079 [details] [diff] [review] Schedule mochitest-media on B2G emulators on cedar, Review of attachment 8438079 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozilla-tests/b2g_config.py @@ +1068,5 @@ > + 'mochitest-media': { > + 'extra_args': [ > + '--cfg', 'b2g/emulator_automation_config.py', > + '--test-suite', 'mochitest', > + '--test-path', 'media/', so I think overall the whole patch works. However, just want to sanity check that we do not want to call a specific chunk here. As in, I'm not sure what happens when you add the '--test-path' to the mochi run_tests.py call but I noticed we are not chunking so I'd assume this will run all the chunks.
Attachment #8438079 -
Flags: review?(jlund) → review+
Reporter | ||
Comment 21•10 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #20) > Comment on attachment 8438079 [details] [diff] [review] > Schedule mochitest-media on B2G emulators on cedar, > > Review of attachment 8438079 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: mozilla-tests/b2g_config.py > @@ +1068,5 @@ > > + 'mochitest-media': { > > + 'extra_args': [ > > + '--cfg', 'b2g/emulator_automation_config.py', > > + '--test-suite', 'mochitest', > > + '--test-path', 'media/', > > so I think overall the whole patch works. However, just want to sanity check > that we do not want to call a specific chunk here. As in, I'm not sure what > happens when you add the '--test-path' to the mochi run_tests.py call but I > noticed we are not chunking so I'd assume this will run all the chunks. Right, without chunks, we will just run all the tests in the specified path in one chunk.
Reporter | ||
Comment 22•10 years ago
|
||
Comment on attachment 8438078 [details] [diff] [review] Add --test-path support to B2G mochitest mozharness script, https://hg.mozilla.org/build/mozharness/rev/8cb7108d657e
Reporter | ||
Comment 23•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #22) > Comment on attachment 8438078 [details] [diff] [review] > Add --test-path support to B2G mochitest mozharness script, > > https://hg.mozilla.org/build/mozharness/rev/8cb7108d657e pushed to production
Reporter | ||
Comment 24•10 years ago
|
||
Comment on attachment 8438077 [details] [diff] [review] Add --test-path to in-tree B2G mochitest config, https://hg.mozilla.org/integration/mozilla-inbound/rev/ae4cb4a3a02c
Comment 25•10 years ago
|
||
In prod with reconfig on 2014-06-12 10:46 PT
Reporter | ||
Updated•10 years ago
|
Whiteboard: [leave open]
Reporter | ||
Comment 27•10 years ago
|
||
Comment on attachment 8438079 [details] [diff] [review] Schedule mochitest-media on B2G emulators on cedar, https://hg.mozilla.org/build/buildbot-configs/rev/b7ae1079631d
Comment 28•10 years ago
|
||
buildbot-config patch live in production :)
Reporter | ||
Comment 29•10 years ago
|
||
I've run into a problem here, and that is that emulators don't like IX slaves. Although we may be able to resolve that problem, IX slaves are already somewhat overloaded. Instead, we're going to experiment with running the tests on faster VM nodes; see bug 1026802.
Comment 30•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=41912476&tree=Try&full=1 (duration=112mins, while others are about 1 hour) Is that the reason why sometimes mochitest-3 on b2g opt runs much slower?
Reporter | ||
Comment 31•10 years ago
|
||
(In reply to JW Wang [:jwwang] from comment #30) > https://tbpl.mozilla.org/php/getParsedLog.php?id=41912476&tree=Try&full=1 > (duration=112mins, while others are about 1 hour) > Is that the reason why sometimes mochitest-3 on b2g opt runs much slower? Very likely that's a significant contributing factor.
Reporter | ||
Updated•10 years ago
|
Summary: Run media mochitests on B2G emulators on IX hardware slaves → Run media mochitests on B2G emulators on faster VMs
Comment 32•10 years ago
|
||
Hi Jonathan -- What's the reason for changing from hardware slaves to faster VMs. Have we tested faster VMs already? Thanks.
Flags: needinfo?(jgriffin)
Comment 33•10 years ago
|
||
Sounds like we don't have enough hardware. Obviously faster VMs should be the first attempt here. But are there any fallback plans/ideas... e.g. execute only a sub-set of tests on real hardware?
Reporter | ||
Comment 34•10 years ago
|
||
I have experimented with both real hardware and faster VM's; see bug 1026800. The emulator does not currently run on the real hardware we have. Although we could probably fix this, the pool of available machines is not large, and adding additional load is undesirable because it will increase wait times. We don't have this problem with VM's. I've tried the tests on a faster VM and they seem to run OK. Since the timeouts for the media tests are intermittent, we'll have to wait to see how they perform in production, but we have the option of moving to even faster VM's if the problem persists.
Flags: needinfo?(jgriffin)
Reporter | ||
Comment 35•10 years ago
|
||
(In reply to Nils Ohlmeier [:drno] from comment #33) > Sounds like we don't have enough hardware. Obviously faster VMs should be > the first attempt here. > But are there any fallback plans/ideas... e.g. execute only a sub-set of > tests on real hardware? Yes, if we can't get consistently green tests on very fast VM's, we can fallback to real hardware, which will take some additional configuration work.
Comment 36•10 years ago
|
||
Where are we at with this? The names of test slaves for B2G emulator suggest that everything still runs small machines. We are disabling WebRTC test in bug 1059867 and discus alternatives for the future in bug 1059878.
Reporter | ||
Comment 37•10 years ago
|
||
I didn't realize this was still pressing. I have a set of media tests running on faster VM's on cedar, but they're actually not getting triggered; I think support for --test-path in the B2G mochitest runner may not be working. I'll expedite a fix for this, and then we can see if the faster VM's solve this problem.
Reporter | ||
Comment 38•10 years ago
|
||
We were passing the wrong test-path to the media mochitest job on cedar; I've fixed this in https://hg.mozilla.org/build/buildbot-configs/rev/917020f08255, but it won't roll out until the next buildbot reconfig.
Comment 39•10 years ago
|
||
patch(es) in production for this bug :)
Reporter | ||
Comment 40•10 years ago
|
||
Maire: can I get a list of tests we want run on the faster VM type? I was thinking it was the tests in content/media, but looking at bug 1059867, I think it may be dom/media instead! Or maybe it's both...
Flags: needinfo?(mreavy)
Comment 41•10 years ago
|
||
I'll attach some lists, but basically: most webrtc (dom/media/tests/mochitest) tests, some content/media/tests (and I'm guessing a bit there without reading each one and understanding them - jwwang likely can do a better job filtering that list), and all of content/media/webaudio/tests (some don't need to be, likely, but it's far simpler to take all of them). We could take all of dom/media/tests/mochitests - most of the CPU time is in the tests I've already nomincated to move. Side note: dom/media/tests/crashtests may want to move as well
Comment 42•10 years ago
|
||
Comment 43•10 years ago
|
||
FYI, Maire asked me to filter the list, so clearing needinfo to her
Flags: needinfo?(mreavy)
Comment 44•10 years ago
|
||
I am having progress in enabling content/media/tests on B2G debug by removing per-token-exactGC in manifest.js (see https://tbpl.mozilla.org/?tree=Try&rev=319879ff1197). Most of them run well for now. Please exclude content/media/tests from the list for now.
Reporter | ||
Comment 45•10 years ago
|
||
So I think the thing to do here is: - point the media mochitest job on cedar (which is now running green) from content/media/tests to dom/media/tests, so we can see if we can reproduce the failures that led to the recent test disabling there - implement conditional subsuites so we can make the media job contain tests from multiple directories without resorting to extra manifests that would eventually create confusion
Reporter | ||
Comment 46•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #45) > So I think the thing to do here is: > > - point the media mochitest job on cedar (which is now running green) from > content/media/tests to dom/media/tests, so we can see if we can reproduce > the failures that led to the recent test disabling there https://hg.mozilla.org/build/buildbot-configs/rev/dd784899e53c
Reporter | ||
Comment 47•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #45) > > - implement conditional subsuites so we can make the media job contain tests > from multiple directories without resorting to extra manifests that would > eventually create confusion bug 1061982
Comment 48•10 years ago
|
||
Merged to production, and deployed.
Reporter | ||
Comment 49•10 years ago
|
||
So we can run the dom/media tests now on cedar on the faster VM. Strangely, they're all perma-fail with bug 1035011. See: https://tbpl.mozilla.org/?tree=Cedar&showall=1&rev=11c440c3fec3&jobname=emulator The media tests are also running again in chunk 7 in the regular mochitest chunks on cedar; they also exhibit this error. So, this faster VM doesn't seem to be helping any. I'm curious though why these tests are perma-fail; were they also perma-fail on trunk before being disabled? Logfile: https://tbpl.mozilla.org/php/getParsedLog.php?id=47499158&tree=Cedar&full=1 If nothing fishy appears to be happening here, we can try to bump the VM size again to see if it makes any difference. Going the real hardware route is going to be more time consuming. :(
Reporter | ||
Comment 50•10 years ago
|
||
Checking if these tests are also perma-fail on try: https://tbpl.mozilla.org/?tree=Try&rev=b23bd3185420
Reporter | ||
Comment 51•10 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #50) > Checking if these tests are also perma-fail on try: > https://tbpl.mozilla.org/?tree=Try&rev=b23bd3185420 This are similar to the failures on cedar. Nils, I thought the problem we were trying to solve was clearing frequent intermittents caused by CPU contention, is it actually to solve a perma-fail? Or is it resolving a set of frequent intermittents that cumulatively amount to a perma-fail? Or...? Right now, based on these results, it looks like the faster VM's have had no effect on these, but I'd like someone else to confirm.
Flags: needinfo?(drno)
Reporter | ||
Comment 52•10 years ago
|
||
Maire, do you have any input on comment #51?
Flags: needinfo?(mreavy)
Comment 53•10 years ago
|
||
When fixing bug 707777 (test_bug493187.html), I came to realize the test requires faster machine (for faster decoding) in order to pass. Can we put the whole folder of content/media/test/ on faster VM?
Comment 54•10 years ago
|
||
Hi Jonathan, thanks for your help with this! When we started talking about moving to faster machines (about 6-9 months ago), it was to solve frequent media/WebRTC intermittents that indeed were shown to be caused by CPU contention. If we ran the same suite of tests on faster hardware, we didn't see failures. AFAIK no one has measured how much faster the hardware needed to be to avoid failures. A number of tests were disabled over the last 6 months after clear indication in the logs that connection attempts were timing out. There are links in the mochitest.ini file to the bugs that led to disabling a specific test. NOTE: Due to the slowness of the emulator, we worked around some problems by dramatically reducing the generation rate for fake audio. We'd love to get rid of this type of kludge. About a month ago, as we were trying to land Bug 991037, we noticed that the refactored code in the patches of that bug caused many of the existing frequent intermittents to permafail. This is when we decided to disable the WebRTC tests on the B2G emulator (roughly 3 weeks ago). Our goal is to get all of these tests re-enabled (and remove any kludges) if we believe they can consistently give us accurate results.
Flags: needinfo?(mreavy)
Reporter | ||
Comment 55•10 years ago
|
||
I'm adding James into the loop here, since his plans impact decisions here. It looks like faster VM's aren't sufficient here, unless we move to _much faster_ VM's. If the tests were going to remain in buildbot long-term, we might do the work needed to run these on real hardware. However, James wants to transition B2G tests to TaskCluster soon, and TaskCluster doesn't support real hardware atm, only VM's. Given that's the case, I propose to try finding a fast VM this works on (apparently the work on bug 991037 rendered earlier attempts at this invalid), and do the buildbot work needed to support that, so that these can be transitioned to TaskCluster eventually. Doing this might require a much more expensive VM than we currently use, but I think this is the only viable option here. Needinfo'ing James and Catlee for their opinions.
Flags: needinfo?(jlal)
Flags: needinfo?(drno)
Flags: needinfo?(catlee)
Reporter | ||
Comment 57•10 years ago
|
||
(In reply to James Lal [:lightsofapollo] from comment #56) > Hrm- Have we tried instances with gpus yet? We haven't; those are roughly 6x more expensive than the "faster" VM's currently in use. But I agree, that's probably what we need to get these tests running well.
Comment 58•10 years ago
|
||
I see two options at this point. 1) run these tests on real hardware in buildbot for now. This would be on the same machines that do our Linux performance tests. This will obviously delay getting these tests moved over to Task Cluster. These machines aren't particularly powerful, so there's no guarantee this will work. Perhaps worth a quick experiment to verify. 2) investigate reducing the CPU requirements of these tests. It's always better to have tests that are less resource intensive. Is there some fundamental aspect of these tests that make them expensive to run? If so, we could also look at not running them per-push to keep costs under control.
Flags: needinfo?(catlee)
Reporter | ||
Comment 59•10 years ago
|
||
I don't think that reducing the CPU requirements of these tests is a viable option. Unfortunately, the emulator itself is very CPU-intensive, and it's quite easy to bump into a CPU-bound state with the VM's we're currently using. I'd like to propose trying these on the g2.2xlarge node type (I'll file a bug for this) and if it works ok there, to move the tests to that VM type. Although prices for this are relatively expensive, there is only 1 job that would need to run on it (at least for now), so the price-per-push shouldn't change much. And, we can always look at scheduling these less frequently if needed.
Reporter | ||
Comment 60•10 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #58) > I see two options at this point. > > 1) run these tests on real hardware in buildbot for now. This would be on > the same machines that do our Linux performance tests. This will obviously > delay getting these tests moved over to Task Cluster. These machines aren't > particularly powerful, so there's no guarantee this will work. Perhaps worth > a quick experiment to verify. Unfortunately, the emulator doesn't run on the current Linux hardware slaves, and it will take some work to figure out why and how to fix that. So a quick experiment here won't be very quick.
Reporter | ||
Comment 61•10 years ago
|
||
Good news. The dom/media tests that have been disabled (with the exception of test_dataChannel_bug1013809.html) work well on an AWS VM of instance type g2.2xlarge. I'll file a bug to stand this up as a platform within releng infra.
Reporter | ||
Comment 62•9 years ago
|
||
Not actively working on this, since it requires bug 1090612 to be implemented, so unassigning.
Assignee: jgriffin → nobody
Comment 63•6 years ago
|
||
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•