Closed Bug 1059878 Opened 10 years ago Closed 7 years ago

How to run B2G WebRTC tests in a reliable way

Categories

(Core :: WebRTC, defect, P3)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: drno, Unassigned)

References

Details

(Whiteboard: [webrtc-mochitest])

As bug 1059867 is going to disable all WebRTC tests on B2G emulator we should discuss what we can do instead to execute WebRTC test for B2G in a more reliable way in the future.
So catlee wrote me yesterday:

If the slave name for the job is 'tst-linux64-XXX' then it's m1.medium. If it's 'tst-emulator64-XXX' then it's c3.xlarge IIRC.

But in the try run I only see tst-linux64-XXX getting used. So I guess that means:

1) Move the WebRTC mochitests to faster EC2 instances, which are guaranteed to be used only by us (dedicated machines).

2) Move the WebRTC mochitests to real phone devices, like others use for their testing.

3) Run the WebRTC mochitests split across two emulator via steeplechase. As the B2G emulator is limited to a single core, this should at least reduce the load within one emulator.
How quickly could we get the tests running on the better instances?
What changed to that made the tests so much more CPU intensive? Are these new tests?
Blocks: 991037
They have always been somewhat CPU intensive, but the real problem is that we aren't getting any kind of real-time treatment from the scheduler. WebRTC _cannot_ function if the instance says "No cycles for you!" for 20-200 seconds, according to its whim.
And no, these are not new tests. A large number of them have already been disabled due to these problems, and anytime the timing of something needs to change even slightly, the navel-gazing that the instance does can suddenly become a problem for more tests.
See Also: → 994920
To put things in perspective: from the 56 test we have I disabled 17 in bug 1059867. So basically there was not much left.

The engineering efforts for debugging intermittent problems on B2G emu is really high. But the current solution has not caught many real bugs yet.

There were some efforts on this topic in the past, like in bug 994920.
By "really high", Nils is talking man-months just on the WebRTC team, by the way. This latest fiasco around bug 991037 has cost me almost a week, and probably that much in aggregate for drno and mt. Simultaneously, the media side of the WebRTC team has been fighting a similar struggle on other bugs. This is a common state of affairs.
Whiteboard: [webrtc-mochitest]
backlog: --- → webRTC+
Rank: 35
Priority: -- → P3
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.