Closed Bug 1031375 Opened 10 years ago Closed 10 years ago

How many devices do we need for perf testing?

Categories

(Firefox OS Graveyard :: Infrastructure, defect, P2)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: retornam, Unassigned)

References

Details

(Keywords: perf, Whiteboard: [c=automation p= s= u=])

At today's  b2g-ateam-perf sync meeting we discussed setting up more Flames to run b2g-perf, maketestperf and possibly bisection of failed runs. This bug is to figure out how many more devices we need to allocate to perf jobs in the QA Lab build out.
Blocks: 1019792
Keywords: perf
Priority: -- → P2
Whiteboard: [c=automation p= s= u=]
At least with b2gperf, we need 4-5 devices to keep up with every gaia commit. We need 2-3 devices to auto bisect in case Gecko regresses. In total, 8 devices should be enough to test every gaia commit, auto bisect gecko commits, using b2gperf which measures start up time only.

We need to figure out how many devices we need to run make test-perf for the other tests. In theory, once we have all the start up tests with the new events, we shouldn't have to run b2gperf anymore and can rely only on make test-perf.

30 runs seemed like overkill for b2gperf, 20 runs was enough to detect regressions using BackoutBot. Leaving a need info on hub to see if he knows how many devices we need to run make test-perf.
Flags: needinfo?(hub)
Hey Dave, we have 8 devices at the moment on the flame build out. I know we have 4-5 devices to run the actual test, and 2-3 devices to auto bisect. Do you have a bug for the auto bisect or to set up the whole auto regression detection system? Thanks!
Flags: needinfo?(dhuseby)
I have no idea how many device we need. We could do the calculation though, but I'm working on changes that make the tests take longer, so we should wait for that first before calculating it.
Flags: needinfo?(hub)
Mason and Hub, I just brain dumped the b2ghaystack piece of the automated regression detection into Bug 1038293  There's some work around making b2ghaystack smart enough to run bisect steps in parallel, feeding the correct parameters to b2ghaystack and the bisection jenkins jobs, and setting up jenkins jobs for running b2ghaystack and the bisection test passes.
Flags: needinfo?(dhuseby)
After re-thinking this, I recommend that we only test on 273MB Flame devices.  I think having Flames with more memory available is redundant.  The only exception would be if we need more ram to run some apps/features (e.g. WebRTC video conferencing?).
I think just testing the minimum memory for flame devices is good for me. If this is our reference device and our reference workload, should be good! What I really want to know is, if 8 devices is enough to test every gaia commit, and some amount of gecko commits for all the make test-perf start up tests.
Component: WebQA → Infrastructure
Product: Testing → Firefox OS
We currently have 10 devices allocated for performance testing. This isn't enough to avoid queuing in the current configuration, however I would prefer to raise other bugs (limit memory configurations, reduce iterations for b2gperf, etc) for addressing this.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.