Closed Bug 1019056 Opened 6 years ago Closed 5 years ago

Investigate noise level in Android Talos test tcheck2

Categories

(Firefox for Android :: Testing, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: gbrown, Unassigned)

References

Details

Attachments

(1 file)

In bug 1015576, several people are concerned about the level of noise in tcheck2 measurements running on Android 4.0. It was better in the distant past, and no one is sure why things are different now -- someone should check and see if the noise can be reduced.
To get started on this I did a try push with some additional logging to see where the noise is coming from (could be number of frames rendered, or checkerboarding per frame, or time elapsed per checkerboarded-frame). Hopefully the try pushes will give me a full logcat so I can analyze the data.

https://tbpl.mozilla.org/?tree=Try&rev=9e31fa7564e9
That didn't work either, looks like. gbrown, do you know how to get large amounts of log data out of talos tests? It doesn't look like MOZ_UPLOAD_DIR is exposed to Java-land.
I know there are some problems with MOZ_UPLOAD_DIR for Android unittests -- likely similar limitations for Talos. I cannot think of a quick way forward, but will try to have a closer look later today.
Flags: needinfo?(gbrown)
Yes, Comment 5 worked.
Flags: needinfo?(gbrown)
Attached file checkerboards.tgz
Sorry, I forgot to update this bug. Comment 5 did in fact work, and I retriggered the test a bunch of times to get some data. The output from individual runs of rck2 are attached (5 runs per tbpl job x 7 jobs = 35 total). I did some rudimentary analysis on the numbers but didn't see any obvious reasons for why the noise level is high.

For the number of frames recorded:
Samples: 35
Average:     791.46
Stddev:      14.98
Max: 819
Min: 759

For the average checkerboarding per frame in the run:
Samples: 35
Average:       0.06
Stddev:       0.02
Max: 0.12
Min: 0.04

For the max checkerboard value in the run:
Samples: 35
Average:       0.99
Stddev:       0.04
Max: 1
Min: 0.775974

All of these values seem clustered fairly close. I suspect the noise might just be because the test results are highly sensitive to timing factors so even small hiccups here and there will throw them off quite a bit. I'm not sure if there's a good way to deal with that.
See Also: → 851861
Noise was significantly reduced by bug 1097318.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.