I've been working to get reftests running on pandaboards, but have hit a significant slowdown in the past few weeks. Running an Aug 7th nightly, running chunk 1 of 5 of the reftests takes 28 minutes on my pandaboard. Running an Aug 10th nightly, the test suite times out after an hour after only completing 58% of the chunk. This does not appear to be test harness related, as running the Aug 7th apk with the Aug 10th tests zip works fine. Potentially graphics related?
What's the status here? Has this been corrected by fixing some other bug?
The problematic revision seems to be http://hg.mozilla.org/integration/mozilla-inbound/rev/c5946a8bcd5b, which is a merge of fx-team to m-c. I'll go through the revisions that were part of that merge to try to identify the problem.
The x86 emulator jsreftests (and perhaps other reftests) take significantly longer to complete than they did in July...possibly related?
This is probably caused by SkiaGL. If you want to try without it, set gfx.canvas.azure.accelerated = false. SkiaGL is not particularly good at the kind of work reftests do (which is draw a gigantic 800x1000 image into the canvas), so this isn't too surprising. We may want to just disable it for reftests entirely.
Yes, I narrowed it down to the change that removed the vendor check for Nvidia and started doing based upon size. I was checking to see if tweaking the minimum dimension to enable skia would help, but it doesn't seem to. Disabling skia certainly fixes the slowdown, so if no one has any objections, that is what I'll do.
Comparing https://tbpl.mozilla.org/?tree=Try&rev=73f46a6d0f5e&showall=1 (SkiaGL) to https://tbpl.mozilla.org/?tree=Try&rev=5123e124b540&showall=1 (no SkiaGL), disabling SkiaGL is a significant improvement on Pandas. Note especially R3 (603 vs 1576 tests complete in 60 minutes). These are still running significantly slower than Tegras, and will need to run in more chunks, unless we find another way to speed them up. For x86 (sorry, no logs -- I'm testing on a loaner), I see no significant change with/without SkiaGL.
With the preference changed, running reftests in five chunks allows me to complete 91% of the first chunk as opposed to 58% before timing out after an hour, but this is still more than twice as slow as it was with the August 7th build, where the first chunk took 28 minutes to run, so there must be other factors impacting this as well. Splitting into six chunks allows me to run the first chunk in ~57 minutes locally, so that might be enough to get panda reftests running again, although running in seven or eight chunks might be safer. blassey, do you have any objections to running the reftests with skiaGL disabled?
I think I can answer that for Brad: yes, it is OK to run reftests with skiaGL disabled. The point of reftests is not to exercise canvas back-ends. For that, we have the canvas mochitests. It is unfortunate that due to using very small canvases below the threshold size for Skia/GL, our canvases mochitests do not currently exercise Skia/GL (bug 905217) but that is a separate bug.
Created attachment 805240 [details] [diff] [review] Patch to set pref to disable skiaGL
Backed out for causing the same Android 2.2 reftest failures that the Try push in comment 5 hit. https://hg.mozilla.org/integration/mozilla-inbound/rev/ca7407d33047
Sorry! I think we were so focused on the times, we neglected to look at the failures!
Sorry, I should have double checked that. gbrown, do you think should we just do this for the pandas or tweak the fuzziness for the failing test? They do render differently, but it does not look like a "significant" difference to me.
I would try tweaking the fuzziness, but I don't have a strong opinion.
Created attachment 806722 [details] [diff] [review] Patch to disable skiaGl and adjust fuzziness for failing reftest Try run here: https://tbpl.mozilla.org/?tree=Try&rev=11897a567b18&showall=1
Comment on attachment 806722 [details] [diff] [review] Patch to disable skiaGl and adjust fuzziness for failing reftest [Approval Request Comment] We should uplift this to keep SkiaGL behavior similar between trunk and aurora. Very low risk, only changes reftest preferences.