Android reftests run in a wacky environment that is not representative of anything

NEW
Assigned to

Status

Testing
Reftest
3 years ago
3 years ago

People

(Reporter: kats, Assigned: droeh)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

I finally got android reftests running on a loaner machine in order to debug mysterious behaviour in bug 1147038. As part of my investigation I realized that the reftests don't really run in a sane environment.

The bootstrap script loads the reftest.jsm into the same nsWindow holding Fennec's browser.xul, [1]. The reftest.jsm then rips out everything inside browser.xul's window [2][3] and injects its own xul:browser element which is used to load the reftest pages. It doesn't use Fennec's normal tab-loading code, and depending on how the listeners and triggers in Fennec's browser.js are registered, they may or may not run on the reftest pages.

For example the DOMMetaAdded listener in ViewportHandler is registered on the top-level window and so does fire, but does nothing since there's no tab. Listeners registered on the deck don't fire at all.

Basically, some stuff in browser.js is running but because the environment that they're running in (browser.xul gutted, no deck or tabs) breaks a lot of assumptions they basically do random things. I'm pretty sure that reftest failures on Android, at least those related to APZ/viewport/displayport stuff are nonsensical and not really representative of what happens in any production environment.

[1] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/bootstrap.js?rev=e51a54f1817d#28
[2] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js?rev=e51a54f1817d#279
[3] http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/browser.xul?rev=cddde1da19ae#8
This is terrifying.
No longer blocks: 1147038
Assignee: nobody → droeh
In recent months I have noticed that when I run reftests locally, only about:home is displayed and shows constant history / top-sites updates as tests run. The tests pass, but are not visible. 

But today -- all's well! Tests are visible again! I haven't run reftests locally for at least a week -- I'm not sure when this change happened, or why -- but it sure seems like a step in the right direction.
I'm not confident this is the way to fix this, but it's an idea:

1) Stop ripping out everything in browser.js (on Fennec, at least)
2) Use BrowserApp to open a tab with whatever url
3) set gBrowser to the browser for that tab
Created attachment 8643827 [details] [diff] [review]
WIP

In my local environment applying this patch seems to have the desired effect of using the <browser> element that Fennec creates on startup. I tried running the reftest-sanity tests with this, and the first few ran and passed, and then the browser crashed with this:

08-05 14:18:53.806 I/Gecko   (24024): FATAL ERROR: Non-local network connections are disabled and a connection attempt to tracking.services.mozilla.com (52.26.18.160) was made.
08-05 14:18:53.806 I/Gecko   (24024): You should only access hostnames available via the test networking proxy (if running mochitests) or from a test-specific httpd.js server (if running xpcshell tests). Browser services should be disabled or redirected to a local server.
08-05 14:18:53.806 F/MOZ_CRASH(24024): Hit MOZ_CRASH(Attempting to connect to non-local address!) at /home/kats/zspace/mozilla-git/netwerk/base/nsSocketTransport2.cpp:1253

I'm assuming there's something in Fennec that's trying to connect to tracking.services.m.c and I need to set a pref or something to disable it. mfinkle, any suggestions?
Flags: needinfo?(mark.finkle)
Created attachment 8643865 [details] [diff] [review]
WIP v2

This seems to do the job. With this I can get through the reftest-sanity suite, although there are some failures:

REFTEST INFO | Result summary:
REFTEST INFO | Successful: 84 (77 pass, 7 load only)
REFTEST INFO | Unexpected: 16 (14 unexpected fail, 0 unexpected pass, 2 unexpected asserts, 0 unexpected fixed asserts, 0 failed load, 0 exception)
REFTEST INFO | Known problems: 43 (19 known fail, 0 known asserts, 2 random, 22 skipped, 0 slow)

I usually get a few failures while running locally anyway so I'll do a try push with this to see what happens.
Attachment #8643827 - Attachment is obsolete: true
Flags: needinfo?(mark.finkle)
Comment on attachment 8643865 [details] [diff] [review]
WIP v2

Review of attachment 8643865 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/tools/reftest/bootstrap.js
@@ +28,5 @@
>                  Components.utils.import("chrome://reftest/content/reftest.jsm");
>                  win.addEventListener("pageshow", function() {
>                      win.removeEventListener("pageshow", arguments.callee); 
>                      // We add a setTimeout here because windows.innerWidth/Height are not set yet;
> +                    win.setTimeout(function () {OnRefTestLoad(win);}, 1000);    // on Fennec we need to give the browser some time to create a tab

I think you can listen to 'TabOpen' here instead. Something like:

win.BrowserApp.deck.addEventListener("TabOpen", function(){ OnRefTestLoad(win); }, false);
Attachment #8643865 - Flags: feedback+
Updated try push still not looking so hot, with a bunch of blue starting to show up: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0343ec6ae766

Geoff, any thoughts on what's going on?
Flags: needinfo?(gbrown)
The blues all look like infra problems -- nothing related to your patch.
Flags: needinfo?(gbrown)
Hm, ok. There's still a lot of failures, looks mostly like in one of the images stuff is blurry and in the other it's sharp. I'm guessing it's a race condition where the reftest snapshot is taken during initial draw or something. I can look into it a bit more.
Most of the differences I see are scale: The test image is a little bit larger than the reference image.
Actually, I see more blur than scale for a lot of tests too.

And this is interesting: The scale difference varies by platform. Look at tests/image/test/reftest/pngsuite-basic-n/basn0g02.png in reftest analyzer for an example. On Android 2.3, the test image is smaller than the reference image; on Android 4.3, the test image is larger than the reference image.
Depends on: 1191901
Forcing the resolution to 1.0 and running the reftest-sanity suite locally produced better results (only 3 failures, 2 of which were assertion failures), but I did a try push with that at https://treeherder.mozilla.org/#/jobs?repo=try&revision=e82df4e9d971 and it seems to be timing out/crashing a lot.
See Also: → bug 1192616
I don't think I have cycles to keep looking into this. Feel free to steal my WIP and carry on.
You need to log in before you can comment on or make changes to this bug.