922680 - [Tracking] Run b2g reftests out of process

Reporter

Description

•

12 years ago

This bug will track the progress towards getting b2g reftests running out of process. The mechanism to do so is already in place, but we'll have some dependent bugs and test fixes to work through. Jonas, as a sanity check I'd like to get your opinion on how running remote reftests currently works (it's been over a year since it was written and things may have changed). A. The harness sets the prefs 'browser.tabs.remote: true' and 'reftest.browser.iframe.enabled: true' [1]. B. These prefs cause reftest.js to use an <iframe mozbrowser remote> instead of a <xul:browser> [2] C. reftest.js proceeds to start reftests in the normal way [1] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftestb2g.py#409 [2] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js#260 Chris Jones once mentioned that he had a hunch that this might not be what we want to do. He suggested that I replace shell.xul with reftest.xul entirely. I looked into this a bit in bug 807970 but ran into a series of roadblocks. If you think this is the better thing to do, I can re-investigate.

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Flags: needinfo?(jonas)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 1

•

12 years ago

This sounds good to me. Are you opening the reftest harness inside the <iframe>, or the individual reftest files themselves?

Flags: needinfo?(jonas)

Andrew Halberstadt [:ahal]

Reporter

Comment 2

•

12 years ago

Yeah, the whole harness is opened inside the iframe.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 3

•

12 years ago

Sounds good

Andrew Halberstadt [:ahal]

Reporter

Updated

•

12 years ago

Depends on: 927568

Jonathan Griffin (:jgriffin)

Comment 4

•

12 years ago

I ran this locally; I get the same result that's showing on cedar. Namely: 1) 14:36:24 INFO - System JS : ERROR chrome://marionette/content/marionette-listener.js:172 - NS_ERROR_INVALID_POINTER: Component returned failure code: 0x80004003 (NS_ERROR_INVALID_POINTER) [nsIContentFrameMessageManager.content] 2) 14:36:24 INFO - [Parent 658] WARNING: waitpid failed pid:709 errno:10: file ../../../gecko/ipc/chromium/src/base/process_util_posix.cc, line 254 14:36:24 INFO - [Parent 658] WARNING: waitpid failed pid:709 errno:10: file ../../../gecko/ipc/chromium/src/base/process_util_posix.cc, line 254 14:36:24 INFO - [Parent 658] WARNING: Failed to deliver SIGKILL to 709!(3).: file ../../../gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc, line 118 14:36:24 INFO - *** UTM:SVC TimerManager:notify - notified @mozilla.org/b2g/webapps-update-timer;1 14:36:24 INFO - System JS : ERROR resource://gre/modules/FileUtils.jsm:63 - NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get] 1) is a really strange problem; it indicates that the 'content' variable which is made available to frame scripts is invalid. This may or may not be a Marionette bug, but the stack trace shows that it's probably harmless to Marionette itself, and isn't likely itself to be the cause of the overall failure. 2) Tests run until they attempt to draw a non-empty page, and then they fail with these errors. Tests that compare empty pages pass, for example: REFTEST TEST-LOAD | about:blank | 0 / 1866 (0%) REFTEST TEST-PASS | data:text/html,<body> | image comparison (==) It's possible we're not setting up the OOP environment correctly.

Andrew Halberstadt [:ahal]

Reporter

Comment 5

•

12 years ago

I read somewhere that <iframe mozbrowser>'s can only be created within a mozapp (is this true?). In light of this and comment 4, I decided to put the reftests under the same test-container app that is checked into gaia which the mochitests + others use. I managed to load the reftest harness in the test-container's contentWindow, though I ran into a dead end because the reftest harness is strongly rooted in the assumption that it is running in a chrome context (reftest.jsm in chrome communicates with reftest-content.js in content). Now I feel I am at a bit of a catch-22.. The <iframe mozbrowser> needs to run within a mozapp to run OOP properly, but the reftest harness needs to run from the chrome context to set up it's message manager properly. I'm hoping I'm wrong, or there's some easy solution I'm overlooking.

Andrew Halberstadt [:ahal]

Reporter

Comment 6

•

12 years ago

Maybe instead of using the test-container app, I can just turn the iframe that the reftest harness creates into a mozapp. I'll play around some more.

Andrew Halberstadt [:ahal]

Reporter

Comment 7

•

12 years ago

I have success by setting the test-container app as the homescreen and then making the reftest harness use the 'systemapp' iframe instead of creating a new one. This seems to work with a few caveats: A) I'm not sure if this is testing exactly what we want to test B) There are a lot of error messages of the form: ************************************************************ * Call to xpconnect wrapped JSObject produced this error: * [Exception... "'[JavaScript Error: "aNetwork is null" {file: "resource://gre/modules/NetworkStatsService.jsm" line: 192}]' when calling method: [nsINetworkStatsServiceProxy::saveAppStats]" nsresult: "0x80570021 (NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS)" location: "native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0" data: yes] ************************************************************ though the tests seem to pass anyway, so maybe it can be ignored? If we do go this route, would be good to fix anyway if only to make the log more readable.

Andrew Halberstadt [:ahal]

Reporter

Comment 8

•

12 years ago

I added a check by doing in reftest-content.js: return Cc["@mozilla.org/xre/app-info;1"]. getService(Ci.nsIXULRuntime). processType == Ci.nsIXULRuntime.PROCESS_TYPE_DEFAULT; And apparently it isn't running out of process after all. Back to the drawing board :/

Andrew Halberstadt [:ahal]

Reporter

Comment 9

•

12 years ago

Attached patch Patch 1.0 - enable ability to run reftest oop — Details — Splinter Review

I had this patch for awhile, but forgot to update this bug. It's currently landed on cedar. It works, but there are a lot of failures and crashes when oop gets turned on. This patch adds the ability to turn oop on, but doesn't do so just yet. There's a try run to see what affect it has on in process reftests: https://tbpl.mozilla.org/?tree=Try&rev=ec04b4205793

Andrew Halberstadt [:ahal]

Reporter

Comment 10

•

12 years ago

Comment on attachment 8341771 [details] [diff] [review] Patch 1.0 - enable ability to run reftest oop Review of attachment 8341771 [details] [diff] [review]: ----------------------------------------------------------------- Try run looks good. I'd like to get this landed just to avoid bitrot. When the failures are sorted out on Cedar it'll just be a matter of setting the oop pref.

Attachment #8341771 - Flags: review?(jgriffin)

Jonathan Griffin (:jgriffin)

Comment 11

•

12 years ago

Comment on attachment 8341771 [details] [diff] [review] Patch 1.0 - enable ability to run reftest oop Review of attachment 8341771 [details] [diff] [review]: ----------------------------------------------------------------- lgtm, thanks for the misc cleanup too

Attachment #8341771 - Flags: review?(jgriffin) → review+

Andrew Halberstadt [:ahal]

Reporter

Comment 12

•

12 years ago

https://hg.mozilla.org/integration/b2g-inbound/rev/94b8161eb20a Leaving open as a tracking bug for oop related failures.

Whiteboard: [leave-open]

Carsten Book [:Tomcat]

Comment 13

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/94b8161eb20a

Andrew Halberstadt [:ahal]

Reporter

Comment 14

•

12 years ago

Hey Milan, just want to inform you that the B2G emulator reftests are running out of process on the Cedar branch: https://tbpl.mozilla.org/?tree=Cedar&showall=1 The jobs are almost all failing, I'm not sure if the failures are due to faulty tests, platform bugs or test harness bugs. I also don't know the priority of greening these up (Jonas or Gregor would be the ones to ask about that). Feel free to direct any inquiries about how they are run to me. p.s You'll notice there are two jobs per chunk. One is running on a physical machine, the other is a VM in AWS. You can tell them apart by looking at the "using slave" section in the bottom left, 'talos-r3-fed-xxx' for the former and 'tst-linux64-spot-xxx' for the latter.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 15

•

12 years ago

I'm going to needinfo me on this to keep an eye on it. I'm sure it's a mixture of reasons, and I imagine there is a lot of actual bugs, though I'm not sure how often those bugs show up outside the tests. CC-ing CJ as well.

Flags: needinfo?(milan)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 16

•

12 years ago

Getting the reftests running out of process is definitely a high priority. Compared to the code tested by other harneses, like mochitest and xpcshell tests, the out-of-process gfx code has quite often seen regressions on desktop. Most likely because of the unfortunate combination of being platform specific and not used by any shipping products. So we should definitely get tests up-and-running if we can. If you even have a small set of tests consistently passing, please do disable any failing or intermittent tests and lets get anything that's there running. That said, getting James Lal's integration tests up-and-running is higher priority. But I'd say that OOP reftests are second after that.

Andrew Halberstadt [:ahal]

Reporter

Comment 17

•

12 years ago

(In reply to Jonas Sicking (:sicking) needinfo? or feedback? encouraged. from comment #16) > So we should definitely get tests up-and-running if we can. If you even have > a small set of tests consistently passing, please do disable any failing or > intermittent tests and lets get anything that's there running. Can do. Though because this isn't a new suite, disabling reftests means losing pre-existing coverage. One possibility is to create a second suite on tbpl that runs them oop. Though because emulator reftests are so slow, doubling the number of jobs running them will make a lot of people who are concerned about infrastructure load unhappy. I'll see what it would take to green them up, maybe the loss of coverage wouldn't be too great.

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

12 years ago

Flags: needinfo?(milan)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 18

•

12 years ago

Loosing coverage is ok if it means going from in-process to out-of-process.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 19

•

12 years ago

(In reply to Jonas Sicking (:sicking) vacation until Jan 20 from comment #18) > Loosing coverage is ok if it means going from in-process to out-of-process. Would we keep running the in-process tests "on the side" in the meantime? Or is that impractical?

Andrew Halberstadt [:ahal]

Reporter

Comment 20

•

12 years ago

(In reply to Milan Sreckovic [:milan] from comment #19) > (In reply to Jonas Sicking (:sicking) vacation until Jan 20 from comment #18) > > Loosing coverage is ok if it means going from in-process to out-of-process. > > Would we keep running the in-process tests "on the side" in the meantime? > Or is that impractical? As always we need to balance infrastructure load with usefulness of the tests. The emulator is notoriously slow, so running the same suite twice would be costly. I can't really comment on the usefulness of keeping in-process coverage though.

Ivan Tsay (:ITsay)

Comment 21

•

12 years ago

Hi Andrew, It was mentioned in the comment 14 that there were a lot of test failures after the OOP test has been up-and-running for gfx. May I know if the situation remains the same right now? I am asking because graphic team is looking for the specific follow-up items from this case. The items are probably on resolving test failure issue, the bugs that are found from the tests, or both. Thank you!!

Flags: needinfo?(ahalberstadt)

Andrew Halberstadt [:ahal]

Reporter

Comment 22

•

12 years ago

(In reply to Ivan Tsay (:ITsay) from comment #21) > Hi Andrew, > > It was mentioned in the comment 14 that there were a lot of test failures > after the OOP test has been up-and-running for gfx. May I know if the > situation remains the same right now? > > I am asking because graphic team is looking for the specific follow-up items > from this case. The items are probably on resolving test failure issue, the > bugs that are found from the tests, or both. Thank you!! Yeah, the situation is still the same. I haven't investigated or filed bugs about any of the issues yet, but if you look at the emulator reftests on cedar (https://tbpl.mozilla.org/?tree=Cedar&jobname=b2g) you'll see what I'm talking about. You'll notice there are two jobs per chunk, the first one is running on physical fedora machines (we are trying to retire these), the second is running on our Amazon ec2 instance. I'd recommend using the first chunk as your baseline since those are failures we know for sure are caused by oop. Let me know if you have questions about how the tests are being set up.

Flags: needinfo?(ahalberstadt)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 23

•

12 years ago

Is there any dependency, or preferred ordering of this and bug 818968? Btw, :cjku is going to manage this form the graphics end of things.

Flags: needinfo?(ahalberstadt)

Andrew Halberstadt [:ahal]

Reporter

Comment 24

•

12 years ago

(In reply to Milan Sreckovic [:milan] from comment #23) > Is there any dependency, or preferred ordering of this and bug 818968? > Btw, :cjku is going to manage this form the graphics end of things. I think bug 818968 is probably the higher priority since I don't think the pool they are currently running on will exist after the mountain view office moves (though I might be wrong?). :cjku, thanks for helping out with this. Let me know if you have any questions about how they are run or how to reproduce locally.

Flags: needinfo?(ahalberstadt)

u459114

Assignee

Comment 25

•

12 years ago

Sure. NI to myself

Flags: needinfo?(cku)

Andrew Halberstadt [:ahal]

Reporter

Comment 26

•

12 years ago

Un-assigning myself as the ability to run them oop has landed.

Assignee: ahalberstadt → nobody

Status: ASSIGNED → NEW

Abel Lin(alin, abel)

Comment 27

•

12 years ago

Looking into the issue for a while, the type of failed reftests(REFTEST TEST-UNEXPECTED-FAIL) are as below: 1. image difference a. timing issue like scrolling b. static rendering(most cases) c. animation 2. xul load failed keep looking into why the most cases failed.

u459114

Assignee

Updated

•

12 years ago

Assignee: nobody → alin

Flags: needinfo?(cku)

u459114

Assignee

Comment 28

•

12 years ago

Abel, #2 is not because of diff test fail, you may handle it separately. For #1, my opinion is to create two diff map images automatically in ref testing while diff detected. The first diff map contains diff pixels of the first canvas image, while the second one contains diff pixel of the second canvas image. It does not only save your debugging time, but also help any developers to figure out ref test problem in the future.

Peter Chang[:pchang]

Comment 29

•

12 years ago

(In reply to Abel Lin(alin, abel) from comment #27) > Looking into the issue for a while, > the type of failed reftests(REFTEST TEST-UNEXPECTED-FAIL) are as below: > 1. image difference > a. timing issue like scrolling > b. static rendering(most cases) > c. animation > 2. xul load failed > > keep looking into why the most cases failed. Abel, Another opinion to debug #1, you got some fail results from border-radius test case. And the pixels of border-radius test case were generated from canvas and css style under OOP mode. Please check the pixel result under non-OOP mode to identify the pixel from canvas is wrong or the pixel from css style is wrong under OOP mode. By the way, please also list some fail test cases in bug. You can attach the detail or using http://www.pastebin.mozilla.org.

u459114

Assignee

Comment 30

•

12 years ago

Abel, Can you still find these error after apply patch in Bug 916350?

Patch 1.0 - enable ability to run reftest oop 12 years ago Andrew Halberstadt [:ahal] 8.31 KB, patch	jgriffin : review+	Details \| Diff \| Splinter Review
skip-fail-test.diff 11 years ago Vincent Chen [:vichen] 44.25 KB, patch	ahal : review+	Details \| Diff \| Splinter Review
skip-fail-test-2.diff 11 years ago Vincent Chen [:vichen] 65.10 KB, patch		Details \| Diff \| Splinter Review
skip-fail-test-3.patch 11 years ago Vincent Chen [:vichen] 63.63 KB, patch	ahal : review+	Details \| Diff \| Splinter Review
enable_oop.patch 11 years ago Vincent Chen [:vichen] 1.81 KB, patch		Details \| Diff \| Splinter Review
922680-enable-oop.patch 11 years ago Vincent Chen [:vichen] 1.51 KB, patch	ahal : review+	Details \| Diff \| Splinter Review
922680-skip-quit.patch 11 years ago Vincent Chen [:vichen] 760 bytes, patch		Details \| Diff \| Splinter Review
Enable oop on b2g emulator reftests 11 years ago Andrew Halberstadt [:ahal] 2.45 KB, patch	jgriffin : review+	Details \| Diff \| Splinter Review