[Tracking] Run b2g reftests out of process

RESOLVED FIXED in mozilla32

Status

RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: ahal, Assigned: u459114)

Tracking

unspecified
mozilla32
ARM
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 6 obsolete attachments)

(Reporter)

Description

5 years ago
This bug will track the progress towards getting b2g reftests running out of process. The mechanism to do so is already in place, but we'll have some dependent bugs and test fixes to work through.

Jonas, as a sanity check I'd like to get your opinion on how running remote reftests currently works (it's been over a year since it was written and things may have changed).

A. The harness sets the prefs 'browser.tabs.remote: true' and 'reftest.browser.iframe.enabled: true' [1].
B. These prefs cause reftest.js to use an <iframe mozbrowser remote> instead of a <xul:browser> [2]
C. reftest.js proceeds to start reftests in the normal way

[1] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftestb2g.py#409
[2] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js#260

Chris Jones once mentioned that he had a hunch that this might not be what we want to do. He suggested that I replace shell.xul with reftest.xul entirely. I looked into this a bit in bug 807970 but ran into a series of roadblocks. If you think this is the better thing to do, I can re-investigate.
(Reporter)

Updated

5 years ago
Flags: needinfo?(jonas)
This sounds good to me.

Are you opening the reftest harness inside the <iframe>, or the individual reftest files themselves?
Flags: needinfo?(jonas)
(Reporter)

Comment 2

5 years ago
Yeah, the whole harness is opened inside the iframe.
(Reporter)

Updated

5 years ago
Depends on: 927568
I ran this locally; I get the same result that's showing on cedar.

Namely:

1) 14:36:24     INFO -  System JS : ERROR chrome://marionette/content/marionette-listener.js:172 - NS_ERROR_INVALID_POINTER: Component returned failure code: 0x80004003 (NS_ERROR_INVALID_POINTER) [nsIContentFrameMessageManager.content]

2) 14:36:24     INFO -  [Parent 658] WARNING: waitpid failed pid:709 errno:10: file ../../../gecko/ipc/chromium/src/base/process_util_posix.cc, line 254
14:36:24     INFO -  [Parent 658] WARNING: waitpid failed pid:709 errno:10: file ../../../gecko/ipc/chromium/src/base/process_util_posix.cc, line 254
14:36:24     INFO -  [Parent 658] WARNING: Failed to deliver SIGKILL to 709!(3).: file ../../../gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc, line 118
14:36:24     INFO -  *** UTM:SVC TimerManager:notify - notified @mozilla.org/b2g/webapps-update-timer;1
14:36:24     INFO -  System JS : ERROR resource://gre/modules/FileUtils.jsm:63 - NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]

1) is a really strange problem; it indicates that the 'content' variable which is made available to frame scripts is invalid.  This may or may not be a Marionette bug, but the stack trace shows that it's probably harmless to Marionette itself, and isn't likely itself to be the cause of the overall failure.

2) Tests run until they attempt to draw a non-empty page, and then they fail with these errors.  Tests that compare empty pages pass, for example:

REFTEST TEST-LOAD | about:blank | 0 / 1866 (0%)
REFTEST TEST-PASS | data:text/html,<body> | image comparison (==)

It's possible we're not setting up the OOP environment correctly.
(Reporter)

Comment 5

5 years ago
I read somewhere that <iframe mozbrowser>'s can only be created within a mozapp (is this true?). In light of this and comment 4, I decided to put the reftests under the same test-container app that is checked into gaia which the mochitests + others use.

I managed to load the reftest harness in the test-container's contentWindow, though I ran into a dead end because the reftest harness is strongly rooted in the assumption that it is running in a chrome context (reftest.jsm in chrome communicates with reftest-content.js in content).

Now I feel I am at a bit of a catch-22.. The <iframe mozbrowser> needs to run within a mozapp to run OOP properly, but the reftest harness needs to run from the chrome context to set up it's message manager properly.

I'm hoping I'm wrong, or there's some easy solution I'm overlooking.
(Reporter)

Comment 6

5 years ago
Maybe instead of using the test-container app, I can just turn the iframe that the reftest harness creates into a mozapp. I'll play around some more.
(Reporter)

Comment 7

5 years ago
I have success by setting the test-container app as the homescreen and then making the reftest harness use the 'systemapp' iframe instead of creating a new one.

This seems to work with a few caveats:

A) I'm not sure if this is testing exactly what we want to test
B) There are a lot of error messages of the form:
************************************************************
* Call to xpconnect wrapped JSObject produced this error:  *
[Exception... "'[JavaScript Error: "aNetwork is null" {file: "resource://gre/modules/NetworkStatsService.jsm" line: 192}]' when calling method: [nsINetworkStatsServiceProxy::saveAppStats]"  nsresult: "0x80570021 (NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS)"  location: "native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0"  data: yes]
************************************************************

though the tests seem to pass anyway, so maybe it can be ignored? If we do go this route, would be good to fix anyway if only to make the log more readable.
(Reporter)

Comment 8

5 years ago
I added a check by doing in reftest-content.js:
       return Cc["@mozilla.org/xre/app-info;1"].
                getService(Ci.nsIXULRuntime).
                processType == Ci.nsIXULRuntime.PROCESS_TYPE_DEFAULT;

And apparently it isn't running out of process after all. Back to the drawing board :/
(Reporter)

Comment 9

5 years ago
Created attachment 8341771 [details] [diff] [review]
Patch 1.0 - enable ability to run reftest oop

I had this patch for awhile, but forgot to update this bug. It's currently landed on cedar. It works, but there are a lot of failures and crashes when oop gets turned on.

This patch adds the ability to turn oop on, but doesn't do so just yet. There's a try run to see what affect it has on in process reftests:
https://tbpl.mozilla.org/?tree=Try&rev=ec04b4205793
(Reporter)

Comment 10

5 years ago
Comment on attachment 8341771 [details] [diff] [review]
Patch 1.0 - enable ability to run reftest oop

Review of attachment 8341771 [details] [diff] [review]:
-----------------------------------------------------------------

Try run looks good. I'd like to get this landed just to avoid bitrot. When the failures are sorted out on Cedar it'll just be a matter of setting the oop pref.
Attachment #8341771 - Flags: review?(jgriffin)
Comment on attachment 8341771 [details] [diff] [review]
Patch 1.0 - enable ability to run reftest oop

Review of attachment 8341771 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm, thanks for the misc cleanup too
Attachment #8341771 - Flags: review?(jgriffin) → review+
(Reporter)

Comment 12

5 years ago
https://hg.mozilla.org/integration/b2g-inbound/rev/94b8161eb20a

Leaving open as a tracking bug for oop related failures.
Whiteboard: [leave-open]
(Reporter)

Comment 14

5 years ago
Hey Milan, just want to inform you that the B2G emulator reftests are running out of process on the Cedar branch:
https://tbpl.mozilla.org/?tree=Cedar&showall=1

The jobs are almost all failing, I'm not sure if the failures are due to faulty tests, platform bugs or test harness bugs. I also don't know the priority of greening these up (Jonas or Gregor would be the ones to ask about that). Feel free to direct any inquiries about how they are run to me.

p.s You'll notice there are two jobs per chunk. One is running on a physical machine, the other is a VM in AWS. You can tell them apart by looking at the "using slave" section in the bottom left, 'talos-r3-fed-xxx' for the former and 'tst-linux64-spot-xxx' for the latter.
I'm going to needinfo me on this to keep an eye on it.  I'm sure it's a mixture of reasons, and I imagine there is a lot of actual bugs, though I'm not sure how often those bugs show up outside the tests.  CC-ing CJ as well.
Flags: needinfo?(milan)
Getting the reftests running out of process is definitely a high priority. Compared to the code tested by other harneses, like mochitest and xpcshell tests, the out-of-process gfx code has quite often seen regressions on desktop.

Most likely because of the unfortunate combination of being platform specific and not used by any shipping products.

So we should definitely get tests up-and-running if we can. If you even have a small set of tests consistently passing, please do disable any failing or intermittent tests and lets get anything that's there running.

That said, getting James Lal's integration tests up-and-running is higher priority. But I'd say that OOP reftests are second after that.
(Reporter)

Comment 17

5 years ago
(In reply to Jonas Sicking (:sicking) needinfo? or feedback? encouraged. from comment #16)
> So we should definitely get tests up-and-running if we can. If you even have
> a small set of tests consistently passing, please do disable any failing or
> intermittent tests and lets get anything that's there running.

Can do. Though because this isn't a new suite, disabling reftests means losing pre-existing coverage. One possibility is to create a second suite on tbpl that runs them oop. Though because emulator reftests are so slow, doubling the number of jobs running them will make a lot of people who are concerned about infrastructure load unhappy.

I'll see what it would take to green them up, maybe the loss of coverage wouldn't be too great.
Flags: needinfo?(milan)
Loosing coverage is ok if it means going from in-process to out-of-process.
(In reply to Jonas Sicking (:sicking) vacation until Jan 20 from comment #18)
> Loosing coverage is ok if it means going from in-process to out-of-process.

Would we keep running the in-process tests "on the side" in the meantime?  Or is that impractical?
(Reporter)

Comment 20

5 years ago
(In reply to Milan Sreckovic [:milan] from comment #19)
> (In reply to Jonas Sicking (:sicking) vacation until Jan 20 from comment #18)
> > Loosing coverage is ok if it means going from in-process to out-of-process.
> 
> Would we keep running the in-process tests "on the side" in the meantime? 
> Or is that impractical?

As always we need to balance infrastructure load with usefulness of the tests. The emulator is notoriously slow, so running the same suite twice would be costly. I can't really comment on the usefulness of keeping in-process coverage though.

Comment 21

5 years ago
Hi Andrew,

It was mentioned in the comment 14 that there were a lot of test failures after the OOP test has been up-and-running for gfx. May I know if the situation remains the same right now?

I am asking because graphic team is looking for the specific follow-up items from this case. The items are probably on resolving test failure issue, the bugs that are found from the tests, or both. Thank you!!
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 22

5 years ago
(In reply to Ivan Tsay (:ITsay) from comment #21)
> Hi Andrew,
> 
> It was mentioned in the comment 14 that there were a lot of test failures
> after the OOP test has been up-and-running for gfx. May I know if the
> situation remains the same right now?
> 
> I am asking because graphic team is looking for the specific follow-up items
> from this case. The items are probably on resolving test failure issue, the
> bugs that are found from the tests, or both. Thank you!!

Yeah, the situation is still the same. I haven't investigated or filed bugs about any of the issues yet, but if you look at the emulator reftests on cedar (https://tbpl.mozilla.org/?tree=Cedar&jobname=b2g) you'll see what I'm talking about. You'll notice there are two jobs per chunk, the first one is running on physical fedora machines (we are trying to retire these), the second is running on our Amazon ec2 instance. I'd recommend using the first chunk as your baseline since those are failures we know for sure are caused by oop.

Let me know if you have questions about how the tests are being set up.
Flags: needinfo?(ahalberstadt)
Is there any dependency, or preferred ordering of this and bug 818968?
Btw, :cjku is going to manage this form the graphics end of things.
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 24

5 years ago
(In reply to Milan Sreckovic [:milan] from comment #23)
> Is there any dependency, or preferred ordering of this and bug 818968?
> Btw, :cjku is going to manage this form the graphics end of things.

I think bug 818968 is probably the higher priority since I don't think the pool they are currently running on will exist after the mountain view office moves (though I might be wrong?).

:cjku, thanks for helping out with this. Let me know if you have any questions about how they are run or how to reproduce locally.
Flags: needinfo?(ahalberstadt)
(Assignee)

Comment 25

5 years ago
Sure. NI to myself
Flags: needinfo?(cku)
(Reporter)

Comment 26

5 years ago
Un-assigning myself as the ability to run them oop has landed.
Assignee: ahalberstadt → nobody
Status: ASSIGNED → NEW
Looking into the issue for a while,
the type of failed reftests(REFTEST TEST-UNEXPECTED-FAIL) are as below:
1. image difference
   a. timing issue like scrolling
   b. static rendering(most cases)
   c. animation 
2. xul load failed

keep looking into why the most cases failed.
(Assignee)

Updated

5 years ago
Assignee: nobody → alin
Flags: needinfo?(cku)
(Assignee)

Comment 28

5 years ago
Abel,
#2 is not because of diff test fail, you may handle it separately.
For #1, my opinion is to create two diff map images automatically in ref testing while diff detected.
The first diff map contains diff pixels of the first canvas image, while the second one contains diff pixel of the second canvas image.

It does not only save your debugging time, but also help any developers to figure out ref test problem in the future.
(In reply to Abel Lin(alin, abel) from comment #27)
> Looking into the issue for a while,
> the type of failed reftests(REFTEST TEST-UNEXPECTED-FAIL) are as below:
> 1. image difference
>    a. timing issue like scrolling
>    b. static rendering(most cases)
>    c. animation 
> 2. xul load failed
> 
> keep looking into why the most cases failed.

Abel,
Another opinion to debug #1, you got some fail results from border-radius test case.
And the pixels of border-radius test case were generated from canvas and css style under OOP mode.

Please check the pixel result under non-OOP mode to identify the pixel from canvas is wrong or the pixel from css style is wrong under OOP mode.


By the way, please also list some fail test cases in bug. You can attach the detail or using http://www.pastebin.mozilla.org.
(Assignee)

Comment 30

5 years ago
Abel,
Can you still find these error after apply patch in  Bug 916350?
Abel,
In font-inflation test cases, it (layout/reftests/font-inflation/reftest.list) set 3 preference to enable font inflation (https://bugzilla.mozilla.org/show_bug.cgi?id=707855), the preferences are: test-pref(font.size.inflation.emPerLine,15) test-pref(font.size.inflation.forceEnabled,true) test-pref(font.size.inflation.lineThreshold,0)

But the render result looks like no font inflation is enabled. We should check how to set preference correctly in OOP.
Hi vingtetun,

There is a reftest module(layout/reftests/font-inflation) is related to issue(940036) after reftest enabled OOP. The patch(http://hg.mozilla.org/mozilla-central/rev/f33904f2323d) affected the preference settings. Most failure tests(52/81) resulted from preference settings. In non-OOP, reftests are
executed in b2g, so the condition wouldn't hit the patch. However, OOP enabled, reftests are executed in content process, treated as a app, which resulted in failure to enable font inflation.
So I want figure out what the patch fixed.
Do you have any suggestion for this?
Flags: needinfo?(21)
(In reply to Abel Lin(alin, abel) from comment #32)
> Hi vingtetun,
> 
> There is a reftest module(layout/reftests/font-inflation) is related to
> issue(940036) after reftest enabled OOP. The
> patch(http://hg.mozilla.org/mozilla-central/rev/f33904f2323d) affected the
> preference settings. Most failure tests(52/81) resulted from preference
> settings. In non-OOP, reftests are
> executed in b2g, so the condition wouldn't hit the patch. However, OOP
> enabled, reftests are executed in content process, treated as a app, which
> resulted in failure to enable font inflation.
> So I want figure out what the patch fixed.
> Do you have any suggestion for this?

Font inflation has been disabled for mobile websites that use a viewport in bug 706198. 

The patch in http://hg.mozilla.org/mozilla-central/rev/f33904f2323d force a fake viewport for applications that are running on top of B2G. It has been done for backward compatbility purpose, and we hope to be able to remove it in the future (but we need to make sure all apps from the Marketplace are updated to use a viewport before that).

So this code is only useful for backward compatibility on some apps from the marketplace that have been built before the switch to APZC. So imho it won't hurt if this code is hidden behind a pref, or #ifndef TEST or something similar. It may look a bit weird but the only objective for this code in the long term is to be removed.
Flags: needinfo?(21)
Hi, Andrew
I update code on 2/13 and modify the flags as described in https://bugzilla.mozilla.org/show_bug.cgi?id=922680#c5, but it still run reftest in non-oop. It seems bug 960783 https://bugzilla.mozilla.org/show_bug.cgi?id=960783#c29 change the other flags and make reftest not switch to OOP. I'm not sure my understanding is correct or not. Please help to check.

Updated

5 years ago
Depends on: 972697
(Reporter)

Comment 35

5 years ago
Hi Vincent,

Looks like Bill answered you in the other bug.

(In reply to Bill McCloskey (:billm) from comment #33)
> If you want to change b2g reftests so that they run out of process, then you
> should set browser.tabs.remote and browser.tabs.remote.autostart to true.

We can set this pref here:
http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/runreftestb2g.py#417

I'll land it on Cedar.
(Reporter)

Comment 36

5 years ago
Landed: https://hg.mozilla.org/projects/cedar/rev/23fb2aae7aad

I also enabled verbosity which was hiding the message from reftest-content.js indicating whether it was in process or not.
(Assignee)

Updated

5 years ago
No longer depends on: 972697
(Assignee)

Comment 37

5 years ago
Our strategy is to enable passed test case piece by piece here. At the same time, we address and fix bugs of reftest fail, caused by OOP, in Bug 973835, which is a catalog of bugs that we need to fix to enable all the reftest.

After fixing one bug in 973835, we enable the associated reftest modules here(by changing reftest list).
This bug should keep open and upload new white list periodically according to the status of bug 973835, until all the bugs in 973835 fixed.
(In reply to C.J. Ku[:CJKu] from comment #37)
> Our strategy is to enable passed test case piece by piece here. At the same
> time, we address and fix bugs of reftest fail, caused by OOP, in Bug 973835,
> which is a catalog of bugs that we need to fix to enable all the reftest.
> 
> After fixing one bug in 973835, we enable the associated reftest modules
> here(by changing reftest list).
> This bug should keep open and upload new white list periodically according
> to the status of bug 973835, until all the bugs in 973835 fixed.

Bug 773482 seems to do the same thing, but do reftest without OOP enable.
According to bug 973835 (reftests with OOP enabled) and bug 773482 (reftests with OOP disabled) there are many reftests fail and skip in B2G but only few fails/skip in other platforms. Abel and I also found the reftests fails varies when disable Async Pan Zoom in B2G/OOP enable (the fails are not filed to Bugzilla yet).

There should be many different root cause for fail in reftests. I need all your help to improve the fix plan. Here is my assumption, please correct me if I have mistake/misunderstanding:
1. The reftests is used to help us to make sure render result is as design.
2. The TPBL is used to help us narrow down which patch make render wrong.
3. The reftests should all pass no matter we do it in OOP or non-OOP.

Here are two major steps for further improvement:
1. Prevent produce more bugs
2. Fix existing bugs efficiently

To prevent produce more bugs, we'll do:
a. Skip known reftests fails in B2G.
b. Let TPBL works for engineers to make sure each patch does not introduce any side-effect.
c. Discuss with QA team to enable more reftests in different situation like OOP enable/disable or APZ enable/disable.

To fix existing bugs, we'll do:
a. Group potential root cause of reftest fail together, no matter it's OOP or non-OOP, APZ or non-APZ.
b. Need tools like LayerScope to help on reftests
(In reply to Vincent Chen [:vichen] from comment #39)
> To fix existing bugs, we'll do:
> a. Group potential root cause of reftest fail together, no matter it's OOP
> or non-OOP, APZ or non-APZ.

I think going through the existing failures is best done by working through the list of failures and trying to address them.  In other words:

 1. look through a few failures at the beginning of the list (or in various parts of the list) to see if there's a common pattern behind the failures.  (looking in reftest-analyzer is helpful.)

 2. either pick one relatively simple example of that common pattern, or just pick any failure that is one of the simpler failures remaining

 3. debug and fix that failure

Then you see how many tests that fix fixed, and repeat, until most or all of the failures are gone.

> b. Need tools like LayerScope to help on reftests

I'm not sure what LayerScope is, but layout/tools/reftest/reftest-analyzer.xhtml is useful for analyzing reftest logs.
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-8) from comment #40)
> (In reply to Vincent Chen [:vichen] from comment #39)
> > To fix existing bugs, we'll do:
> > a. Group potential root cause of reftest fail together, no matter it's OOP
> > or non-OOP, APZ or non-APZ.
> 
> I think going through the existing failures is best done by working through
> the list of failures and trying to address them.  In other words:
> 
>  1. look through a few failures at the beginning of the list (or in various
> parts of the list) to see if there's a common pattern behind the failures. 
> (looking in reftest-analyzer is helpful.)
> 
>  2. either pick one relatively simple example of that common pattern, or
> just pick any failure that is one of the simpler failures remaining
> 
>  3. debug and fix that failure
> 
> Then you see how many tests that fix fixed, and repeat, until most or all of
> the failures are gone.
> 
> > b. Need tools like LayerScope to help on reftests
> 
> I'm not sure what LayerScope is, but
> layout/tools/reftest/reftest-analyzer.xhtml is useful for analyzing reftest
> logs.

Thank you, dbaron.

Bug 974783 is used to put all fail reftests, one for OOP and one for non-OOP. Will discuss with QA team if we can do reftest for:
1. All green with skip known fail in B2G
2. Current known fail (OOP and non-OOP) for me to check daily status.
We found one group of reftests fail is due to .xul load test (bug 974239). According to bug 782655, we want to enable remote xul test. As remote xul load test will have 15 minutes timeout and increase total reftests test and we suggest to:
1. Add B2G_OOP flag (Bug 974776)
2. Do .xul load test once and skip others (Bug 974780)

Please advise if this is correct.
alah, NI you because bug 782655 assigned to you.
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 44

5 years ago
Don't take bug 782655 to mean we actually care about xul tests. That was a long time ago and I was just trying to get as many running as possible. In fact I'm pretty sure we don't care about them at all on B2G, but you'd probably want to double check with someone like Fabrice or Sicking.
Flags: needinfo?(ahalberstadt)
Fabrice, Sicking
Is it ok if we skip all .xul reftests in B2G for both OOP and non-OOP?
Flags: needinfo?(jonas)
Flags: needinfo?(fabrice)
I'll defer to dbaron. Is there a particular reason you want to do that? I.e. are the XUL tests failing?
Flags: needinfo?(jonas)
(In reply to Vincent Chen [:vichen] from comment #39)
> According to bug 973835 (reftests with OOP enabled) and bug 773482 (reftests
> with OOP disabled) there are many reftests fail and skip in B2G but only few
> fails/skip in other platforms. Abel and I also found the reftests fails
> varies when disable Async Pan Zoom in B2G/OOP enable (the fails are not
> filed to Bugzilla yet).
> 
> There should be many different root cause for fail in reftests. I need all
> your help to improve the fix plan. Here is my assumption, please correct me
> if I have mistake/misunderstanding:
> 1. The reftests is used to help us to make sure render result is as design.
> 2. The TPBL is used to help us narrow down which patch make render wrong.
> 3. The reftests should all pass no matter we do it in OOP or non-OOP.
> 
> Here are two major steps for further improvement:
> 1. Prevent produce more bugs
> 2. Fix existing bugs efficiently
> 
> To prevent produce more bugs, we'll do:
> a. Skip known reftests fails in B2G.
> b. Let TPBL works for engineers to make sure each patch does not introduce
> any side-effect.
> c. Discuss with QA team to enable more reftests in different situation like
> OOP enable/disable or APZ enable/disable.
> 
> To fix existing bugs, we'll do:
> a. Group potential root cause of reftest fail together, no matter it's OOP
> or non-OOP, APZ or non-APZ.
> b. Need tools like LayerScope to help on reftests

Sicking, Fabrice,
Also need your help to comment this. As I'm new at Mozilla and really need 2nd opinion.

From our investigation these days, here is short summary:
1. There are about 7000 reftests
2. About 770-800 is skipped in B2G in current code, but only few is skipped in other platforms. Some of the skip is filed here bug 773482
3. There should be more than 100 reftested fail after enable OOP B2G reftests, these fails are filed here bug 973835.
From 2, 3, we have total 900 reftests fails in B2G, more than 10% of total reftests and much more than other platforms, it looks terrible.

I agree Sicking's comment on https://bugzilla.mozilla.org/show_bug.cgi?id=922680#c16 to disable known fails, in the mean while, we plan to dig into root cause of each reftests fail. Currently, we use bug 973835 to tracking the fix status.
Flags: needinfo?(jonas)
(In reply to Jonas Sicking (:sicking) from comment #46)
> I'll defer to dbaron. Is there a particular reason you want to do that? I.e.
> are the XUL tests failing?

Yes, it's due to XUL reftest fail.

In non-OOP, the fails looks like render wrong http://dxr.mozilla.org/mozilla-central/source/layout/reftests/bugs/reftest.list#155. Many .xul reftests are skipped in B2G, we'll check one by one if we can fix them.

In OOP, the fails looks like no permission and is by design no support now. Bug 782655 and comment #44
Font-inflation reftests fail is followed up here https://bugzilla.mozilla.org/show_bug.cgi?id=972697#c4
So, overall I don't care too much about xul tests on b2g since we don't use xul except for the audio/video controls.

But that's still strange that they fail so much. One reason could be the custom css rules we have to render scrollbars and a few other elements - Can you point me to one expected result and what we actually render?
Flags: needinfo?(fabrice)
(In reply to Fabrice Desré [:fabrice] from comment #50)
> But that's still strange that they fail so much. One reason could be the
> custom css rules we have to render scrollbars and a few other elements - Can
> you point me to one expected result and what we actually render?

Here is xul render wrong: bug 975287 the different part is at scroll bar.
Some fails in OOP, but not XUL, has similar pattern: bug 975254
(Reporter)

Comment 52

5 years ago
(In reply to Vincent Chen [:vichen] from comment #47)
> Sicking, Fabrice,
> Also need your help to comment this. As I'm new at Mozilla and really need
> 2nd opinion.
> 
> From our investigation these days, here is short summary:
> 1. There are about 7000 reftests
> 2. About 770-800 is skipped in B2G in current code, but only few is skipped
> in other platforms. Some of the skip is filed here bug 773482
> 3. There should be more than 100 reftested fail after enable OOP B2G
> reftests, these fails are filed here bug 973835.
> From 2, 3, we have total 900 reftests fails in B2G, more than 10% of total
> reftests and much more than other platforms, it looks terrible.
> 
> I agree Sicking's comment on
> https://bugzilla.mozilla.org/show_bug.cgi?id=922680#c16 to disable known
> fails, in the mean while, we plan to dig into root cause of each reftests
> fail. Currently, we use bug 973835 to tracking the fix status.

Yes the state of b2g reftests is still pretty poor, here's some background on how we got to this point. Basically back in 2012 there was 0 B2G test coverage in tbpl so we were under a lot of pressure to get at least *something* running. Chris Jones ranked each set of reftests in order of importance [1], and I used this list to determine which sets of tests to enable. I didn't even bother trying to enable many of the lower priority ones. I don't have a graphics background, so initially, I just disabled anything that failed. After the harness had stabilized, I tried to get some help looking into all the various failures, but a few disabled reftests were understandably way down low on the priority list at the time. Since then a few more sets of reftests have been enabled, but no one has gone through and looked at all the initially disabled tests. There's a good chance that many of the currently disabled tests would actually pass if we re-enabled them.

Anyway, thanks for looking into all this, and don't be afraid of aggressively re-enabling tests if you have the time. I'm excited that there is some good progress being made on this.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=811779#c5
Note that cjones put the xul tests in the "Just a waste of CPU cycles, please don't run" bucket. I agree we don't need those.
Created attachment 8383391 [details] [diff] [review]
skip-fail-test.diff

ahal,
Here is list which makes reftest/OOP fail based on code 2014/2/27. From my investigation these days:
1. Total about 10500 reftests tests and >4900 tests are tested in B2G, OOP is not enabled.
2. Skip about >200 tests in B2G/OOP (the attached skip list).
3. Two fail types are identified:
- bug 974780: xul load fail
- bug 972697: font-inflation fail
4. The results of reftests in B2G/OOP fail varies among versions, I suggest to skip known fails to help us find side effect. Bug 973835 is used to track fail on B2G/OOP.

We'll keep working on fixing reftests fail to increase test coverage.
Attachment #8383391 - Flags: review?(ahalberstadt)
(Reporter)

Comment 55

5 years ago
Comment on attachment 8383391 [details] [diff] [review]
skip-fail-test.diff

Review of attachment 8383391 [details] [diff] [review]:
-----------------------------------------------------------------

So this passes on try? If so that's awesome! If you want I can also land it on Cedar for you.

::: layout/tools/reftest/runreftestb2g.py
@@ +415,4 @@
>          prefs["browser.firstrun.show.localepicker"] = False
>          prefs["browser.homescreenURL"] = "app://test-container.gaiamobile.org/index.html"
>          prefs["browser.manifestURL"] = "app://test-container.gaiamobile.org/manifest.webapp"
> +        prefs["browser.tabs.remote.autostart"] = True

Is browser.tabs.remote set to True by default? Does it hurt to leave it in anyway to show intent?

@@ +420,5 @@
>          prefs["dom.mozBrowserFramesEnabled"] = True
>          prefs["font.size.inflation.emPerLine"] = 0
>          prefs["font.size.inflation.minTwips"] = 0
>          prefs["network.dns.localDomains"] = "app://test-container.gaiamobile.org"
> +#        prefs["reftest.browser.iframe.enabled"] = False

Just delete this
Attachment #8383391 - Flags: review?(ahalberstadt) → review+
(In reply to Andrew Halberstadt [:ahal] from comment #55)
> Comment on attachment 8383391 [details] [diff] [review]
> skip-fail-test.diff
> 
> Review of attachment 8383391 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> So this passes on try? If so that's awesome! If you want I can also land it
> on Cedar for you.
> 
> ::: layout/tools/reftest/runreftestb2g.py
> @@ +415,4 @@
> >          prefs["browser.firstrun.show.localepicker"] = False
> >          prefs["browser.homescreenURL"] = "app://test-container.gaiamobile.org/index.html"
> >          prefs["browser.manifestURL"] = "app://test-container.gaiamobile.org/manifest.webapp"
> > +        prefs["browser.tabs.remote.autostart"] = True
> 
> Is browser.tabs.remote set to True by default? Does it hurt to leave it in
> anyway to show intent?
> 
> @@ +420,5 @@
> >          prefs["dom.mozBrowserFramesEnabled"] = True
> >          prefs["font.size.inflation.emPerLine"] = 0
> >          prefs["font.size.inflation.minTwips"] = 0
> >          prefs["network.dns.localDomains"] = "app://test-container.gaiamobile.org"
> > +#        prefs["reftest.browser.iframe.enabled"] = False
> 
> Just delete this

Found another reftests fail in try server. I'll check further.

Another problem in try server is Bug 930884 - Intermittent PROCESS-CRASH | Shutdown | application crashed [@ 0x0] [@ egl_window_surface_t::~egl_window_surface_t] [@ GLContextEGL::~GLContextEGL]
Return code: 1. This problem will not happen in my Ubuntu.
(In reply to Vincent Chen [:vichen] from comment #56)
> (In reply to Andrew Halberstadt [:ahal] from comment #55)
> > Comment on attachment 8383391 [details] [diff] [review]
> > skip-fail-test.diff
> > 
> > Review of attachment 8383391 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > So this passes on try? If so that's awesome! If you want I can also land it
> > on Cedar for you.
> > 
> > ::: layout/tools/reftest/runreftestb2g.py
> > @@ +415,4 @@
> > >          prefs["browser.firstrun.show.localepicker"] = False
> > >          prefs["browser.homescreenURL"] = "app://test-container.gaiamobile.org/index.html"
> > >          prefs["browser.manifestURL"] = "app://test-container.gaiamobile.org/manifest.webapp"
> > > +        prefs["browser.tabs.remote.autostart"] = True
> > 
> > Is browser.tabs.remote set to True by default? Does it hurt to leave it in
> > anyway to show intent?
> > 
> > @@ +420,5 @@
> > >          prefs["dom.mozBrowserFramesEnabled"] = True
> > >          prefs["font.size.inflation.emPerLine"] = 0
> > >          prefs["font.size.inflation.minTwips"] = 0
> > >          prefs["network.dns.localDomains"] = "app://test-container.gaiamobile.org"
> > > +#        prefs["reftest.browser.iframe.enabled"] = False
> > 
> > Just delete this
> 
> Found another reftests fail in try server. I'll check further.
> 
> Another problem in try server is Bug 930884 - Intermittent PROCESS-CRASH |
> Shutdown | application crashed [@ 0x0] [@
> egl_window_surface_t::~egl_window_surface_t] [@ GLContextEGL::~GLContextEGL]
> Return code: 1. This problem will not happen in my Ubuntu.

CJ,
The crash do happen on my Ubutu. Need someone to help on Bug 930884.
Flags: needinfo?(cku)
(Assignee)

Comment 58

5 years ago
Bug 930884 assigned to Jerry
Flags: needinfo?(cku)
Created attachment 8385776 [details] [diff] [review]
skip-fail-test-2.diff

1. Pass reftests in try server with only exceptiopn Bug 930884 Intermittent PROCESS-CRASH at the end of reftests.
2. Follow comment #55:
  a. Keep browser.tabs.remote=false (default value).
  b. remove extra comment line
Attachment #8383391 - Attachment is obsolete: true
Attachment #8385776 - Flags: review?(ahalberstadt)
(Reporter)

Comment 60

5 years ago
Comment on attachment 8385776 [details] [diff] [review]
skip-fail-test-2.diff

Review of attachment 8385776 [details] [diff] [review]:
-----------------------------------------------------------------

Is the intent of this patch to enable oop across all branches? If so then you need to set browser.tabs.remote to True. If not then this looks good (though I can't comment on the test enabling/disabling.. if it passes try, wfm).
(Reporter)

Updated

5 years ago
Flags: needinfo?(vichen)
Created attachment 8386501 [details] [diff] [review]
skip-fail-test-3.patch

ahal,
I remove runreftestb2g.py from patch and this patch is used to skip the failed reftests in B2G/OOP based on the version I test and it has passed try server with browser.tabs.remote.autostart=True and reftest.browser.iframe.enabled=true. Supposely, we should pass reftest on OOP or non-OOP or other considion set.

The patch is used to reduce test coverage for engineers to check if new patch introduce new reftests from try server (cedar or others). The fix of reftests fail is on bug 973835. If my understand is correct, the test environment (runreftestb2g.py) will be set by you, so I remove it from patch.
Attachment #8385776 - Attachment is obsolete: true
Attachment #8385776 - Flags: review?(ahalberstadt)
Flags: needinfo?(vichen)

Updated

5 years ago
Attachment #8386501 - Flags: review?(ahalberstadt)
(Assignee)

Updated

5 years ago
Assignee: alin → vichen
(Reporter)

Comment 62

5 years ago
Comment on attachment 8386501 [details] [diff] [review]
skip-fail-test-3.patch

Review of attachment 8386501 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, thanks for clearing that up! Actually, if it passes with iframe.enabled = True, let's land it. One less thing that could potentially break and go unnoticed in the future. Consider this an r+ for your previous patch that you just obsoleted.
Attachment #8386501 - Flags: review?(ahalberstadt) → review+
(Reporter)

Updated

5 years ago
Depends on: 973835
(In reply to Vincent Chen [:vichen] from comment #47)
> From our investigation these days, here is short summary:
> 1. There are about 7000 reftests
> 2. About 770-800 is skipped in B2G in current code, but only few is skipped
> in other platforms. Some of the skip is filed here bug 773482
> 3. There should be more than 100 reftested fail after enable OOP B2G
> reftests, these fails are filed here bug 973835.
> From 2, 3, we have total 900 reftests fails in B2G, more than 10% of total
> reftests and much more than other platforms, it looks terrible.
> 
> I agree Sicking's comment on
> https://bugzilla.mozilla.org/show_bug.cgi?id=922680#c16 to disable known
> fails, in the mean while, we plan to dig into root cause of each reftests
> fail. Currently, we use bug 973835 to tracking the fix status.

It makes me very happy that you think it's terrible that we've disabled over 10% of test. I agree it's terrible! But it's better that we have 90% of the tests running out-of-process on TBPL than 0% of the tests running out-of-process on TBPL which is the current status.

So yeah, let's disable what's needed and then use the tracking bugs that you've linked to to get the remaining tests working. It's much easier to re-enable tests once they are on TBPL anyway. That way you don't have to worry about spending a bunch of work on re-enabling them only to have someone break them again a couple of days later.

So yes, lets disable what's needed and then switch to running reftest out-of-process on TBPL. Then spend time on reenabling as much as we can.
Flags: needinfo?(jonas)
(In reply to Jonas Sicking (:sicking) from comment #63)
> (In reply to Vincent Chen [:vichen] from comment #47)
> > From our investigation these days, here is short summary:
> > 1. There are about 7000 reftests
> > 2. About 770-800 is skipped in B2G in current code, but only few is skipped
> > in other platforms. Some of the skip is filed here bug 773482
> > 3. There should be more than 100 reftested fail after enable OOP B2G
> > reftests, these fails are filed here bug 973835.
> > From 2, 3, we have total 900 reftests fails in B2G, more than 10% of total
> > reftests and much more than other platforms, it looks terrible.
> > 
> > I agree Sicking's comment on
> > https://bugzilla.mozilla.org/show_bug.cgi?id=922680#c16 to disable known
> > fails, in the mean while, we plan to dig into root cause of each reftests
> > fail. Currently, we use bug 973835 to tracking the fix status.
> 
> It makes me very happy that you think it's terrible that we've disabled over
> 10% of test. I agree it's terrible! But it's better that we have 90% of the
> tests running out-of-process on TBPL than 0% of the tests running
> out-of-process on TBPL which is the current status.
> 
> So yeah, let's disable what's needed and then use the tracking bugs that
> you've linked to to get the remaining tests working. It's much easier to
> re-enable tests once they are on TBPL anyway. That way you don't have to
> worry about spending a bunch of work on re-enabling them only to have
> someone break them again a couple of days later.
> 
> So yes, lets disable what's needed and then switch to running reftest
> out-of-process on TBPL. Then spend time on reenabling as much as we can.

Thank you. I have to update some data:
1. The 10% skip is previous report is by text search and due to a) I miss one reftests folder b) Some test are skip by folder, so, the 10% is under-estimate
2. The data from real reftests are:
  a. We have total 10493 tests
  b. 49xx is tested in B2G/InProcess
  c. 47xx is tested in B2G/OOP (about 200 will be skipped after enable OOP)
So, the test coverage is less than 50% now.
B
There is one good news: I have another reftests last week, and found many tests are over-skipped. I'm summarizing the log and reenable these tests this week.
I wouldn't worry too much about the percentage. The first step is to get something enabled on the TBPL front page. Once we have that we should definitely start worrying about getting more (all!) tests turned on.
If a reftest is failing, it should be marked using fails-if(), not skip-if().  skip-if() should be used only for crashes or hangs.  When a reftest is marked using fails-if(), and somebody fixes the problem that is causing it to fail, the harness will report a TEST-UNEXPECTED-PASS that will tell them them to remove the fails-if() marking, which will mean that we'll catch any future regressions of that test.  With skip-if() we never run the test, so we don't know to enable it when somebody fixes it.
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-8) from comment #66)
> If a reftest is failing, it should be marked using fails-if(), not
> skip-if().  skip-if() should be used only for crashes or hangs.  When a
> reftest is marked using fails-if(), and somebody fixes the problem that is
> causing it to fail, the harness will report a TEST-UNEXPECTED-PASS that will
> tell them them to remove the fails-if() marking, which will mean that we'll
> catch any future regressions of that test.  With skip-if() we never run the
> test, so we don't know to enable it when somebody fixes it.

For skip-fail-test-3.patch, 
1. the xul related test still keep skip-if() as we do not support it in B2G/OOP.
2. For the other skip-if(), as there are many skip-if() in current reftests. I'll check them one-by-one and change the failed test to fail-if() in later patch.

dbaron,
Do you agree?
Flags: needinfo?(dbaron)
Yes, that sounds good.  (But I suspect they can be checked faster than one-by-one.)
Flags: needinfo?(dbaron)
(In reply to Andrew Halberstadt [:ahal] from comment #62)
> Comment on attachment 8386501 [details] [diff] [review]
> skip-fail-test-3.patch
> 
> Review of attachment 8386501 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Ok, thanks for clearing that up! Actually, if it passes with iframe.enabled
> = True, let's land it. One less thing that could potentially break and go
> unnoticed in the future. Consider this an r+ for your previous patch that
> you just obsoleted.

ahal,
Please help to land it.
(In reply to Vincent Chen [:vichen] from comment #69)
> (In reply to Andrew Halberstadt [:ahal] from comment #62)
> > Comment on attachment 8386501 [details] [diff] [review]
> > skip-fail-test-3.patch
> > 
> > Review of attachment 8386501 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Ok, thanks for clearing that up! Actually, if it passes with iframe.enabled
> > = True, let's land it. One less thing that could potentially break and go
> > unnoticed in the future. Consider this an r+ for your previous patch that
> > you just obsoleted.
> 
> ahal,
> Please help to land it.

Add NI.
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 71

5 years ago
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-8) from comment #66)
> If a reftest is failing, it should be marked using fails-if(), not
> skip-if().

Agreed. Though many of them were crashes/hangs and most of the rest were intermittents. IMO random-if is the same thing as skip-if except it wastes more machine time.

(In reply to Vincent Chen [:vichen] from comment #69)
> ahal,
> Please help to land it.

Sure. I think you said you pushed it to try, but even if so that was 5 days ago, so here's another run (I'll fix patch authorship before pushing to m-i):
https://tbpl.mozilla.org/?tree=Try&rev=7f2f79b2a38a
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 72

5 years ago
Wait, before I land this.. did you want me to land it on cedar? Or mozilla-inbound?
Flags: needinfo?(vichen)
(In reply to Andrew Halberstadt [:ahal] from comment #72)
> Wait, before I land this.. did you want me to land it on cedar? Or
> mozilla-inbound?

If no other concern, let's land to mozilla-inbound. Two reasons:
1. No code change, just enable reftests. Should be no harm.
2. Need reftests result to do analysis.

edmorley reminds me at Bug 981477 comment #1, to run 20 times of tests to catch intermittent failures.
From my tests these two weeks, the intermittent failures comes from:
1. CSS animation, like scroll bar (Bug 975254)
2. Reftests only capture one page (No bug#)
3. timed out waiting for reftest-wait to be removed (Bug 975911), but in other run of reftests test, the fails case in 975911 changed.

Is any other intermittent we have found but not list above? If yes, please provide me Bug # or some hint.
In the above 3 cases, the cause of fails is not the reftest itself, but the way we do reftest and we should not skip-if() or fails-if() the reftest. 

It would be helpful if we can get reftests result from Cedar or M-I.
Flags: needinfo?(vichen)
(In reply to Vincent Chen [:vichen] from comment #73)
> (In reply to Andrew Halberstadt [:ahal] from comment #72)
> > Wait, before I land this.. did you want me to land it on cedar? Or
> > mozilla-inbound?
> 
> If no other concern, let's land to mozilla-inbound. Two reasons:
> 1. No code change, just enable reftests. Should be no harm.
> 2. Need reftests result to do analysis.
> 
> edmorley reminds me at Bug 981477 comment #1, to run 20 times of tests to
> catch intermittent failures.
> From my tests these two weeks, the intermittent failures comes from:
> 1. CSS animation, like scroll bar (Bug 975254)
> 2. Reftests only capture one page (No bug#)
> 3. timed out waiting for reftest-wait to be removed (Bug 975911), but in
> other run of reftests test, the fails case in 975911 changed.
> 
> Is any other intermittent we have found but not list above? If yes, please
> provide me Bug # or some hint.
> In the above 3 cases, the cause of fails is not the reftest itself, but the
> way we do reftest and we should not skip-if() or fails-if() the reftest. 
> 
> It would be helpful if we can get reftests result from Cedar or M-I.

Bug 982010 is random fail. Different in color.
(Reporter)

Comment 75

5 years ago
(In reply to Vincent Chen [:vichen] from comment #73)
> Is any other intermittent we have found but not list above? If yes, please
> provide me Bug # or some hint.
> In the above 3 cases, the cause of fails is not the reftest itself, but the
> way we do reftest and we should not skip-if() or fails-if() the reftest. 
> 
> It would be helpful if we can get reftests result from Cedar or M-I.

In theory I agree that if it isn't a problem with the reftest itself it shouldn't be skipped. In practice if enabling a test causes intermittent orange, no matter what the cause, we can't land it on mozilla-inbound. Sounds like the tests you want to enable here have some known intermittents and we need to land the fix first (or at the same time) before we can turn them back on.

I'll go ahead and land them on Cedar.. feel free to push potential fixes to Cedar (no review required) and feel free to retrigger tests to get data about them (click on a job and click the blue '+' in the bottom left to retrigger it).

Once we're satisfied that there are no (or very sporadic) intermittents, we can then proceed to land it on mozilla-inbound
(Reporter)

Comment 76

5 years ago
Actually the re-triggers of that last try run were quite green, so just pushed it directly to inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1bcd1fea2e73
(Reporter)

Comment 78

5 years ago
Sigh. Looks like the previous try run didn't have the runreftestb2g.py change in it.
(In reply to Vincent Chen [:vichen] from comment #73)
> edmorley reminds me at Bug 981477 comment #1, to run 20 times of tests to
> catch intermittent failures.
> From my tests these two weeks, the intermittent failures comes from:
> 1. CSS animation, like scroll bar (Bug 975254)
> 2. Reftests only capture one page (No bug#)
> 3. timed out waiting for reftest-wait to be removed (Bug 975911), but in
> other run of reftests test, the fails case in 975911 changed.
> 
> Is any other intermittent we have found but not list above? If yes, please
> provide me Bug # or some hint.
> In the above 3 cases, the cause of fails is not the reftest itself, but the
> way we do reftest and we should not skip-if() or fails-if() the reftest. 

I'm not sure if we're both agreeing here already - so apologies if the following is redundant, but just to be clear:

If landing a change causes us to now have a >5% failure rate, then that change cannot land, even if the failures are due to harness/hardware/whatever & not directly the test(s) being enabled. 

Does the failure rate increase with the landings being discussed in this bug, and if so, from what and to what?

Thanks :-)
(Reporter)

Comment 80

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/730ba60ac7ed

Landed again with the runreftestb2g.py changes removed. This time it's actually the same as the green try run above.
(In reply to Andrew Halberstadt [:ahal] from comment #80)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/730ba60ac7ed
> 
> Landed again with the runreftestb2g.py changes removed. This time it's
> actually the same as the green try run above.

ahal,
Would you please modify runreftestb2g.py to enable OOP and test on try server with both emulator (ics) and emulator-JB? I found we have fixed emulator crash egl_window_surface_t::~egl_window_surface_t (Bug 930884) at Bug 776742.
Flags: needinfo?(ahalberstadt)
BTW, keep in mind that your commit message should be stating what the patch is doing, i.e. "Annotate reftests that fail OOP on B2G". As landed, it's very unclear.
https://developer.mozilla.org/en-US/docs/Developer_Guide/Committing_Rules_and_Responsibilities#Checkin_comment
(Reporter)

Comment 83

5 years ago
(In reply to Vincent Chen [:vichen] from comment #81)
> (In reply to Andrew Halberstadt [:ahal] from comment #80)
> > https://hg.mozilla.org/integration/mozilla-inbound/rev/730ba60ac7ed
> > 
> > Landed again with the runreftestb2g.py changes removed. This time it's
> > actually the same as the green try run above.
> 
> ahal,
> Would you please modify runreftestb2g.py to enable OOP and test on try
> server with both emulator (ics) and emulator-JB? I found we have fixed
> emulator crash egl_window_surface_t::~egl_window_surface_t (Bug 930884) at
> Bug 776742.

Vincent, reftests are already running oop on Cedar. Do you have commit access? You can also push to Cedar without my permission if you need to :)
Flags: needinfo?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #83)
> (In reply to Vincent Chen [:vichen] from comment #81)
> > (In reply to Andrew Halberstadt [:ahal] from comment #80)
> > > https://hg.mozilla.org/integration/mozilla-inbound/rev/730ba60ac7ed
> > > 
> > > Landed again with the runreftestb2g.py changes removed. This time it's
> > > actually the same as the green try run above.
> > 
> > ahal,
> > Would you please modify runreftestb2g.py to enable OOP and test on try
> > server with both emulator (ics) and emulator-JB? I found we have fixed
> > emulator crash egl_window_surface_t::~egl_window_surface_t (Bug 930884) at
> > Bug 776742.
> 
> Vincent, reftests are already running oop on Cedar. Do you have commit
> access? You can also push to Cedar without my permission if you need to :)

ahal,
It seems we only enable reftests on emulator (ICS) but not on JB. Can we enable both? I have some progress on the crash and need to verify it in try server and it's related to platform we port. This is what I test in try server: https://tbpl.mozilla.org/?tree=Try&rev=bfe897476327

Updated

5 years ago
Duplicate of this bug: 974780
(Reporter)

Comment 87

5 years ago
We don't run any tests on emulator-jb at the moment, and there aren't really any plans to do that. We'll probably be skipping ahead to emulator-kk. So you shouldn't need to worry about it.
ahal,
Do you know how to specify symbolsPath from ./mach reftest-remote ?
It seems we miss --symbols-path parameter.

Updated

5 years ago
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 89

5 years ago
It should get passed in automatically if you have symbols built. Just run "make -C objdir-gecko buildsymbols" first. (Side note: I'm trying to get the gecko |mach buildsymbols| commmand available for B2G).
Flags: needinfo?(ahalberstadt)
Created attachment 8396905 [details] [diff] [review]
enable_oop.patch

The biggest problem to enable B2G/OOP reftest is crash after reftests is done. Bug 983489 and Bug 988132. The potential reason for Bug 983489 should be the sequence of cleanup resource. This patch is a work-around to prevent the crash by not ask Chrome process quit after reftests is done.
Suggest use this work-around to enable B2G/OOP reftest first.
Flags: needinfo?(cku)
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 91

5 years ago
As much as I don't like to, I agree that we should land a work around rather than block on the shutdown crashes. B2G mochitests suffer from the same problem and they have been enabled all along.

We can't land your patch as is though because it affects all platforms. You'd need an "#ifdef REFTEST_B2G" around it.
Flags: needinfo?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #89)
> It should get passed in automatically if you have symbols built. Just run
> "make -C objdir-gecko buildsymbols" first. (Side note: I'm trying to get the
> gecko |mach buildsymbols| commmand available for B2G).

FYI "./build.sh buildsymbols" should work.
(In reply to Andrew Halberstadt [:ahal] from comment #91)
> As much as I don't like to, I agree that we should land a work around rather
> than block on the shutdown crashes. B2G mochitests suffer from the same
> problem and they have been enabled all along.
> 
> We can't land your patch as is though because it affects all platforms.
> You'd need an "#ifdef REFTEST_B2G" around it.

ahal,
I should have found root cause of the crash, I'm testing the patch on try server: https://tbpl.mozilla.org/?tree=Try&rev=b527e30e4264

There are two problems:
1. Missing /home/cltbld/talos-slave/test/build/tests/config/mozharness/b2g_emulator_config.py I encounter this problem before (maybe another .py missed), I have to repush patch again.
2. No reftest on debug emulator (to verify Bug 987251).
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 94

4 years ago
The first problem is just because you pushed to try from an old <rev> of m-c. Do a 'pull -u' and push again and it will work (see my post on dev.platform for more details).

There are reftests on debug emulators on Cedar, but they might be busted for completely different reasons than here. If you want you can try landing your changes there, but I would recommend just ignoring them for now and focusing on getting opt OOP. Once that is done, if debug reftests are the next priority we can shift focus to those. I think trying to do both at the same time is too confusing.
Flags: needinfo?(ahalberstadt)
(Assignee)

Comment 95

4 years ago
I think Andrew had answer your question. Instead of blocking ref testing, workaround shutdown crash and file bugs keep tracing/fix those problem is better approach.
Flags: needinfo?(cku)
Created attachment 8403725 [details] [diff] [review]
922680-enable-oop.patch

Updated

4 years ago
Depends on: 981477

Updated

4 years ago
Depends on: 983489
Comment on attachment 8403725 [details] [diff] [review]
922680-enable-oop.patch

Review of attachment 8403725 [details] [diff] [review]:
-----------------------------------------------------------------

To enable B2G/OOP reftest, need to fix 4 bugs:
1. Bug 922680, just enable flag in this patch.
2. Bug 983489, crash when release texture. Fixed, under review.
3. Bug 981477, skip known fail reftests. We plan to enable some, but dbaron did this on Bug 986409, this patch is to skip the fails under B2G/OOP based on dbaron's result.
4. Bug 988132, crash when release NetdClient. This bug is not fixed complete, but the patch reduce the possibility. Network team is working on it.
https://tbpl.mozilla.org/?tree=Try&rev=6b5abea83eed
The only fail is R15 has two fail:
1. Unexpected-pass
2. Crash, it's Bug 988132.
Attachment #8403725 - Flags: review?(ahalberstadt)
ahal,

INFO -  WARNING | leakcheck | refcount logging is off, so leaks can't be detected!

As we have this message at the end of reftest, it seems we have leak detection while doing reftest. Do we have plan to enable it?
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 99

4 years ago
Comment on attachment 8403725 [details] [diff] [review]
922680-enable-oop.patch

Review of attachment 8403725 [details] [diff] [review]:
-----------------------------------------------------------------

Lgtm. I guess we are still blocked on the other three bugs you listed though?
Attachment #8403725 - Flags: review?(ahalberstadt) → review+
(Reporter)

Comment 100

4 years ago
(In reply to Vincent Chen [:vichen] from comment #98)
> ahal,
> 
> INFO -  WARNING | leakcheck | refcount logging is off, so leaks can't be
> detected!
> 
> As we have this message at the end of reftest, it seems we have leak
> detection while doing reftest. Do we have plan to enable it?

I don't know if there are plans to, but I think it would be a good thing to do.. Might be worth looking into after this is all wrapped up.
Flags: needinfo?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #99)
> Comment on attachment 8403725 [details] [diff] [review]
> 922680-enable-oop.patch
> 
> Review of attachment 8403725 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Lgtm. I guess we are still blocked on the other three bugs you listed though?

Yes, still wait for the other 3 bug fixed. As we have fixed 2 of them and I have request for review, and the last is Bug 988132 which is related to network. Network team is fixing it, I've a discussion with Network team if we cannot fix it soon, I'll suggest a work-around to disable this service (USB tethering) when doing reftest.

While doing reftests on local, I cannot test all 10485k tests and it crashed at middle. The crash does not happen in TBPB as we divided reftests to 15 groups and it is not problem of the last reftest when crash happen. It should be accumulated error and make some reftests crash, leak is one of the potential cause.

dbaron,
What's your suggestion if I want to check the leak when doing reftests?
Flags: needinfo?(dbaron)
(In reply to Vincent Chen [:vichen] from comment #101)
> While doing reftests on local, I cannot test all 10485k tests and it crashed
> at middle. The crash does not happen in TBPB as we divided reftests to 15
> groups and it is not problem of the last reftest when crash happen. It
> should be accumulated error and make some reftests crash, leak is one of the
> potential cause.
> 
> dbaron,
> What's your suggestion if I want to check the leak when doing reftests?

I think the first thing to do is try to figure out whether the problem is a leak, rather than debug a leak that you're not sure exists.  Does "MOZ_IGNORE_NUWA_PROCESS=1 ./tools/get_about_memory.py" (that script is in a B2G checkout) work?  I know that script works on devices; I'm not sure how to use it on the emulator.

You should also beable to run the 15 chunks locally just as they run on TBPL, though I'm not sure what the syntax is.

(Is this bug about emulator reftests, B2G desktop reftests, or both?)
Flags: needinfo?(dbaron)
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-7) from comment #102)
> (In reply to Vincent Chen [:vichen] from comment #101)
> > While doing reftests on local, I cannot test all 10485k tests and it crashed
> > at middle. The crash does not happen in TBPB as we divided reftests to 15
> > groups and it is not problem of the last reftest when crash happen. It
> > should be accumulated error and make some reftests crash, leak is one of the
> > potential cause.
> > 
> > dbaron,
> > What's your suggestion if I want to check the leak when doing reftests?
> 
> I think the first thing to do is try to figure out whether the problem is a
> leak, rather than debug a leak that you're not sure exists.  Does
> "MOZ_IGNORE_NUWA_PROCESS=1 ./tools/get_about_memory.py" (that script is in a
> B2G checkout) work?  I know that script works on devices; I'm not sure how
> to use it on the emulator.
> 
> You should also beable to run the 15 chunks locally just as they run on
> TBPL, though I'm not sure what the syntax is.
> 
> (Is this bug about emulator reftests, B2G desktop reftests, or both?)

The crash on my local is not related to leak, it's seccomp sandbox violation. I only test on emulator.
Record here: Bug 981949 Comment 8
Bug 988100 has similar symptom.

Updated

4 years ago
Keywords: checkin-needed

Updated

4 years ago
Attachment #8396905 - Attachment is obsolete: true
Comment on attachment 8386501 [details] [diff] [review]
skip-fail-test-3.patch

Bug 981477 provide an updated list, ignore this one.
Attachment #8386501 - Attachment is obsolete: true
I don't see any recent Try runs here. Please provide a link to one before requesting checkin.
Keywords: checkin-needed
ahal,

This is try result on 4/8, no reftest fails (only one crash by Network module and is fixed now)
https://tbpl.mozilla.org/?tree=Try&rev=6b5abea83eed

This is try result on 4/18
https://tbpl.mozilla.org/?tree=Try&rev=489c14c8e04d

All I change is enable OOP and fix crash after reftests is complete, it means some patch between 4/8 to 4/18 introduce the reftests (>200) fail.

I can skip the failed reftests for enabling OOP, but if we have any method to check which patch make the reftests fail?
Flags: needinfo?(ahalberstadt)
(In reply to Vincent Chen [:vichen] from comment #106)
> ahal,
> 
> This is try result on 4/8, no reftest fails (only one crash by Network
> module and is fixed now)
> https://tbpl.mozilla.org/?tree=Try&rev=6b5abea83eed
> 
> This is try result on 4/18
> https://tbpl.mozilla.org/?tree=Try&rev=489c14c8e04d
> 
> All I change is enable OOP and fix crash after reftests is complete, it
> means some patch between 4/8 to 4/18 introduce the reftests (>200) fail.
> 
> I can skip the failed reftests for enabling OOP, but if we have any method
> to check which patch make the reftests fail?

Too many fails, will check why then update new patch.
(Reporter)

Comment 108

4 years ago
Ouch :(

I think manually bisecting the range is the only option here. Either by running the tests locally (if you can reproduce it locally) or pushing to try based off an old tip.

If you can reproduce against a downloaded nightly, you can use a tool called mozregression to help you out:
https://github.com/mozilla/mozregression

The downside to this is the mach targets won't work with a downloaded nightly.
Flags: needinfo?(ahalberstadt)
(Reporter)

Comment 109

4 years ago
Oh sorry, ignore my mozregression comment, looks like it doesn't work with b2g yet.
(Reporter)

Comment 110

4 years ago
Here's a try push with your patches based on a changeset from Apr 11th (the actual regression range based on the parents of your previous try pushes is Apr 7th - Apr 16th). Note that the nsIAppStartup crash patch you had pushed was empty.

How did I do this?

1) get the pushlog date range of the regression (use the 'parent' changeset of each try push to get the fromchange and tochange values):
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5405d6f4e3c6&tochange=b735e618c2a8

Note that you can't use the patch dates as these are not chronological.

2) pick a changeset roughly halfway between, let's call it <rev>

3) in your mozilla-central clone:
$ hg qpop -a
$ hg update -C <rev>
$ hg qpush <patches>
$ hg push -f try

4) Wait for results, if they pass, edit the 'fromchange' of the above url to be <rev>. If they fail, edit the 'tochange'.

5) go to step 2 until regression range is small enough.
(Reporter)

Comment 111

4 years ago
Also, please update your try syntax to be:
try: -b o -p emulator -u reftest -t none

When you set everything to all, you waste a lot of machine time testing things you don't care about which costs mozilla quite a bit of money. For the future, you can use this site to figure out what syntax you should use based on what you need to test:
http://trychooser.pub.build.mozilla.org/
(Reporter)

Comment 112

4 years ago
Forgot to paste the aforementioned try push (sorry for bug spam):
https://tbpl.mozilla.org/?tree=Try&rev=37b4a8c66a7f
(Assignee)

Updated

4 years ago
Status: NEW → ASSIGNED

Updated

4 years ago
Depends on: 1000722
https://tbpl.mozilla.org/?tree=Try&rev=c88c51437df2

Bug 983489 is fixed.
Bug 981477 mark reftests fail during 4/8-4/18, the failed reftests is recorded in Bug 1000722.
Created attachment 8419928 [details] [diff] [review]
922680-skip-quit.patch

CJ,
As our discussion, I have no confidence to fix Bug 983489 soon. To enable B2G/OOP reftest we can apply this patch as a work-around solution. Please help on this.
Flags: needinfo?(cku)

Updated

4 years ago
Assignee: vichen → cku
(In reply to Andrew Halberstadt [:ahal] from comment #89)
> It should get passed in automatically if you have symbols built. Just run
> "make -C objdir-gecko buildsymbols" first. (Side note: I'm trying to get the
> gecko |mach buildsymbols| commmand available for B2G).

File to Bug 1010726

Updated

4 years ago
Attachment #8419928 - Attachment is obsolete: true
(Reporter)

Comment 119

4 years ago
The crashtest is my fault for forgetting to include it in try. The reftest perma-fail is related to the bitrot I dealt with post try-run. I think new tests were added either today or yesterday that are similar to other ones this patch was skipping.
(Reporter)

Updated

4 years ago
Depends on: 1014192
(Reporter)

Comment 120

4 years ago
Created attachment 8427005 [details] [diff] [review]
Enable oop on b2g emulator reftests

This uses the new command line option from bug 1014192 to enable oop on reftests only.
Attachment #8403725 - Attachment is obsolete: true
Attachment #8427005 - Flags: review?(jgriffin)
Comment on attachment 8427005 [details] [diff] [review]
Enable oop on b2g emulator reftests

Review of attachment 8427005 [details] [diff] [review]:
-----------------------------------------------------------------

This forces emulator reftests to be OOP if you use the mach command; I guess this is what we want.
Attachment #8427005 - Flags: review?(jgriffin) → review+
(Reporter)

Comment 122

4 years ago
I could add an argument to disable it, but I'm not sure why anyone would want to do that.
(In reply to Andrew Halberstadt [:ahal] from comment #122)
> I could add an argument to disable it, but I'm not sure why anyone would
> want to do that.

There's probably no need, at least as soon as TBPL is running them OOP.
No longer depends on: 1015200
(Assignee)

Updated

4 years ago
Flags: needinfo?(cku)
https://hg.mozilla.org/mozilla-central/rev/6ae6efd7f5e6
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Blocks: 1025961
You need to log in before you can comment on or make changes to this bug.