Closed Bug 914584 Opened 11 years ago Closed 8 years ago

[meta] Get tests passing on OOP B2G Desktop

Categories

(Firefox OS Graveyard :: GonkIntegration, defect)

All
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: fabrice, Unassigned)

References

Details

(Whiteboard: [systemsfe])

Attachments

(2 files, 4 obsolete files)

Attached patch linux-oop.patch (obsolete) — Splinter Review
We can have nice things on Gonk and Linux desktop!
Attachment #802240 - Flags: review?(jst)
Comment on attachment 802240 [details] [diff] [review]
linux-oop.patch

Yup, we need to do this, having a mode where we don't use OOP is not at all useful for us...
Attachment #802240 - Flags: review?(jst) → review+
Blocks: 914843
Gregor, can you check that these prefs work on mac?
Assignee: nobody → fabrice
Attachment #802240 - Attachment is obsolete: true
Attachment #825014 - Flags: review?(anygregor)
No longer blocks: 914843
Depends on: 914843
Depends on: 891882
Comment on attachment 825014 [details] [diff] [review]
enable oop on desktop for mac and linux

Review of attachment 825014 [details] [diff] [review]:
-----------------------------------------------------------------

\o/ works for me!
Attachment #825014 - Flags: review?(anygregor) → review+
Is this ready to land? Anything I can do to help it along, if not? It will soon (not quite yet) be blocking me landing a bunch of stuff.
Flags: needinfo?(fabrice)
Depends on: 935672
(In reply to Nick Cameron [:nrc] from comment #5)
> Is this ready to land? Anything I can do to help it along, if not? It will
> soon (not quite yet) be blocking me landing a bunch of stuff.

We are also blocked on bug 891882.
Flags: needinfo?(fabrice)
Blocks: 924403
Pushed to try, will land to unblock gfx people if all is well (the situation will not be worse than non-oop wrt to touch events):

https://tbpl.mozilla.org/?tree=Try&rev=4ebdcadefd81
(In reply to Fabrice Desré [:fabrice] from comment #7)
> Pushed to try, will land to unblock gfx people if all is well (the situation
> will not be worse than non-oop wrt to touch events):
> 
> https://tbpl.mozilla.org/?tree=Try&rev=4ebdcadefd81

The build red looks to be transient, but the Gu red looks real and I guess related to this. I have no idea how to interpret the logs.
(In reply to Fabrice Desré [:fabrice] from comment #7)
> Pushed to try, will land to unblock gfx people if all is well (the situation
> will not be worse than non-oop wrt to touch events):
> 
> https://tbpl.mozilla.org/?tree=Try&rev=4ebdcadefd81

Is the red on Try a concern? Or do you think there is an easy fix? Do you need any help from gfx people?
Flags: needinfo?(fabrice)
We need to fix the Gu failures, yes. If someone has time to help investigate that would be great!
Flags: needinfo?(fabrice)
Milan is there anyone who knows things about b2g desktop who could take a look at this? It is blocking removing MT OpenGL and all the things that that blocks. If not, then I will investigate next week.
Flags: needinfo?(milan)
@fabrice: https://github.com/mozilla-b2g/gaia/pull/13941 <- this should also work as a hacky way to test (with existing builds) correct?
@jgriffin: Is there someone you can spare to help us debug/fix?

--- 

Started looking into this:

Seems like there are a few bugs I found (either in the test or elsewhere) when using:

# touch events

(no error messages yet)


# app transitions (like activities)
  b2g-desktop stdout +1ms 1385108586446	Marionette	DEBUG	Got request: switchToFrame, data: {"name":"switchToFrame","parameters":{"element":"{397b3276-f248-2c45-8bf0-ed1e7b8e0d0d}"},"to":"0","session":"6-b2g"}, id: {f41ec015-d974-b749-ab20-4e952a70579a}

  b2g-desktop stdout +1ms 1385108586447	Marionette	INFO	Switched to frame: {"frameValue":"{397b3276-f248-2c45-8bf0-ed1e7b8e0d0d}"}

  b2g-desktop stdout +0ms 1385108586447	Marionette	INFO	frame-manager load script: [object ChromeMessageSender]

  b2g-desktop stderr +1s [Parent 27988] WARNING: waitpid failed pid:27991 errno:10: file /builds/slave/try-osx64_g-000000000000000000/build/ipc/chromium/src/base/process_util_posix.cc, line 254
[Parent 27988] WARNING: Failed to deliver SIGKILL to 27991!(3).: file /builds/slave/try-osx64_g-000000000000000000/build/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc, line 118

^^^ this should not hang but we never get a response back from marionette


# window.close:

We send:

  b2g-desktop stdout +1ms 1385108354751	Marionette	DEBUG	Got request: execute, data: {"name":"executeScript","parameters":{"script":"return (function () { window.close(); }.apply(this, arguments));","args":[]},"to":"0","session":"6-b2g"}, id: {c205dbca-5727-344a-b11c-f0d79a45c062}

And marionette never responds (hangs forever) which means window.close is busted or...

---

I hope this does not have to be said but we can't break a large majority of the tests we have in gaia this far into 1.3. We need to dig into some of these more and (hopefully) we only have to xfail some portion of them.
Flags: needinfo?(jgriffin)
Flags: needinfo?(fabrice)
Try push with just OMTC enabled: https://tbpl.mozilla.org/?tree=Try&rev=8a1c8852c47e

Should be interesting to see if it is this or the other prefs which causes the problems. And this is the only bit that blocks my work :-)
Flags: needinfo?(milan)
The app transitions error looks like a child process crash; it doesn't look at first glance like it's caused by Marionette, although it may be exposed by it.

How can we reproduce the window.close() error...is there a specific test which causes this?
Flags: needinfo?(jgriffin)
(In reply to Nick Cameron [:nrc] from comment #13)
> Try push with just OMTC enabled:
> https://tbpl.mozilla.org/?tree=Try&rev=8a1c8852c47e
> 
> Should be interesting to see if it is this or the other prefs which causes
> the problems. And this is the only bit that blocks my work :-)

That looks pretty bad, but it is because I pushed my patch on top of inbound, not m-c, and got a bad pull. Interestingly, Linux b2g desktop Gu is green, so I think it is OK to land this and that the OMTC pref is not the culprit for the above errors. I'll do another try push to confirm...
Try push looks good (https://tbpl.mozilla.org/?tree=Try&rev=72b8efac2ee7).

This patch just enables OMTC and ignores the other prefs for now. I would like to land this asap since it doesn't interfere with debugging the other issues.
Attachment #8337294 - Flags: review?(fabrice)
Comment on attachment 8337294 [details] [diff] [review]
Turn on OMTC only for b2g desktop

Moving to bug 936339
Attachment #8337294 - Attachment is obsolete: true
Attachment #8337294 - Flags: review?(fabrice)
No longer blocks: 924403
https://hg.mozilla.org/mozilla-central/rev/b1dd4d55f3f0
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
That landed with the wrong bug number (should have been bug 936339). Reopening.
Status: RESOLVED → REOPENED
Flags: needinfo?(fabrice)
Resolution: FIXED → ---
Attached patch enable-desktop-oop.patch (obsolete) — Splinter Review
Rebase, carrying r+
Attachment #825014 - Attachment is obsolete: true
Attachment #8342519 - Flags: review+
Whats the status here?
(In reply to Gregor Wagner [:gwagner] from comment #22)
> Whats the status here?

Waiting on bug 891882 to get working touch events.
(In reply to Fabrice Desré [:fabrice] from comment #23)
> (In reply to Gregor Wagner [:gwagner] from comment #22)
> > Whats the status here?
> 
> Waiting on bug 891882 to get working touch events.

Are we good to land this now since the dependency has landed?
Flags: needinfo?(fabrice)
(In reply to Jason Smith [:jsmith] from comment #24)
> (In reply to Fabrice Desré [:fabrice] from comment #23)
> > (In reply to Gregor Wagner [:gwagner] from comment #22)
> > > Whats the status here?
> > 
> > Waiting on bug 891882 to get working touch events.
> 
> Are we good to land this now since the dependency has landed?

Yep, I'm doing a sanity check on a local build and then I'll land.
Flags: needinfo?(fabrice)
(In reply to Fabrice Desré [:fabrice] from comment #26)
> Looks fine locally.
> 
> https://hg.mozilla.org/integration/b2g-inbound/rev/5ed4a1817540

Backed out for turning Gu tests permanently orange in https://hg.mozilla.org/integration/b2g-inbound/rev/60da449145be
Depends on: 973107
Blocks: 971135
No longer blocks: 971135
Blocks: 971135
No longer blocks: 971135
Blocks: 971135
No longer blocks: 971135
Assignee: fabrice → kyle
Turning on OOP and running a try for linux/linux-64 b2g desktop with all b2g tests on just to see what happens.

https://tbpl.mozilla.org/?tree=Try&rev=a5711ebea019
Ok apparently try didn't parse that last one right so here's another run

https://tbpl.mozilla.org/?tree=Try&rev=3b2a765634be
Looks like network usage and cost control unit tests fall over. Those are basically the same thing so I'm going to guess there's a shared mock that's not working OOP or something.
I tested this manually by downloading the build and running it against a Gaia built profile. When the keyboard is opened the whole screen goes black, covering everything including the status bar.

The UI tests don't seem to be affected by this so we should be wary of false positives on the Gi and Gu suites for these Try runs.

Shout out if I can be of more help testing locally. I can do a "Travis-like" try run too.
blocking-b2g: --- → backlog
Whiteboard: [systemsfe]
(In reply to Zac C (:zac) from comment #32)
> The UI tests don't seem to be affected by this so we should be wary of false
> positives on the Gi and Gu suites for these Try runs.

Huh, weird, my local build ran fine. I'll check out what tbpl generated. Just curious, what distro/version are you on?
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #33)
> (In reply to Zac C (:zac) from comment #32)
> > The UI tests don't seem to be affected by this so we should be wary of false
> > positives on the Gi and Gu suites for these Try runs.
> 
> Huh, weird, my local build ran fine. I'll check out what tbpl generated.
> Just curious, what distro/version are you on?

Ubuntu 13.04. Tried comparing your local build to the TBPL one, perhaps some difference in the Gaia profile? However I did use try the profile that comes with the TBPL build and also a self-build profile the same as which Travis uses (as I was concerned about unexpected Travis bustage when this goes in).
Integration tests all seem to run fine, so may just be dealing with unit tests and possible graphics bug issues.
Reran try. Down to 1 error in clock's onblocking event when trying to delete the db.

https://tbpl.mozilla.org/?tree=Try&rev=64723e712869
Hmm, might want to use -u all in the try syntax; the current run doesn't seem to be picking up gaia-ui-tests correctly (which is probably a bug in infrastructure somewhere).
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #38)
> Reran try. Down to 1 error in clock's onblocking event when trying to delete
> the db.

Have you fixed issues in Gaia tests or was it issues in your patch ?

> 
> https://tbpl.mozilla.org/?tree=Try&rev=64723e712869
To help here, I made a patch in bug 980251 to make it a lot easier to run tests in B2G Desktop.

After applying the patch, you can do:
* run "bin/gaia-test -d" ("d" is for "Desktop", ie "B2G Desktop")
* run "make test-agent-test" (with an optional "APP=<appname>" if you want it)
Kyle, with your Try build again it's still coming up with a black box covering the whole window when you open the keyboard. It definitely does not do that on a nightly releng build for example as I compared them back to back with even the same Gaia profile.

Is there anything else I can try/debug for you?
(In reply to Zac C (:zac) from comment #42)
> 
> Is there anything else I can try/debug for you?

Thanks for covering that Zac, I've actually seen that too while playing with the OOP build locally now. If you could file and follow up with gfx (and make that bug block this one) that'd be awesome.
(In reply to Julien Wajsberg [:julienw] from comment #40)
> (In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #38)
> > Reran try. Down to 1 error in clock's onblocking event when trying to delete
> > the db.
> 
> Have you fixed issues in Gaia tests or was it issues in your patch ?
> 
> > 
> > https://tbpl.mozilla.org/?tree=Try&rev=64723e712869

I did nothing new when running that test. Just rebased the patch on top of b2g-i and reran the try to see if anything changed. Obviously it did. Gonna try again now and see what happens, because that many tests turning into heisenbugs would just be nuts.
Gu is available for that build, it's just hidden (append &showall=1). It looks like b2g crashed which is why it's hidden and not related to this patch.
(In reply to Zac C (:zac) from comment #45)
> Gu is available for that build, it's just hidden (append &showall=1). It
> looks like b2g crashed which is why it's hidden and not related to this
> patch.

Actually looking again the crash could be related; there is a pattern; all tests that use the Messages app failed to load. Let me try this manually.

Link to the report:
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/bcfdc8fc3d5e6b961e33f28bdaadf06d68493a15ce93a39d884a934a3ca8f5d9836303c01c90290f265b4f1663e5943e90efde74ff1267e3b977075202af9090
Yep just verified locally that the Messages app fails to launch in this Try build. It starts to open but then crashes instantly.

The other failure (test_system_message) is checking that a system message can open the Test Container app but this can be replicated manually just by clicking the app's icon. The symptoms are the same as Messages app.
Depends on: 980462
(In reply to Zac C (:zac) from comment #47)
> Yep just verified locally that the Messages app fails to launch in this Try
> build. It starts to open but then crashes instantly.

Ok, I just verified this too. Now I'm really confused as to why the sms app integration tests I ran didn't flag this.
qdot, in Gu you can ignore the Settings is undefined failures, they were fixed when the tree was reopened yesterday.

For the other failures it looks like the Messages app is crashing on launch (same as c#47).
Component: General → GonkIntegration
I'm going on vacation and figured I should list out the errors I'm still seeing locally. Thus all of the new dependencies for this bug. The major problem here is that I'm not sure who else will be able to replicate these, as I can't replicate them on tbpl (I get /maybe/ 1-2 unit test errors per run there), and I haven't taken the time to set up my own travis to run OOP there yet. I may try that when I get back, or if someone else wants to take that on, be my guest.
https://tbpl.mozilla.org/?tree=Try&rev=02d5c81b4d91&showall=1

Unit tests pass, UI Tests have lots of timeouts though. Whether those are intermittents, I'm not sure sure.
Kyle, FYI, I fixed all unit test failures when running in the current B2G Desktop. Last unlanded is bug 993961. Should make things easier for you.
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #52)
> https://tbpl.mozilla.org/?tree=Try&rev=02d5c81b4d91&showall=1
> UI Tests have lots of timeouts though. Whether those are
> intermittents, I'm not sure sure.

These look like new failures compared to B2g-i at the moment. There aren't actually many known intermittents at the moment
Unfortunately the html report is broken (https://bugzilla.mozilla.org/show_bug.cgi?id=986032) so it's a bit trickier to diagnose this locally.
However it definitely looks like the Messages/SMS app is still failing to load and that would account for about 4 failures.
One appears to be a new one around touch events but we know about that on device, too ( https://bugzilla.mozilla.org/show_bug.cgi?id=995989 )

The remaining two I am not sure of, they'd need some local investigation.
Aw hell. I totally flaked on applying the messages patch from bug 962988 before that try. I'll run it again now with that to see what it changes.
Agh. Sorry for the wrong link... Any idea why it ran all of our tests 5-6 times?

Looks like the wallpaper test (bug 987400) is definitely still an issue.
I asked the sheriffs to get involved when I was trying to find out why the original link didn't work, they might have retriggered the tests.

On Gu it looks like the SMS patch had no effect.

This test:
test_calendar_flick_through_months we know to be failing on device because of a Marionette problem (bug 995989) with touch events. But it is also passing on TBPL on non-OOP builds which is why it is enabled here. I'm going to ni? mdas about whether this bug is sensitive to OOP/non-OOP and then we can decide if it's valid to disable it or not.
Flags: needinfo?(mdas)
(In reply to Zac C (:zac) from comment #59)
> I asked the sheriffs to get involved when I was trying to find out why the
> original link didn't work, they might have retriggered the tests.

Ah, sorry about that, typo'd while pasting. :/

> On Gu it looks like the SMS patch had no effect.

Well, it did have an effect. SMS does at least come up without crashing now. You just can't actually /do/ anything in it. I built it locally with the patch applied, and none of the buttons in the app work due to the fact that we have no mozMobileMessaging object. 

I'm wondering if we can turn SMS ui tests off for OOP at the moment?
I thought they were already disabled on TBPL for B2G Desktop for the same reason. I think they are enabled for the emulator, and even then not all of them because some of them need to actually send SMS.
(In reply to Kyle Machulis [:kmachulis] [:qdot] from comment #60)
> 
> Well, it did have an effect. SMS does at least come up without crashing now.
> You just can't actually /do/ anything in it. I built it locally with the
> patch applied, and none of the buttons in the app work due to the fact that
> we have no mozMobileMessaging object. 
> 
> I'm wondering if we can turn SMS ui tests off for OOP at the moment?

Progress!

Some SMS app tests that don't send SMS are enabled on TBPL and Travis. I don't think we should just disable the tests; desktopb2g underpins Travis and TBPL so it's not something we can just regress a whole app's functionality and say we'll deal with it later. 

There are also more tests on Travis that cannot be enabled on TBPL and it's possible some of those will fail too; we really need all the test coverage we can get here.
(In reply to Zac C (:zac) from comment #62)

> There are also more tests on Travis that cannot be enabled on TBPL and it's
> possible some of those will fail too; we really need all the test coverage
> we can get here.

I've been operating under the assumption that travis was going away, so I just haven't worried about it. Has this changed?
This will eventually happen but we're saying this since we started using Travis, so, well...
(In reply to Zac C (:zac) from comment #59)
> This test:
> test_calendar_flick_through_months we know to be failing on device because
> of a Marionette problem (bug 995989) with touch events. But it is also
> passing on TBPL on non-OOP builds which is why it is enabled here. I'm going
> to ni? mdas about whether this bug is sensitive to OOP/non-OOP and then we
> can decide if it's valid to disable it or not.

That bug is fallout from us changing our code to deal with APZ (async pan and zoom). It's likely that APZ works differently in non-oop environments, so that's probably why it passes there.
Flags: needinfo?(mdas)
thanks mdas, I'll add that bug as a blocker.
Depends on: 995989
As bug 1007918 points out, we've now got OOP builds running on cedar. Copying over :armenzg's comment:

Running in here:
https://tbpl.mozilla.org/?tree=Cedar&jobname=oop&rev=6f73745c79b0

A log:
https://tbpl.mozilla.org/php/getParsedLog.php?id=39104509&tree=Cedar&full=1

11 tests fail out of 52 with TimeoutExceptions and MarionetteExceptions.

Most of these tests already have bugs here, so we need to start trimming those down in order to get these running on b2g-i/m-i/m-c.

raise NoSuchElementException(message=message, status=status, stacktrace=stacktrace)
TEST-UNEXPECTED-FAIL | test_clock_set_alarm.py test_clock_set_alarm.TestClockSetAlarm.test_clock_set_alarm | NoSuchElementException: NoSuchElementException: Unable to locate element: repeat-menu
Bug 972688 - Intermittent test_clock_add_alarm_multiple_times.py test_clock_add_alarm_multiple_times.TestClockAddAlarmMultipleTimes.test_clock_add_alarm_multiple_times | test_clock_set_alarm.py test_clock_set_alarm.TestClockSetAlarm.test_clock_set_alarm | TEST-UNEXPECTED-FAIL | test_sms_add_contact.py test_sms_add_contact.TestSmsAddContact.test_sms_add_contact | TimeoutException: TimeoutException: Connection timed out
Bug 982995 - Intermittent TEST-UNEXPECTED-FAIL | test_sms_add_contact.py test_sms_add_contact.TestSmsAddContact.test_sms_add_contact | (after TimeoutException: TimeoutException: Timed out after 10.0 seconds) TEST-UNEXPECTED-FAIL | test_sms_add_contact.py test_sms_add_contact.TestSmsAddContact.test_sms_add_contact | MarionetteException: MarionetteException: Please start a session
Bug 982995 - Intermittent TEST-UNEXPECTED-FAIL | test_sms_add_contact.py test_sms_add_contact.TestSmsAddContact.test_sms_add_contact | (after TimeoutException: TimeoutException: Timed out after 10.0 seconds) TEST-UNEXPECTED-FAIL | test_sms_contact_input_validation.py test_sms_contact_input_validation.TestContactValidation.test_sms_contact_validation | TimeoutException: TimeoutException: Connection timed out
TEST-UNEXPECTED-FAIL | test_sms_contact_input_validation.py test_sms_contact_input_validation.TestContactValidation.test_sms_contact_validation | MarionetteException: MarionetteException: Please start a session
TEST-UNEXPECTED-FAIL | test_sms_contact_match.py test_sms_contact_match.TestContactMatch.test_contact_match | TimeoutException: TimeoutException: Connection timed out
TEST-UNEXPECTED-FAIL | test_sms_contact_match.py test_sms_contact_match.TestContactMatch.test_contact_match | MarionetteException: MarionetteException: Please start a session
TEST-UNEXPECTED-FAIL | test_mms_add_subject.py test_mms_add_subject.TestMmsAddSubject.test_mms_add_subject | TimeoutException: TimeoutException: Connection timed out
TEST-UNEXPECTED-FAIL | test_mms_add_subject.py test_mms_add_subject.TestMmsAddSubject.test_mms_add_subject | MarionetteException: MarionetteException: Please start a session
TEST-UNEXPECTED-FAIL | test_inbox_to_settings.py test_inbox_to_settings.TestSettingsFromInbox.test_settings_from_inbox | TimeoutException: TimeoutException: Connection timed out
TEST-UNEXPECTED-FAIL | test_inbox_to_settings.py test_inbox_to_settings.TestSettingsFromInbox.test_settings_from_inbox | MarionetteException: MarionetteException: Please start a session
Return code: 10
Blocks: 999719
Context of this bug has changed pretty drastically since it started, but I'd rather not lose the history and dependencies, so I'm gonna adjust the title to be a little more accurate.
Summary: Turn on OOP on all b2g linux builds → Get unit/integration tests passing on OOP B2G Desktop
Comment on attachment 8342519 [details] [diff] [review]
enable-desktop-oop.patch

Marking patch obsolete since this can now be chosen via a command line option.
Attachment #8342519 - Attachment is obsolete: true
Kyle, I think all unit tests are now passing in OOP, can you confirm?
Hi Julien,
Is this where you're looking at?
https://tbpl.mozilla.org/?tree=Cedar&jobname=oop&rev=cc0f17eb478f

Kyle,
Does this bug also cover fixing the mochitest-1-oop? (bug 1007916)
Does now! :)
Depends on: 1007916
Summary: Get unit/integration tests passing on OOP B2G Desktop → [meta] Get tests passing on OOP B2G Desktop
Depends on: 1016705
Depends on: 1016713
blocking-b2g: backlog → ---
Assignee: kyle → nobody
As far as I'm aware, B2G desktop is dead in favor of Mulet. Closing this, reopen if I'm wrong.
Status: REOPENED → RESOLVED
Closed: 11 years ago8 years ago
Resolution: --- → FIXED
Fixing resolve reason.
Resolution: FIXED → WONTFIX
You need to log in before you can comment on or make changes to this bug.