Closed Bug 1035226 Opened 10 years ago Closed 9 years ago

Running Gaia integration tests against Mulet

Categories

(Firefox OS Graveyard :: Gaia, defect)

All
Android
defect
Not set
normal

Tracking

(firefox42 fixed)

RESOLVED FIXED
FxOS-S4 (07Aug)
Tracking Status
firefox42 --- fixed

People

(Reporter: gerard-majax, Assigned: gerard-majax)

References

Details

(Whiteboard: [systemsfe])

Attachments

(6 files, 5 obsolete files)

We wakt to run Gaia Integration tests against a mulet build
This bug is for tracking other bugs.
Depends on: 1016030
Depends on: 1035231
Attached file integration-mulet.log (obsolete) —
So, current status:

> 292 passing (1h)
>  48 pending
>  26 failing

That's not that bad !
Whiteboard: [systemsfe]
Depends on: 1036415
Attached file mulet-gijs.log
Result of a run against current master, with the fix for bug 1053185 relanded:

>  310 passing (2h)
>  65 pending
>  64 failing

This was run by:
> $ xvfb-run -a make test-integration RUNTIME=/home/alex/codaz/Mozilla/b2g/gecko/obj-mulet/dist/firefox/firefox shared/test/integration/tbpl-manifest.json

What should we do now ?
Attachment #8452154 - Attachment is obsolete: true
Flags: needinfo?(poirot.alex)
Can you try running it with:
  TEST_MANIFEST=./shared/test/integration/tbpl-manifest.json
?

I don't understand why tests pass when running against b2g desktop, do they??
By default we are using local-manifest.json, which contains no blacklist at all,
so I'm expecting to see the same tests failing when running on b2g desktop,
as tbpl-manifest.json explicitely disable many tests on desktop due to the lack of shims.
Flags: needinfo?(poirot.alex)
This modifies our mozharness script to pass RUNTIME to 'make test-integration' for Mulet.
Attachment #8499066 - Flags: review?(poirot.alex)
Assignee: lissyx+mozillians → jgriffin
Builders added:
+ Ubuntu VM 12.04 x64 Mulet cedar opt test gaia-js-integration-1
+ Ubuntu VM 12.04 x64 Mulet cedar opt test gaia-js-integration-2
+ Ubuntu VM 12.04 x64 Mulet cedar opt test gaia-js-integration-3
+ Ubuntu VM 12.04 x64 Mulet cedar opt test gaia-js-integration-4
Attachment #8499076 - Flags: review?(jlund)
Oops, didn't mean to re-assign this, just doing the infra bits.
Assignee: jgriffin → lissyx+mozillians
Comment on attachment 8499066 [details] [diff] [review]
Pass RUNTIME to Gij for Mulet,

Review of attachment 8499066 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just some aesthetics comments.

::: scripts/gaia_integration.py
@@ +50,5 @@
> +        ]
> +
> +        # for Mulet
> +        if 'firefox' in self.binary_path:
> +            cmd += ['RUNTIME=%s' % self.binary_path]

Two comments:
- Can't we use `env` instead of hacking cmd string like that?
  env['RUNTIME'] = self.binary_path
Or RUNTIME is one of these variable that can only be passed as make argument?
- And may be doing it no matter what is binary path? (It sounds less magical to pass an explicit path to the runtime to test rather than depending on putting the binary in a arbitrary path.)
Attachment #8499066 - Flags: review?(poirot.alex) → review+
(In reply to Alexandre Poirot [:ochameau] from comment #8)
> Comment on attachment 8499066 [details] [diff] [review]
> Pass RUNTIME to Gij for Mulet,
> 
> Review of attachment 8499066 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good, just some aesthetics comments.
> 
> ::: scripts/gaia_integration.py
> @@ +50,5 @@
> > +        ]
> > +
> > +        # for Mulet
> > +        if 'firefox' in self.binary_path:
> > +            cmd += ['RUNTIME=%s' % self.binary_path]
> 
> Two comments:
> - Can't we use `env` instead of hacking cmd string like that?
>   env['RUNTIME'] = self.binary_path
> Or RUNTIME is one of these variable that can only be passed as make argument?

I haven't tried it, but some of the arguments are of that type, so it seems to make sense to be consistent.

> - And may be doing it no matter what is binary path? (It sounds less magical
> to pass an explicit path to the runtime to test rather than depending on
> putting the binary in a arbitrary path.)

Yep, I'll make this non-conditional.
Attachment #8499076 - Flags: review?(jlund) → review+
(In reply to Jonathan Griffin (:jgriffin) from comment #5)
> Created attachment 8499066 [details] [diff] [review]
> Pass RUNTIME to Gij for Mulet,
> 
> This modifies our mozharness script to pass RUNTIME to 'make
> test-integration' for Mulet.

Mozharness patch, with changes suggested above: https://hg.mozilla.org/build/mozharness/rev/5c900c15ff70
(In reply to Jonathan Griffin (:jgriffin) from comment #10)
> (In reply to Jonathan Griffin (:jgriffin) from comment #5)
> > Created attachment 8499066 [details] [diff] [review]
> > Pass RUNTIME to Gij for Mulet,
> > 
> > This modifies our mozharness script to pass RUNTIME to 'make
> > test-integration' for Mulet.
> 
> Mozharness patch, with changes suggested above:
> https://hg.mozilla.org/build/mozharness/rev/5c900c15ff70

I restored the original patch and landed, since passing RUNTIME for the b2gdesktop case broke the tests on cedar:

https://hg.mozilla.org/build/mozharness/rev/97f6c98a5437
Committed code from this bug has been rolled out to production.
Committed code from this bug has been rolled out to production.
(whoops, sorry for writing that twice!)
Can we close this ? I'm unable to run those as of now, as documented in bug 1100345, while it was working previously.
Depends on: 1100345
Note that we may just end up running tests on the new taskcluster infra as it just works in matter of minutes...
I'll submit taskcluster patch to bug 1099238.
If there is value in running them on the old infra, please shout.
I'm not sure whether there's value in fixing these here or not, but we probably should until they're running successfully in TC and being sheriffed.
Attachment #8572742 - Flags: review?(jlund)
Assignee: lissyx+mozillians → jgriffin
Status: NEW → ASSIGNED
Sorry, didn't mean to reassign.
Assignee: jgriffin → lissyx+mozillians
Comment on attachment 8572742 [details] [diff] [review]
Pass a config file when running Gij on mulet on cedar,

Review of attachment 8572742 [details] [diff] [review]:
-----------------------------------------------------------------

reading bug history suggests that I missed this in my earlier review. who needs configs anyway!?
Attachment #8572742 - Flags: review?(jlund) → review+
What is the status of this bug ? I've hacked a little bit and it seems green enough, doesn't it?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b95f3c102aa5&exclusion_profile=false
Flags: needinfo?(jgriffin)
Flags: needinfo?(coop)
It does look fine.  You should ask James Lal to turn it on; he's handling B2G-specific TC scheduling, I believe.
Flags: needinfo?(jgriffin)
James ? See comment 25 :)
Flags: needinfo?(jlal)
Hrm- what is the question here ? You do not need permission from me (or anyone actually :)) to land this if it's green aside from the reviews you have already gotten.
Flags: needinfo?(jlal)
(In reply to James Lal [:lightsofapollo] from comment #28)
> Hrm- what is the question here ? You do not need permission from me (or
> anyone actually :)) to land this if it's green aside from the reviews you
> have already gotten.

^^ This.
Flags: needinfo?(coop)
(In reply to James Lal [:lightsofapollo] from comment #28)
> Hrm- what is the question here ? You do not need permission from me (or
> anyone actually :)) to land this if it's green aside from the reviews you
> have already gotten.

Well I want to know:
 - if people are working on this
 - if the patch in this try request is the proper way to deal with it
 - if the triggered run looks okay to you
Here is a recent mass retrigger: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1524decd5e94

So it looks like Gij 9 would need some love. Ryan what is your advice on this for the failure rate of others? There's a lot of blue that gets green in the end, so technically it's not orange, but it probably means there are intermittents there. Too much would mean wasting resources?
Flags: needinfo?(ryanvm)
Attached patch Run Gij on Mulet r=... (obsolete) — Splinter Review
Rebased on current master.
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=81480b83ce3e
Attached patch Run Gij on Mulet r=... (obsolete) — Splinter Review
Typo in the groupName
Attachment #8638902 - Attachment is obsolete: true
(In reply to Alexandre LISSY :gerard-majax from comment #33)
> Created attachment 8638902 [details] [diff] [review]
> Run Gij on Mulet r=...
> 
> Rebased on current master.
> Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=81480b83ce3e

That has much much more green \o/
Comment on attachment 8638909 [details] [diff] [review]
Run Gij on Mulet r=...

Greg, can you review this? Also, once it lands, what are the steps to make it visible and useful?
Attachment #8638909 - Flags: review?(garndt)
I've spent the last two weeks trying to get Gij running green enough to turn off the auto-retries, as they've been a disaster from a test stability standpoint. So no, I'm not willing to regress that on a new platform. Feel free to disable your way to victory, though. Seems that's the most-likely way you're going to get to green in a timely manner with this suite.
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #37)
> I've spent the last two weeks trying to get Gij running green enough to turn
> off the auto-retries, as they've been a disaster from a test stability
> standpoint. So no, I'm not willing to regress that on a new platform. Feel
> free to disable your way to victory, though. Seems that's the most-likely
> way you're going to get to green in a timely manner with this suite.

Does it means we need only green, or a one/two blue are okay?
Flags: needinfo?(ryanvm)
Ryan, on this new try https://treeherder.mozilla.org/#/jobs?repo=try&revision=81480b83ce3e I suspect it contains the results of your efforts.

I see more blue on Mulet than B2G Desktop for Gij jobs 3, 4, 7, 9, 11, 12, 17 and 20. Would it be possible that we:
 - land the current changes to run Gij on Mulet for the trees you sherrifs
 - only show for now the test suites that are 100% green
 - that I continue to work on either fixing or disabling the remaining failures that makes blue ?
(In reply to Alexandre LISSY :gerard-majax from comment #38)
> Does it means we need only green, or a one/two blue are okay?

Consider anything blue to be orange w/ auto-retries disabled. At that point, it comes down to meeting the Job Visibility requirements (<5% failure rate, etc).
https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy

(In reply to Alexandre LISSY :gerard-majax from comment #39)
> Would it be possible that we:
>  - land the current changes to run Gij on Mulet for the trees you sherrifs
>  - only show for now the test suites that are 100% green
>  - that I continue to work on either fixing or disabling the remaining
> failures that makes blue ?

Sounds fine to me.

By chance, do you know what needs to be done to turn off the auto-retries? If it can be done per-suite, let's get that done on Mulet first as well.
Flags: needinfo?(ryanvm)
I think it's the 'rerun' field of the yml file. I'll continue working on this with this field removed. Let's go with the plan then! Thanks!
Attached patch Run Gij on Mulet (obsolete) — Splinter Review
This time, let's disable reruns so we only get either green or orange. That
should help deciding regarding the job visibility criterions

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6715020461d5
Attachment #8638909 - Attachment is obsolete: true
Attachment #8638909 - Flags: review?(garndt)
Attachment #8638927 - Flags: review?(garndt)
OOC, why does the groupName need to be changed for this? Mulet is already included elsewhere in the platform, so it seems redundant to include it there as well. It actually makes things like filtering on Treeherder more complicated, so it'd be preferable to avoid changing it unless you absolutely have to for some reason.
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #43)
> OOC, why does the groupName need to be changed for this? Mulet is already
> included elsewhere in the platform, so it seems redundant to include it
> there as well. It actually makes things like filtering on Treeherder more
> complicated, so it'd be preferable to avoid changing it unless you
> absolutely have to for some reason.

I'd be happy that you show me the value you would like, I don't have any specific reason for this other than exposing Mulet's name. But if you tell me you already have the information coming down from elsewhere and this is making your life bad, no problem :).

So, I should keep "groupName: Gaia JS Integration test" ?
Flags: needinfo?(ryanvm)
(In reply to Alexandre LISSY :gerard-majax from comment #42)
> Created attachment 8638927 [details] [diff] [review]
> Run Gij on Mulet
> 
> This time, let's disable reruns so we only get either green or orange. That
> should help deciding regarding the job visibility criterions
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=6715020461d5

So: Gij 3, Gij 11, Gij 17, Gij 19 and Gij 20 are in bad shape.
(In reply to Alexandre LISSY :gerard-majax from comment #42)
> Created attachment 8638927 [details] [diff] [review]
> Run Gij on Mulet
> 
> This time, let's disable reruns so we only get either green or orange. That
> should help deciding regarding the job visibility criterions
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=6715020461d5

On Gij 3, it's FTU PseudoLocalization that is intermittent.

On Gij 11, it's System LockScreen related tests that are intermittents.

On Gij 17, it's localization tests in verticalhome that are intermittents.

For Gij 19 and 20, there is no test reported in the failures so I'd consider them related to the Gaia test infra unless someone can explain me :)

I think that each of you is responsible for apps impacted. Could you please have a look at those intermittents?
Flags: needinfo?(sfoster)
Flags: needinfo?(kgrandon)
Flags: needinfo?(gweng)
Flags: needinfo?(gaye)
Flags: needinfo?(aus)
(In reply to Alexandre LISSY :gerard-majax from comment #46)
> On Gij 17, it's localization tests in verticalhome that are intermittents.

I've pushed an attempted fix for this in bug 1117630.
Flags: needinfo?(kgrandon)
(In reply to Alexandre LISSY :gerard-majax from comment #46)
> On Gij 3, it's FTU PseudoLocalization that is intermittent.

For this one there appears to not be a stack, but my guess is that we need to change the wait after selecting a new language.

Currently we wait for mozL10n.readyState === 'complete', but it's possible to be in that state before selecting a new language. An alternative approach might be to use executeAsyncScript(), calling marionetteScriptFinished() after a 'localized' event.
(In reply to Kevin Grandon :kgrandon from comment #48)
> (In reply to Alexandre LISSY :gerard-majax from comment #46)
> > On Gij 3, it's FTU PseudoLocalization that is intermittent.
> 
> For this one there appears to not be a stack, but my guess is that we need
> to change the wait after selecting a new language.
> 
> Currently we wait for mozL10n.readyState === 'complete', but it's possible
> to be in that state before selecting a new language. An alternative approach
> might be to use executeAsyncScript(), calling marionetteScriptFinished()
> after a 'localized' event.

I'm currently trying this fix in bug 1181419.
Let's handle the FTU one in bug 1181419.
Depends on: 1117630, 1181419
Flags: needinfo?(sfoster)
(In reply to Kevin Grandon :kgrandon from comment #47)
> (In reply to Alexandre LISSY :gerard-majax from comment #46)
> > On Gij 17, it's localization tests in verticalhome that are intermittents.
> 
> I've pushed an attempted fix for this in bug 1117630.

Thanks, let's check this here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d93af9c3c2ee
(In reply to Alexandre LISSY :gerard-majax from comment #51)
> (In reply to Kevin Grandon :kgrandon from comment #47)
> > (In reply to Alexandre LISSY :gerard-majax from comment #46)
> > > On Gij 17, it's localization tests in verticalhome that are intermittents.
> > 
> > I've pushed an attempted fix for this in bug 1117630.
> 
> Thanks, let's check this here:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=d93af9c3c2ee

Kevin, seems like it is not fixed :(
Flags: needinfo?(kgrandon)
Attachment #8638927 - Flags: review?(garndt) → review+
So, setting checkin-needed for attachment 8638927 [details] [diff] [review]. This should make those tests ran on all try and then once we have it we will be able to turn on suites to be displayed for sherrifs :)
Keywords: checkin-needed
Depends on: 1187714
Depends on: 1187715
Depends on: 1187716
Depends on: 1187717
Depends on: 1187718
Added dependent bugs to track greening each test suite.
Since you NI me in the individual bug, clean this one.
Flags: needinfo?(gweng)
(In reply to Alexandre LISSY :gerard-majax from comment #44)
> So, I should keep "groupName: Gaia JS Integration test" ?

I think that the groupName/job name, etc, should be as generic as possible. Metadata like platform/flavor shouldn't be included as that information is already available elsewhere. Adding redundant information to the name just makes filtering rules, data mining, etc, more difficult to do.

Does that make sense? :)
Flags: needinfo?(ryanvm)
Attached patch Run Gij on Mulet (obsolete) — Splinter Review
Fixed groupName
Attachment #8638927 - Attachment is obsolete: true
Attachment #8639270 - Flags: review+
Depends on: 1187886
Attached patch Run Gij on MuletSplinter Review
Added the missing bits so that it is ran on all trees
Attachment #8639270 - Attachment is obsolete: true
Attachment #8639446 - Flags: review+
Keywords: checkin-needed
(In reply to Alexandre LISSY :gerard-majax from comment #52)
> Kevin, seems like it is not fixed :(

Will try to handle this in bug 1187716. Thanks.
Flags: needinfo?(kgrandon)
https://hg.mozilla.org/mozilla-central/rev/71ad4ff41fe7
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
I'm going to reset my ni? flag here since the actual specific issues are filed into separate bugs. I'll report my findings there.
Flags: needinfo?(aus)
Depends on: 1189286
Target Milestone: --- → FxOS-S4 (07Aug)
Flags: needinfo?(gaye)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: