Closed
Bug 1081529
Opened 10 years ago
Closed 9 years ago
Un-hide Marionette(Mnw) tests on B2G when they meet visibility standards
Categories
(Tree Management Graveyard :: Visibility Requests, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: RyanVM, Unassigned)
References
Details
Basically perma-fail and unowned. Both Mn and Mnw have been hidden on trunk. Aurora34/b2g34 likely to follow.
Reporter | ||
Comment 1•10 years ago
|
||
John, I'm hiding these on Try but am leaving Gaia Try alone for now. Let me know if you'd prefer me to follow suit there as well.
Flags: needinfo?(jhford)
Reporter | ||
Updated•10 years ago
|
OS: Windows 8.1 → Gonk (Firefox OS)
Updated•10 years ago
|
Summary: Un-hide Marionette tests on B2G when they meet visibility standards → Un-hide Marionette(Mnw) tests on B2G when they meet visibility standards
Comment 2•10 years ago
|
||
Splitting this into Mn/Mnw even though it might be the similar root cause
Updated•10 years ago
|
Flags: needinfo?(jhford)
Reporter | ||
Comment 3•10 years ago
|
||
FYI, I ran a Try push off recent b2g-inbound, and Mn is currently permafailing in test_click_scrolling.py (lots of bug 1078177 and also another failure as well). Mnw is currently sitting on almost exactly a 10% failure rate (98/1000 runs were orange/red).
Example Mn failure log:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3034902&repo=try
There appear to be some common oranges on Mnw (bug 1025284, bug 1078276, bug 1025289, bug 1029296, and bug 1020930 to name a few), but there's still a pretty long tail of one-off failures too. If someone wants to go through them, here's a link to get you started:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=f1c72c1a3203&searchQuery=webapi (you'll need to click the visibility toggle button in the upper-right to actually see anything)
Overall, I'd say things aren't as bleak as they were when this bug was first filed (10% is lower than I expected for Mnw, TBH), but timeouts are still the death of us in webapi. While there appear to be some tests that are more susceptible to failure than others, my expectation is that disabling commonly-failing tests will just move the failures to other ones instead. There still appears an underlying issue with the harness and/or emulator environment that contributes to these problems.
Updated•10 years ago
|
Component: Marionette → Visibility Requests
Product: Testing → Tree Management
Version: unspecified → ---
Comment 4•10 years ago
|
||
Hi Ryan, MNW is really important for b2g webapi. Do you think we could un-hide MNW back first? :)
I know there are still some orange happened in MNW [1]. We have already landed two bugs trying to improve it, like bug 1143596 for test_getthreads.js and bug 1143628 for test_massive_incoming_delete.js. Let's keep improving it.
[1] https://treeherder.mozilla.org/#/jobs?repo=b2g-inbound&exclusion_profile=false&filter-searchStr=MNW
Flags: needinfo?(ryanvm)
Reporter | ||
Comment 5•10 years ago
|
||
What's the current failure rate? Looks like we're still well north of 5% based on the link you gave?
https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy#Low_intermittent_failure_rate
Sorry for not including a link to the policy in this bug previously. Anyway, that page should give you a good feel for what needs to be done for Mnw to be unhidden again.
Flags: needinfo?(ryanvm)
Comment 6•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #5)
> What's the current failure rate? Looks like we're still well north of 5%
> based on the link you gave?
> https://wiki.mozilla.org/Sheriffing/
> Job_Visibility_Policy#Low_intermittent_failure_rate
>
> Sorry for not including a link to the policy in this bug previously. Anyway,
> that page should give you a good feel for what needs to be done for Mnw to
> be unhidden again.
I see, thanks for this information.
Updated•10 years ago
|
Updated•10 years ago
|
Blocks: HiddenAutomationTests
Comment hidden (obsolete, typo) |
Comment 10•9 years ago
|
||
(In reply to Josh Cheng [:josh] from comment #8)
> Hi Edgar,
> Are you still working on this?
> Thanks!
Yes, I am still working on this.
Quick update current status: we found a emulator adb hang issue (bug 1207039) which contributes to the random timeout problems (bug 1153709, bug 1154215). I believe marionette tests can get a big improvement with fixing the bug 1207039.
Comment 11•9 years ago
|
||
Hi Ryan, we have fixed a bunch of bugs which improves the stability of MNW a lot. The most important one is that we fix the random timeout issue (bug 1153709, bug 1154215), so if any test is not stable enough, you could just disable it which won't move the failures to other ones.
MNW is now on 5/129(~3%) failure rate (https://treeherder.mozilla.org/#/jobs?repo=try&revision=769d200ca8ca&exclusion_profile=false&group_state=expanded). Could MNW be unhidden again based on current status?
Thank you.
Flags: needinfo?(ryanvm)
Reporter | ||
Comment 12•9 years ago
|
||
Nice work! A few things:
1) Can you please make sure those remaining failures get filed so that they're starrable once Mnw is made visible again?
2) Has Mnw only been greened up on trunk/master or is the intent to get it unhidden on the release branches as well?
3) Can you please respond to a few items from the Job Visibility Page [1] checklist (with links where applicable) to verify that we're not missing anything?
* Has an active owner
* Has sufficient documentation
* Must avoid patterns known to cause non deterministic failures
Thanks!
[1] https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy
Flags: needinfo?(ryanvm) → needinfo?(echen)
Comment 13•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #12)
> Nice work! A few things:
>
> 1) Can you please make sure those remaining failures get filed so that
> they're starrable once Mnw is made visible again?
Done: bug 1224986, bug 1224990 and bug 1224992.
>
> 2) Has Mnw only been greened up on trunk/master or is the intent to get it
> unhidden on the release branches as well?
Only on master. Most of the fixes are not landed in release branches.
Comment 14•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #12)
> 3) Can you please respond to a few items from the Job Visibility Page [1]
> checklist (with links where applicable) to verify that we're not missing
> anything?
> * Has an active owner
Ken Chang would be the active owner.
> * Has sufficient documentation
> * Must avoid patterns known to cause non deterministic failures
Here is the link I found from MDN: https://developer.mozilla.org/en-US/docs/Mozilla/QA/Marionette/Marionette_JavaScript_Tests
Flags: needinfo?(echen)
Reporter | ||
Comment 15•9 years ago
|
||
Tomcat/Wes, please look this information (and recent trunk results) over and decide if Mnw is ready for unhiding or not.
Flags: needinfo?(wkocher)
Flags: needinfo?(cbook)
Mnw seems to have failed nine times out of the most recent 90ish runs on inbound, so it's still got about a 10% failure rate. However, most of those are all timing out in the same test: https://treeherder.mozilla.org/logviewer.html#?job_id=18854065&repo=mozilla-inbound
If you can disable test_mobile_operator_names_plmnlist.js and if that doesn't move the timeout to some other test, the failure rate would be around 2%, which would be fine for unhiding, imo.
Flags: needinfo?(wkocher) → needinfo?(echen)
Flags: needinfo?(cbook)
Comment 17•9 years ago
|
||
(In reply to Wes Kocher (:KWierso) from comment #16)
> Mnw seems to have failed nine times out of the most recent 90ish runs on
> inbound, so it's still got about a 10% failure rate. However, most of those
> are all timing out in the same test:
> https://treeherder.mozilla.org/logviewer.html#?job_id=18854065&repo=mozilla-
> inbound
>
> If you can disable test_mobile_operator_names_plmnlist.js and if that
> doesn't move the timeout to some other test, the failure rate would be
> around 2%, which would be fine for unhiding, imo.
The failure rate of test_mobile_operator_names_plmnlist.js is higher than I thought. I filed bug 1234746 for the timeout and disabled the test first.
Flags: needinfo?(echen)
Comment 18•9 years ago
|
||
(In reply to Wes Kocher (:KWierso) from comment #16)
> Mnw seems to have failed nine times out of the most recent 90ish runs on
> inbound, so it's still got about a 10% failure rate. However, most of those
> are all timing out in the same test:
> https://treeherder.mozilla.org/logviewer.html#?job_id=18854065&repo=mozilla-
> inbound
>
> If you can disable test_mobile_operator_names_plmnlist.js and if that
> doesn't move the timeout to some other test, the failure rate would be
> around 2%, which would be fine for unhiding, imo.
I have disabled test_mobile_operator_names_plmnlist.js, is MNW ready for unhide?
https://treeherder.allizom.org/#/jobs?repo=b2g-inbound&exclusion_profile=false&filter-searchStr=mnw&group_state=expanded&fromchange=8ad77c0ff487
Thank you.
Flags: needinfo?(wkocher)
Looks much better, thanks! I retriggered a bunch. Out of 325 runs, there were 11 failures, around 3%. Unhidden.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(wkocher)
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Tree Management → Tree Management Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•