Closed Bug 1354937 Opened 7 years ago Closed 7 years ago

Intermittent devtools/client/aboutdebugging/test/browser_service_workers_status.js | Service worker is currently running - Got Stopped, expected Running

Categories

(DevTools :: about:debugging, defect, P2)

defect

Tracking

(firefox55 wontfix, firefox56 wontfix, firefox57 fixed, firefox58 fixed)

RESOLVED FIXED
Firefox 58
Tracking Status
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- fixed
firefox58 --- fixed

People

(Reporter: aryx, Assigned: jdescottes)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed:timing])

Attachments

(1 file)

+++ This bug was initially created as a clone of Bug #1348547 +++

https://treeherder.mozilla.org/logviewer.html#?job_id=89942056&repo=mozilla-central

[task 2017-04-09T19:29:59.571437Z] 19:29:59     INFO - Console message: [JavaScript Error: "TypeError: this.conn is null" {file: "resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/server/actors/root.js" line: 515}]
[task 2017-04-09T19:29:59.572157Z] 19:29:59     INFO - Console message: [JavaScript Error: "TypeError: this.conn is null" {file: "resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/server/actors/root.js" line: 515}]
[task 2017-04-09T19:29:59.575029Z] 19:29:59     INFO - Console message: [JavaScript Error: "TypeError: this.conn is null" {file: "resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/server/actors/root.js" line: 515}]
[task 2017-04-09T19:29:59.575168Z] 19:29:59     INFO - Buffered messages finished
[task 2017-04-09T19:29:59.575535Z] 19:29:59     INFO - TEST-UNEXPECTED-FAIL | devtools/client/aboutdebugging/test/browser_service_workers_status.js | Service worker is currently running - Got Stopped, expected Running
[task 2017-04-09T19:29:59.575715Z] 19:29:59     INFO - Stack trace:
[task 2017-04-09T19:29:59.576081Z] 19:29:59     INFO -     chrome://mochikit/content/browser-test.js:test_is:928
[task 2017-04-09T19:29:59.576590Z] 19:29:59     INFO -     chrome://mochitests/content/browser/devtools/client/aboutdebugging/test/browser_service_workers_status.js:null:44
[task 2017-04-09T19:29:59.577067Z] 19:29:59     INFO -     Tester_execTest@chrome://mochikit/content/browser-test.js:752:9
[task 2017-04-09T19:29:59.577487Z] 19:29:59     INFO -     Tester.prototype.nextTest</<@chrome://mochikit/content/browser-test.js:672:7
[task 2017-04-09T19:29:59.577669Z] 19:29:59     INFO -     SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<@chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:791:59
Low frequency intermittent.
Aboutdebugging triage. Filter on TRIAGE-JD201705

The way this test works, we load a service worker which should automatically timeout after 2000ms.
The test wants to observe the transition from the state "Running" to "Stopped" (which should occur after the 2 seconds timeout).
On a slow platform, it's possible we can miss the "Running" step and directly go to "Stopped".

We should try to increase the sw timeout [1] to see if it improves the situation.

Try pushes:
- with same timeout https://treeherder.mozilla.org/#/jobs?repo=try&revision=108792cea15d54f3d153e14d82bd791f5cc12795
- with 5000ms https://treeherder.mozilla.org/#/jobs?repo=try&revision=6428755c237241f1b484957c6f4aef3f96ade254

[1] http://searchfox.org/mozilla-central/rev/b0e1da2a90ada7e00f265838a3fafd00af33e547/devtools/client/aboutdebugging/test/browser_service_workers_status.js#10
Priority: -- → P3
See Also: → 1352586
Is it possible this is failing due to multi-e10s being turned on?  Can we configure the test to only run in single process mode?
Flags: needinfo?(jdescottes)
We are calling normally forcing a single content process before the test by doing:

> return new Promise(done => {
>   let options = { "set": [
>     // Enable service workers.
>     ["dom.serviceWorkers.enabled", true],
>     // Accept workers from mochitest's http.
>     ["dom.serviceWorkers.testing.enabled", true],
>     // Force single content process.
>     ["dom.ipc.processCount", 1],
>   ]};
>   SpecialPowers.pushPrefEnv(options, done);
>   Services.ppmm.releaseCachedProcesses();
> });

called at http://searchfox.org/mozilla-central/rev/1a054419976437d0778a2b89be1b00207a744e94/devtools/client/aboutdebugging/test/browser_service_workers_status.js#15
Flags: needinfo?(jdescottes)
on August 27th, this started failing often:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1354937

sort of all over the map, but linux64* debug seems to be the dominant failure pattern.  I will ni? if this picks up in frequency.
:jdescottes, this is failing quite often on debug configurations, can you help look into this (or find someone to look into it)?
Flags: needinfo?(jdescottes)
Whiteboard: [stockwell needswork]
Let's try to stop relying on mutations and to increase the Service Worker timeout.
try baseline: https://treeherder.mozilla.org/#/jobs?repo=try&revision=187082501439b53a4ad0b8f2f1e7b2b01c58ed55
try with fix: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5de5aaa262b825cf368d7866b4869ecefe1d815a
Assignee: nobody → jdescottes
Status: NEW → ASSIGNED
Flags: needinfo?(jdescottes)
Priority: P3 → P2
Seems ok with a slight twist: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b5642eef2691039562d6726b4579cdb10a1cf525

Can't guarantee we will go over the "registering" step so use a relaxed predicate in waitUntil().
Comment on attachment 8910990 [details]
Bug 1354937 - stop using mutations in browser_service_workers_status.js;

https://reviewboard.mozilla.org/r/182454/#review188306

The issue with mutation observers and React is that it is hard to predict how many mutations will be really done until we get the expected value. Ideally only one, to assert that we do not render() too many times.
So you can see these checks as both feature and performance testing.

A middle step approach would be to usage "mutation observer + until" instead of "settimeout + until".
So that instead of waiting for arbitrary time, you would wait for next mutation before checking for a value.

But it sounds reasonable to use waitUntil if that ends up being too hard to maintain.

::: devtools/client/aboutdebugging/test/browser_service_workers_status.js:51
(Diff revision 1)
> +    info("Service worker is currently running");
> +    yield waitUntil(() => status.textContent != "Running", 100);
> +  }
>  
> -  yield waitForMutation(serviceWorkersElement, { attributes: true, subtree: true });
>    is(status.textContent, "Stopped", "Service worker is currently stopped");

Shouldn't we replace these all three checks with:
`yield waitUntil(() => status.textContent != "Stopped", 100);`
?
Attachment #8910990 - Flags: review?(poirot.alex) → review+
(In reply to Alexandre Poirot [:ochameau] from comment #27)
> Comment on attachment 8910990 [details]
> Bug 1354937 - stop using mutations in browser_service_workers_status.js;
> 
> https://reviewboard.mozilla.org/r/182454/#review188306
> 
> The issue with mutation observers and React is that it is hard to predict
> how many mutations will be really done until we get the expected value.
> Ideally only one, to assert that we do not render() too many times.
> So you can see these checks as both feature and performance testing.
> 
> A middle step approach would be to usage "mutation observer + until" instead
> of "settimeout + until".
> So that instead of waiting for arbitrary time, you would wait for next
> mutation before checking for a value.
> 
> But it sounds reasonable to use waitUntil if that ends up being too hard to
> maintain.

A waitUntil based on mutations sounds like a good fit for aboutdebugging, it should be as stable as waitUntil but less intensive. The only potential "instability" would be caused by selecting a bad root node for listening to the mutations. It will require a bit of testing, but that sounds promising. 
> 
> ::: devtools/client/aboutdebugging/test/browser_service_workers_status.js:51
> (Diff revision 1)
> > +    info("Service worker is currently running");
> > +    yield waitUntil(() => status.textContent != "Running", 100);
> > +  }
> >  
> > -  yield waitForMutation(serviceWorkersElement, { attributes: true, subtree: true });
> >    is(status.textContent, "Stopped", "Service worker is currently stopped");
> 
> Shouldn't we replace these all three checks with:
> `yield waitUntil(() => status.textContent != "Stopped", 100);`
> ?

Correct :) I kept them around for "documentation" rather than assertion. But it's true that it might give the wrong impression. I will remove the intermediary checks asserts and add a comment explaining that we can't really verify intermediary stages due to potential race conditions.
Pushed by jdescottes@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f9a826fd9a94
stop using mutations in browser_service_workers_status.js;r=ochameau
https://hg.mozilla.org/mozilla-central/rev/f9a826fd9a94
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 58
Whiteboard: [stockwell needswork] → [stockwell fixed:timing]
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.