Open Bug 1425175 Opened 2 years ago Updated 2 days ago

Intermittent /service-workers/service-worker/skip-waiting-using-registration.https.html | Test skipWaiting while a client is using the registration - assert_equals: Controller state should be activating expected "activating" but got "activated"

Categories

(Core :: DOM: Service Workers, defect, P2)

defect

Tracking

()

Tracking Status
firefox60 --- disabled
firefox61 --- disabled

People

(Reporter: intermittent-bug-filer, Assigned: aiakab, NeedInfo)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, leave-open, Whiteboard: [stockwell disabled])

Attachments

(3 files, 3 obsolete files)

Filed by: ncsoregi [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=151555341&repo=autoland

https://queue.taskcluster.net/v1/task/BBcGh38PT32pC0wUFQXXSA/runs/0/artifacts/public/logs/live_backing.log

INFO - 
6546
15:36:20     INFO - TEST-UNEXPECTED-FAIL | /service-workers/service-worker/skip-waiting-using-registration.https.html | Test skipWaiting while a client is using the registration - assert_equals: Controller state should be activating expected "activating" but got "activated"
6547
15:36:20     INFO - saw_controllerchanged<@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:25:11
6548
15:36:20     INFO - promise callback*@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:14:33
6549
15:36:20     INFO - Test.prototype.step@https://web-platform.test:8443/resources/testharness.js:1489:20
This intermittent is due to bug 1293277.  Unfortunately there is not a good solution to fix it until we finish moving SWM to the parent process.

When skipWaiting() activates a waiting working it requires us to:

1. Post a task to fire the controller change event.
2. Post a task to update the worker's state to activated.

Posting two tasks like this guarantees ordering.

After bug 1293277, though, step (1) requires us to round trip some IPC through the parent-process.  The worker state update, though, is still just a normal runnable.  This means sometimes the ordering is incorrect.

Eventually we will move the SWM to the parent process and the state update will also require IPC.  That should fix this.

Note, guaranteeing ordering like this, though, may be another reason to use PBackground for ServiceWorkerManager actors.  FYI :asuth.
Priority: P5 → P3
In the last 7 days we have 30 failures.
They occur on Linux (opt) Linux x64 (pgo), linux32-stylo-disabled, linux64-stylo-disabled (opt), macosx64-nightly (opt), OS X 10.10 (opt), Windows 7 (opt, pgo).

Recent log example: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=162512713&lineNumber=7204

and a relevant part of it:

13:08:19     INFO - TEST-UNEXPECTED-FAIL | /service-workers/service-worker/skip-waiting-using-registration.https.html | Test skipWaiting while a client is using the registration - assert_equals: Controller state should be activating expected "activating" but got "activated"
13:08:19     INFO - saw_controllerchanged<@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:25:11
13:08:19     INFO - promise callback*@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:14:33
13:08:19     INFO - Test.prototype.step@https://web-platform.test:8443/resources/testharness.js:1494:20
13:08:19     INFO - promise callback*promise_test@https://web-platform.test:8443/resources/testharness.js:539:31
13:08:19     INFO - @https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:9:1
13:08:19     INFO - TEST-OK | /service-workers/service-worker/skip-waiting-using-registration.https.html | took 318ms
Flags: needinfo?(mdaly)
Whiteboard: [stockwell needswork]
Please see comment 2.  I have already investigated this failure and it needs our multi-e10s refactor to be fixed.
Flags: needinfo?(mdaly)
Priority: P3 → P2
is there someone working on the multi-e10s work?  I don't see a specific bug we are waiting on
Flags: needinfo?(bkelly)
Bug 1231208.  I can't set the dependency without creating a cycle, etc.
Flags: needinfo?(bkelly)
There have been 34 failures in the last week.

Failures per
platform:
- OS X 10.10: 22
- Linux x64: 7
- linux32-stylo-disabled: 3
- linux64-stylo-disabled: 1
- Linux: 1

build type:
- opt: 31
- pgo: 3

Here is a recent log file and a snippet with the failure:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=165689405&lineNumber=6455

16:28:12     INFO - TEST-START | /service-workers/service-worker/skip-waiting-using-registration.https.html
16:28:13     INFO - PID 1281 | 
16:28:13     INFO - PID 1281 | ###!!! [Parent][DispatchAsyncMessage] Error: PClientSourceOp::Msg___delete__ Route error: message sent to unknown actor ID
16:28:13     INFO - PID 1281 | 
16:28:13     INFO - 
16:28:13     INFO - TEST-UNEXPECTED-FAIL | /service-workers/service-worker/skip-waiting-using-registration.https.html | Test skipWaiting while a client is using the registration - assert_equals: Controller state should be activating expected "activating" but got "activated"
16:28:13     INFO - saw_controllerchanged<@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:25:11
16:28:13     INFO - promise callback*@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:14:33


Waiting for a fix on Bug 1231208.
In the last 7 days, we have 54 failures.
The occur on Linux32, Linux x64, linux32-stylo-disabled, OS X 10.10, Windows 7. And the affected build type are opt, pgo, debug.

Recent log example: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=167460047&lineNumber=6617

and a relevant part of it:

11:20:36     INFO - TEST-START | /service-workers/service-worker/skip-waiting-using-registration.https.html
11:20:37     INFO - 
11:20:37     INFO - TEST-UNEXPECTED-FAIL | /service-workers/service-worker/skip-waiting-using-registration.https.html | Test skipWaiting while a client is using the registration - assert_equals: Controller state should be activating expected "activating" but got "activated"
11:20:37     INFO - saw_controllerchanged<@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:25:11
11:20:37     INFO - promise callback*@https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:14:33
11:20:37     INFO - Test.prototype.step@https://web-platform.test:8443/resources/testharness.js:1494:20
11:20:37     INFO - promise_test/tests.promise_tests<@https://web-platform.test:8443/resources/testharness.js:543:27
11:20:37     INFO - promise callback*promise_test@https://web-platform.test:8443/resources/testharness.js:539:31
11:20:37     INFO - @https://web-platform.test:8443/service-workers/service-worker/skip-waiting-using-registration.https.html:9:1

Waiting for Bug 1231208 to be fixed.
:jmaher, should we disable this test until Bug 1231208 is solved?
Flags: needinfo?(jmaher)
yes, lets do that!
Flags: needinfo?(jmaher)
:jmaher, could you please double check the windows version? I am not sure about it.
Thank you.
Assignee: nobody → aiakab
Attachment #8959761 - Flags: review?(jmaher)
Comment on attachment 8959761 [details] [diff] [review]
Disabled on Linux, OSX opt, Windows 7 pgo

Review of attachment 8959761 [details] [diff] [review]:
-----------------------------------------------------------------

windows version is correct, but we do not have options for 'opt' or 'pgo'

::: testing/web-platform/meta/service-workers/service-worker/skip-waiting-using-registration.https.html.ini
@@ +1,4 @@
> +[skip-waiting-using-registration.https.html]
> +  disabled:
> +    if (os == "linux"): https://bugzilla.mozilla.org/show_bug.cgi?id=1425175
> +    if opt and (os == "mac"): https://bugzilla.mozilla.org/show_bug.cgi?id=1425175

!debug

@@ +1,5 @@
> +[skip-waiting-using-registration.https.html]
> +  disabled:
> +    if (os == "linux"): https://bugzilla.mozilla.org/show_bug.cgi?id=1425175
> +    if opt and (os == "mac"): https://bugzilla.mozilla.org/show_bug.cgi?id=1425175
> +    if pgo and (os == "win") and (version == "6.1.7600"): https://bugzilla.mozilla.org/show_bug.cgi?id=1425175

!debug
Attachment #8959761 - Flags: review?(jmaher) → review-
Whatever bug gets filed to re-enable these tests, please make it block bug 1231208.  Thanks.
Modified with "not debug" searched in serachfox https://tinyurl.com/ycuas322 and did not find "! debug" only "not debug" on web-platform tests.
Attachment #8962017 - Flags: review?(jmaher)
Comment on attachment 8962017 [details] [diff] [review]
Disabled Linux, OSX opt, Windows 7 pgo platforms

Review of attachment 8962017 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/web-platform/meta/service-workers/service-worker/skip-waiting-using-registration.https.html.ini
@@ +1,2 @@
> +[skip-waiting-using-registration.https.html]
> +  disabled

you need a : after disabled, here is an example:
https://searchfox.org/mozilla-central/source/testing/web-platform/meta/IndexedDB/idbindex_getAllKeys.html.ini#2
Attachment #8962017 - Flags: review?(jmaher) → review-
Updated the patch.
Attachment #8962575 - Flags: review?(jmaher)
Attachment #8959761 - Attachment is obsolete: true
Attachment #8962017 - Attachment is obsolete: true
Comment on attachment 8962575 [details] [diff] [review]
Disabled Linux, OSX opt, Windows 7 pgo platforms

Review of attachment 8962575 [details] [diff] [review]:
-----------------------------------------------------------------

thanks!
Attachment #8962575 - Flags: review?(jmaher) → review+
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e0edfda61ef2
skip-waiting-using-registration.https.html.ini disabled on Linux, OSX opt, Windows 7 pgo platforms r=jmaher
Keywords: checkin-needed
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Any idea why we're still seeing Win7 failures on trunk here?
Flags: needinfo?(jmaher)
ack, we check for version == "6.1.7600", not version == "6.1.7601".

Eliza, can you fix the patch?
Flags: needinfo?(jmaher) → needinfo?(ebalazs)
I changed it to version == "6.1.7601".
Flags: needinfo?(ebalazs)
Attachment #8967256 - Flags: review?(jmaher)
Attachment #8962575 - Attachment is obsolete: true
Comment on attachment 8967256 [details] [diff] [review]
Skipped on Linux, OSX opt, Windows 7 pgo platforms

Review of attachment 8967256 [details] [diff] [review]:
-----------------------------------------------------------------

thanks
Attachment #8967256 - Flags: review?(jmaher) → review+
Pushed by ebalazs@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b02cae8c720e
Disable skip-waiting-using-registration.https.html on Linux, OSX opt, Windows 7 pgo platforms. r=jmaher
Any update on this?
Flags: needinfo?(aiakab)
@Marion Daly 

These failed on Windos 10 and osx-10 debug, for which the test is not disabled. It is early to say that this needs disabling since it just failed once on the mentioned platforms.

:jmaher what do you think?
Flags: needinfo?(aiakab) → needinfo?(jmaher)
as we have 2 failures in the last 30 days, I don't think there is a next step here other than to fix the test and reenable it where we have disabled it.
Flags: needinfo?(jmaher)

This is failing constantly on android-em-7-0-x86_64 opt & debug, macosx1014-64 & windows7-32 debug: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-07-21&endday=2019-08-20&tree=trunk&bug=1425175
Since there is no action here for a long time and the test was disabled on other platforms long ago, can we make this skip-if = true?
Joel, what do you think?

Flags: needinfo?(jmaher)

I this we can make this:
linux
mac
android

and leave windows alone?

In general this should be:

expected:
if os == "mac": ["FAIL", "PASS"]
if os == "android": ["FAIL", "PASS"]

and remove the mac line from disabled. This gets us from disabled to finding crashes or timeouts.

Flags: needinfo?(jmaher)

See comment 85 re this now being perma-failing. There's probably a test bug that our recent changes have made more consistently fail.

Flags: needinfo?(perry)
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7d3f524bc6a6
Update expectations because of frequent failures. r=jmaher
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b1c39cc82f19
Adjust test expectations. a=test-only

This bug failed 89 times in the last 7 days. Occurs on android-em-7-0-x86_64, macosx1014-64 and macosx1014-64-shippable on debug and opt build types.

Recent log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=263725139&repo=autoland&lineNumber=12462

Csabou: Could you please take a look at this bug? It looks to me that the failure is still frequent.

Flags: needinfo?(csabou)

The expectations here were recently changed with: https://hg.mozilla.org/mozilla-central/diff/6fdb1282586def80e1a0d8c4c85f84e44a860b66/testing/web-platform/meta/service-workers/service-worker/skip-waiting-using-registration.https.html.ini

I'll reinstate those expectations in a patch and send it to jgraham for review.

Flags: needinfo?(csabou)
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dc8cd8acd5ad
Update test expectations that were deleted with the wpy-sync. r=jgraham
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b880da0d308d
Fix subtest expectations. a=test-only

Clearing ni? because won't be able to look at this right now.

Flags: needinfo?(perry)

:aiakab, are you still planning to work on this? Is there anything actionable left to do?

Flags: needinfo?(aiakab)
You need to log in before you can comment on or make changes to this bug.