Perma Assertion failure: mDocGroup->MatchesKey(docGroupKey), at /builds/worker/checkouts/gecko/dom/base/Document.cpp:3752 when Gecko 78 merges to Beta on 2020-06-01
Categories
(Core :: DOM: postMessage, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox76 | --- | unaffected |
firefox77 | --- | unaffected |
firefox78 | + | fixed |
firefox79 | --- | verified |
People
(Reporter: aryx, Assigned: tt)
References
(Regression)
Details
(Keywords: intermittent-failure, regression)
Attachments
(2 files, 1 obsolete file)
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
central-as-beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=a7415c718528d68105906a186ba8a78a05e4fc43&selectedTaskRun=biHcQ4jMRjC8ncoG25OKAg-0
Haven't verified it's from bug 1601594.
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=304256296&repo=try
[task 2020-05-29T12:48:37.075Z] 12:48:37 INFO - TEST-START | /workers/postMessage_clone_port.htm
[task 2020-05-29T12:48:37.076Z] 12:48:37 INFO - Clearing pref browser.tabs.remote.useCrossOriginOpenerPolicy
[task 2020-05-29T12:48:37.092Z] 12:48:37 INFO - Clearing pref browser.tabs.remote.useCrossOriginEmbedderPolicy
[task 2020-05-29T12:48:37.095Z] 12:48:37 INFO - Clearing pref javascript.options.shared_memory
[task 2020-05-29T12:48:37.102Z] 12:48:37 INFO - Clearing pref dom.postMessage.sharedArrayBuffer.withCOOP_COEP
[task 2020-05-29T12:48:37.118Z] 12:48:37 INFO - Closing window 45
[task 2020-05-29T12:48:37.139Z] 12:48:37 INFO - PID 3660 | Assertion failure: mDocGroup->MatchesKey(docGroupKey), at /builds/worker/checkouts/gecko/dom/base/Document.cpp:3752
[task 2020-05-29T12:48:37.170Z] 12:48:37 INFO - STDOUT: Initializing stack-fixing for the first stack frame, this may take a while...
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - Traceback (most recent call last):
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/tests/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py", line 126, in run_test
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - return self.executor.run_test(test)
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/tests/web-platform/tests/tools/wptrunner/wptrunner/executors/base.py", line 255, in run_test
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - self.on_environment_change(test.environment)
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/tests/web-platform/tests/tools/wptrunner/wptrunner/executors/executormarionette.py", line 707, in on_environment_change
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - self.protocol.testharness.load_runner(new_environment["protocol"])
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/tests/web-platform/tests/tools/wptrunner/wptrunner/executors/executormarionette.py", line 136, in load_runner
[task 2020-05-29T12:48:37.600Z] 12:48:37 ERROR - self._close_windows()
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/tests/web-platform/tests/tools/wptrunner/wptrunner/executors/executormarionette.py", line 176, in _close_windows
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - self.marionette.switch_to_window(runner_handle)
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1361, in switch_to_window
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - {"focus": focus, "name": handle, "handle": handle})
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py", line 26, in _
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - return func(*args, **kwargs)
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 602, in _send_message
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - self._handle_error(err)
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 622, in _handle_error
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - raise errors.lookup(error)(message, stacktrace=stacktrace)
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - NoSuchWindowException: Unable to locate window: 7
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - stacktrace:
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - WebDriverError@chrome://marionette/content/error.js:175:5
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - NoSuchWindowError@chrome://marionette/content/error.js:409:5
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - GeckoDriver.prototype.switchToWindow@chrome://marionette/content/driver.js:1602:11
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - despatch@chrome://marionette/content/server.js:305:40
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - execute@chrome://marionette/content/server.js:275:16
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - onPacket/<@chrome://marionette/content/server.js:248:20
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - onPacket@chrome://marionette/content/server.js:249:9
[task 2020-05-29T12:48:37.601Z] 12:48:37 ERROR - _onJSONObjectReady/<@chrome://marionette/content/transport.js:501:20
Reporter | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 1•4 years ago
|
||
Hitting this assertion means the document belongs to two different doc groups. And it's probably because the value CrossOriginIsolated
gets change during the lifetime of the document.
I can reproduce this issue with ./mach web-platform-tests --headless testing/web-platform/tests/workers/postMessage_block.https.html testing/web-platform/tests/workers/postMessage_clone_port.htm
and central-as-beta sim in my local machine.
The same setting and command didn't hit the assertion for Nightly, so I lean to this is an issue related to Nighly only stuff.
postMessage_block.https.html
itself has headers (COOP, COEP) set and thus it should be run in webCoop_Coep process.
postMessage_clone_port.htm
itself doesn't have headers set.
Only ./mach web-platform-tests --headless testing/web-platform/tests/workers/postMessage_block.https.html
or ./mach web-platform-tests --headless testing/web-platform/tests/workers/postMessage_clone_port.htm
doesn't hit the assertion
The interesting frames are that:
0:57.45 TEST_START: /workers/postMessage_block.https.html
...
1:29.18 TEST_END: Test OK. Subtests passed 1/1. Unexpected 0
1:29.18 TEST_START: /workers/postMessage_clone_port.htm
1:29.19 INFO Clearing pref browser.tabs.remote.useCrossOriginOpenerPolicy
1:29.30 INFO Clearing pref browser.tabs.remote.useCrossOriginEmbedderPolicy
1:29.37 INFO Clearing pref javascript.options.shared_memory
1:29.44 INFO Clearing pref dom.postMessage.sharedArrayBuffer.withCOOP_COEP
1:29.54 INFO Closing window 28
1:29.79 pid:30091 Assertion failure: mDocGroup->MatchesKey(docGroupKey), at /home/shes050117/Work/mozilla-central/dom/base/Document.cpp:3752
I wonder if the cause can be changing the prefs after the next test starts. If so, we can enable the prefs for all the tests in the directory to temporarily stop the crash and looking into the issue. (we still need to find the root cause, though)
I wonder if it's something like:
postMessage_block.https.html
creates a webCoop_Coep process and use it for the test
postMessage_clone_port.htm
reuses the process but it shouldn't because it doesn't have the headers set.
Anyway, I have recorded it with rr and will take a closer look next week.
Comment hidden (Intermittent Failures Robot) |
Updated•4 years ago
|
Assignee | ||
Comment 3•4 years ago
|
||
// Hit the assertion
(rr) p docGroupKey
$1 = {<nsTString<char>> = "https://web-platform.test", static kStorageSize = 64, mInlineCapacity = 63,
mStorage = "https://web-platform.test\000", '\344' <repeats 15 times>, "䪪", '\252' <repeats 20 times>}
(rr) p NodePrincipal()->mURI->GetSpecOrDefault()
$2 = "https://web-platform.test:8443/workers/postMessage_block.https.html"
(rr) p GetBrowsingContext()->CrossOriginIsolated()
$3 = false
(rr) p GetBrowsingContext()->GetOpenerPolicy()
$6 = (const nsILoadInfo::CrossOriginOpenerPolicy &) @0x7f826111c1f8: nsILoadInfo::OPENER_POLICY_SAME_ORIGIN_EMBEDDER_POLICY_REQUIRE_CORP
(rr) p ContentChild::GetSingleton()->GetRemoteType()
$7 = (const nsAString &) @0x7f82808ca888: {<mozilla::detail::nsTStringRepr<char16_t>> = {mData = 0x7f826128bd88 u"webCOOP+COEP=https://web-platform.test", mLength = 38,
mDataFlags = (mozilla::detail::StringDataFlags::TERMINATED | mozilla::detail::StringDataFlags::REFCOUNTED), mClassFlags = mozilla::detail::StringClassFlags::NULL_TERMINATED},
static kMaxCapacity = 1073741817}
So, it looks like the document for postMessage_block
is about to be destructed. However, the value of CrossOriginIsolated()
was false
(which was expected to be true
) and thus the docGroupKey
was the value of SiteOrigin
rather than Origin
of the NodePrincipal
. So that it hit the assertion.
In short, the issue is that the pref changes during the lifetime of a document. It didn't matter before bug 1601594 because it only affects the result of CrossOriginIsolated()
. However, we link CrossOriginIsolated
and DocGroup
together in bug 1601594 and thus we hit this assertion here.
I guess we can fix this by only evaluating the pref when the browser starts so that the value of CrossOriginIsolated
won't be affected by changing the value the pref (dom.postMessage.sharedArrayBuffer.withCOOP_COEP
).
Assignee | ||
Comment 4•4 years ago
|
||
(In reply to Tom Tung [:tt, :ttung] from comment #3)
I guess we can fix this by only evaluating the pref when the browser starts so that the value of
CrossOriginIsolated
won't be affected by changing the value the pref (dom.postMessage.sharedArrayBuffer.withCOOP_COEP
).
It seems that this approach doesn't work well with the current web-platform test since that appears to set the pref after the browser startup.
I'm considering caching the pref value on the cpp object itself instead.
Assignee | ||
Comment 5•4 years ago
|
||
Assignee | ||
Comment 6•4 years ago
|
||
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4530435be2c4ad29121f0b7b129664ee3828faad
From the result of that, it seems that the issue is fixed by the patch.
Assignee | ||
Comment 7•4 years ago
|
||
Assignee | ||
Comment 8•4 years ago
|
||
Depends on D78282
Assignee | ||
Comment 9•4 years ago
|
||
try for make the pref unchangeable: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9c4cc2b011d74f95a7289e91cae7abbcf2afcde6
Updated•4 years ago
|
Comment hidden (Intermittent Failures Robot) |
Reporter | ||
Updated•4 years ago
|
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
Backed out for perma faliures.
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=305588982&repo=autoland&lineNumber=12989
Backout: https://hg.mozilla.org/integration/autoland/rev/5ba531c2c1b78182a00f38862e9481a6ee793ce2
Comment 13•4 years ago
|
||
Assignee | ||
Comment 14•4 years ago
|
||
Ah, it's probably because I removed setting dom.postMessage.sharedArrayBuffer.bypassCOOP_COEP.insecure.enabled
to true
in the second patch and I shouldn't. I will verify that. Sorry for this.
Comment 15•4 years ago
|
||
Reporter | ||
Comment 16•4 years ago
|
||
Thank you for fixing this. Please request beta approval.
Assignee | ||
Comment 17•4 years ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #16)
Thank you for fixing this. Please request beta approval.
I will tomorrow (just to ensure there is no other issue comes out)
Comment 18•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/aa72c8fda60e
https://hg.mozilla.org/mozilla-central/rev/156de44a647f
Assignee | ||
Comment 19•4 years ago
|
||
Comment on attachment 9154188 [details]
Bug 1641874 - Update test expectations;
Beta/Release Uplift Approval Request
- User impact if declined: If they flip the pref during runtime, it might cause a document to find an incorrect doc group. Note that this wouldn't affect most users/websites. It should only affect websites that have applied COOP and COEP stuff.
Note that we don't have a test to directly test this behavior, but we flip the pref in wpt tests at runtime. So I put "unknown" below.
- Is this code covered by automated tests?: Unknown
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): It only makes the pref be evaluated once at startup so should not be risky.
- String changes made/needed:
Assignee | ||
Updated•4 years ago
|
Comment 20•4 years ago
|
||
Comment on attachment 9154187 [details]
Bug 1641874 - Make the pref unchangeable at runtime;
approved for 78.0b6
Updated•4 years ago
|
Comment 21•4 years ago
|
||
bugherder uplift |
Reporter | ||
Comment 22•4 years ago
|
||
Verified fixed in latest beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Crunnable&revision=41d33c6c9caa97604eb73694179e1ec0d5d426f7
Comment hidden (Intermittent Failures Robot) |
Updated•4 years ago
|
Description
•