Intermittent toolkit/components/antitracking/test/browser/browser_partitionedMessaging.js | We have first-party set on storagePrincipal - "" == "example.net" -
Categories
(Core :: Privacy: Anti-Tracking, defect, P2)
Tracking
()
Fission Milestone | M4.1 |
People
(Reporter: intermittent-bug-filer, Assigned: jya)
References
(Regression)
Details
(Keywords: intermittent-failure, regression, Whiteboard: [stockwell disabled])
Attachments
(10 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
Filed by: nbeleuzu [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=270940365&repo=autoland
Full log: https://queue.taskcluster.net/v1/task/Dim-wOaiS8yVOqCD2B3vyQ/runs/0/artifacts/public/logs/live_backing.log
[task 2019-10-11T21:16:19.292Z] 21:16:19 INFO - TEST-PASS | toolkit/components/antitracking/test/browser/browser_partitionedMessaging.js | We don't have first-party set on nodePrincipal - "" == "" -
[task 2019-10-11T21:16:19.293Z] 21:16:19 INFO - Buffered messages finished
[task 2019-10-11T21:16:19.296Z] 21:16:19 INFO - TEST-UNEXPECTED-FAIL | toolkit/components/antitracking/test/browser/browser_partitionedMessaging.js | We have first-party set on storagePrincipal - "" == "example.net" -
[task 2019-10-11T21:16:19.297Z] 21:16:19 INFO - Stack trace:
[task 2019-10-11T21:16:19.298Z] 21:16:19 INFO - resource://testing-common/content-task.js line 110 > eval:ifr.onload:13
[task 2019-10-11T21:16:19.299Z] 21:16:19 INFO - Sending code to the 3rd party content
[task 2019-10-11T21:16:19.300Z] 21:16:19 INFO - GECKO(5691) | ++DOCSHELL 0x7f29dd87c800 == 7 [pid = 5818] [id = {e693c255-1ac0-47a0-903a-69ff4b885b24}]
[task 2019-10-11T21:16:19.301Z] 21:16:19 INFO - GECKO(5691) | ++DOMWINDOW == 20 (0x7f29df202f20) [pid = 5818] [serial = 768] [outer = (nil)]
[task 2019-10-11T21:16:19.302Z] 21:16:19 INFO - GECKO(5691) | ++DOMWINDOW == 21 (0x7f29dfc58c00) [pid = 5818] [serial = 769] [outer = 0x7f29df202f20]
[task 2019-10-11T21:16:19.303Z] 21:16:19 INFO - GECKO(5691) | ++DOMWINDOW == 22 (0x7f29dfc5c400) [pid = 5818] [serial = 770] [outer = 0x7f29df202f20]
[task 2019-10-11T21:16:19.304Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 36 (0x7f102e922980) [pid = 5763] [serial = 198] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html]
[task 2019-10-11T21:16:19.306Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 35 (0x7f102e9225c0) [pid = 5763] [serial = 195] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html]
[task 2019-10-11T21:16:19.307Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 34 (0x7f102e923e20) [pid = 5763] [serial = 208] [outer = (nil)] [url = http://example.net/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.308Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 33 (0x7f102e922b60) [pid = 5763] [serial = 203] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html]
[task 2019-10-11T21:16:19.309Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 32 (0x7f102f07fd40) [pid = 5763] [serial = 227] [outer = (nil)] [url = http://example.net/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.310Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 31 (0x7f102f07f020) [pid = 5763] [serial = 217] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html?0.042698885081182025]
[task 2019-10-11T21:16:19.312Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 30 (0x7f102e9227a0) [pid = 5763] [serial = 192] [outer = (nil)] [url = http://example.net/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.313Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 29 (0x7f102e922d40) [pid = 5763] [serial = 200] [outer = (nil)] [url = http://example.net/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.314Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 28 (0x7f102fbca7a0) [pid = 5763] [serial = 187] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.315Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 27 (0x7f102f07f980) [pid = 5763] [serial = 221] [outer = (nil)] [url = http://example.net/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.316Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 26 (0x7f102f07f3e0) [pid = 5763] [serial = 214] [outer = (nil)] [url = http://example.net/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.319Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 25 (0x7f102fbca5c0) [pid = 5763] [serial = 184] [outer = (nil)] [url = http://example.net/browser/toolkit/components/antitracking/test/browser/page.html]
[task 2019-10-11T21:16:19.320Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 24 (0x7f102f07fb60) [pid = 5763] [serial = 230] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html]
[task 2019-10-11T21:16:19.321Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 23 (0x7f102e922f20) [pid = 5763] [serial = 206] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html]
[task 2019-10-11T21:16:19.322Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 22 (0x7f102e9223e0) [pid = 5763] [serial = 190] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html]
[task 2019-10-11T21:16:19.324Z] 21:16:19 INFO - GECKO(5691) | --DOMWINDOW == 21 (0x7f102f07f7a0) [pid = 5763] [serial = 224] [outer = (nil)] [url = http://not-tracking.example.com/browser/toolkit/components/antitracking/test/browser/localStorageEvents.html]
[task 2019-10-11T21:16:19.326Z] 21:16:19 INFO - Console message: [JavaScript Warning: "Request to access cookie or storage on “http://tracking.example.org/browser/toolkit/components/antitracking/test/browser/3rdPartyStorage.html” was blocked because it came from a tracker and content blocking is enabled." {file: "http://example.net/browser/toolkit/components/antitracking/test/browser/page.html" line: 27 column: 19 source: "http://tracking.example.org/browser/toolkit/components/antitracking/test/browser/3rdPartyStorage.html"}]
[task 2019-10-11T21:16:19.327Z] 21:16:19 INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://tracking.example.org/browser/toolkit/components/antitracking/test/browser/3rdPartyStorageWO.html" line: 0}]
[task 2019-10-11T21:16:19.329Z] 21:16:19 INFO - GECKO(5691) | ++DOCSHELL 0x7f29dd7e6000 == 8 [pid = 5818] [id = {2862571f-3ea3-46b7-aec5-7330853a8afa}]
[task 2019-10-11T21:16:19.331Z] 21:16:19 INFO - GECKO(5691) | ++DOMWINDOW == 23 (0x7f29df203100) [pid = 5818] [serial = 771] [outer = (nil)]
[task 2019-10-11T21:16:19.332Z] 21:16:19 INFO - GECKO(5691) | ++DOMWINDOW == 24 (0x7f29dfc60000) [pid = 5818] [serial = 772] [outer = 0x7f29df203100]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 8•5 years ago
|
||
A pernosco trace for this is here: https://pernos.co/debug/byaKmg98a4Os1nUqeI0bhA/index.html#f{m[A9QH,B5f4_,t[AYI,CPY_,f{e[A9QH,B5fu_,s{af1ybykAA,bASE,oGaz+GQ,uGaqIuQ___
It looks like we depend on the channel being classified from within HttpChannelChild::OnStartRequest. Stack for that is this:
::HttpChannelChild::RecvOnStartRequest () at /HttpChannelChild.cpp:488
::ChannelEventQueue::RunOrEnqueue () at /ChannelEventQueue.h:210
::StartRequestEvent::Run (this=0x7f5c94eb5000) at /HttpChannelChild.cpp:426
::HttpChannelChild::OnStartRequest () at /HttpChannelChild.cpp:607
::HttpChannelChild::DoOnStartRequest () at /HttpChannelChild.cpp:682
nsDocumentOpenInfo::OnStartRequest () at /nsURILoader.cpp:292
nsDocumentOpenInfo::DispatchContent () at /nsURILoader.cpp:413
nsDocumentOpenInfo::TryContentListener () at /nsURILoader.cpp:741
nsDSURIContentListener::DoContent () at /nsDSURIContentListener.cpp:184
nsDocShell::CreateContentViewer () at /nsDocShell.cpp:7873
nsDocShell::NewContentViewerObj () at /nsDocShell.cpp:8128
nsContentDLF::CreateInstance ()
nsContentDLF::CreateDocument () at /nsContentDLF.cpp:329
nsHTMLDocument::StartDocumentLoad () at /nsHTMLDocument.cpp:487
::Document::StartDocumentLoad () at /Document.cpp:2960
nsHTMLDocument::Reset () at /nsHTMLDocument.cpp:178
::Document::Reset () at /Document.cpp:2310
nsScriptSecurityManager::GetChannelResultPrincipals () at /nsScriptSecurityManager.cpp:283
::StoragePrincipalHelper::Create () at /StoragePrincipalHelper.cpp:80
::(anonymous namespace)::ChooseOriginAttributes () at /StoragePrincipalHelper.cpp:21
On the parent side, we receive HttpChannelParent::NotifyClassificationFlags before HttpChannelParent::OnStartRequest, but the notification flags go over the background channel, and OnStartRequest stays on the main-thread.
Looks like this was originally added in bug 1015466, to ensure that the flags arrived before OnStopRequest (which is also on the bg channel). It looks like the classification process is async, so it does seem possible that it could be later.
I think the regression here is that NotifyClassificationFlags is now delivered to DocumentChannelParent, which then forwards it on to HttpChannelParent, immediately before OnStartRequest (in FinishReplacementChannelSetup). I think that means it was just an existing race, which we're now losing more often.
baku, any ideas what we should be doing here? Should we be trying to guarantee that classification is done before OnStartRequest (and then send it over the normal main-thread channel so that it arrives on the child before too)? Or should we not be reading classification status until OnStopRequest (and can we enforce that somehow?).
Updated•5 years ago
|
Comment 9•5 years ago
|
||
Hmm, it looks like nsHttpChannel::AsyncOpen will wait for the classification callback (but didn't when bug 1015466 was written?).
I guess that means we can depend on all classification messages arriving before OnStartRequest, so we should move the IPDL bits to the PHttpChannel class, so that guarantee extends to the content process.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•5 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment 16•5 years ago
|
||
Comment 17•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 18•5 years ago
|
||
PrepentEvent was already moving the argument passed by reference. This makes it clear that the object will be moved.
Assignee | ||
Comment 19•5 years ago
|
||
Remove the need to create separate runnable classes. It's far more readable and remove the need to duplicate lots of code.
We unfortunately need to capture "this" in a ref counter to get around the static analyzer complaining about capturing this by value, even thouch the ChannelEventQueue guarantees that this will outlive the event.
Depends on D52260
Assignee | ||
Comment 20•5 years ago
|
||
We can guarantee that those callbacks will occur prior the call to OnStartRequest.
But running them all on the PHttpChannel we remove the potential race between the PHttpChannel which runs on the main thread and PHttpBackroundChannel which runs on the necko thread.
Depends on D52261
Assignee | ||
Comment 21•5 years ago
|
||
The information is now passed to the parent channel once replacement is completed.
Depends on D52262
Assignee | ||
Comment 22•5 years ago
|
||
This reverts bug 1590608 and bug 1590649.
Depends on D52263
Assignee | ||
Comment 23•5 years ago
|
||
And amend comment on other bugs failing for the same reasons, linking to bug that is the cause.
Comment 24•5 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 28•5 years ago
|
||
Assignee | ||
Comment 29•5 years ago
|
||
Fix style consistency while at it and a couple of constness issues.
Depends on D52503
Comment hidden (Intermittent Failures Robot) |
Comment 31•5 years ago
|
||
Comment 32•5 years ago
|
||
Backed out for wpt leaks on ChannelEvent/ChannelEventQueue
Backout link: https://hg.mozilla.org/integration/autoland/rev/2f19e7b646e0a52fa855b75c868f0a3f3a990ad3
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=275958267&repo=autoland&lineNumber=102012
Assignee | ||
Comment 33•5 years ago
|
||
We have a cycle in mUnknownDecoderEventQ as it now contains a strong reference to itself and unless UnknownDecoderInvolvedOnStartRequestCalled is called, it will never be cleared.
Comment 34•5 years ago
|
||
(In reply to Jean-Yves Avenard [:jya] from comment #33)
We have a cycle in mUnknownDecoderEventQ as it now contains a strong reference to itself and unless UnknownDecoderInvolvedOnStartRequestCalled is called, it will never be cleared.
Comment 35•5 years ago
|
||
Jean-Yves - I will ask you one thing, please bisect at least which directory or even the particular test (or set of sub tests) cause the leak; I can't apply your patches locally on m-c with phab patch (even -a here
), so those will need a rebase. It's hard to understand the code paths and the patch may be tricky. This involves diversion and it's always fragile when touched.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 37•5 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #35)
Jean-Yves - I will ask you one thing, please bisect at least which directory or even the particular test (or set of sub tests) cause the leak; I can't apply your patches locally on m-c with phab patch (even
-a here
), so those will need a rebase. It's hard to understand the code paths and the patch may be tricky. This involves diversion and it's always fragile when touched.
I've never managed to get moz-phab to apply patches unfortunately.
Patches where based on central already.
The best is to use one of my try push.
In any case, removing the cycle due to HttpChannelChild::mUnknownDecoderEventQ will still cause leaks due to mEventQ not running some events / being cleared like here:
https://treeherder.mozilla.org/logviewer.html#?job_id=276141931&repo=try
leak at mozilla::net::HttpChannelChild::OnStopRequest, mozilla::net::ChannelEventQueue::FlushQueue, CompleteResume, mozilla::net::ChannelEventQueue::ResumeInterna
Those issues weren't used before because not running events or clearing the event queue wouldn't cause side effects.
Assignee | ||
Comment 38•5 years ago
|
||
:ehsan ; the static analyser will always complain whenever you are capturing a pointer to a refcounted object by value.
There are legitimate cases on which you would want to do so; like the ChannelEventQueue ; it guarantees that the object in the queue will not outlive its owner.
Could we add a way to bypass this error? it makes for frustrating time sometimes, as we now need to deal with reference cycles that shouldn't exist in the first place.
Comment 39•5 years ago
•
|
||
(In reply to Jean-Yves Avenard [:jya] from comment #38)
:ehsan ; the static analyser will always complain whenever you are capturing a pointer to a refcounted object by value.
There are legitimate cases on which you would want to do so; like the ChannelEventQueue ; it guarantees that the object in the queue will not outlive its owner.
Could we add a way to bypass this error? it makes for frustrating time sometimes, as we now need to deal with reference cycles that shouldn't exist in the first place.
Do you mean something effectively like UnsafePtr
in https://phabricator.services.mozilla.com/D52261 for https://hg.mozilla.org/integration/autoland/rev/8d2cf0fc6b9b#l1.35?
(How come the version of the patch that landed there was different than the one in Phabricator BTW?!)
If I understand what you want to achieve here correctly, then yes I think that should be easily possible. We could do so by adding an attribute hidden behind a macro that would allow you to write the code like this:
template <typename T>
NeckoTargetChannelFunctionEvent(T* aChild, std::function<void()>&& aCallback)
: ChannelFunctionEvent(
[MOZ_UNCHECKED child = aChild]() {
MOZ_ASSERT(child);
return child->GetNeckoTarget();
},
std::move(aCallback)) {}
Or something to that effect. That way usage of refcounted pointers with lambdas is safe by default and if you know the lifetime of your objects precisely and would like to opt out of the compiler-enforced safety checks you could do so and make it clear to the reader of the code that the specific lambda capture in question isn't being checked.
If this sounds good please file a bug and CC Andi, he should be able to help you out. :-)
Assignee | ||
Comment 40•5 years ago
|
||
This is blocking further fission work
Assignee | ||
Comment 41•5 years ago
|
||
nsTArray::InsertElementsAt is infallible. So the test checking that nullptr wasn't returned would always be false.
Seeing that queuing an event is also an infallible operation, there's no point for prepentEvents to be.
Assignee | ||
Comment 42•5 years ago
|
||
(In reply to :ehsan akhgari from comment #39)
(In reply to Jean-Yves Avenard [:jya] from comment #38)
:ehsan ; the static analyser will always complain whenever you are capturing a pointer to a refcounted object by value.
There are legitimate cases on which you would want to do so; like the ChannelEventQueue ; it guarantees that the object in the queue will not outlive its owner.
Could we add a way to bypass this error? it makes for frustrating time sometimes, as we now need to deal with reference cycles that shouldn't exist in the first place.
Do you mean something effectively like
UnsafePtr
in https://phabricator.services.mozilla.com/D52261 for https://hg.mozilla.org/integration/autoland/rev/8d2cf0fc6b9b#l1.35?
yes
(How come the version of the patch that landed there was different than the one in Phabricator BTW?!)
Because I had hoped to resolve the cycle first rather than getting around the problem. That a cycle cause a leak is the apparent effect of that event taskqueue not always running a task. The problem has now appeared to be more complex than first thought and I have no particular interest in fixing it anymore as it's been a long problem (even if the effect were not noticed). So instead I've created bug 1596295.
If I understand what you want to achieve here correctly, then yes I think that should be easily possible. We could do so by adding an attribute hidden behind a macro that would allow you to write the code like this:
template <typename T> NeckoTargetChannelFunctionEvent(T* aChild, std::function<void()>&& aCallback) : ChannelFunctionEvent( [MOZ_UNCHECKED child = aChild]() { MOZ_ASSERT(child); return child->GetNeckoTarget(); }, std::move(aCallback)) {}
yes, indeed that's precisely what I'm after; it makes it much easier to use automated capture with [=]
Or something to that effect. That way usage of refcounted pointers with lambdas is safe by default and if you know the lifetime of your objects precisely and would like to opt out of the compiler-enforced safety checks you could do so and make it clear to the reader of the code that the specific lambda capture in question isn't being checked.
If this sounds good please file a bug and CC Andi, he should be able to help you out. :-)
will do.
Comment 43•5 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment 45•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/2a6684436b37
https://hg.mozilla.org/mozilla-central/rev/a00f164555ff
https://hg.mozilla.org/mozilla-central/rev/2dda521c0591
https://hg.mozilla.org/mozilla-central/rev/e9ca1ee307d6
https://hg.mozilla.org/mozilla-central/rev/354253d41c6a
https://hg.mozilla.org/mozilla-central/rev/c70d60f8293c
https://hg.mozilla.org/mozilla-central/rev/693144fd8c54
https://hg.mozilla.org/mozilla-central/rev/0ba862a839d1
Comment 46•5 years ago
|
||
Comment 47•5 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Updated•5 years ago
|
Updated•5 years ago
|
Updated•3 years ago
|
Description
•