Closed Bug 1705910 Opened 3 years ago Closed 3 years ago

Navigating iframes between origins frequently fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210 (that warning being a downstream symptom, not the root of the issue)

Categories

(Core :: DOM: Navigation, defect, P3)

defect

Tracking

()

RESOLVED FIXED
91 Branch
Fission Milestone M8
Tracking Status
firefox91 --- fixed

People

(Reporter: dholbert, Assigned: kmag)

Details

Attachments

(2 files, 1 obsolete file)

EDIT: I'm adjusting comment 0 to include some STR:
(1) Import the attached patch which just adds a mochitest:
hg import https://bug1705910.bmoattachments.org/attachment.cgi?id=9216643
(2) ./mach build to pick up the new mochitest
(3) Run the included mochitest, with fission enabled:

./mach mochitest --headless --setpref "fission.autostart=true" dom/base/test/test_bug1705910.html

EXPECTED RESULTS:
mochitest should complete successfully (as it does if you run it without fission)

ACTUAL RESULTS:
The mochitest times out at some point, due to one of its iframe navigations having failed. Also, this warning seems to always appear just before that happens:

[Parent 1391, Main Thread] WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210

[My original bug description follows below.]


I'm filing this issue for a bug I ran into when investigating some test failures in bug 1703735.

Short version: occasionally, if you have <iframe src="http://example.org"> and then removeAttribute("src") and then set src to some other origin (e.g. http://example.com), the navigation fails, with this log-spew:

[task 2021-04-17T15:35:31.144Z] 15:35:31     INFO - GECKO(1544) | [Parent 1544, Main Thread] WARNING: IPC message discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:532
[task 2021-04-17T15:35:31.152Z] 15:35:31     INFO - GECKO(1544) | ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2021-04-17T15:35:31.152Z] 15:35:31     INFO - GECKO(1544) | ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2021-04-17T15:35:31.167Z] 15:35:31     INFO - GECKO(1544) | [Parent 1544, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/dom/security/FramingChecker.cpp:210

NOTE: at this point, the iframe's "load" event fires (in the other document); however, the iframe is blank, and it never runs JS code from the HTML file that we told it to load.

Then here's some more logging that appears just after that, in case it's useful:

[task 2021-04-17T15:35:31.443Z] 15:35:31     INFO - GECKO(1544) | [Child 1771, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3361
[task 2021-04-17T15:35:31.463Z] 15:35:31     INFO - GECKO(1544) | [Child 1771, Main Thread] WARNING: '!obs', file /builds/worker/checkouts/gecko/toolkit/components/sessionstore/RestoreTabContentObserver.cpp:58
[task 2021-04-17T15:35:31.483Z] 15:35:31     INFO - GECKO(1544) | [Child 1771, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4442
[task 2021-04-17T15:35:31.660Z] 15:35:31     INFO - GECKO(1544) | ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /tmp/tmpBdU2GQ.mozrunner/runtests_leaks_tab_pid1879.log
[task 2021-04-17T15:35:31.661Z] 15:35:31     INFO - GECKO(1544) | [1879, Main Thread] WARNING: XPCOM_MEM_BLOAT_LOG is set, disabling native allocations.: file /builds/worker/checkouts/gecko/tools/profiler/core/platform.cpp:254
[task 2021-04-17T15:35:31.736Z] 15:35:31     INFO - GECKO(1544) | [Child 1879, Main Thread] WARNING: could not set real-time limit in CubebUtils::InitLibrary: file /builds/worker/checkouts/gecko/dom/media/CubebUtils.cpp:619
[task 2021-04-17T15:35:36.783Z] 15:35:36     INFO - GECKO(1544) | [Child 1672, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/caps/BasePrincipal.cpp:1349

This comes from this Try run:
https://firefoxci.taskcluster-artifacts.net/cfYA0IMuR5KJST4fE5DnOg/0/public/logs/live_backing.log
https://treeherder.mozilla.org/jobs?repo=try&selectedTaskRun=cfYA0IMuR5KJST4fE5DnOg.0&revision=56b0517813e3efb0373cd9dc4b2fd49346a794c6&searchStr=debug%2Ctv-fis

I haven't been able to reproduce this issue locally (even with --verify), but it reproduces very reliably in this particular TV-fis treeherder job.

I'll post a patch here later to add a standalone testcase that reproduces the issue (when run in this TV-fix treeherder job).

[ni=me to simplify & post a testcase here that can be used to demonstrate the bug]

Flags: needinfo?(dholbert)
Attached patch patch with tedstcase (obsolete) — Splinter Review

Here's a patch with a mochitest to demonstrate the bug.

This test passes locally, but it fails on the Try test-verify (TV) task, with what looks like a navigation failing for some reason.

Here's a Try run, which I anticipate will have oranges showing this:
https://treeherder.mozilla.org/jobs?repo=try&revision=6531ca57494f4a5ded6961dd314abbdc35f60649

Assignee: nobody → dholbert
Flags: needinfo?(dholbert)

(er, didn't mean to self-assign.)

Assignee: dholbert → nobody

Here's a slightly better testcase.

Also: I found that I can actually reproduce this locally if I just manually enable the fission pref when starting mochitests. e.g. if I run:

./mach mochitest --setpref "fission.autostart=true" dom/base/test/test_bug1705910.html

...then the test times out partway through due to one of its iframe-navigation operations apparently failing.

Attachment #9216621 - Attachment is obsolete: true

Try run showing failures in this test, for reference:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=334c57ac83d841cdd01a7cb0130f2a36d8c8325d

In a debug build, the navigation that fails always seems to be accompanied by this diagnostic output:

[Parent 1391, Main Thread] WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210
Summary: [fission] Navigating iframes between origins occasionally fails → [fission] Navigating iframes between origins occasionally fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210

In addition to some unrelated intermittents, that Try run shows this test -- test_bug1705910.html -- failing [as shown by it timing out] in Mochitest-Fission runs on all platforms.

e.g. here's a mochitest-fission log from linux-debug showing the failure:
https://treeherder.mozilla.org/logviewer?job_id=336926105&repo=try&lineNumber=3990
...and a TV-fission log from linux-debug:
https://treeherder.mozilla.org/logviewer?job_id=336926085&repo=try&lineNumber=2548

Summary: [fission] Navigating iframes between origins occasionally fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210 → [fission] Navigating iframes between origins frequently fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210

dragana or ckerchb, perhaps one of you could take a look here?

Looks like the FramingChecker.cpp warning that's firing is in some code that you added/reviewed in bug 1593321, so I'm hoping maybe you could follow the breadcrumbs from there. (I'm betting the real failure is happening a bit earlier & the FramingChecker.cpp warning is just a symptom of whatever the real issue was.)

Flags: needinfo?(dd.mozilla)

GetResponseStatus() is failing to return a response status because mResponseHead is null. Why? A canceled navigation?

needinfo'ing kmag because he said will try to reproduce and grab a stack trace. He has seen these warnings without Fission, but perhaps Fission process switching makes this problem more likely.

Flags: needinfo?(kmaglione+bmo)

I am pretty sure we can not be sure that we have a mResponseHead. This is present only if a channel succeeds to get a response.

We probably cancel the channel that fires that OnStartReques notification and it is likely that mResponseHead is not present.

I think the debug assert is wrong and should be removed.
It would be worth taking a look at logs, it would be interested to look why it succeeds sometimes.

Moving to DOM: Security component because this appears to be a problem in dom/security/FramingChecker.cpp.

This bug is not Fission-only.

No longer blocks: fission
Component: DOM: Navigation → DOM: Security
Flags: needinfo?(kmaglione+bmo)
Summary: [fission] Navigating iframes between origins frequently fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210 → Navigating iframes between origins frequently fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210

(In reply to Chris Peterson [:cpeterson] from comment #10)

This bug is not Fission-only.

Are you sure / can you elaborate? My STR here show this failing in an extremely fission-dependent way.

(I could imagine this being an issue that's not-fission-specific-but-just-much-easier-to-trigger-with-fission -- maybe that's what you mean? -- but I'm not sure we've demonstrated/determined that...)

Flags: needinfo?(cpeterson)

Note that dragana suggested removing the debugging assertion (by which I think she means the WARNING: 'NS_FAILED(rv) logging that's attributed to FramingChecker.cpp).

But the core issue here isn't that logging -- the core issue is an iframe navigation failure, which is a user-visible issue that reproduces in both debug and opt builds, as shown by the "TV-fis" oranges on my Try run[1] with the attached test-patch.

[1] https://treeherder.mozilla.org/jobs?repo=try&revision=334c57ac83d841cdd01a7cb0130f2a36d8c8325d

(In reply to Daniel Holbert [:dholbert] from comment #11)

(In reply to Chris Peterson [:cpeterson] from comment #10)

This bug is not Fission-only.

Are you sure / can you elaborate? My STR here show this failing in an extremely fission-dependent way.

@ kmag: you said in our Fission bug triage meeting that you had seen some tests that reliably reproduced these same FramingChecker.cpp errors without Fission. Do you recall which tests those were?

Fission Milestone: --- → ?
Flags: needinfo?(cpeterson) → needinfo?(kmaglione+bmo)
Flags: needinfo?(ckerschb)

(In reply to Daniel Holbert [:dholbert] from comment #11)

Are you sure / can you elaborate? My STR here show this failing in an extremely fission-dependent way.

Daniel, thanks for providing the testcase - I can reproduce with fission enabled. At this point I am confident to say that the FramingChecker code in question. Is not causing the problem here. In fact, if the FramingChecker can not query the reponseHead then we return early, allowing the load to succeed.

Surely we could remove the NS_WARN_IF part based on Dragana's suggestion, but it's not the code that causes the frame navigation to be blocked.

Flags: needinfo?(ckerschb)

Based on Comment 14 I am moving this back to DOM: Navigation, but here to help investigate if People think it's still related to DomSec

Component: DOM: Security → DOM: Navigation
Summary: Navigating iframes between origins frequently fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210 → Navigating iframes between origins frequently fails, with WARNING: 'NS_FAILED(rv)', file dom/security/FramingChecker.cpp:210 (that warning being a downstream symptom, not the root of the issue)
Flags: needinfo?(dd.mozilla)

Assigning to kmag

Assignee: nobody → kmaglione+bmo
Severity: -- → S3
Fission Milestone: ? → M8
Priority: -- → P3
Flags: needinfo?(kmaglione+bmo)
Pushed by maglione.k@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/faa061d58d37
Remove NS_WARN_IF for expected case in FramingChecker. ckerschb r=ckerschb
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: