Closed Bug 1113196 Opened 9 years ago Closed 8 years ago

Don't pass the existing document for top-level loads to the loadinfo

Categories

(Core :: DOM: Security, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
e10s m8+ ---
firefox46 + wontfix
firefox47 --- fixed

People

(Reporter: ckerschb, Assigned: mrbkap)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 3 obsolete files)

When debugging https://bugzilla.mozilla.org/show_bug.cgi?id=1094156#c35 we encountered the following problem:

TestPage:
https://github.com/PagueVeloz/ClientAPI/blob/master/python/pagueveloz/api/base.py
Click: 'Raw' Button


We have to investigate why the requestingNode->NodePrincipal() in 
http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#10315
is different for e10s and regular mode.

In fact, in e10s, the principal is:
  https://github.com/PagueVeloz/ClientAPI/blob/master/python/pagueveloz/api/base.py
whereas in regular mode is
  SystemPrincipal

The Principal should be the same in both cases!
Depends on: 1094156
Blocks: 1006868
No longer depends on: 1094156
Still depending on bug 1094156.
Depends on: 1094156
Assignee: nobody → mrbkap
Blake, is this important enough to track?
Flags: needinfo?(mrbkap)
This definitely would have been, but has already been fixed.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(mrbkap)
Resolution: --- → DUPLICATE
Oops, I think I acted too quickly.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
I've been looking at the principals in nsDocShell::DoURILoad in the non-e10s case and this is what I have so far - https://etherpad.mozilla.org/docShellPrincipals
In reference to <http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp?rev=d639e397f2f0#10281>:

The difference in the top level case between e10s and non-e10s is that in regular Firefox, the top-level content docshell is in a <xul:browser>, which counts as a frame element. That's where the system principal comes from in comment 0. In e10s Firefox, we don't have a <xul:browser> in the child process so we fall into the GetExtantDocument case.

I don't really understand the call to GetExtantDocument, actually. If this is a load for a document that isn't in a frame, the requesting node doesn't seem like it should be the existing document (which is about to be replaced anyway). Can we use null there?
Status: REOPENED → NEW
Flags: needinfo?(mozilla)
(In reply to Blake Kaplan (:mrbkap) (please use needinfo!) from comment #6)
> In reference to
> <http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.
> cpp?rev=d639e397f2f0#10281>:
> 
> The difference in the top level case between e10s and non-e10s is that in
> regular Firefox, the top-level content docshell is in a <xul:browser>, which
> counts as a frame element. That's where the system principal comes from in
> comment 0. In e10s Firefox, we don't have a <xul:browser> in the child
> process so we fall into the GetExtantDocument case.

Thanks for the analysis - that's great to know.

> I don't really understand the call to GetExtantDocument, actually. If this
> is a load for a document that isn't in a frame, the requesting node doesn't
> seem like it should be the existing document (which is about to be replaced
> anyway). Can we use null there?

I think we can use null as the node, but in general I guess we want to to have the same node, and also node->NodePrincipal() in e10s and non-e10s. Jonas, what do you think? How would like to proceed?
Flags: needinfo?(mozilla) → needinfo?(jonas)
For what it's worth, we don't have a <xul:browser> in the child process at all. We could create some sort of hidden chrome page to parent currently top-level content docshells in child processes, but that seems very complicated to me.
I think the solution that I mentioned on the call yesterday would fix this.

I.e. we should, after calling mScriptGlobal->GetFrameElementInternal, check that the return element belongs to a docshell of the same "type" as the current docshell. If it does not, then we should behave as if GetFrameElementInternal had returned null.

I'm honestly not sure why we have the GetExtantDoc call there. But I think that's the topic for a different bug.
Flags: needinfo?(jonas)
See Also: → 1105556
In bug 1105556 we are discussing setting the loadingPrincipal to null for Top level loads.  This would fix this bug and make the behavior consistent for top level loads.  Although we should double check that we get the same loadInfo values for frame loads and for the triggeringPrincipal in the e10s and non-e10s cases.
Depends on: 1105556
Summary: ReqeustingNode->NodePrincipal() should return the same principal in e10s → RequestingNode->NodePrincipal() should return the same principal in e10s
This depends on bug 1105556, which is on Tanvi's list for next quarter. Given that we don't know of any immediate ill effects of this bug, I'd like to push this to m8.
Hey Blake, as discussed in person yesterday, maybe we indeed shouldn't do |requestingNode = mScriptGlobal->GetExtantDoc();| here [1]. Did you end up pushing that change to try?

Maybe that is the source of all evil here. Unfortunately I can't trace it down anymore, but apparently Boris mentioned something about that change in the review when that line landed [2]. If that is indeed the case I am kean to get that fixed and happy to help out.

[1] http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#10524
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1038756#c248
Flags: needinfo?(mrbkap)
Hey Christoph, we almost certainly shouldn't do that and that's actually part of what Tanvi is going to be doing in bug 1105556. This bug and that are entirely interwoven.
Flags: needinfo?(mrbkap)
Some documentation on the docshell behavior for e10s:
https://etherpad-mozilla.org/docShellPrincipals-e10s

Can be compared to non-e10s here:
https://etherpad-mozilla.org/docShellPrincipals2

Looks like the first thing that needs to be done is to remove the GetExtantDoc call and figure out what to replace it with https://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#10412.  In the non-e10s case, I've only seen that codepath called for xul (ex: Show History or on startup).
Giving this to Tanvi since she's doing all of the work here.
Assignee: mrbkap → tanvi
Tanvi, are you making progress here?
Flags: needinfo?(tanvi)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #16)
> Tanvi, are you making progress here?

Yes, I am working on this.  It will probably be fixed by https://bugzilla.mozilla.org/show_bug.cgi?id=1105556.
Flags: needinfo?(tanvi)
(In reply to Tanvi Vyas [:tanvi] from comment #17)
> (In reply to Brad Lassey [:blassey] (use needinfo?) from comment #16)
> > Tanvi, are you making progress here?
> 
> Yes, I am working on this.  It will probably be fixed by
> https://bugzilla.mozilla.org/show_bug.cgi?id=1105556.

What's the ETA? This is blocking e10s
Flags: needinfo?(tanvi)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #18)
> (In reply to Tanvi Vyas [:tanvi] from comment #17)
> > (In reply to Brad Lassey [:blassey] (use needinfo?) from comment #16)
> > > Tanvi, are you making progress here?
> > 
> > Yes, I am working on this.  It will probably be fixed by
> > https://bugzilla.mozilla.org/show_bug.cgi?id=1105556.
> 
> What's the ETA? This is blocking e10s

I'm not sure this will make it to FF46.  The proposed changes haven't been signed off yet.  I've started the implementation but there are likely to be many bugs and corner cases.  We can discuss further next week over irc or vidyo and see if we can meet the needs of e10s even if we don't completely finish bug 1105556 in time for FF 46.  (Note that we shouldn't try and uplift that bug.)
Flags: needinfo?(tanvi)
NI to Blake to talk to Tanvi and get a plan for 46
Flags: needinfo?(mrbkap)
Status: NEW → ASSIGNED
This is basically the same as bug 1105556 at this point. The part that we need to do for the initial rollout of e10s has been moved to bug 1248089.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago8 years ago
Resolution: --- → DUPLICATE
I have a patch that seems like it could be uplifted here. It gets rid of the GetExtantDocument call but instead of passing null for the loading principal (which is a more proper fix, but also more invasive) it passes a codebase principal of the page that we're about to load.
Assignee: tanvi → mrbkap
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Summary: RequestingNode->NodePrincipal() should return the same principal in e10s → Don't pass the existing document for top-level loads to the loadinfo
Attached patch Proposed patch (obsolete) — Splinter Review
The one thing I didn't do here that I could was to force non-e10s to follow this behavior. Tanvi, what do you think?
Attachment #8719940 - Flags: review?(tanvi)
(In reply to Blake Kaplan (:mrbkap) (please use needinfo!) from comment #23)
> Created attachment 8719940 [details] [diff] [review]
> Proposed patch
> 
> The one thing I didn't do here that I could was to force non-e10s to follow
> this behavior. Tanvi, what do you think?

Jonas and I discussed yesterday, and we prefer to add another loadinfo constructor.  Something like this:


+/* Constructor takes an outer window, but no loadingNode or loadingPrincipal.
+ * This constructor should only be used for TYPE_DOCUMENT loads, since they
+ * have a null loadingNode and loadingPrincipal.
+*/
+LoadInfo::LoadInfo(nsIPrincipal* aTriggeringPrincipal,
+                   nsSecurityFlags aSecurityFlags,
+                   nsPIDOMWindowOuter* aOuterWindow)
https://pastebin.mozilla.org/8860304

But thinking about this more, this constructor is good for bug 1105556, but not for this bug since you have a loadingPrincipal.  Maybe we could add loadingPrincipal here and then remove it in bug 1105556.
Comment on attachment 8719940 [details] [diff] [review]
Proposed patch

See comment 25.  Also see loadinfo patch on bug 1105556.  You can tweak the loadinfo.cpp and loadinfo.h changes there for this bug. https://bugzilla.mozilla.org/attachment.cgi?id=8722787
Attachment #8719940 - Flags: review?(tanvi) → review-
Attached patch Patch v2 (obsolete) — Splinter Review
This is based on top of Tanvi's patch in the other bug with Jonas and
Christoph's comments addressed. I had to back off on the assertions
asked for (we still create LoadInfos with securityFlags set to 0 because
not all call sites have been converted to asyncOpen2 and the hidden
window is a window with a null frameElement but a non-null scriptable
parent).
Attachment #8725467 - Flags: review?(tanvi)
Attachment #8725467 - Flags: review?(jonas)
Attachment #8719940 - Attachment is obsolete: true
Comment on attachment 8725467 [details] [diff] [review]
Patch v2

Review of attachment 8725467 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with that fixed.

::: docshell/base/nsDocShell.cpp
@@ +10605,5 @@
> +    // happen for top-level loads in e10s (or for top-level XUL window loads in
> +    // the parent process, in that case, we don't want to create a codebase
> +    // principal with a chrome: URI, so avoid doing that here).
> +    nsIScriptSecurityManager* ssm = nsContentUtils::GetSecurityManager();
> +    ssm->GetDocShellCodebasePrincipal(aURI, this, getter_AddRefs(loadingPrincipal));

Can you assert that aContentPolicyType is TYPE_DOCUMENT

@@ +10609,5 @@
> +    ssm->GetDocShellCodebasePrincipal(aURI, this, getter_AddRefs(loadingPrincipal));
> +  } else {
> +    // This is a top-level chrome load, use a system principal for the
> +    // loadingPrincipal.
> +    loadingPrincipal = nsContentUtils::GetSystemPrincipal();

Same here.

@@ +10617,5 @@
> +    requestingNode ?
> +      new LoadInfo(loadingPrincipal, triggeringPrincipal, requestingNode,
> +                   securityFlags, aContentPolicyType) :
> +      new LoadInfo(loadingPrincipal, triggeringPrincipal, securityFlags,
> +                   requestingWindow);

Nit, could you make requestingWindow the first argument for this ctor? It maps pretty closely to the loadingPrincipal (and eventually will replace the loadingPrincipal).

::: netwerk/base/nsNetUtil.cpp
@@ -374,5 @@
> -                                 nsINode            *aLoadingNode,
> -                                 nsIPrincipal       *aLoadingPrincipal,
> -                                 nsIPrincipal       *aTriggeringPrincipal,
> -                                 nsSecurityFlags     aSecurityFlags,
> -                                 nsContentPolicyType aContentPolicyType,

I'd rather have both the take-all-parts function, and the take-a-loadInfo function. I'd rather not force callers to create a loadinfo under normal circumstances.
Attachment #8725467 - Flags: review?(jonas) → review+
(In reply to Jonas Sicking (:sicking) from comment #29)
> I'd rather have both the take-all-parts function, and the take-a-loadInfo
> function. I'd rather not force callers to create a loadinfo under normal
> circumstances.

Both versions of this function exist. The diff is just a bit confusing.
That patch is orange on try because we can have docshells with null mScriptGlobal.
Attached patch Orange fixes v1 (obsolete) — Splinter Review
This applies on top of the previous patch. The logic in docshell is getting complicated because of the !mScriptGlobal case. Jonas, any thoughts? I decided to keep calling the old constructor if !mScriptGlobal.
Attachment #8725525 - Flags: review?(jonas)
Attached patch Final patchSplinter Review
This has r=sicking in person. The logic is slightly cleaned up from the previous patch but should otherwise be the same. If this is green on try I'll push to inbound tomorrow. This should be very safe. Even though the changes look big, they're actually only affecting the specific case that we care about in this bug.
Attachment #8726008 - Flags: review+
Attachment #8725467 - Attachment is obsolete: true
Attachment #8725467 - Flags: review?(tanvi)
Attachment #8725525 - Attachment is obsolete: true
Attachment #8725525 - Flags: review?(jonas)
I had accidentally set mIsThirdPartyContext to true in the TYPE_DOCUMENT constructor. That's fixed now. This should be the final try run.
Looks like once this lands it should uplift to 46 along with the work in bug 1105556. 
Tracking for 46 since this is one of the last blockers for e10s staged rollout on beta.
https://hg.mozilla.org/integration/mozilla-inbound/rev/750b61b5d681dacc485e53201612b717b95a6d07
Bug 1113196 - Pass a sane set of parameters to loadinfo for top-level loads in e10s. r=sicking
OK. I was going from some earlier comments and may have misunderstood. thanks Blake
Comment on attachment 8726008 [details] [diff] [review]
Final patch

># HG changeset patch
># User Blake Kaplan <mrbkap@gmail.com>
>
>Bug 1113196 - Pass a sane set of parameters to loadinfo for top-level loads in e10s.
>
>diff --git a/docshell/base/nsDocShell.cpp b/docshell/base/nsDocShell.cpp
>index 587d6a8..f8dfc99 100644
>--- a/docshell/base/nsDocShell.cpp
>+++ b/docshell/base/nsDocShell.cpp
>@@ -10556,21 +10556,49 @@ nsDocShell::DoURILoad(nsIURI* aURI,
>     }
>   }
> 
>   // open a channel for the url
>   nsCOMPtr<nsIChannel> channel;
> 
>   bool isSrcdoc = !aSrcdoc.IsVoid();
> 
>+  // There are three cases we care about:
>+  // * Null mScriptGlobal: shouldn't happen but does (see bug 1240246). In this
>+  //   case, we create a loadingPrincipal as for a top-level load, but we leave
>+  //   requestingNode and requestingWindow null.
So in the null mScriptGlobal case, we use the original LoadInfo constructor (not the one that takes the outerwindow).  Correct?

>+  // * Top-level load (GetFrameElementInternal returns null). In this case,
>+  //   requestingNode is null, but requestingWindow is our mScriptGlobal.
>+  //   TODO we want to pass null for loadingPrincipal in this case.
>+  // * Subframe load: requestingWindow is null, but requestingNode is the frame
>+  //   element for the load. loadingPrincipal is the NodePrincipal of the frame
>+  //   element.
requestingWindow isn't technically "null" - it does exist, it is just not set.  Maybe we should make that a little clearer in the comments.
Comment on attachment 8726008 [details] [diff] [review]
Final patch

Looks good Blake!  A few comment suggestions in Comment 41.  Should we run this by bz?
Attachment #8726008 - Flags: review+
Depends on: 1253792
https://hg.mozilla.org/mozilla-central/rev/750b61b5d681
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
I think we should. Let's let it (and the followups like bug 1253792) bake for a few more days and then uplift.
Flags: needinfo?(mrbkap)
Comment on attachment 8726008 [details] [diff] [review]
Final patch

Approval Request Comment
[Feature/regressing bug #]: n/a
[User impact if declined]: Possibly misbehaving security code.
[Describe test coverage new/current, TreeHerder]: Lots of existing coverage on tests (see the dependencies).
[Risks and why]: Medium risk (docshell code), but this has shaken out on Treeherder and all of the known regressions have been fixed.
[String/UUID change made/needed]: n/a
Attachment #8726008 - Flags: approval-mozilla-aurora?
I need another person to assess the risk/reward on this one. I have requested Bz's opinion here https://bugzilla.mozilla.org/show_bug.cgi?id=1240246#c15
This is showing up as fixed on aurora 47 already (landed on March 5 in comment 43)  Are you asking for uplift to beta or is this additional stuff that needs to land for 47?  I set the status-firefox47 flag back to ? since I couldn't tell.

Bz says in the other bug,  "Seems to be not insane".  Not a hearty endorsement and also since we are not enabling e10s for 46 release, wontfixing for 46. If you strongly disagree please let me know!
Flags: needinfo?(mrbkap)
Sure.
Flags: needinfo?(mrbkap)
Comment on attachment 8726008 [details] [diff] [review]
Final patch

This seems to have landed on 47 when it was in central.
Attachment #8726008 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
You need to log in before you can comment on or make changes to this bug.