Closed Bug 1319111 Opened 8 years ago Closed 7 years ago

Expose result principal URL ("final channel URL") on LoadInfo, convert current consumers of LOAD_REPLACE

Categories

(Core :: Security: CAPS, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox53 --- wontfix
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- fixed

People

(Reporter: mayhemer, Assigned: mayhemer)

References

(Blocks 1 open bug)

Details

(Keywords: addon-compat)

Attachments

(1 file, 43 obsolete files)

90.72 KB, patch
mayhemer
: review+
Details | Diff | Splinter Review
https://bugzilla.mozilla.org/show_bug.cgi?id=1256122#c55

LOAD_REPLACE flag is misused for two different purposes.  One of them is which of the channel's URLs to use for security checks - if either URL or originalURL.  That interferes with redirects.

Instead, expose this URL on the LoadInfo of the channel and change all consumers to use that instead of playing with originalURL and LOAD_REPLACE.
Summary: Expose security check URL ("final channel URL") on LoadInfo, convert current consumer of LOAD_REPLACE → Expose security check URL ("final channel URL") on LoadInfo, convert current consumers of LOAD_REPLACE
Just mirroring the priority of bug 1256122.
Priority: -- → P2
For my own ref: https://hg.mozilla.org/mozilla-central/rev/60613f18435b
We no longer need to retain the LOAD_REPLACE flag in NS_NewChannelInternal.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Boris, what is the deadline here?  This turns out to be a bit harder than I expected.  I had to turn to some top crashes and other bugs too.  And I'm facing some issues here which, as I'm not an expert, are a bit harder to overcome (will refer later on what exactly).

If this is urgent to fix, probably someone else should work on it now.  And also verify the idea from https://bugzilla.mozilla.org/show_bug.cgi?id=1256122#c55 is actually the way to go.  For instance, I think function of originalURL should be preserved as it is now (to be a "display" url).
Flags: needinfo?(bzbarsky)
I don't know.  I assume this is as urgent as bug 1256122, right?  Kris might have a better idea of how urgent that is.

I can pick this up, but I'd have to drop other things to do it...  Kris, please let me know if I should do that, or try to find someone else who can pick this up.
Flags: needinfo?(bzbarsky) → needinfo?(kmaglione+bmo)
Jason, is there someone other than Honza available to work on the necko end of this?
Flags: needinfo?(jduell.mcbugs)
Boris, despite that the base change may be in necko, this touches the whole platform as well as end products.

Anyway, I may get to this during the second half of the next week.  But I will have to spend some time on studying all the tentacles.
> Boris, despite that the base change may be in necko, this touches the whole platform as well as end products.

Sure, because we have to check all the protocol handlers and see whether they set an originalURI distinct from URI, right?  And those are scattered all over the tree...

Andy, blassey told me I should probably be pinging you, not Kris, for an urgency evaluation on this one.  He also suggested that if someone on your team is interested in learning more about necko and docshell and whatnot it might be a good idea to have them look at this bug and then the rest of bug 1256122.  I would be a available to mentor and review, obviously.
Flags: needinfo?(kmaglione+bmo) → needinfo?(amckay)
Thanks for asking, it isn't urgent, but its not something we can ignore for too long.
Flags: needinfo?(amckay)
Can we translate that into a timeframe?  Days?  Weeks?  Months?

Also, _are_ there people on your team who would be interested in working on this sort of thing?
Flags: needinfo?(amckay)
As I said, I can start in the middle of the next week or week after, tops.  If that's OK, then this bug can stay with me.
(In reply to Honza Bambas (:mayhemer) from comment #11)
> As I said, I can start in the middle of the next week or week after, tops. 
> If that's OK, then this bug can stay with me.

I won't have time to get to it before then either, and it would probably take anyone else on my team longer than that to get up to speed, so that's probably best.

(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #10)
> Can we translate that into a timeframe?  Days?  Weeks?  Months?

From the discussions we've had so far, I'd say weeks. This seems to be blocking a fair number of developers.

> Also, _are_ there people on your team who would be interested in working on
> this sort of thing?

I am, yes.
Flags: needinfo?(amckay)
There's no one in necko who's likely to be free to work on this before Honza gets to it.
Flags: needinfo?(jduell.mcbugs)
Attached patch wip1 (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4c8d8489e465fc0b12cb883f0d532ff70840dae1

- LOAD_REPLACE only used for redirected channels (removing it too is part of a different bug)
- where LOAD_REPLACE was being set to distinguish actual url from the original url, we set actual url as securityCheckURL on load info and don't set LOAD_REPLACE anymore ; original URL value is preserved as before this patch
- where LOAD_REPLACE was being removed from a channel, we now clear securityCheckURL (set null) from load info (probably a no-op, tho)
- originalURL is NO LONGER set for redirect target channels because otherwise it caused two isses:
  1. address bar was still showing the source url (because NS_GetFinalChannelURL no longer picked url but originalURL on redirected channels)
  2. redirect cross origin checks were broken (I can't provide much data here why, this was a bit like try-and-see way of development for me)

I felt like "I didn't know what I was doing" when writing this patch.  Hence, this is a wip w/o deep understanding of the change I've made.

General f? - if anyone interested.
Attachment #8819673 - Flags: feedback?
(In reply to Honza Bambas (:mayhemer) from comment #14)
> Created attachment 8819673 [details] [diff] [review]
> wip1
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=4c8d8489e465fc0b12cb883f0d532ff70840dae1

Ehm... pretty much orange...
Hmm.  So I have some questions about that patch:

1)  Why the removals of SetOriginalURI in the HTTP redirect code?  I thought the idea was to preserve it?

2)  I assume all the docshell/shistory changes are so we can propagate the "security check uri" though to history loads?  Are we already propagating the LOAD_REPLACE state in some way right now?  This part will need session restore changes too...

3)  I thought the idea was to stop using originalURI as the "security check uri".  That means we need to fix any protocol handler that does SetOriginalURI() in newChannel, right?  Some of those are fixed in this patch, but nsHostObjectProtocolHandler, nsJSProtocolHandler, nsSubstituringProtocolHandler are not.  Also not sure about the SetOriginalURI in nsHTMLDocument.cpp in the wyciwyg case, the multimixedconv bits, etc.  Basically, we should look at all the SetOriginalURI callers and decide what they should be doing.

4)  During HTTP redirects, we should probably not be propagating the "security check URI" from the pre-redirect channel, right?  In fact, we should be picking it up directly from the post-redirect channel or something...

5)  NS_GetFinalChannelURI has null-derefs when it's doing the (*uri)->GetAsciiSpec(spec) thing before securityCheckURI.forget(uri);
(In reply to Honza Bambas (:mayhemer) from comment #14)
> - originalURL is NO LONGER set for redirect target channels because
> otherwise it caused two isses:
>   1. address bar was still showing the source url (because
> NS_GetFinalChannelURL no longer picked url but originalURL on redirected
> channels)
>   2. redirect cross origin checks were broken (I can't provide much data
> here why, this was a bit like try-and-see way of development for me)

This is the cause of why the try is completely orange.  Some pieces of code are using original URL for mapping.  E.g. dom/html/test/test_imports_redirect.html fails because an ImportLoader is not found in ImportManager.  The related loader is added under the URI taken directly from the markup.  Hence, if it has to be found later, it has to be looked for by originalURL on a channel.

For the scenario from bug 1256122, there is a sequence of channel #1: http://foo: 302 -> channel #2 moz-extension://foo which internally points to a file:///foo URL.  The original URL here must remain the http://foo URL.  For the security checking we must override to the moz-extension:// URL.  So, channel #2 should look like:
- URL = file:///foo 
- originalURL = http://foo
- loadinfo->securityCheckURL = moz-extension://foo 

The wip1 patch were clearing securityCheckURL on the load info where the LOAD_REPLACE flag was originally dropped.  But that's just the same approach we have now implemented a different way!  Stupid me.. :)
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #16)
> Hmm.  So I have some questions about that patch:
> 
> 1)  Why the removals of SetOriginalURI in the HTTP redirect code?  I thought
> the idea was to preserve it?

I think comment 17 answers this.

> 
> 2)  I assume all the docshell/shistory changes are so we can propagate the
> "security check uri" though to history loads?  Are we already propagating
> the LOAD_REPLACE state in some way right now?  This part will need session
> restore changes too...

We do preserve LOAD_REPLACE already.  And good point about the session restore changes..

> 
> 3)  I thought the idea was to stop using originalURI as the "security check
> uri".  That means we need to fix any protocol handler that does
> SetOriginalURI() in newChannel, right?  Some of those are fixed in this
> patch, but nsHostObjectProtocolHandler, nsJSProtocolHandler,
> nsSubstituringProtocolHandler are not.  Also not sure about the
> SetOriginalURI in nsHTMLDocument.cpp in the wyciwyg case, the multimixedconv
> bits, etc.  Basically, we should look at all the SetOriginalURI callers and
> decide what they should be doing.

I think comment 17 applies here too?

> 
> 4)  During HTTP redirects, we should probably not be propagating the
> "security check URI" from the pre-redirect channel, right?  In fact, we
> should be picking it up directly from the post-redirect channel or
> something...

Yes, the secCheckURI is specific to a channel.  A redirect (= a new channel) should drop it in it's cloned loadinfo.

> 
> 5)  NS_GetFinalChannelURI has null-derefs when it's doing the
> (*uri)->GetAsciiSpec(spec) thing before securityCheckURI.forget(uri);

just debugging code, forgot to remove it before the try push.

I'll try to have a new patch yet today, or only after a new year.
(In reply to Honza Bambas (:mayhemer) from comment #18)
> > 3)  I thought the idea was to stop using originalURI as the "security check
> > uri".  That means we need to fix any protocol handler that does
> > SetOriginalURI() in newChannel, right?  Some of those are fixed in this
> > patch, but nsHostObjectProtocolHandler, nsJSProtocolHandler,
> > nsSubstituringProtocolHandler are not.  Also not sure about the
> > SetOriginalURI in nsHTMLDocument.cpp in the wyciwyg case, the multimixedconv
> > bits, etc.  Basically, we should look at all the SetOriginalURI callers and
> > decide what they should be doing.
> 
> I think comment 17 applies here too?

comment 17 doesn't apply here.  I will walk this one by one.
Attached patch wip2 (backup) (obsolete) — Splinter Review
still stuff missing (e.g. new tab puts chrome:// url in the address bar) but redirects and some test work now

- purpose of the originalURL left unchanged for all cases
- the new "security check URL" property on load info is now set every time, to either URL or originalURL value according how LOAD_REPLACE was set prior this patch ; done on most of the places where LOAD_REPLACE was used, still need to check on others
- during redirects secCheckURI is set to the target URL _before_ we create the target channel
- this allows creation of the redirect target channel freely override the value to something else 
- after the redirect is done and the originalURL is re-set to the redirect origin URL, we end up with all proper 3 distinct URLs setup on the channel as described in comment 17

TODO:
- session store
- do a deeper check on how protocols set original URI and expect it to be used for sec checking
- and probably some more :)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=974d6eff6ca097ad2644565f0d53c06430c9ca4c
Attachment #8819673 - Attachment is obsolete: true
Attachment #8819673 - Flags: feedback?
> I think comment 17 answers this.

I can't make sense of comment 17 or how it applies here...  But the wip2 drops those bits, so good.  ;)
Wondering if there's any update on this bug. Asking because if I go back up the tree it blocks bug 1276048 which is a P1 for add-ons.
(In reply to Andy McKay [:andym] from comment #23)
> Wondering if there's any update on this bug. Asking because if I go back up
> the tree it blocks bug 1276048 which is a P1 for add-ons.

This bug is on top of my list.  There would have been an update last week, but I caught flue, I'm currently recovering.  So, start of the next week is most likely for me to move forward here.
Great thanks, sorry to hear about the flu.
Attached patch v1 (obsolete) — Splinter Review
reason why wip3 didn't work was that we must set the sec check uri after we create the underlying channel (specially in aboutredirector code).  otherwise the desired URI will be rewritten by the channel creation logic.

will ask for r based on try results.
Attachment #8820339 - Attachment is obsolete: true
Honza, any update on this?
Flags: needinfo?(honzab.moz)
(In reply to Shane Caraveo (:mixedpuppy) from comment #29)
> Honza, any update on this?

If there is anyone else willing to work on this, please feel free to take this bug.  It's not very efficient me working on this anyway.  I'd love to move another small bit forward here this week, but I can't say for sure.
Flags: needinfo?(honzab.moz)
Attached patch v2 (obsolete) — Splinter Review
Attachment #8831269 - Attachment is obsolete: true
Attached patch v3 (obsolete) — Splinter Review
This still makes a lot of tests fail:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fa1b2eccd377e681e0524d4591d302a7644fcc41&filter-classifiedState=unclassified&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception

But before I go on, I'd like to ask for a feedback that I'm on the right track.  

Then, any help with how to proceed with diagnosing any of the failing tests would be very much appreciated.  Thanks!
Attachment #8839236 - Attachment is obsolete: true
Attachment #8842482 - Flags: feedback?
Honza, was that feedback request meant for me or for someone else?
Flags: needinfo?(honzab.moz)
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #33)
> Honza, was that feedback request meant for me or for someone else?

Yes, it was.  If you have time, it would be helpful if you could at least quick look.  The try run may be interesting to overlook too.  Thanks.
Flags: needinfo?(honzab.moz)
Flags: needinfo?(bzbarsky)
Comment on attachment 8842482 [details] [diff] [review]
v3

Sorry for the lag here.  I had to page all this back in.

So the general idea is that we need to change all the places that do SetOriginalURI to also set the security check URI?

Is there a reason to not simply have SetOriginalURI affect the security check URI?  And in the few places (redirects, other places that used to set LOAD_REPLACE) where we want to update them independently, save the securityCheckURI before doing SetOriginalURI and restore it afterward, or set it to whatever we want?

The other option, of course, is to go through all the SetOriginalURI callers and hope we didn't miss any.

The sessionstore changes may need a fallback for what to do if aEntry.securityCheckURI is not set in deserialization.

In NS_GetFinalChannelURI, you want to do GetURI, not GetOriginalURI, if securityCheckURI is null, right?  And it's worth checking whether the !loadInfo case is being hit for the test failures...

The change to NS_NewChannelInternal is ok because protocol handlers should never be setting LOAD_REPLACE now?  Can we assert there if it's set on the channel already, in case someone is messing up?

In HttpBaseChannel::CloneLoadInfoForRedirect, why are we calling SetSecurityCheckURI with a non-null thing at all?  I'd assume we want to set it to null, then let the protocol handler for the new channel override as needed, right?

We should probably add docs explaining that the last arg to SubstituteChannel() is an inout param that is already set and addrefed on entry into the function....

In SubstitutingProtocolHandler::NewChannel2, do we want to set the securityCheckURI before or after the SubstituteChannel call?  Note that that call can create a new channel, and hence set securityCheckURI itself.  And I think we want to override it, so make our own call later.

In nsViewSourceChannel, shouldn't we SetSecurityCheckURI to our own security check uri, instead of our originalURI?  Or should we set it at all?  It's not clear to me that we should.  In general, seems like no one outside protocol handlers should be setting securityCheckURI...

In terms of the test failures, it's probably worth fixing the things I mention above and then seeing what try looks like.
Flags: needinfo?(bzbarsky)
Attachment #8842482 - Flags: feedback? → feedback+
Any progress here? This bugfix is very critical for porting my Add-on to WebExtensions.
Flags: needinfo?(honzab.moz)
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #35)
> Comment on attachment 8842482 [details] [diff] [review]
> v3
> 
> Sorry for the lag here.  I had to page all this back in.

Thanks, I have the same lag problem...  Sorry, I am pretty busy with CDP and other bugs on my plate these days.

> 
> So the general idea is that we need to change all the places that do
> SetOriginalURI to also set the security check URI?

The patch seems to turn out to exactly that.  If that is the right way to do is yet to be proven.

> 
> Is there a reason to not simply have SetOriginalURI affect the security
> check URI?  And in the few places (redirects, other places that used to set
> LOAD_REPLACE) where we want to update them independently, save the
> securityCheckURI before doing SetOriginalURI and restore it afterward, or
> set it to whatever we want?

I was already thinking about it.  If it turns out to be less code to update and viable thing to do, I will do it that way.

> 
> In NS_GetFinalChannelURI, you want to do GetURI, not GetOriginalURI, if
> securityCheckURI is null, right?  

I'm not sure about this.  

The very original idea was that setting loadinfo.securi = URI would be equal to channel.loadflags |= LOAD_REAPLACE and channel.uri = URI, leading to NS_GetFinalChannelURI -> URI from channel.uri.  

When loadinfo.securi = null (in the first version of the patch) the effect would be equal channel.loadflags &= ~LOAD_REPLACE, NS_GetFinalChannelURI -> channel.originalURI.

It turned out we have to set the URI every time to the one we want NS_GetFinalChannelURI to return, since it sometimes may get set to something else when a channel is created by a protocol handler (comment 26).

> And it's worth checking whether the
> !loadInfo case is being hit for the test failures...
> 

Will try.

> In HttpBaseChannel::CloneLoadInfoForRedirect, why are we calling
> SetSecurityCheckURI with a non-null thing at all?  I'd assume we want to set
> it to null, then let the protocol handler for the new channel override as
> needed, right?

No, and this should be explained in comment 20.

> 
> We should probably add docs explaining that the last arg to
> SubstituteChannel() is an inout param that is already set and addrefed on
> entry into the function....
> 
> In SubstitutingProtocolHandler::NewChannel2, do we want to set the
> securityCheckURI before or after the SubstituteChannel call?  Note that that
> call can create a new channel, and hence set securityCheckURI itself.  And I
> think we want to override it, so make our own call later.

That could be source of the test failures.  yes, we should override securi here.

> 
> In nsViewSourceChannel, shouldn't we SetSecurityCheckURI to our own security
> check uri, instead of our originalURI?  Or should we set it at all?  It's
> not clear to me that we should.  

I will try to find out, maybe we shouldn't do anything here.

> In general, seems like no one outside
> protocol handlers should be setting securityCheckURI...
> 
> In terms of the test failures, it's probably worth fixing the things I
> mention above and then seeing what try looks like.

Thank you.
Flags: needinfo?(honzab.moz)
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #35)
> The sessionstore changes may need a fallback for what to do if
> aEntry.securityCheckURI is not set in deserialization.

The only thing I can think of is: shEntry.securityCheckURI = entry.loadReplace ? shEntry.URI : shEntry.originalURI; when entry.securityCheckURI is undefined.  Not sure it's correct for all cases, but we can't throw away the whole saved sessions just because of this...  OTOH, we only want to preserve behavior of GetFinalChannelURI as it used to be before this patch for stored session data.  So it should work.
(In reply to Honza Bambas (:mayhemer) from comment #38)
Any idea when you can fix this bug?
> I'm not sure about this.  

I mean, it depends on the semantics of the security check URI. But with the state in the patch as attached, the security check URI is set any time original URI is set.  When security check URI is _not_ set in that patch, that corresponds to the current LOAD_REPLACE case, at least for HTTP redirects.  Unless I'm misreading something.

> No, and this should be explained in comment 20.

Hmm.  So is that part trying to preserve the current (buggy, really) behavior and then we will fix it in 
1256122 to the behavior we really want at the end of that day?  OK, that makes sense.
> OTOH, we only want to preserve behavior of GetFinalChannelURI as it used to be before this
> patch for stored session data.

Right, that is the key.
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #40)
> > I'm not sure about this.  
> 
> I mean, it depends on the semantics of the security check URI. But with the
> state in the patch as attached, the security check URI is set any time
> original URI is set.  When security check URI is _not_ set in that patch,
> that corresponds to the current LOAD_REPLACE case, at least for HTTP
> redirects.  Unless I'm misreading something.

The idea is to set securi EVERY TIME, regardless if LOAD_REPLACE used to or not be set.  But the patch is probably incomplete at this right now.  I'll see what the update with your suggested changes will do.

> 
> > No, and this should be explained in comment 20.
> 
> Hmm.  So is that part trying to preserve the current (buggy, really)
> behavior and then we will fix it in 
> 1256122 to the behavior we really want at the end of that day?  OK, that
> makes sense.

Yep, it should be an intermediate state only, fixed in that bug, if any other fix for it will be necessary.
Attached patch v3 -> v4 interdiff (obsolete) — Splinter Review
I tried few of the failing tests from the previous run locally with this patch and they are fixed.  Let's see what the patch does now.


https://treeherder.mozilla.org/#/jobs?repo=try&revision=90339875ed5b17ce2293a41f932c505fc1e7ab9d
Attached patch v4 (obsolete) — Splinter Review
Attachment #8842482 - Attachment is obsolete: true
Attached patch v4 (obsolete) — Splinter Review
(this one is merged to today m-c)
Attachment #8849217 - Attachment is obsolete: true
> Yep, it should be an intermediate state only, fixed in that bug, if any other fix for it will be necessary.

It will be necessary.  But yes, makes sense to reduce risk like that.
(In reply to Honza Bambas (:mayhemer) from comment #46)
> Comment on attachment 8849222 [details] [diff] [review]
> v4
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=9ae4785fcaac7f01896dd1695151d32f426cc3d0

No help with the suggested changes from comment 35 :(
OK.  Someone is going to need to sit down and debug those testcases.  If you don't have time to do that, does someone else on the necko team?  This is something we really need to fix...
Flags: needinfo?(jduell.mcbugs)
Attached patch v4 -> v5 (obsolete) — Splinter Review
- forgot about one place where LOAD_REPLACE was used in doc shell!
 (caught by devtools/client/styleeditor/test/browser_styleeditor_sv_resize.js)
- view source changes needed, but moved to the handler
 (caught by toolkit/components/viewsource/test/browser/browser_viewsourceprefs.js)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b23cebb179325311acf6180774e435c453899e90
Attachment #8849216 - Attachment is obsolete: true
Attached patch v5 (obsolete) — Splinter Review
The patch may need some polishing prior r?.  It turns out we mostly set sec-uri when original URI is changed on a channel.  Hence, instead of updating all the places calling originalURI = <something>, make the channel change it's own load info.  It will probably be fewer changes.
Attachment #8849222 - Attachment is obsolete: true
I think we are close enough now...  Fingers crossed.
Flags: needinfo?(jduell.mcbugs)
Attached patch v5 -> v6 (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=15c325aa2bcbc30acec84d0d37b3c8e055b02a12

definitely not everything, but getting better.
(In reply to Honza Bambas (:mayhemer) from comment #53)
> Created attachment 8850156 [details] [diff] [review]
> v5 -> v6
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=15c325aa2bcbc30acec84d0d37b3c8e055b02a12
> 
> definitely not everything, but getting better.

   if (aOriginalURI && !aLoadReplace) {
     loadInfo->SetSecurityCheckURI(aOriginalURI);
-  } else {
-    loadInfo->SetSecurityCheckURI(aURI);
   }

breaks again about:devtools* uris.  But this change is needed for moz:// protocol...
Attachment #8842482 - Attachment is obsolete: false
Attached patch v3 -> v6 (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a9ab09bb8a7bf2a0cea7c1d4fdbb3a769b17dfd
Attachment #8849661 - Attachment is obsolete: true
Attachment #8849666 - Attachment is obsolete: true
Attachment #8850156 - Attachment is obsolete: true
The docshell #if 0 changes should go away, keeping them around just to remember the spot, if it happens to need a resurrection.

Still a bit WIP overall.
Depends on: 1349987
Attached patch v3 -> v7 (obsolete) — Splinter Review
Had to return the multipart LOAD_REPLACE flag usage.  When the response is multipart/replace loadgroup may complain the top level document is wrongly replaced by each part.  Will be handled in bug 1319110.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb7b40add0248dac1e74ef091da85a2e7ed16e1e
Attachment #8850503 - Attachment is obsolete: true
Comment on attachment 8850602 [details] [diff] [review]
v3 -> v7

We... are... GREEEEEN!! :)  (at least 93% at the moment I'm pressing enter here)

Boris, please check on this interdiff.  
- The changes are mostly fixes of typos, misplaced setSecURI additions and updates of a code introduce after I've started to work on this bug a long ago
- I hope I've incorporated all of your feedback, at least the major parts
- The docshell changes under #if 0 will go away
- There are also some // commented lines that should go away in the final patch

If there is anything here to polish (like moving setSecURI to channel::setOriginalURL or so) then I'd very much prefer doing it in followups.  I've already spent few full days on this and I can't afford spending more time on this.

thanks.
Attachment #8850602 - Flags: feedback?
Attached patch v7 (obsolete) — Splinter Review
I will look tomorrow first thing when I can focus and read carefully.  In the meantime, I have the one lingering question I want to make sure we're very clear on: who is responsible for setting securityCheckURI and to what and in what circumstances?  It seems to me that conceptually (the HTTP redirect bug we talked about aside), securityCheckURI should be set in situations in which:

1)  LOAD_REPLACE _is_not_ set in our code today and channel.originalURI is set to a URI different from channel.URL.  This is a must.  This requires more or less auditing all originalURI setters, which is more or less what this patch did, right?

2)  LOAD_REPLACE _is_ set; in this case channel.securityCheckURI should be set to channel.URL to match today's behavior.  This seems optional to me, given that if securityCheckURI is null then channel.URL is used.  I see a bunch of places in the patch that set this anyway; I'd like to understand why.

Are there other situations I'm missing, or something else I'm missing in the analysis above?
Flags: needinfo?(honzab.moz)
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #60)
> I will look tomorrow first thing when I can focus and read carefully.  In
> the meantime, I have the one lingering question I want to make sure we're
> very clear on: who is responsible for setting securityCheckURI and to what
> and in what circumstances?  

The patch turns out to put all these changes (except redirects) to protocol handlers after the channel is created.  We can turn that to a rule, I believe.

> It seems to me that conceptually (the HTTP
> redirect bug we talked about aside), securityCheckURI should be set in
> situations in which:
> 
> 1)  LOAD_REPLACE _is_not_ set in our code today and channel.originalURI is
> set to a URI different from channel.URL.  This is a must.  This requires
> more or less auditing all originalURI setters, which is more or less what
> this patch did, right?

Exactly.

> 
> 2)  LOAD_REPLACE _is_ set; in this case channel.securityCheckURI should be
> set to channel.URL to match today's behavior.  This seems optional to me,
> given that if securityCheckURI is null then channel.URL is used.  I see a
> bunch of places in the patch that set this anyway; I'd like to understand
> why.

I think they all are just leftovers from before the patch has settled down the way it is now.

Pushed a try removing those changes: https://treeherder.mozilla.org/#/jobs?repo=try&revision=852e6ec7475187db27dfb1e65fcc4e0c7184f84e

> 
> Are there other situations I'm missing, or something else I'm missing in the
> analysis above?

I think it's complete.
Flags: needinfo?(honzab.moz)
(In reply to Honza Bambas (:mayhemer) from comment #61)
> I think they all are just leftovers from before the patch has settled down
> the way it is now.
> 
> Pushed a try removing those changes:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=852e6ec7475187db27dfb1e65fcc4e0c7184f84e

The changes to devtools/client/framework/about-devtools-toolbox.js are needed.  In this case the URL is actually an original URL.  What confused me here was that there were originally no setting of original URI on the resulting channel.  Not sure it's intentional or a bug.  Tho, prior this patch it works w/o setting the original URI.  I think this omission (not setting orig URI in about-devtools-toolbox.js) made me to add the (now removed) docshell changes originally.  And I think that the original URL on about:devtools-toolbox channels is set in docshell.
(In reply to Honza Bambas (:mayhemer) from comment #63)
> One more push with changes from comment 62:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=b30a33d71c792fae2f1b2bba5368a31b96b979c6

Green.  I will finish the patch and provide v3->v8 and v8 patches.  The interdiff will probably be longer than the full patch now :)
Attached patch v3 -> v8 (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=59a7858f415299ba2d89b018c66b5a32295ae689
Attachment #8850602 - Attachment is obsolete: true
Attachment #8850690 - Attachment is obsolete: true
Attachment #8850602 - Flags: feedback?
Attachment #8851057 - Flags: feedback?
Attached patch v8 (obsolete) — Splinter Review
> And I think that the original URL on about:devtools-toolbox channels is set in docshell.

Yeah, seems quite likely.  The devtools code relying on that is annoying.  ;(

Looking at this now.  Sorry for the lag...
Comment on attachment 8851057 [details] [diff] [review]
v3 -> v8

This is looking pretty good.  A few remaining issues:

1)  The docs about the third arg of SubstituteChannel should go on its decl, not at the callsite.

2)  In nsViewSourceHandler::NewSrcdocChannel the comment talks about |uri|, but we're dealing with aURI there.

3)  I guess nsFileChannel::nsFileChannel does not need to do something with the security check URI in this case:

     SetURI(targetURI);
     SetOriginalURI(uri);

because it used to be setting LOAD_REPLACE?

4)  I'm still a little confused by the HTTP channel code.  In the new setup, we allow the protocol handler for the post-redirect thing to override the securityCheckURI.  But then why do we need to set it anyway?  That would only affect cases when the protocol handler _doesn't_ set it, which are only the cases when it ought to match the channel URI anyway.  And in the cases when it does NOT (e.g. some of the about: cases which don't set securityCheckURI), it looks like we're making a breaking change: before this patch we'd always end up with the post-redirect channel URI, not we'll end up with the post-redirect channel originalURI (at the point when the protocol handler created it).  I really think that this is wrong and we should simply be nulling out securityCheckURI on our cloned loadinfo...

The rest is looking pretty good to me, I think.  I'm probably OK with having a "set the security check URI whenever originalURI is set" followup.  We'll also want a followup to remove the loadreplace args on various docshell-internal functions (which are now ignored) and from nsISHEntry and whatnot..
Flags: needinfo?(bzbarsky)
Attachment #8851057 - Flags: feedback? → feedback+
(In reply to Honza Bambas (:mayhemer) from comment #66)
> Created attachment 8851060 [details] [diff] [review]
> v8

Nice way to hide a real crash behind a reported intermittent:
https://treeherder.mozilla.org/logviewer.html#?job_id=86291114&repo=try

This needs a fix first.

I'll answer on Monday.

Thanks Boris!
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #68)
> Comment on attachment 8851057 [details] [diff] [review]
> v3 -> v8
> 3)  I guess nsFileChannel::nsFileChannel does not need to do something with
> the security check URI in this case:
> 
>      SetURI(targetURI);
>      SetOriginalURI(uri);
> 
> because it used to be setting LOAD_REPLACE?

The interdiff backs out all the changes I've previously added to nsFileChannel.  There is no longer any change to nsFileChannel in the v8 patch except removal of LOAD_REPLACE flag setting.

> 
> 4)  I'm still a little confused by the HTTP channel code.  In the new setup,
> we allow the protocol handler for the post-redirect thing to override the
> securityCheckURI.  But then why do we need to set it anyway?  That would
> only affect cases when the protocol handler _doesn't_ set it, which are only
> the cases when it ought to match the channel URI anyway.  And in the cases
> when it does NOT (e.g. some of the about: cases which don't set
> securityCheckURI), it looks like we're making a breaking change: before this
> patch we'd always end up with the post-redirect channel URI, not we'll end
> up with the post-redirect channel originalURI (at the point when the
> protocol handler created it).  I really think that this is wrong and we
> should simply be nulling out securityCheckURI on our cloned loadinfo...

Probably again a left over (had to address that already.. sorry).  You are right, nullifying it from the cloned info prior giving it to NS_NewChannelInternal is the way to go.  Comment will be added to the code.

> 
> The rest is looking pretty good to me, I think.  I'm probably OK with having
> a "set the security check URI whenever originalURI is set" followup.  We'll
> also want a followup to remove the loadreplace args on various
> docshell-internal functions (which are now ignored) and from nsISHEntry and
> whatnot..

I'll file them.

Thanks!
Blocks: 1350899
Attached patch v8 -> v9 (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=81a6063efc01711f4b0a1a5d2485fac6691ecdbd



Not asking f? for now, since there seems to be a raise in one test failure [1] that is tracked, but doesn't happen that often on the base changeset and seems pretty much related to this patch.

Note: [1] is an older version of the v8->v9 patch (had two omittances), not sure they would be a cause of that failure raise, though.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=8a75f05f05c19b4a51bd948a7a464841471addcc&selectedJob=86684678
Attached patch v9 (obsolete) — Splinter Review
Boris, please see v8 -> v9 idiff for the final set of changes.  I think this is mature enough to ask for a full review.  Try for this final patch from comment 71 is 75% done now and looks positive.
Attachment #8842482 - Attachment is obsolete: true
Attachment #8851057 - Attachment is obsolete: true
Attachment #8851060 - Attachment is obsolete: true
Flags: needinfo?(bzbarsky)
Blocks: 1351411
Attachment #8851791 - Flags: review?(bzbarsky)
Comment on attachment 8851791 [details] [diff] [review]
v9

So a few general comments:

1)  Now that I read this thing as a whole, I'm unsure about the "securityCheckURI" naming.  It's more like the "uriForResultPrincipal" or "principalURI" or something...

2)  Please make sure to get someone who works on sessionstore to carefully review the sessionstore bits, especially with an eye for upgraded and downgrade compat.

Specific comments:

>+    // Drop security info from the cloned load info because we want NS_GetFinalChannelURI
>+    // return either URI of the target channel or anything that target protocol handler

s/return either/to return either the/ 

and 

s/anything that/anything that the/

>+  // |result| is an in-out argument that can change.

The important part is that when the function is called *result must be non-null and must be already addrefed, and that this function will preserve that invariant if it succeeds.  That needs to be documented here.

>+++ b/docshell/base/nsAboutRedirector.cpp
>+      // channel's load info which forces the channel owner to reflect the

I know the preexisting comment said that, but "channel principal" would be better than "channel owner" here.

>+++ b/docshell/base/nsDocShell.cpp

I think it would be better to put the new thing next to the existing (URI/originalURI) things in argument lists here.

>+  // Always rewrite, including null value of aSecurityCheckURI
>+  // for consistency.
>+  loadInfo->SetSecurityCheckURI(aSecurityCheckURI);

This should probably document that it has to come _before_ the NS_NewChannelInternal call, so the protocol handler can override if it wants to, right?

>+                             nullptr,                   // unknown?

 // defer to protocol handler

perhaps?  It's a bit weird, because the protocol handler _can_ always override, so this is really about the cases when it _chooses_ not to.

>+++ b/docshell/shistory/nsISHEntry.idl
>+     * TODO

Needs to be done.

>+++ b/dom/jsurl/nsJSProtocolHandler.cpp

I don't understand the comments here.  What do they actually mean?  Why does this channel always want to use whatever its originalURI is?

>+++ b/modules/libjar/nsJARChannel.cpp

So what is the resulting behavior of jar: when the inner channel gets redirected?

>+++ b/netwerk/protocol/res/SubstitutingProtocolHandler.cpp
>+  // We must set security check URI prior calling SubstituteChannel

"prior to calling"
Flags: needinfo?(bzbarsky)
Attachment #8851791 - Flags: review?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #73)
> Comment on attachment 8851791 [details] [diff] [review]
> v9
> 
> So a few general comments:
> 
> 1)  Now that I read this thing as a whole, I'm unsure about the
> "securityCheckURI" naming.  It's more like the "uriForResultPrincipal" or
> "principalURI" or something...

resultPrincipalURI ?

> >+  // Always rewrite, including null value of aSecurityCheckURI
> >+  // for consistency.
> >+  loadInfo->SetSecurityCheckURI(aSecurityCheckURI);
> 
> This should probably document that it has to come _before_ the
> NS_NewChannelInternal call, so the protocol handler can override if it wants
> to, right?

I think this has to change.  If only the protocol handlers are allowed to change this URI, then it should be setup correctly after NS_NewChannelInternal here.  But still, this URI could have been changed afterwards on the channel we are trying to reproduce here from either the session store or as a reload.  Hence, I think it should be set AFTER NS_NewChannelInternal call.  If no one has set the result principal URI on the original channel after it has been created, then the stored one (aSecurityCheckURI) will be, by value, equal to the one we will find set in the load info after the NS_NewChannelInternal call.  So, this will be a no-op.  If the URI has been changed, then we will get to the right state.  Unless... we in the future fix some error in some protocol handler that was setting a wrong result princ URI...  That error would be persisted this way.

> >+++ b/dom/jsurl/nsJSProtocolHandler.cpp
> 
> I don't understand the comments here.  What do they actually mean?  Why does
> this channel always want to use whatever its originalURI is?

It seems to be a leftover.  Will be removed.

> >+++ b/modules/libjar/nsJARChannel.cpp
> 
> So what is the resulting behavior of jar: when the inner channel gets
> redirected?

I think there is no need to change the jar channel code at all.  Again, a leftover.  

LOAD_REPLACE can now only indicate a redirect or a part of a multimixed content (both subject to change in a different bug).  In both cases we want to set the flag also on the jar channel.  And I believe that leaving the load info unchanged here is the way to go.  If the target principal URI on the jar channel's load info has been set to something, by an overlying protocol handler - if there is any, it should be left as is.  If it's null, we will correctly use a post-redirected-updated mJarURI (exposed as jarchannel.uri) as a result of NS_GetFinalChannelURI, the same result we would get w/o this patch.


Try-pushed as https://treeherder.mozilla.org/#/jobs?repo=try&revision=30db8db623bfbc45dfa7c87bac649672f8906e55
> resultPrincipalURI ?

That sounds good, thank you.

> If no one has set the result principal URI on the original channel after it has been created, then the stored one (aSecurityCheckURI)

The problem is that there may not be a stored one at all.  There's only a stored one for history loads and session restore.  The common case of all other navigations is that aSecurityCheckURI is null here.

But more importantly... why would someone other than a protocol handler ever change the resultPrincipalURI?  The whole point is for the protocol handler or equivalent thereof to communicate some information about cases in which the channel URI (which the protocol handler fully controls) doesn't correctly capture the principal the channel should actually have.  And it's only relevant in the cases in which no one else messes with that principal directly via the loadinfo so we end up falling through to "get the principal from the channel URI" cases.  Maybe we should make this more explicit somehow...

Agreed on the jarchannel/jsprotocolhandler comments.
So, it seems like there is no need to store the target principal URI at all, as it's always a product of a channel creation with no additional stored information needed.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fd4e275a807efbadc77ebd4e8211f1df239ac600

This removes the sessions store bits completely and is a candidate to remove the aTargetPrincipalURI from various places in docshell as it's unused.
Attached patch v10 (obsolete) — Splinter Review
Diff from v9:

- new name: result principal URI
- removes all the changes from docshell
- removes all the changes from session store
- removes nsJSChannel changes
- removes nsJARChannel changes

- there still are no explicit checks or API adjustments preventing modification of resultPrincipalURI outside protocol handlers, note there is one place we need to null it out (channel redirect)


This turns out to only be a removal of LOAD_REPLACE flag set from protocol handlers and adding loadInfo.resultPrincipalURI = channel.originalURI to protocol handlers where originalURI is set to something different from the URI while LOAD_REPLACE was not set.


https://treeherder.mozilla.org/#/jobs?repo=try&revision=a48a5c9eb9cbf0806c5c5d4ebcc2032a35b11102
Attachment #8851753 - Attachment is obsolete: true
Attachment #8851791 - Attachment is obsolete: true
Attachment #8855324 - Flags: review?(bzbarsky)
I'm sorry for the lag here; I will try to get to this tomorrow.
Honza, within Bug 1286838 I am trying to deprecate the owner of the channel and hence would need a very similar structure that you are about to add within this Bug. Within Bug 1286838 I have problems to return the correct principal within GetChannelResultPrincipal() and by looking at your bug you are adding exactly those bits to the loadInfo that I am seeking, but as a URI instead of a principal. (At least as far as I can tell).

Just wondering, is there a particular reason you are adding a URI instead of a principal or could we update to your patch to add loadinfo.channelResultPrincipal?
Flags: needinfo?(honzab.moz)
(In reply to Christoph Kerschbaumer [:ckerschb] from comment #79)
> Honza, within Bug 1286838 I am trying to deprecate the owner of the channel
> and hence would need a very similar structure that you are about to add
> within this Bug. Within Bug 1286838 I have problems to return the correct
> principal within GetChannelResultPrincipal() and by looking at your bug you
> are adding exactly those bits to the loadInfo that I am seeking, but as a
> URI instead of a principal. (At least as far as I can tell).
> 
> Just wondering, is there a particular reason you are adding a URI instead of
> a principal or could we update to your patch to add
> loadinfo.channelResultPrincipal?

The main purpose of this patch was to change source of data for NS_GetFinalChannelURI function, which returns only a URI.  The patch after few iterations ended up as 'if a protocol handler sets originalURI on a channel, it also sets resultPrincipalURI to the same URI on the channel's load info ; if not, leave it null'.  

What you want means two things:
- set loadInfo.resultPrincipal every time, now we set loadInfo.resultPrincipalURI only when a channel is set originalURI and wants to use it for security checks
- have a principal in hands, which is something I'm not sure I always have in nsIProtocolHandler::NewChannel(2)

I'm also a bit afraid that setting the resultPrincipal (or just the URI) on load info every time may break some assumptions about that property manipulations.  There are few cases a protocol handler just delegates to a different (inner) protocol handler.  The top level channel and the associated load info is sometimes adjusted before and sometimes conditionally after it's given to the inner protocol handler.  And when the inner protocol rewrites loadInfo.resultPrincipal(URI) unexpectedly, we could end up with something we don't want.  But that's all probably fixable, just needs to carefully look at all the places I've modified in my patch.

Note that this patch cost me a lot of time already, and what you suggest means to dive into this and write most of it again.  I'm afraid I don't have cycles anymore.
Flags: needinfo?(honzab.moz)
(In reply to Honza Bambas (:mayhemer) from comment #80)
> The main purpose of this patch was to change source of data for
> NS_GetFinalChannelURI function, which returns only a URI.  The patch after
> few iterations ended up as 'if a protocol handler sets originalURI on a
> channel, it also sets resultPrincipalURI to the same URI on the channel's
> load info ; if not, leave it null'.  
> What you want means two things:
> - set loadInfo.resultPrincipal every time, now we set
> loadInfo.resultPrincipalURI only when a channel is set originalURI and wants
> to use it for security checks
> - have a principal in hands, which is something I'm not sure I always have
> in nsIProtocolHandler::NewChannel(2)
> 
> I'm also a bit afraid that setting the resultPrincipal (or just the URI) on
> load info every time may break some assumptions about that property
> manipulations.  There are few cases a protocol handler just delegates to a
> different (inner) protocol handler.  The top level channel and the
> associated load info is sometimes adjusted before and sometimes
> conditionally after it's given to the inner protocol handler.  And when the
> inner protocol rewrites loadInfo.resultPrincipal(URI) unexpectedly, we could
> end up with something we don't want.  But that's all probably fixable, just
> needs to carefully look at all the places I've modified in my patch.
> 
> Note that this patch cost me a lot of time already, and what you suggest
> means to dive into this and write most of it again.  I'm afraid I don't have
> cycles anymore.

Honza, thanks for your answer. I see what you are saying and I am not asking that you should rewrite the patch. But going forward I suppose we could add similar semantics for the result principal. E.g. add loadinfo.channelResultPrincipal which we could set in certain ProtoclHandlers. If that is set, we then return that principal when calling scriptSecurityManager::GetChannelResultPrincipal and if left null we could just fall back and do what we are currently doing. But I guess that's more a discussion for Bug 1286838.
OK, I am no expert to principals, but from how you describe it, it sounds like NS_GetFinalChannelURI and scriptSecurityManager::GetChannelResultPrincipal may do very similar thing.  Should that be duplicated, we might need to think about de-duplication here.
Summary: Expose security check URL ("final channel URL") on LoadInfo, convert current consumers of LOAD_REPLACE → Expose result principal URL ("final channel URL") on LoadInfo, convert current consumers of LOAD_REPLACE
Comment on attachment 8855324 [details] [diff] [review]
v10

One overall problem I just realized: in a bunch of places where we're setting resultPrincipalURI we might have a null loadinfo.  It's probably OK to just throw from newChannel in those case, so the JS ones need no changes, but the C++ ones need to null-check and either throw or silently not set resultPrincipalURI.  A few of the C++ callees already do the latter...  I'm not quite sure which is better.

>+++ b/browser/components/about/AboutRedirector.cpp
>       // If tempURI links to an external URI (i.e. something other than
>+      // chrome:// or resource://) then set the result principal URL on the
>+      // channel's load info which forces the channel principal to reflect the
>+      // displayed URL rather then being the systemPrincipal.

etc.  This comment is backwards.  We set the result principal URL if tempURI links to chrome:// or resource://.  If it is _not_ set, then the principal will reflect the displayed URL, right?

>       rv = NS_NewChannelInternal(getter_AddRefs(tempChannel),
>                                  tempURI,
>                                  aLoadInfo,
>                                  nullptr, // aLoadGroup
>                                  nullptr, // aCallbacks
>-                                 loadFlags);
>+                                 nsIChannel::LOAD_NORMAL);

LOAD_NORMAL is the default value.  This can just become:

       rv = NS_NewChannelInternal(getter_AddRefs(tempChannel),
                                  tempURI,
                                  aLoadInfo);

>+++ b/devtools/client/framework/about-devtools-toolbox.js

Do we really want to override whatever resultPrincipalURI the protocol handler for this.uri decided to set?  If we do, that probably deserves a comment explaining why.  If not, we should set aLoadInfo.resultPrincipalURI before the newChannelFromURIWithLoadInfo call, right?

>+++ b/docshell/base/nsAboutRedirector.cpp

Same comments apply here as to browser/components/about/AboutRedirector.cpp.

Also, the comment completely ignores the isAboutBlank thing.

>+++ b/docshell/base/nsDocShell.cpp
> It's better
>+    // to set the flag on more occurences than on less. 

I don't follow why.  Presumably this whole chunk of code should go away in a followup.  Please make sure this followup is filed and its bug number mentioned in this comment.

>+++ b/netwerk/base/nsILoadInfo.idl
>+   * If set to a non-null value, this property carries the URL
>+   * we should do security checks against on this channel as well
>+   * as it carries the URL that should appear in the address bar.

How about:

  If this is non-null, this property represents two things: (1) the
  URI to be used for the principal if the channel with this loadinfo
  gets a principal based on URI and (2) the URI to use for a
  document created from the channel with this loadinfo.

>+HttpBaseChannel::CloneLoadInfoForRedirect(nsIURI * newURI, uint32_t redirectFlags)

Could probably early-return null if !mLoadInfo and outdent some code.

Please make sure that you merge to the bug 1353975 fix correctly and it doesn't get lost.

>+++ b/netwerk/protocol/res/ExtensionProtocolHandler.h
>+  // Caller must pass non-null and add-reffed |result|.  |result| will be
>+  // preserved invariant when this method succeeds.

That second sentence doesn't make sense.  result can get changed on success, no?

I would think this comment should say something like:

  // result is an inout param.  On entry to this function, *result
  // is expected to be non-null and already addrefed.  This function 
  // may release the object stored in *result on entry and write
  // a new pointer to already addrefed channel to *result.

or so.

r=me with those all fixed.  I'm really sorry for the lag here last week.  :(
Attachment #8855324 - Flags: review?(bzbarsky) → review+
Thanks Boris for catching all these details.  I'm currently in a work week, hence I may get to an update of the patch here no sooner than the next week anyway.
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #83)
> >+++ b/docshell/base/nsDocShell.cpp
> > It's better
> >+    // to set the flag on more occurences than on less. 
> 
> I don't follow why.  Presumably this whole chunk of code should go away in a
> followup.  Please make sure this followup is filed and its bug number
> mentioned in this comment.

I changed the comment a bit.

to explain:

For channels that have been redirected and are restored from the session store we want to make sure the flag is correctly set (as an indication of a redirect) to not break any code expecting the flag be set - i.e. not setting it may IMO cause harm.  

If the flag has originally been set on a channel as an intention to change the result of NS_GetFinalChannelURL, it won't have any effect if we still set it here - it will cause no harm.

There is bug 1319110 for this.
Great, that's much clearer.
Attached patch v10.4 (obsolete) — Splinter Review
- the about:devtools-toolbox proto handler must set resultPrincipalURI after creation of the channel (comment added)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a3e081b43c130c4ad6619d99c15e6cf1acf69b6
Attachment #8862060 - Attachment is obsolete: true
Attachment #8862420 - Flags: review+
Ready to land!
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/88c2a0b2dde5
Exposing URI to make security check against on LoadInfo (no LOAD_REPLACE flag). r=bz
Keywords: checkin-needed
Backed out for Talos g2 failures (looks like they run more than 1h now, the test before 36m):

https://hg.mozilla.org/integration/mozilla-inbound/rev/99fec9a1c20452e524e1b28b7114275173c1d4db

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=88c2a0b2dde53837f770d76050f35aaa18c03bbd&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=94946116&repo=mozilla-inbound

13:24:15     INFO -  PID 3304 | [#50] console.bulklog  Cycles:25  Average:184.50  Median:166.53  stddev:36.40 (21.9%)  stddev-sans-first:36.51
13:24:15     INFO -  PID 3304 | Values: 217.4  166.5  160.6  215.0  163.4  200.7  163.4  162.2  164.7  191.9  179.6  160.2  161.4  165.0  155.2  212.0  183.9  157.2  170.9  293.1  186.8  282.2  153.6  164.0  181.6
13:24:15     INFO -  PID 3304 |
13:24:15     INFO -  PID 3304 | [#51] console.streamlog  Cycles:25  Average:31.32  Median:31.00  stddev:2.64 (8.5%)  stddev-sans-first:2.68
13:24:15     INFO -  PID 3304 | Values: 30.0  30.0  34.0  31.0  34.0  35.0  30.0  30.0  31.0  30.0  30.0  28.0  31.0  30.0  31.0  30.0  31.0  33.0  28.0  31.0  31.0  32.0  36.0  27.0  39.0
13:24:15     INFO -  PID 3304 | -------- Summary: end --------
13:24:15     INFO -  PID 3304 |
13:24:16     INFO -  TEST-INFO | 3304: exit 0
13:24:17     INFO -  TEST-OK | damp | took 1676884ms
13:24:17     INFO -  TEST-START | tps
13:24:17     INFO -  operating with platform_type : w7_
13:24:17     INFO -  Initialising browser for tps test...
13:24:17     INFO -  TEST-INFO | started process 2388 (C:\slave\test\build\application\firefox\firefox -profile c:\users\cltbld\appdata\local\temp\tmpdo_jhf\profile http://localhost:49893/getInfo.html)
13:24:30     INFO -  PID 2388 | __metrics	Screen width/height:1600/1200
13:24:30     INFO -  PID 2388 | 	colorDepth:24
13:24:30     INFO -  PID 2388 | 	Browser inner width/height: 1010/676
13:24:30     INFO -  PID 2388 | __metrics
13:24:31     INFO -  PID 2388 | Unable to read VR Path Registry from C:\Users\cltbld\AppData\Local\openvr\openvrpaths.vrpath
13:24:31     INFO -  PID 2388 | JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 673: Error: Phase "quit-application-granted" is finished, it is too late to register completion condition "SessionStore: flushing all windows"
13:24:31     INFO -  PID 2388 | JavaScript error: resource:///modules/BrowserUsageTelemetry.jsm, line 334: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]
13:24:32     INFO -  TEST-INFO | 2388: exit 0
13:24:32     INFO -  Browser initialized.
13:24:32     INFO -  Running cycle 1/1 for tps test...
13:24:32     INFO -  TEST-INFO | started process 1900 (C:\slave\test\build\application\firefox\firefox -profile c:\users\cltbld\appdata\local\temp\tmpdo_jhf\profile -tp file:\C:\slave\test\build\tests\talos\talos\tests\tabswitch\tps.manifest.develop -tpchrome -tpnoisy -tploadnocache -tpcycles 1 -tppagecycles 5)
13:24:44     INFO -  PID 1900 | RSS: Main: 137089024
13:24:44     INFO -  PID 1900 |
13:29:33     INFO -  PID 1900 | *************************
13:29:33     INFO -  PID 1900 | A coding exception was thrown and uncaught in a Task.
13:29:33     INFO -  PID 1900 |
13:29:33    ERROR -  PID 1900 | Full message: TypeError: NetworkError when attempting to fetch resource.
13:29:33     INFO -  PID 1900 | Full stack:
13:29:33     INFO -  PID 1900 | *************************
13:29:33     INFO -  PID 1900 | *************************
13:29:33     INFO -  PID 1900 | A coding exception was thrown and uncaught in a Task.
13:29:33     INFO -  PID 1900 |
13:29:33    ERROR -  PID 1900 | Full message: TypeError: NetworkError when attempting to fetch resource.
13:29:33     INFO -  PID 1900 | Full stack:
13:29:33     INFO -  PID 1900 | *************************
13:29:33     INFO -  PID 1900 | *************************
13:29:33     INFO -  PID 1900 | A coding exception was thrown and uncaught in a Task.
13:29:33     INFO -  PID 1900 |
13:29:33    ERROR -  PID 1900 | Full message: TypeError: NetworkError when attempting to fetch resource.
13:29:33     INFO -  PID 1900 | Full stack:
13:29:33     INFO -  PID 1900 | *************************
13:29:33     INFO -  PID 1900 | *************************
13:29:33     INFO -  PID 1900 | A coding exception was thrown and uncaught in a Task.
13:29:33     INFO -  PID 1900 |
13:29:33    ERROR -  PID 1900 | Full message: TypeError: NetworkError when attempting to fetch resource.
13:29:33     INFO -  PID 1900 | Full stack:
13:29:33     INFO -  PID 1900 | *************************
14:24:32     INFO -  Timeout waiting for test completion; killing browser...
14:24:33     INFO -  Terminating psutil.Process(pid=1900, name=u'firefox.exe')
14:24:33     INFO -  Unable to kill process
14:24:33    ERROR -  Traceback (most recent call last):
14:24:33     INFO -    File "C:\slave\test\build\tests\talos\talos\talos_process.py", line 144, in run_browser
14:24:33     INFO -      return_code = context.kill_process()
14:24:33     INFO -    File "C:\slave\test\build\tests\talos\talos\talos_process.py", line 41, in kill_process
14:24:33     INFO -      procs = self.process.children()
14:24:33     INFO -    File "C:\slave\test\build\venv\lib\site-packages\psutil\__init__.py", line 299, in wrapper
14:24:33     INFO -      raise NoSuchProcess(self.pid, self._name)
14:24:33     INFO -  NoSuchProcess: psutil.NoSuchProcess process no longer exists (pid=1900, name=u'firefox.exe')
14:24:33     INFO -  mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/XHpUwE37TzCMkHUU0c5ntQ/artifacts/public/build/firefox-55.0a1.en-US.win32.crashreporter-symbols.zip
14:24:50     INFO -  mozcrash Copy/paste: C:\slave\test\build\win32-minidump_stackwalk.exe c:\users\cltbld\appdata\local\temp\tmpdo_jhf\profile\minidumps\e2ba91ac-db31-48fa-ad51-27d031c696da.dmp c:\users\cltbld\appdata\local\temp\tmp44ng1e
14:24:58     INFO -  mozcrash Saved minidump as C:\slave\test\build\blobber_upload_dir\e2ba91ac-db31-48fa-ad51-27d031c696da.dmp
14:24:58     INFO -  PROCESS-CRASH | tps | application crashed [unknown top frame]
14:24:58     INFO -  Crash dump filename: c:\users\cltbld\appdata\local\temp\tmpdo_jhf\profile\minidumps\e2ba91ac-db31-48fa-ad51-27d031c696da.dmp
14:24:58     INFO -  Operating system: Windows NT
14:24:58     INFO -                    6.1.7601 Service Pack 1
14:24:58     INFO -  CPU: x86
14:24:58     INFO -       GenuineIntel family 6 model 30 stepping 5
14:24:58     INFO -       8 CPUs
14:24:58     INFO -  GPU: UNKNOWN
14:24:58     INFO -  No crash
14:24:58     INFO -  Process uptime: 3601 seconds
Flags: needinfo?(honzab.moz)
This bug touches pdf.js code that is maintained upstream and then updated in-tree via periodic syncs. Please be sure to coordinate with the upstream maintainers to make sure that this change doesn't get clobbered next time we update too.
See Also: → 1358539
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #91)
> Backed out for Talos g2 failures (looks like they run more than 1h now, the
> test before 36m):

It looks to me like the g2 timeouts continued after this backout, until bug 1337062 was backed out -- this bug may not have been at fault.
Is there some way to get more detailed console logs from the run?  Or at least know which exact test case is hanging it?  (presuming not as these are opt builds.) 

According comment 94, it's clear this causes a perma orange of g2.
Flags: needinfo?(aryx.bugmail)
> Is there some way to get more detailed console logs from the [Talos] run?  Or at
> least know which exact test case is hanging it?  (presuming not as these are
> opt builds.) 
Joel knows better about Talos.
Flags: needinfo?(aryx.bugmail) → needinfo?(jmaher)
there is no easy way to get data- this looks like it is failing on all platforms, can you run talos locally: |mach talos-test -a tps| and see what happens?

if g2 is failing and on tps, then there is probably something in the tps code triggering this:
https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/tabswitch
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher) from comment #97)
> there is no easy way to get data- this looks like it is failing on all
> platforms, can you run talos locally: |mach talos-test -a tps| and see what
> happens?
> 
> if g2 is failing and on tps, then there is probably something in the tps
> code triggering this:
> https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/
> tabswitch

Thanks for this clue!  I think I need to repack [1] since it needs to reflect changes to [2] introduced in the patch.

Since the name 'tabswitch-signed.xpi' suggests this needs a signing cert, how exactly should that be done?


[1] https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/tabswitch/tabswitch-signed.xpi
[2] https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/tabswitch/content/tabswitch-content-process.js
Flags: needinfo?(honzab.moz) → needinfo?(jmaher)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #92)
> This bug touches pdf.js code that is maintained upstream and then updated
> in-tree via periodic syncs. Please be sure to coordinate with the upstream
> maintainers to make sure that this change doesn't get clobbered next time we
> update too.

Ryan, is there a separate repo this change should land on too?  Similarly to how we are landing NSPR changes?  Otherwise I'm not sure what exactly to do here...
Flags: needinfo?(ryanvm)
we have some documentation on how to resign the talos addons up here:
https://wiki.mozilla.org/EngineeringProductivity/HowTo/SignExtensions#Use_Temporary_Addons

Let me know if you need me to do that.
Flags: needinfo?(jmaher)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #100)
> https://github.com/mozilla/pdf.js/

Thanks!  https://github.com/mozilla/pdf.js/pull/8367
(In reply to Joel Maher ( :jmaher) from comment #101)
> we have some documentation on how to resign the talos addons up here:
> https://wiki.mozilla.org/EngineeringProductivity/HowTo/
> SignExtensions#Use_Temporary_Addons
> 
> Let me know if you need me to do that.

Thanks, but I'm not sure what exactly you mean in the document to use.  It's more about how to avoid signing.

Mike, could you please help me with resigning the addon at https://dxr.mozilla.org/mozilla-central/source/testing/talos/talos/tests/tabswitch/tabswitch-signed.xpi?  

I'm not sure how to proceed, thanks.
Flags: needinfo?(mconley)
Hey mayhemer,

I suspect jmaher meant to point you at https://wiki.mozilla.org/EngineeringProductivity/HowTo/SignExtensions#Signing_an_Addon

Those instructions are pretty straight forward, but if you don't have web-ext or something installed, let me know, and I can sign it for you.
Flags: needinfo?(mconley)
(In reply to Mike Conley (:mconley) from comment #104)
> Hey mayhemer,
> 
> I suspect jmaher meant to point you at
> https://wiki.mozilla.org/EngineeringProductivity/HowTo/
> SignExtensions#Signing_an_Addon
> 
> Those instructions are pretty straight forward, but if you don't have
> web-ext or something installed, let me know, and I can sign it for you.

Thanks, this helped a lot!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=076ee88b683cef10d5d56ac0545972440b138aeb
- repacked and resigned talos/talos/tests/tabswitch/tabswitch-signed.xpi causing g2 perma failures

For a try push of that particular change see comment 105
Attachment #8862420 - Attachment is obsolete: true
Attachment #8864549 - Flags: review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f9abb9c83452
Expose URI to make security check against on LoadInfo (no LOAD_REPLACE flag). r=bz
Keywords: checkin-needed
Is this bugfix already included in the nightly from 2017-05-05?
No, it even hasn't been merged to mozilla-central yet.
https://hg.mozilla.org/mozilla-central/rev/f9abb9c83452
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Depends on: 1362458
Depends on: 1362829
Depends on: 1362461
Could this have caused bug 1362827? If so, this bug needs to be flagged as addon-compat
Flags: needinfo?(honzab.moz)
Yes, it absolutely could have.

The basic problem is that the old behavior more or less defaulted to "use the originalURI unless LOAD_REPLACE is set" and the new behavior more or less defaults to "use the URI unless resultPrincipalURI is set".

What that means is that any addons that set originalURI should probably be changed to also set resultPrincipalURI on the loadinfo.... 

I sort of wish we could have held off on landing this until we drop addon support, but we need this to fix some webextensions issues.  :(
Depends on: 1362827
Keywords: addon-compat
Alternately, if we did the "assume setting originalURI implies setting resultPrincipalURI" thing that would make most addons work, except the ones that implement their own channel classes...
Boris, before the effect chain of this bug spreads too much, I'm trying to think of why I haven't implemented this backwards: 

1. convert |= LOAD_REPLACE to resultPrincipalURI = URI 
2. convert &= ~LOAD_REPLACE to explicit resultPrincipalURI = nullptr
3. leave the default be originalURI of the channel (when no load info or no resultPrincipalURI)

When we also change MOZ_ASSERT(!channelLoadFlags & nsIChannel::LOAD_REPLACE) in NS_NewChannelInternal to a DIAGNOSTIC_ASSERT, we may catch most add-on handlers that still set this flag.  I believe there is way smaller number of handlers setting LOAD_REPLACE than setting originalURI.

I went through out the patch and the suggested change would work.  I would also be a way less changes.


What do you think?
Flags: needinfo?(honzab.moz) → needinfo?(bzbarsky)
Hmm.  I'm not seeing an obvious reason we didn't do it that way, other than the discussion in bug 1256122 somehow ending up on the setup we ended up doing in bug 1256122 comment 55.

I agree that doing what you propose would likely be more extension-compatible and should have the same behavior as what we have now for non-extension cases.  Are you willing to write the corresponding patch?
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #115)
> Hmm.  I'm not seeing an obvious reason we didn't do it that way, other than
> the discussion in bug 1256122 somehow ending up on the setup we ended up
> doing in bug 1256122 comment 55.
> 
> I agree that doing what you propose would likely be more
> extension-compatible and should have the same behavior as what we have now
> for non-extension cases.  Are you willing to write the corresponding patch?

I definitely am!
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I will back introduce two new patches:
- backout of 10.5
- "reverse approach" implementation according comment 114

and land them together.
Status: REOPENED → ASSIGNED
Attached patch v10.5 backout (obsolete) — Splinter Review
Attachment #8864549 - Attachment description: v10.5 → v10.5 (resultPrincipalURI = originalURI, default: URI)
> 3. leave the default be originalURI of the channel (when no load info or no resultPrincipalURI)

I remember the problem with this now: What is the originalURI for an HTTP post-redirect channel?  If the plan was to have the originalURI propagated across HTTP redirects (and I think that was the plan) then using it for security anything is no good.  If a patch that does this passes tests, then we need more tests.

And "convert |= LOAD_REPLACE to resultPrincipalURI = URI" doesn't save us, because that clobbers the resultPrincipalURI during an HTTP redirect in the cases when the target protocol handler has already set it, right?  In other words, it breaks the thing we're trying to fix in bug 1256122.
Though I guess that if HTTP sets up the resultPrincipalURI to the URI it's redirecting to _before_ calling the target protocol handler and then the target protocol handler overrides it, that would probably work.  But that's not what the patches I see on try are doing.
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #123)
> > 3. leave the default be originalURI of the channel (when no load info or no resultPrincipalURI)
> 
> I remember the problem with this now: What is the originalURI for an HTTP
> post-redirect channel?  

During the on-redirect notification, it's the URL as set by the protocol handler during its creation.  After the on-redirect notification it's set to the first URL in the chain.

> If the plan was to have the originalURI propagated
> across HTTP redirects (and I think that was the plan) 

The plan and necessity is to preserve function of originalURI to be the same as before this patch.

> then using it for
> security anything is no good.  If a patch that does this passes tests, then
> we need more tests.
> 
> And "convert |= LOAD_REPLACE to resultPrincipalURI = URI" doesn't save us,
> because that clobbers the resultPrincipalURI during an HTTP redirect in the
> cases when the target protocol handler has already set it, right?  

Yes, but I'm then not sure if that's bad or not.

What the latest patch (v11) does is that after the channel has been created the resultPrincipalURI is set to the URI we are redirecting to.  In case

> In other
> words, it breaks the thing we're trying to fix in bug 1256122.

I think opposite (but I may still miss something..)  With v11 patch here NS_GetFinalChannelURI for the new channel will return the moz-extension:// URI, since that is the one we redirect to and - as I understand it - is also the principal for the channel.

See also comment 17 in this bug, specifically the "For the scenario from bug 1256122" part - is it actually correct?  There is one exception to it: the URI of the channel is meant to be jar:// with inner URL pointing to file:// and not just file:// URL.

If not, then we should again state clearly what has to be done here.

Thank you again (and again and again :))) for you feedback, Boris.
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #124)
> Though I guess that if HTTP sets up the resultPrincipalURI to the URI it's
> redirecting to _before_ calling the target protocol handler and then the
> target protocol handler overrides it, that would probably work.  But that's
> not what the patches I see on try are doing.

Maybe yes.  I have to go now (it's too late here) and this needs some thinking.  Will answer on Thursday.
> What the latest patch (v11) does is that after the channel has been created the resultPrincipalURI
> is set to the URI we are redirecting to.

That means the target protocol handler can't control things, right?  It's effectively forced into the resultPrincipalURI being whatever the thing was that was passed to newChannel.

In particular, the about: cases that want to set the resultPrincipalURI to something that is _not_ the about: URI would fail to do so in this setup.

I do agree that doing things that way would make the moz-extension case work, since that case does want the URI passed to newChannel to be the resultPRincipalURI.
(In reply to Honza Bambas (:mayhemer) from comment #125)

> See also comment 17 in this bug, specifically the "For the scenario from bug
> 1256122" part - is it actually correct?  There is one exception to it: the
> URI of the channel is meant to be jar:// with inner URL pointing to file://
> and not just file:// URL.

My understanding is that moz-ext: will resolve to either a jar: or a file:, depending on how the extension is loaded, file: if loaded from about:debugging.  That makes me think I need to add another test for loads that happen using file:.
(In reply to Honza Bambas (:mayhemer) from comment #119)
> Created attachment 8865909 [details] [diff] [review]
> v10.5 backout

So will anyone ever land this? Or do we have to contend with constant crashing until the new approach lands?
(In reply to avada from comment #129)
> (In reply to Honza Bambas (:mayhemer) from comment #119)
> > Created attachment 8865909 [details] [diff] [review]
> > v10.5 backout
> 
> So will anyone ever land this? Or do we have to contend with constant
> crashing until the new approach lands?

No.  What we want is bug 1362829.
(In reply to Honza Bambas (:mayhemer) from comment #130)
> No.  What we want is bug 1362829.

To be more precise, I'd rather first spend time figuring out if we are going to backout or not at all.  I hope to have a resolution soon here.
Comment on attachment 8865909 [details] [diff] [review]
v10.5 backout

Boris, this is a clean backout of the v10.5 patch which is crashy.  I think it may take some time before I finish the work on the v11.  I also found some details missing even in this patch that need fixing, one more reason to take it out.
Attachment #8865909 - Flags: review?(bzbarsky)
(In reply to Honza Bambas (:mayhemer) from comment #134)
> Boris, this is a clean backout of the v10.5 patch which is crashy.
Just out of interest, can't you just back out anything landed recently without review?
(In reply to Jorg K (GMT+2) from comment #135)
> (In reply to Honza Bambas (:mayhemer) from comment #134)
> > Boris, this is a clean backout of the v10.5 patch which is crashy.
> Just out of interest, can't you just back out anything landed recently
> without review?

What exactly do you have in mind?
Well, in cases I wanted stuff backed out, I talked to a sheriff and they backed it out for me (with commit message: Backed out at developer's request, or words to that extent). Or I've seen people push a backout to m-i. Maybe in this case it's harder since the original patch touched many files and can't be backed out easily, hence the need for a review, I assume.
The patch is in for few days and I want to make sure Boris doesn't know about anything that should stop me.  I probably can back this out even without an r+, tho.
Comment on attachment 8865909 [details] [diff] [review]
v10.5 backout

r=me, though I just skimmed this.
Attachment #8865909 - Flags: review?(bzbarsky) → review+
Attachment #8864549 - Attachment description: v10.5 (resultPrincipalURI = originalURI, default: URI) → v10.5 [backed out comment 142] (resultPrincipalURI = originalURI, default: URI)
I'm going to back out the backout for the talos issues.
Backout by kwierso@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d7711b69b013
Backed out changeset 95ff98f1c2e5 for talos failures a=backout
(In reply to Wes Kocher (:KWierso) from comment #143)
> This backout appears to have caused Windows talos (g2) tests to start
> failing: 
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-
> inbound&noautoclassify&fromchange=823615e72783e6563d72b20374f68b25030a88d5&fi
> lter-
> searchStr=windows%20(g2&tochange=9730f88bdbd3d167d3c5b1acb0618cd2533bb570

Grr.. from some reason I absolutely don't follow the backout patch lost the changes to talos/tests/tabswitch/tabswitch-signed.xpi.

A better one will be provided ASAP.  Thanks and sorry for trouble, this was intended as a clean thing.
Flags: needinfo?(honzab.moz)
Why not to let channels rewrite loadinfo.resultPrincipalURI when their originalURI is set: function of originalURI is going to be preserved as the first URL in a redirect chain, i.e. the originally navigated URL that started the load.

OriginalURI on a redirect target channel is set (to that value) by the redirect source nsHttpChannel _after_ redirect veto checks have been done.  If we let channels modify resultPrincipalURI when originalURI is touched, this would ruin the carefully set resultPrincipalURI and go against comment 17.
It also turns out it's a bad idea to set loadinfo.resultPrincipalURI = null when (before this patch) a protocol handler is dropping the LOAD_REPLACE flag.  

That will cause that the channel's originalURI will be used for result principal and when we are _redirecting to_ such a channel, that will be overwritten after redirect vetoing is done to the originating URL (source URL) which is defintiely unwanted.

Hence, such handlers MUST set result principal URI to the desired value (not null), in that case the same URL that is passed to result channel's originalURI.
Attachment #8864549 - Attachment description: v10.5 [backed out comment 142] (resultPrincipalURI = originalURI, default: URI) → v10.5 [backed out of m-i comment 147] (resultPrincipalURI = originalURI, default: URI)
(In reply to Honza Bambas (:mayhemer) from comment #149)
> Hence, such handlers MUST set result principal URI to the desired value (not
> null), in that case the same URL that is passed to result channel's
> originalURI.

Errr... no.  The latest research shows we have to save the result principal URI and restore it after we create the inner channel.  This applies to two of our handlers that are dropping LOAD_REPLACE and resets orignalURI - chrome and substituting protocol handlers.
Backout (looks like the sheriff forgot):
https://hg.mozilla.org/mozilla-central/rev/b25962573b3a
I think it's mature enough to get a review.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e7e0a4f117d929d14a595f3fb3e4ec9e870403a6
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9c6836f4c604ecee1bfea133c9c2163126593408


- NS_GetFinalChannelURI defaults to the channel's originalURI (turns out to be less number of changes in our codebase and likely in add-ons code too)

- Adds serialization and deserialization of result principal URI in the load info object over IPC (was missing in the <=v10 patches)

- HTTP redirect target loadinfo resultPrincipalURI is pre-set to the target channel's URI (originalURI at its creation time) before the channel is being created

reason: 
* if the target protocol handler doesn't change the result principal URI (most cases), result principal URI will correctly reflect the target URI (the URI we redirect to), but this approach also gives the handler a chance to change the result URI at will

(note that originalURI on an HTTP redirect target channel is set to the first URI of the redirect chain _after_ the vetoing and security checking is done ; this bit has not been changed by the patch)

- protocol handlers previously setting LOAD_REPLACE flag to change NS_GetFinalChannelURI result now set the desired URI on loadinfo.resultPrincipalURI unconditionally (side by also changing the channel's URI)

- protocol handlers previously dropping LOAD_REPLACE flag after an "inner" channel creation are saving loadinfo.resultPrincipalURI before creating the "inner" channel and then restoring it back

reason: 
* it can't be set to the value being set as the inner channel's originalURI because it might override any previously set result URI (mainly in case of a redirect to this channel) to a different value (e.g. in case there are more nesting levels when creating channels this way - like substitution)
* it can't be set to null to make NS_GetFinalChannelURI return the value from the result channel's originalURI (whatever value it may end up during a nested channel creation) because in case this is a redirect target channel, the channel's originalURI will be changed to the redirect target URI which is likely different from the currently set result principal URI.  

Hence, the only way is to preserve the result principal URI spanning the inner channel creation.
Attachment #8867494 - Attachment is obsolete: true
Attachment #8867685 - Flags: review?(bzbarsky)
Comment on attachment 8867685 [details] [diff] [review]
v11 (resultPrincipalURI = redirectToURI, resultPrincipalURI = URI or preserved, default = originalURI)

>+++ b/chrome/nsChromeProtocolHandler.cpp
>+    // since we want either |aURI| or anything pre-set by upper layers prevail.

"to prevail".

>+++ b/netwerk/protocol/file/nsFileChannel.cpp
>+  NS_ENSURE_STATE(mLoadInfo);

This is going to break people who create file:// channels via newChannel instead of newChannel2, right?  Have we checked whether extensions are doing that?  I'm not sure I see a good way around this, short of maybe only doing this check on the codepath that _uses_ loadinfo (the symlink path).  That would make the common case work, but make the failure, when it does happen, that much harder to diagnose.... :(

>+      NS_SUCCEEDED(NS_NewLocalFile(fileTarget, PR_TRUE,

Want to s/PR_TRUE/true/ while you're here?

>+      NS_SUCCEEDED(NS_NewNativeLocalFile(fileTarget, PR_TRUE,

And here.

>+++ b/netwerk/protocol/file/nsFileProtocolHandler.cpp
>+    // set the loadInfo on the new channel, must do this

s/,/;/

>+++ b/netwerk/protocol/res/SubstitutingProtocolHandler.cpp
>+  // We don't want to allow the inner protocol handler modify the result principal URI

"to modify"

>+  // since we want either |uri| or anything pre-set by upper layers prevail.

"to prevail"

r=me.  Thank you for doing this!
Attachment #8867685 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] (work week until 5/26) (if a patch has no decent message, automatic r-) from > >+++ b/netwerk/protocol/file/nsFileChannel.cpp
> >+  NS_ENSURE_STATE(mLoadInfo);
> 
> This is going to break people who create file:// channels via newChannel
> instead of newChannel2, right?  

Definitely.

> Have we checked whether extensions are doing
> that?  

No, that would be an overkill to do IMO.

> I'm not sure I see a good way around this, short of maybe only doing
> this check on the codepath that _uses_ loadinfo (the symlink path). That
> would make the common case work, but make the failure, when it does happen,
> that much harder to diagnose.... :(

Exactly my thinking.  I'll leave it to fail every time.  We are killing xpcom addons soon anyway.  Maybe some devs will be pushed by sudden consistent failures to migrate to webext.


> r=me.  Thank you for doing this!


Thank you for all the insight and reviews :)
better commit message that somehow become screwed..
Attachment #8870432 - Attachment is obsolete: true
Attachment #8870436 - Flags: review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7f28c1084c47
Expose 'result principal URI' on LoadInfo as a source for NS_GetFinalChannelURI (removes some use of LOAD_REPLACE flag). r=bz
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/7f28c1084c47
Status: ASSIGNED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Depends on: 1367586
Depends on: 1367656
backed out as:

https://hg.mozilla.org/integration/mozilla-inbound/rev/3f7f4882f9d4ea7f583dcbf5f1b11af0da4e4be7
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #8870446 - Attachment description: v11.1 (resultPrincipalURI = redirectToURI, resultPrincipalURI = URI or preserved, default = originalURI) → v11.1 [backout comment 162] (resultPrincipalURI = redirectToURI, resultPrincipalURI = URI or preserved, default = originalURI)
Status: REOPENED → ASSIGNED
Target Milestone: mozilla55 → ---
Blocks: 1367814
Blocks: 1368110
Attached patch v12 (obsolete) — Splinter Review
reviewed in bug 1367586, also see that bug for details on this patch.

carried from v11:
- |= LOAD_REPLACE turned to result principal URI = channel URI
- &= ~LOAD_REPLACE turned to result principal URI is preserved over inner channel creation
- NS_GetFinalChannelURI returns result principal URI or originalURI

new in v12:
- redirected channel's result principal URI is ensured to be either non-null (set by the handler) or set to the redirect target channel original URI (the creation URI) after the channel has been created and before redirect security checks are made
- result principal URI is stored to the session store
- result principal URI is unconditionally restored on replay load in docshell
- test added to not regress (bug 1367586)

merged to the current m-c
Attachment #8864549 - Attachment is obsolete: true
Attachment #8867453 - Attachment is obsolete: true
Attachment #8870446 - Attachment is obsolete: true
Attachment #8872687 - Flags: review+
(In reply to Honza Bambas (:mayhemer) from comment #171)
> Created attachment 8872708 [details] [diff] [review]
> v12
> 
> Which actually builds...
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=5ef470d03b473e3f1ee7df09d19691d1d63a5e4e


TEST-UNEXPECTED-FAIL | toolkit/components/mozprotocol/tests/browser_mozprotocol.js | Test timed out - [log…]
TEST-UNEXPECTED-FAIL | toolkit/components/mozprotocol/tests/browser_mozprotocol.js | Found a tab after previous test timed out: moz://a - [log…]
Fwiw, the v11->v12 interdiff looks right to me.  Please let me know if you want a hand looking into the browser_mozprotocol.js failure!
(In reply to Honza Bambas (:mayhemer) from comment #172)> TEST-UNEXPECTED-FAIL |
> toolkit/components/mozprotocol/tests/browser_mozprotocol.js | Test timed out
> - [log…]
> TEST-UNEXPECTED-FAIL |
> toolkit/components/mozprotocol/tests/browser_mozprotocol.js | Found a tab
> after previous test timed out: moz://a - [log…]

Reason: 
the doc shell changes in v12, that always overwrite the rpURI to whatever is passed to nsDocShell::DoURILoad, wipe out the one that has been set by the the moz-protocol handler.

It's nearly unbelievable that only such a totally unessential test can catch such a huge mistake.  

I would expect everything just gets broken with this.  But apparently only very few handlers in only very few situations set the rpURI to channel.URI (similarly as they were setting the LOAD_REPLACE flag) and go through this code in docshell (being loaded as top level docs).  Definitely about and file do that, none of the tests yell about it being wrong..

We should probably have few tests that check all the code paths on artificially built test protocols.  Navigation to a page (top level), redirecting to such a URI (the protocol must be implemented in C for e10s :/ ), going back and forward bypassing bfcache.



To fix this:
We want the replay always rewrite.  But we don't want to rewrite to null for a normal load.  hence we need another argument passed to InternalLoad whether to rewrite or not.  This means to add this parameter also to nsIDocShellLoadInfo.

Fixes the moz_protocol test failure, passes the nav-back test added in v12.

v13 is coming :)  Is it going to be the lucky one? ;)
I'm not going to proceed with landing this (these) patch(es) for 55 since I believe it's too late (less than a week to feature freeze).

According Shane, bug 1256122 won't land on 55 as well and it was the only seriously blocked bug.
> hence we need another argument passed to InternalLoad whether to rewrite or not.

No, we need to make the argument a Maybe<nsIURI*> with a comment about how it might be null when present.  Having an extra boolean arg to communicate the presence of absence of this one is too fragile.
(In reply to Boris Zbarsky [:bz] (if a patch has no decent message, automatic r-) from comment #177)
> > hence we need another argument passed to InternalLoad whether to rewrite or not.
> 
> No, we need to make the argument a Maybe<nsIURI*> with a comment about how
> it might be null when present.  Having an extra boolean arg to communicate
> the presence of absence of this one is too fragile.

I wanted that too!  But it's an IDL :(
But it's noscript.  So nothing stops you from defining it as the right thing using "native" types.  See http://searchfox.org/mozilla-central/rev/b318c7dca7392bd16c0b11929f55b1be133f0b31/image/imgIContainer.idl#64 for an example.
(In reply to Boris Zbarsky [:bz] (if a patch has no decent message, automatic r-) from comment #179)
> But it's noscript.  So nothing stops you from defining it as the right thing
> using "native" types.  See
> http://searchfox.org/mozilla-central/rev/
> b318c7dca7392bd16c0b11929f55b1be133f0b31/image/imgIContainer.idl#64 for an
> example.

Thanks!! :)
(In reply to Boris Zbarsky [:bz] (if a patch has no decent message, automatic r-) from comment #179)
> But it's noscript.  So nothing stops you from defining it as the right thing
> using "native" types.  See
> http://searchfox.org/mozilla-central/rev/
> b318c7dca7392bd16c0b11929f55b1be133f0b31/image/imgIContainer.idl#64 for an
> example.

Hmm.. but nsIDocShellLoadInfo isn't noscript.  And that needs to carry that info as well, if I'm not mistaken...
Yes, for nsIDocShellLoadInfo things are a bit more complicated.  We need an API that allows setting and getting the thing in scriptable form that preserves the "is it present?" thing.  We could also have inline APIs in a C++ block taking/returning Maybe that are built on top of the ugly scriptable API.
Attached patch v12->v14 (obsolete) — Splinter Review
same as v12->v13 but using Maybe where possible.  nsIDocShellLoadInfo is scriptable, so there is "IsSome" attribute and helpers for Maybe 'conversion' from/to C++ added.

note that the 'isSome' bit doesn't need to be carried to session store.  we assume isSome=true when restoring from sessions store every time.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fe62e6281a0e445d8b4cc5bec4e1931ef83fcf9b
Attachment #8872672 - Attachment is obsolete: true
Attachment #8873005 - Attachment is obsolete: true
Attachment #8878123 - Flags: review?(bzbarsky)
Comment on attachment 8878123 [details] [diff] [review]
v12->v14

>+++ b/docshell/base/nsDocShell.cpp
>@@ -5416,17 +5416,18 @@ nsDocShell::LoadErrorPage(nsIURI* aURI, 
>+  Maybe<nsCOMPtr<nsIURI>> noResultPrincipalURI;

Could just pass Nothing() directly, unless you think this is clearer...

>@@ -5506,19 +5507,23 @@ nsDocShell::Reload(uint32_t aReloadFlags
>+    emplacedResultPrincipalURI.emplace(mozilla::Move(resultPrincipalURI));

No need for "mozilla::".

>@@ -11160,19 +11168,21 @@ nsDocShell::DoURILoad(nsIURI* aURI,
>+  if (aResultPrincipalURI.isSome()) {

Any reason to not just test |if (aResultPrincipalURI)|?

>+    // Unconditionally override, we want the replay be equal to what has

Preexisting, but "to be equal".

>@@ -14236,19 +14248,20 @@ nsDocShell::OnLinkClickSync(nsIContent* 
>+  Maybe<nsCOMPtr<nsIURI>> noResultPrincipalURI;

Again, Nothing() at the callsite would be fine.

>+++ b/docshell/base/nsDocShellLoadInfo.cpp
>+nsresult
>+GetMaybeResultPrincipalURI(nsIDocShellLoadInfo* aLoadInfo, Maybe<nsCOMPtr<nsIURI>>& aRPURI)

No one ever looks at the return value.  Either this should be void or they should look at it.  Pick one.  Given the existing code flow, void makes sense to me.

>+  rv = aLoadInfo->GetResultPrincipalURIIsSome(&isSome);

Can you please file a followup to make nsIDocShellLoadInfo builtinclass and make a bunch of the attributes infallible, so we could have nicer code here?

>+  aRPURI.emplace(Move(uri));

This is wrong if aRPURI is already emplaced, right?  You probably want to do the reset() unconditionally, then early-return if not isSome instead.

>+nsresult
>+SetMaybeResultPrincipalURI(nsIDocShellLoadInfo* aLoadInfo, Maybe<nsCOMPtr<nsIURI>> const& aRPURI)

Again, void.  Callers don't check, and callee should really be infallible.

>+++ b/docshell/base/nsDocShellLoadInfo.h
>+  bool mResultPrincipalURIIsSome;
>   nsCOMPtr<nsIPrincipal> mTriggeringPrincipal;
>   bool mLoadReplace;

Please put the bools together so they will pack better.

r=me with the above fixed
Attachment #8878123 - Flags: review?(bzbarsky) → review+
Attached patch v14 (obsolete) — Splinter Review
Thanks Boris for enhancing my Maybe-fu :)
Attachment #8872708 - Attachment is obsolete: true
Attachment #8878123 - Attachment is obsolete: true
Attachment #8878432 - Flags: review+
Comment on attachment 8878432 [details] [diff] [review]
v14

Tim, can you please take a look at the changes to toolkit/modules/sessionstore/SessionHistory.jsm?  This patch adds a new property (.resultPrincipalURI) and we need to 'simulate' it when reading sessions stored by code before this patch.  There should be a lot of explanation in the code comments what this is trying to do.

I'm not sure how this is supposed to handle the case when session store data created by the new code is read by the old code (when people migrate from a newer version to an older one - not an uncommon thing).  

The new code changes the meaning of the loadReplace property a bit.  This flag is still stored to sessions store, though.

(Note that we have a followup bug that will likely remove the LOAD_REPLACE (and entry.loadReplace) flag altogether.  No immediate milestone planned, tho)

Writing this comment, I may tend to enhance this patch to store a new property (loadReplace2) with the new meaning and also - for compatibility - store the old loadReplace with more-or-less the original meaning to not break migration back.

Note that these properties are highly security sensitive!


What do you think?
Thanks.
Attachment #8878432 - Flags: review?(ttaubert)
Comment on attachment 8878432 [details] [diff] [review]
v14

Review of attachment 8878432 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM but I'll let Mike do the final call, he handles SessionStore these days :)
Attachment #8878432 - Flags: review?(ttaubert)
Attachment #8878432 - Flags: review?(mdeboer)
Attachment #8878432 - Flags: feedback+
Comment on attachment 8878432 [details] [diff] [review]
v14

Review of attachment 8878432 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me too! Thanks letting me check it ;-)

::: netwerk/ipc/NeckoChannelParams.ipdlh
@@ +40,5 @@
>    OptionalPrincipalInfo       requestingPrincipalInfo;
>    PrincipalInfo               triggeringPrincipalInfo;
>    OptionalPrincipalInfo       principalToInheritInfo;
>    OptionalPrincipalInfo       sandboxedLoadingPrincipalInfo;
> +  OptionalURIParams     resultPrincipalURI;

nit: I think you'll want to align this properly with the others.

::: toolkit/modules/sessionstore/SessionHistory.jsm
@@ +378,5 @@
>      if (entry.originalURI) {
>        shEntry.originalURI = Utils.makeURI(entry.originalURI);
>      }
> +    if (typeof entry.resultPrincipalURI === "undefined" && entry.loadReplace) {
> +      // This is a backward compatibility code for stored sessions saved prior to

nit: s/a //
Attachment #8878432 - Flags: review?(mdeboer) → review+
Attached patch v14 -> v15 (obsolete) — Splinter Review
As suggested in comment 186, we now store to loadReplace2 and also store a synthesized value for loadReplace to let the old code recognize it.  

I've tested this with about:credentials that redirects from about: url to https:// url.  W/o this patch when going back to an unpatched browser, the URL in the address bar is about:credential on restore (bad!).  With the v15 patch going back, back-forward and forward always ends up with the correct https:// url in the address bar.
Attachment #8879135 - Flags: review?(mdeboer)
Comment on attachment 8879135 [details] [diff] [review]
v14 -> v15

Review of attachment 8879135 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me, thanks!

::: toolkit/modules/sessionstore/SessionHistory.jsm
@@ +398,5 @@
>      } else if (entry.resultPrincipalURI) {
>        shEntry.resultPrincipalURI = Utils.makeURI(entry.resultPrincipalURI);
>      }
> +    if (entry.loadReplace2) {
> +      shEntry.loadReplace = entry.loadReplace2;

Ah, smart! This way we can contain `loadReplace2` to this file only.
Attachment #8879135 - Flags: review?(mdeboer) → review+
Attached patch v15 (obsolete) — Splinter Review
Thank everyone for reviews!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=526d6e7d75c3808a00da774d59e06a86898d5b70
Attachment #8878432 - Attachment is obsolete: true
Attachment #8879135 - Attachment is obsolete: true
Attachment #8879168 - Flags: review+
Attached patch v15Splinter Review
Fixed build issue on linux...

https://treeherder.mozilla.org/#/jobs?repo=try&revision=868c5a30a9cbcfdcb79ea96010f1ba6b12c00f52
Attachment #8879168 - Attachment is obsolete: true
Attachment #8879227 - Flags: review+
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d909485099a
Expose 'result principal URI' on LoadInfo as a source for NS_GetFinalChannelURI (removes some use of LOAD_REPLACE flag). r=bz, r=mikedeboer
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/6d909485099a
Status: ASSIGNED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
No longer depends on: 1362827
Depends on: 1362827
Depends on: 1378760
Depends on: 1380012
Depends on: 1403998
Depends on: 1405199
Depends on: 1442784
No longer depends on: 1442784
Depends on: 1531289
Depends on: 1532752
No longer depends on: 1531289
No longer depends on: 1532752
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: