Closed Bug 1391277 Opened 7 years ago Closed 6 years ago

Firefox randomly redirects to non-existent https version of page

Categories

(Core :: Networking: HTTP, defect, P2)

55 Branch
defect

Tracking

()

RESOLVED DUPLICATE of bug 1419222

People

(Reporter: fonttax, Assigned: mayhemer)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(3 files, 2 obsolete files)

Attached file harfiles.zip
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0
Build ID: 20170815141045

Steps to reproduce:

I have been browsing on this page with desktop nightly (57.0a1 (2017-08-17) 64bit) and desktop release (55.0.2 64bit) firefox: http://wiki.teamliquid.net/dota2/MinD_ContRoL, trying to edit the page/see the source code of the page (it is a MediaWiki wiki). I've seen this both on nightly and on the current release version of firefox, and I've also gotten similar reports from other users of the page. My guess is that somehow one of the ads does an upograde of insecure requests and that ripples through to the parent frame, but i can't 100% confirm that. I recorded HAR files of page loads where it happened both for nightly and release firefox (see attachment). I sincerely hope someone can help me with this, because i'm out of ideas.


Actual results:

When clicking edit, sometimes i go to a https version of the page, which does not exist as the server does not answer on https.


Expected results:

I should not have gone to a https version but stayed on http.
Component: Untriaged → Networking: HTTP
Product: Firefox → Core
The har files don't show any issues (other than the failed load of the https page), and I can't reproduce. Reporter, can you try to reproduce this with http logging enabled? https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(fonttax)
(I'll note, this also sounds suspiciously similar to bug 1384201)
Attached file log.txt-main.zip (obsolete) —
Here is a http log from about networking, it is fairly short but has both a port:80 call and a port:443 call, i had the tab in question open, then started the logging, then it failed loading the page and i stopped the logging again. (Sorry it's in a zip again, but i run into upload size restrictions else)
Flags: needinfo?(fonttax)
can you please change MOZ_LOG to:
MOZ_LOG=timestamp,nsHttp:5,nsSocketTransport:5,nsSSService:5

we need SiteSecurityService logging.

Thank you.
Flags: needinfo?(fonttax)
Attached file log.txt-main.zip
Same process as last time, but with the settings you specified
Attachment #8898707 - Attachment is obsolete: true
Flags: needinfo?(fonttax)
(In reply to FO-nTTaX from comment #5)
> Created attachment 8898825 [details]
> log.txt-main.zip
> 
> Same process as last time, but with the settings you specified

The log is not long enough, the beginning is missing. I think that at some point FF gets header:
Content-Security-Policy: upgrade-insecure-requests
and decides to upgrade request.

Can you look at the console there should be an warning message:
"Content Security Policy: upgrade-insecure-requests ..."
Is this one long enough? http://stuff.fo-nttax.de/log.txt-main.6684

I reloaded the page, got the ad loading that i identified to trigger it, then clicked a link and it redirects me to https again. (Can't upload a file of that size here i think)
WRT "Content Security Policy: upgrade-insecure-requests" i have tons of them, but they come from the ad, it is literally spamming the console like crazy (and for ad reasons the console should be filterable by domain, but that's something for another day) http://stuff.fo-nttax.de/console.txt
(In reply to FO-nTTaX from comment #7)
> Is this one long enough? http://stuff.fo-nttax.de/log.txt-main.6684

The channel's loadInfo was from some reason called SetUpgradeInsecureRequests() what makes us upgrade at [1].  Now we have to find the origin where this happens.

Could you add CSPContext:5 module to the list and reproduce once more?  Thanks.


[1] https://dxr.mozilla.org/mozilla-central/rev/a9d372645a32b8d23d44244f351639af9d73b96a/netwerk/base/nsNetUtil.cpp#
Unfortunately our ad guy killed the ad from the rotation, so i can't easily reproduce it anymore. I'll try but can't guarantee anything :(
I was looking for a header but there is non. It can also be that we get a meta tag '<meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests">'

If ff receives such a tag or a header it will upgrade to https.
I know for a fact that the main website doesn't have such a meta tag, as for the ad in the iframe I can't say
(In reply to Dragana Damjanovic [:dragana] from comment #11)
> I was looking for a header but there is non. It can also be that we get a
> meta tag '<meta http-equiv="Content-Security-Policy"
> content="upgrade-insecure-requests">'
> 
> If ff receives such a tag or a header it will upgrade to https.

From the nsSSService module in the log there is no info we would ever store HSTS headers for the domain, there is no HSTS cached entry.  

The request to update likely comes somewhere from the top level code (docshell/csp).   CC'ing Kate in case she could think of something.

FO-nTTaX, can you please try to reproduce as suggested in comment 9?  Thanks.
Flags: needinfo?(fonttax)
I got it again with a different ad: http://stuff.fo-nttax.de/log.txt-main.19744 (Caution this file is 300MB)
Flags: needinfo?(fonttax)
needinfo me not to forget.
Flags: needinfo?(dd.mozilla)
Flags: needinfo?(dd.mozilla)
Priority: -- → P3
Whiteboard: [necko-triaged]
Flags: needinfo?(dd.mozilla)
(In reply to FO-nTTaX from comment #14)
> I got it again with a different ad:
> http://stuff.fo-nttax.de/log.txt-main.19744 (Caution this file is 300MB)

What URL to look for?
Flags: needinfo?(fonttax)
http://wiki.teamliquid.net/counterstrike/Main_Page
Flags: needinfo?(fonttax)
CSPContext, nsSSService and nsHttp logging turned on and I do not see any log that help us figure out what is happening.

I do not know CSPContext, but I think it should show when a policy is added.

Christoph, do you have any other idea? Can you douple check CSPContext logging in the log from comment 14.
Flags: needinfo?(dd.mozilla) → needinfo?(ckerschb)
(In reply to Dragana Damjanovic [:dragana] from comment #18)
> Christoph, do you have any other idea? Can you douple check CSPContext
> logging in the log from comment 14.

I could not reproduce the problem, but I am happy to take another look if we have a site where the problem is reproduce-able. Two things regarding upgrade-insecure-requests (probably those are known facts, but I list them anyway):
a) Upgrade-insecure-requests applies to all nested iframes (not the other way round), but
b) Upgrade-insecure-requests also applies to same-origin navigations.

The reporter mentioned that when clicking 'edit' the problem occurs. So it's possible that there is problem within IsConsideredSameOriginForUIR() [1] or it's callsite [2] where we just (for whatever reason) take the wrong principal (the principal of the iframe that ships with UIR) and then try to upgrade the top-level page. Obviously that's only a guess, but probably not too far off.

Using CSPContext for logging should work and should also print the applied CSP.

[1] https://dxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#10950
[2] https://dxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#11272
Flags: needinfo?(ckerschb)
(In reply to Christoph Kerschbaumer [:ckerschb] from comment #19)
> (In reply to Dragana Damjanovic [:dragana] from comment #18)
> > Christoph, do you have any other idea? Can you douple check CSPContext
> > logging in the log from comment 14.
> 
> I could not reproduce the problem, but I am happy to take another look if we
> have a site where the problem is reproduce-able. Two things regarding
> upgrade-insecure-requests (probably those are known facts, but I list them
> anyway):
> a) Upgrade-insecure-requests applies to all nested iframes (not the other
> way round), but
> b) Upgrade-insecure-requests also applies to same-origin navigations.
> 
> The reporter mentioned that when clicking 'edit' the problem occurs. So it's
> possible that there is problem within IsConsideredSameOriginForUIR() [1] or
> it's callsite [2] where we just (for whatever reason) take the wrong
> principal (the principal of the iframe that ships with UIR) and then try to
> upgrade the top-level page. Obviously that's only a guess, but probably not
> too far off.
> 
> Using CSPContext for logging should work and should also print the applied
> CSP.
> 
> [1]
> https://dxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.
> cpp#10950
> [2]
> https://dxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.
> cpp#11272

I was expecting to see some log in CSPContext. That could help me figure out which request is setting this, but I do not see any.

grep CSPContect on the log file gives:

2017-08-28 16:42:32.444000 UTC - [Main Thread]: D/CSPContext nsCSPContext::nsCSPContext
2017-08-28 16:42:32.444000 UTC - [Main Thread]: D/CSPContext No Document in SetRequestContext; can not query loadgroup; sending reports may fail.
2017-08-28 16:42:44.682000 UTC - [Main Thread]: D/CSPContext nsCSPContext::nsCSPContext
2017-08-28 16:42:44.683000 UTC - [Main Thread]: D/CSPContext No Document in SetRequestContext; can not query loadgroup; sending reports may fail.
2017-08-28 16:42:44.685000 UTC - [Main Thread]: D/CSPContext nsCSPContext::~nsCSPContext
2017-08-28 16:42:51.052000 UTC - [Main Thread]: D/CSPContext nsCSPContext::~nsCSPContext
There are two requests made to http://wiki.teamliquid.net/counterstrike/Main_Page.

The first one is cached, but we reload since Vary doesn't match.  There is no redirect.  nsHttpChannel @1e71482a000 created 2017-08-28 16:42:28.549

The second one is redirected.  What's kinda weird for that channel is that we are not opening the cache entry, but it's probably just because the code path to check for hsts redir happens sooner.  nsHttpChannel @1e71694e000 created 2017-08-28 16:42:45.039.


But what is most interesting is this:

2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Seeking HSTS entry for wiki.teamliquid.net
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Parsing state from 
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService  is not a valid SiteHSTSState
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Parsing state from 
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService  is not a valid SiteHSTSState
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService no HSTS data for wiki.teamliquid.net found, walking up domain
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Seeking HSTS entry for teamliquid.net
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Parsing state from 
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService  is not a valid SiteHSTSState
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Parsing state from 
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService  is not a valid SiteHSTSState
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService no HSTS data for teamliquid.net found, walking up domain
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Seeking HSTS entry for net
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Parsing state from 
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService  is not a valid SiteHSTSState
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService Parsing state from 
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService  is not a valid SiteHSTSState
2017-08-28 16:42:47.874000 UTC - [Main Thread]: D/nsSSService no HSTS data for net found, walking up domain

there are 48 lookups for hsts data on wiki.teamliquid.net (and the upper domain).  All are negative, there is no "storing HSTS site entry for .*teamliquid.net" found in the log.


were there any child logs produced as well?
This is the only log my browser produced following the steps I had been asked to do
(In reply to FO-nTTaX from comment #22)
> This is the only log my browser produced following the steps I had been
> asked to do

Yeah..  Thanks for it.  

There is a known sandboxing issue regarding logging and use of about:networking#logging.  

If you could use environment variables according [1], it would generate a child log too.  It could give us some clue.  

The log will be large (you may need disk space of few gigabytes), zipping or xz'ing it will make it way smaller and sharable.  You can also add a size limit (not preferred until the file is 500M since we could lose the important info) with an added module "rotate:500".

Thanks again.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging#Logging_HTTP_activity_by_manually_setting_environment_variables
I wish i had an easily reproducible case, but i can't reproduce it right now. It was an ad, and the ad is gone now :(
(In reply to FO-nTTaX from comment #24)
> I wish i had an easily reproducible case, but i can't reproduce it right
> now. It was an ad, and the ad is gone now :(

Yes, these problem are very difficult to reproduce, I understand.  I'm just curious how an add (presuming from a third party origin?) would affect http://wiki.teamliquid.net/ origin.
Yeah we don't host them ourselves, we get our ads from third party ad partners
Not sure if this could be related to bug 1417901.

So, when there is a request in the context of the top level page that upgrades, it might be that we share either loadinfo or some other object or path (have to check) between this request and the load context we set the "do upgrade" flag on.  On next top-level load we do upgrade based on it.

According comment 14 this seems likely, since we encounter upgrades ONLY between the first (non upgraded) and the second (upgraded) request to the main page.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: P3 → P2
See Also: → 1417901
Ok so i managed to reproduce it again, this time with environment variables for logging. Main log and child logs are here: http://stuff.fo-nttax.de/logs.tar.gz, the URL to look for is http://liquipedia.net/pubg/index.php?title=Main_Page&action=edit
Oh, and i logged like this: export MOZ_LOG=timestamp,nsHttp:5,nsSocketTransport:5,nsSSService:5,CSPContext:5
The upgrade must come from [1].  We don't find STS records for http://stuff.fo-nttax.de at all, definitely not for the channel in question.  

Hence, either we use a wrong triggering principal in nsDocShell::DoURILoad or we set the UPGRADE_IF_INSECURE_DIRECTIVE directive on its CSP object somewhere.  I cannot see it (not saying it's not there somewhere) in the log, tho.

[1] https://hg.mozilla.org/mozilla-central/annotate/351c75ab74c9a83db5c0662ba271b49479adb1f1/docshell/base/nsDocShell.cpp#l11429
stuff.fo-nttax.de is just my file hosting, the domain this happened on is http://liquipedia.net/pubg/index.php?title=Main_Page&action=edit as i wrote in #28
The simplest hypothesis I have given the above information is that some ad adds a upgrade-insecure-requests <meta> CSP to a document that is sharing principals with the main document (e.g. an about:blank iframe in the main document).  That is, I am guessing this is basically like bug 1419222.

Could try creating a build with logging for that case and seeing whether that's actually what happens...
(In reply to FO-nTTaX from comment #31)
> stuff.fo-nttax.de is just my file hosting, the domain this happened on is
> http://liquipedia.net/pubg/index.php?title=Main_Page&action=edit as i wrote
> in #28

Sorry, just typo.  I meant liquipedia.net.

(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #32)
> The simplest hypothesis I have given the above information is that some ad
> adds a upgrade-insecure-requests <meta> CSP to a document that is sharing
> principals with the main document (e.g. an about:blank iframe in the main
> document).  That is, I am guessing this is basically like bug 1419222.
> 
> Could try creating a build with logging for that case and seeing whether
> that's actually what happens...

Thanks for this hint.  I think we could land a log for that into m-c, if there is already a module or even log into web console, perhaps.
Attached patch investigative logs (v1) (obsolete) — Splinter Review
based on comment 32.  hopefully we will catch it.  tested using a modified case at [1]

not sure how more we could identify where from the csp comes and where its actually being set.  logging the two obvious properties of the context, please feel free to suggest anything smarter.  thanks.

[1] https://bugzilla.mozilla.org/attachment.cgi?id=8930300&action=edit
Assignee: nobody → honzab.moz
Attachment #8941911 - Flags: review?(bzbarsky)
Comment on attachment 8941911 [details] [diff] [review]
investigative logs (v1)

r=me.

The other thing you could do is add logging in HTMLMetaElement::BindToTree around the AppendPolicy call logging the document URI of aDocument.  Correlating that with the logging from inside AppendPolicy might be fruitful.
Attachment #8941911 - Flags: review?(bzbarsky) → review+
HTMLMetaElement::BindToTree added some logging as well, thanks Boris for a quick r!
Attachment #8941911 - Attachment is obsolete: true
Attachment #8941919 - Flags: review+
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8e08a2cf1b2d
Investigative logging in CSP: log when 'upgrade-insecure-requests' CSP is added to the CSP context, r=bz
Keywords: checkin-needed
FO-nTTaX, I have added a new log to Nigthly that might answer what is the cause of this bug.  If you find time again to help to investigate, please run logging once more with the following modules:

MOZ_LOG=timestamp,nsHttp:5,nsMetaElement:5,CSPContext:5

Thanks!
Flags: needinfo?(fonttax)
FO-nTTaX, any news?  I'll otherwise close this bug as incomplete for lack of data.  Thanks.
See Also: → 1419222
I've only been able to reproduce it with ads, and I haven't encountered one since, so it's hard for me. If I had a reliable way to reproduce it, I'd do it in a second :(
(In reply to FO-nTTaX from comment #42)
> I've only been able to reproduce it with ads, and I haven't encountered one
> since, so it's hard for me. If I had a reliable way to reproduce it, I'd do
> it in a second :(

I know this is not easy.  Our best theory now is that this is a duplicate of bug 1419222.  I'll close this for now.  If you manage to reproduce and capture a log, I would be interested to confirm it, but for now I think we are OK with duplicating.

Thank you!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Sounds likely. If I manage to get a log, I'll make sure to add it.
Flags: needinfo?(fonttax)
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: