Closed Bug 1273882 Opened 3 years ago Closed 3 years ago

enabled prefetch network predictor causes loosing session/login status in some cases

Categories

(Core :: Networking, defect, critical)

48 Branch
x86_64
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla50
Tracking Status
firefox47 --- unaffected
firefox48 --- disabled
firefox49 + disabled
firefox50 + verified

People

(Reporter: Virtual, Assigned: u408661)

References

Details

(Keywords: dataloss, nightly-community, regression, Whiteboard: [necko-active])

Attachments

(7 files)

Attached file credentials.txt
From about 3 weeks Firefox behaves oddly, especially on sites which requires login.
Being logged in on forums and marking all posts and threads as already read logouts you immediately from the website page. Also it's visible when you have remembered a few accounts on GMail login website page, just login into one and next logout to see that all accounts are gone. I finally found the culprit today.


STR:
1. Go to http://board.pl.ogame.gameforge.com/index.php?form=UserLogin
2. Login with attached credentials
3. Open "Forum" or use this URL: http://board.pl.ogame.gameforge.com/index.php?page=Index
4. Go to the bottom of the website page and press "Mark all Forums as read"
5. and see that you will be logged off


Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=a603640aa82d54b224b2325d8a445fd14dd00bc6&tochange=b3b5be1d2027f915c3c54b966fb41de8444923ee


Workaround:
-Set "network.predictor.enable-prefetch" to "false" in about:config
-Use "Private Mode"
-Restart browser


[Tracking Requested - why for this release]: Regression
Flags: needinfo?(hurley)
Summary: enabled prefetch network predictor causes loosing session in some cases → enabled prefetch network predictor causes loosing session/login status in some cases
I want to add that, there is no any error or warning related to this in the Browser Console with all options enabled.
Hrm... just at first glance (reading the STR) without actually trying, I suspect this has something to do with query strings controlling the page that's shown, but I could be totally off-base. Ugh... well, at least it was found on nightly, and not, like, beta :)
Flags: needinfo?(hurley)
So after trying the STR... I can't reproduce with a fresh profile. :Virtual - is this a site you visit and browse on a regular basis? (In other words, might I need to browse around the site some in order to make this problem occur?)
Flags: needinfo?(bernesb)
Yes. It's not reproducible with clean new fresh profile without any addons (extensions and plugins) at first usage, but after using Firefox for a few days or maybe in faster way for just a few hours, the issue will be reproducible with about 25% (GMail login) - 100% (OGame forum) rate.
Flags: needinfo?(bernesb)
Bugzilla is also affected, but reproducible rate is about 25%, like GMail login.
Tracking this regression for 49. Is the tracking flag for 48 supposed to be set to disabled?
Flags: needinfo?(bernesb)
@ Marcia Knous - Yes.
Prefetch network predictor feature is disabled in Firefox 48 per bug #1268208, so that's why I didn't requested it, as probably it isn't needed, but feel free to "+" it too.

@ Nicholas Hurley - Is there any build, maybe some specific debug one, which will log useful data for diagnosing what's going on? Because like I said in Comment 1
> there is no any error or warning related to this in the
> Browser Console with all options enabled.
If not, I could probably set the Firefox Portable and make this bug reproducible on it and attach it here in last resort.
Flags: needinfo?(bernesb) → needinfo?(hurley)
:Virtual - just turning on HTTP and Predictor logging should do the trick (logging is available in any nightly build).

https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging has the directions for turning on logging, but instead of following the settings on that page for NSPR_LOG_MODULES, use the following:

NSPR_LOG_MODULES=timestamp,nsHttp:5,NetworkPredictor:5

I was also planning on doing that today, but if you can reproduce it quicker, that will get me off to a faster start figuring out what's going on (there may be more logging that I need to add)
Assignee: nobody → hurley
Flags: needinfo?(hurley)
Whiteboard: [necko-active]
Attached file log.7z
Done.
Just STR with my profile with disabled all addons.
Status: NEW → ASSIGNED
Has Regression Range: --- → yes
Has STR: --- → yes
Thanks. I'll take a look at the log and see if I can find anything useful
:Virtual - what (if any) addons do you have installed? The log you provided isn't giving me enough clues, and I can't reproduce this behavior at all locally. The one strange thing I'm noticing is that all the prefetches created by the predictor are for URIs that have no path component (in other words, they're all for "http://host/") - which is exceedingly odd. I'm wondering if any addons you have installed could be helping trigger this behavior.
(In reply to Nicholas Hurley [:nwgh][:hurley] from comment #11)
> :Virtual - what (if any) addons do you have installed?
I attached this information in Comment #12
I also attached all dump from about:support in Comment #13

(In reply to Nicholas Hurley [:nwgh][:hurley] from comment #11)
> The log you provided  isn't giving me enough clues,
> and I can't reproduce this behavior at all locally.
I hope you're testing this on Windows 7 (64bit) or at least on Windows, as I don't know it's reproducible on other operating systems like OS X, Linux and BSD.
How the issue looks like it's in Comment 15 and how it looks like without issue is in Comment 14.

> The one strange thing I'm noticing is that all the prefetches
> created by the predictor are for URIs that have no path component (in other
> words, they're all for "http://host/") - which is exceedingly odd.
> I'm wondering if any addons you have installed could be helping
> trigger this behavior.
Like I written in my
(Comment #9)
> Just STR with my profile with disabled all addons.
I had all addons (extensions, plugins, themes) disabled. I could also reproduce the same effect by going in Safe Mode to negate my changes in about:config.
Thanks for the info, I'll see if that gets me anywhere. The one thing to note is that I've only (so far) been testing on OS X and Linux (mostly Linux, hoping I'll be able to debug with rr). If this is windows-specific, I'll obviously need a machine like that (which is supposed to be on order for me, but has so far taken almost 2 weeks longer than promised).
If you will have more ideas and more settings to try in the logger, just say, I will be glad to help in fast diagnosing and next fixing this.
seeing this also on win7 with bugzilla and mozillazine
Quick update - I have been able to repro this (finally!) now that I've gotten a windows machine. From my tests, it looks like the login is not actually lost, but the page that's displayed looks like the login was lost - clicking through to another page results in continuing to be logged in.

There's still some oddities going on around this that don't make any sense (for example, the page looking logged-out shouldn't be anything that's prefetched, as it's the base URI, not a resource, so why it's being prefetched at all is mysterious). Perhaps figuring that out will solve the issue.
(In reply to Nicholas Hurley [:nwgh][:hurley] from comment #21)
> Created attachment 8761407 [details]
> Bug 1273882 - Don't prefetch on origin predictions.
> 
> Review commit: https://reviewboard.mozilla.org/r/58624/diff/#index_header
> See other reviews: https://reviewboard.mozilla.org/r/58624/

Please describe the patch.
Flags: needinfo?(hurley)
(In reply to Honza Bambas (:mayhemer) from comment #23)
> (In reply to Nicholas Hurley [:nwgh][:hurley] from comment #21)
> > Created attachment 8761407 [details]
> > Bug 1273882 - Don't prefetch on origin predictions.
> > 
> > Review commit: https://reviewboard.mozilla.org/r/58624/diff/#index_header
> > See other reviews: https://reviewboard.mozilla.org/r/58624/
> 
> Please describe the patch.

It's right there in the commit message :) We were allowing prefetch when doing predictions based just on origin (not full URI), which causes things to go funky. This makes it so prefetch will only occur with full URIs (which is the right thing to do even if this bug didn't exist, it makes no sense to prefetch an origin).
Flags: needinfo?(hurley)
Tracking 50+ as preserving the login is important.
Having network.predictor.enable-prefetch=true breaks login on Typo3 with RSA authentication (https://docs.typo3.org/typo3cms/extensions/rsaauth/)! Firefox makes a second request resulting in a new key-pair, a public key for the login form and a private key stored in the session. The rendered page with the login formu uses the public key from the first request. But this key does not match the private key from the second request made by the network predictor. The problem was described three years ago at https://forge.typo3.org/issues/38660#note-7. The bug report was for Typo3 4.7 but the situation is now the same with Typo3 6.1 or newer.

A header or meta-tag would be required to give website owner/admins the possibility to disable prefetching.
Comment on attachment 8761407 [details]
Bug 1273882 - Don't prefetch on origin predictions.

https://reviewboard.mozilla.org/r/58624/#review56416

can you please explain how exactly is this patch fixing this bug?

::: netwerk/base/Predictor.cpp:1659
(Diff revision 1)
>        // predictor tests should NEVER call Learn with LEARN_LOAD_TOPLEVEL.
>        // The predictor xpcshell test needs this branch, however, because we
>        // have no real page loads in xpcshell, and this is how we fake it up
>        // so that all the work that normally happens behind the scenes in a
>        // page load can be done for testing purposes.
> -      if (fullUri) {
> +      if (fullUri && mDoingTests) {

can you explain this change please?  sounds odd you suddenly do something only during testing, probably needs a comment
Attachment #8761407 - Flags: review?(honzab.moz) → review+
Just another bit of explanation here for how the actual issue comes around, for posterity:

The user goes to the login page, and logs in. When loading the page for the login, we end up predictively prefetching (in the case of this bug) the root path on the origin (which by itself redirects to the landing page in a form that is not logged in, as the prefetch happens before we get our login cookie). Then, when you click "mark all read" on the forums, what ends up happening is you get redirected to the landing page - which has been cached and forced valid in a not-logged-in state. The HTTP channel for that happily uses the forced-valid, not-logged-in page, which makes it appear that you have been logged out. In reality you haven't (your login cookie still exists, and if you go to another page, you will see that you are, in fact, still logged in), but the perception that you have been logged out is at least as bad as actually being logged out :)
Pushed by hurley@todesschaf.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1d4260332e43
Don't prefetch on origin predictions. r=mayhemer
https://hg.mozilla.org/mozilla-central/rev/1d4260332e43
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
this patch gave us a 2-4% win in tp5o across the board:
https://treeherder.mozilla.org/perf.html#/alerts?id=1555

Thanks for the perf fix :)
(In reply to Joel Maher (:jmaher) from comment #31)
> this patch gave us a 2-4% win in tp5o across the board:
> https://treeherder.mozilla.org/perf.html#/alerts?id=1555
> 
> Thanks for the perf fix :)

Joel - this seems to match up perfectly with the regression listed in bug 1266735 - do you think we can close that one?
Flags: needinfo?(jmaher)
About 22 network probes just changed [1]. Any chance that this might be the cause?

[1]: https://groups.google.com/forum/#!forum/mozilla.dev.telemetry-alerts (all the ones from June 22)
Flags: needinfo?(hurley)
Verified as fixed in latest Nightly.
Thank you very much! \o/
Status: RESOLVED → VERIFIED
good point, I closed the talos regression bug!
Flags: needinfo?(jmaher)
(In reply to Chris H-C :chutten from comment #34)
> About 22 network probes just changed [1]. Any chance that this might be the
> cause?
> 
> [1]: https://groups.google.com/forum/#!forum/mozilla.dev.telemetry-alerts
> (all the ones from June 22)

It's conceivable, but most likely only if those probes also changed when bug 1016628 first landed. Either way, none of those telemetry changes look particularly bad (some of them are, in fact, good - we're taking less time to do things more often!)
Flags: needinfo?(hurley)
Well, quite a few of them did [1], so I guess that's alright then.

I am heartened to see improvements, but I do like to have them accounted for at least a bit so we know where to look if they change back :)

[1]: https://bugzilla.mozilla.org/show_bug.cgi?id=1016628#c47
You need to log in before you can comment on or make changes to this bug.