Closed Bug 236772 Opened 20 years ago Closed 20 years ago

Sprint Local Bill Redirection Limit Exceeded

Categories

(Core :: Networking: Cookies, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.7final

People

(Reporter: glc_bugs, Assigned: darin.moz)

References

()

Details

(Keywords: fixed1.7, regression, Whiteboard: fixed-aviary1.0)

Attachments

(8 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7b) Gecko/20040226 Firefox/0.8.0+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7b) Gecko/20040226 Firefox/0.8.0+

Login stopped working sometime between 20031020 and 20031029

Reproducible: Always
Steps to Reproduce:
1. Visit localbill.sprint.com
2. Try to login with a valid username/password


Actual Results:  
"Redirect Loop Error: Redirection limit for this URL exceeded. Unable to load
the requested page. This may be caused by cookies that are blocked."

Expected Results:  
Login normally.

This regressed quite a while ago. I never filed it since it was only a minor
inconvenience once a month when I try to pay my bill. I did some drilling back
and it worked in a build from 20031020 and is broken in 20031029. All tested
with new profiles.

I realize this is hard to test since Sprint doesn't have local service in very
many locations. If anyone has access to (non-SSE) nightlies in that range, I
might be able to narrow it down to a fewer range of days.

The site claims to work with Netscape 7+.
Whoops, this should be Browser. Adding regression keyword.
Component: General → Browser-General
Keywords: regression
Product: Firefox → Browser
Version: unspecified → Trunk
*** Bug 240499 has been marked as a duplicate of this bug. ***
Attached file cookie log
Since the error message mentions cookies, I've attached a cookie log. When
creating the log I didn't get the error message; it just hung there. I closed
Mozilla after nothing had changed for several minutes.
Setting blocking 1.7? since this works on the 1.4 branch. It would be a shame to
see the new stable branch (and therefore Netscape 7.2) be broken on this site
that advertises compatibility with Netscape 7.
Flags: blocking1.7?
Is this actually broken with Mozilla builds on the 1.7 branch? Has someone
tested that? (clean profile, clean install, etc.)
and what is this bug doing in Browser-General? -->Networking (possibly cookies)
Assignee: firefox → darin
Component: Browser-General → Networking
QA Contact: benc
Yes, still broken with same message.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040507
Fresh from the oven, new profile, etc.
*** Bug 243525 has been marked as a duplicate of this bug. ***
Status: UNCONFIRMED → NEW
Ever confirmed: true
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.7final
Greg,

Thank you for supplying us with a cookie log, but could you please capture an
HTTP log in addition?  All you'd need to do is modify NSPR_LOG_MODULES to look
like this:

  NSPR_LOG_MODULES=nsHttp:5,cookie:5

Then the log file will include output for both HTTP and the cookies module. 
Thanks!!

See http://www.mozilla.org/projects/netlib/http/http-debugging.html for more
info on HTTP logging.
Attached file HTTP and cookie log
HTTP and cookie log. Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7)
Gecko/20040521 Firefox/0.8.0+ Used a clean profile.

Had to zip it since it exceeds bugzilla's 300k limit.
Cookies did have some changes landed in the 2003-10-20 to 10-29 time range. bug
143939 and bug 221185 changed things quite a bit (10/21), bug 223365 (10/25).
Not much network change, although there was one crash fix in
nsHostResolver::GetHostToLookup (bug 221491)
While I don't know what the problem is, i noticed a few things from the log:
- The cookies aren't rejected or something. All the cookies that are set are
also returned.
- There are two session ids: SMSESSION and JSESSIONID. jsession is the same
every request, smsession is renewed. That makes me guess that is the problematic
cookie
- The hashtable checkin may have changed the order in which the cookies are
returned, but that shouldn't be a problem to any decent server side script.

Greg, can you create the same log with a working build, so we can see if there
are any differences? Thanks.
(In reply to comment #12)
> - The hashtable checkin may have changed the order in which the cookies are
> returned, but that shouldn't be a problem to any decent server side script.

the hash checkin shouldn't have affected that. (if it did, it's a bug.)
I do think the difference is in the order in which the cookies are send.

from working log:
0[3f44a8]:   Cookie: SMTRYNO=1;
FORMCRED=1cC0ENU06bzNFvWP2ARu2e/T7ttp9hXpe4amaX3dDsl0JbOCUsd7hXMbER9ThC2YGy3ur1b6+46gDpx4O0QQZBQKVX7eqof2QC1a7TrNhsyfrBXp/pl4xpnJJyaCaPz2

from not working:
0[274490]:   Cookie:
FORMCRED=M4+L+Fjoi98gffdjnVCHkDIJe/oO/t72CfqE5TJV5xtL/rHXziRgfNZ+VJEdp6O3gsbvZOkGO/U3iNH+qEoIHoL9rRClov/9EIfa0rm+vENZlnjA6FwOXDIYnob78gsB;
SMTRYNO=1

Same happens for the SMSESSION and JSESSIONID cookies. I think. I can check,
because the cookies are long, and the log is truncated.

afaik, the spec only defines the order if the paths are different.
JSESSIONID has path=/, SMSESSION has no path. We both interpret that as /. Maybe
we shouldn't do that for sorting purposes.
ohh, nice. that could be the problem. :)
> afaik, the spec only defines the order if the paths are different.

The world of cookies operates on de facto reverse engineered "specs" as much as
it does on any RFC. If Navigator and IE have "always" returned cookies in a
certain order then we may need to consider that the "spec" whether it's written
down somewhere or not.
so, first things first. ;)

nspr logging is truncating long lines to 512 chars... making these logs
difficult to interpret. i've filed bug 244478 about that.

secondly, the paths aren't the problem here. both the SMSESSION and JSESSIONID
cookies have their paths explicitly specified as '/', so the order in which
these cookies get sent back is technically undefined. if the order really is the
problem, then maybe we need to add another sorting criteria, but hashes
certainly aren't the culprit. (paths have always been the sole sorting criteria
for cookies; hashes just changed the order in memory, which happens to change
the order in this example.)

i'm not convinced the order is the problem...
The server would be pretty broken if the order really is the problem, but that
is the only difference i can find.
I think the old order the cookies were send out was fifo, so new cookies got
added to the end of the list.
After the hastable patch, new cookies are at the head of the list.
http://lxr.mozilla.org/seamonkey/source/netwerk/cookie/src/nsCookieService.cpp#2039

I think it is worth a try to add them to the end of the list. See if that helps.
(I can't try for myself, as i dont have anything to do with sprint, let alone
have an account)
right; the old approach was fifo, the new approach is lifo. we could try hacking
up a fifo storage order for hashtables, to test if order really is the
problem... but since hashes are fundamentally different from lists, it wouldn't
work in all cases, so we'd have to do something more extensive for checkin.

for instance, within one hash node (which is one domain, e.g. .sprint.com) we
could make the order fifo. but what if we have two nodes with cookies to send
back, e.g. .sprint.com and .foo.sprint.com? then, we're combining the cookies
from two nodes, and sorting by path length - so intranode, the list is fifo, but
internode it will not be.

we would have to store a timestamp for when the cookie was set, and do a
secondary sort on that, to guarantee fifo always.
Bob, can you help us figure out if this is likely to be more widespread than
just sprint? 

MVL, and Dwitte, time is pretty short on 1.7 and I think we probably need to fix
this for the 1.7 release. How quickly do you think you'll be able to test your
theory and if it's pans out, then how soon before a fix?
It is difficult to determine the scope of the effects the change from fifo to
lifo order in the document.cookie string will have on the web in general however
we can make some educated guesses about the scope and location of this problem.

I can reproduce this bug in Mozilla 1.6 and Mozilla 1.7 on Windows XP but not in
Mozilla 1.4 or Mozilla 1.5. Depending on the share of Mozilla 1.6/Firefox 0.8
and later versus 1.5 and earlier you might feel comfortable that this bug does
not affect most sites. However remember many "intranet" and "web app" type sites
only officially support Netscape 7.x which is on either 1.0.x or 1.4.x.
Companies currently using Netscape 7.x/Gecko 1.4 and earlier might be in for a
surprise when upgrading to Mozilla 1.7.

In a quick cursory search of web-based cookie documentation I haven't found any
mention of a fixed fifo order for document.cookie but the use of undocumented
features can be common. "Normal" use of a set of cookies might be expected to
follow the documentation of examples on the web where the cookie string is
parsed on ; then =. Even for cases where the programmer was aware of the
de-facto fifo nature of cookies, it would be difficult to use this information
considering the different possible orderings of the cookies and the potential
variable length of the cookie values. 

My guess is that effects of this change from fifo to lifo will be limited to
cases where the programmer can control the web application entry point, has a
fixed order for setting cookies and a fixed format for cookies which have fixed
length values and has "optimized" his application to take advantage of the fact
he could retrieve cookie values using offset/lengths without having to
scan/parse the entire cookie string.

If one programmer has done this on one site/app, he has done it elsewhere. If
one programmer has done this you can count on others doing it as well. The fact
that the lifo cookie ordering is *unique* to Mozilla 1.6/1.7 while previous
releases of Mozilla, Internet Explorer, and Opera all use fifo makes lifo
ordering a non-starter in my opinion.
Flags: blocking1.7? → blocking1.7+
asa/bc: the limiting factor here is how quickly we can determine if this is
indeed an ordering problem... once we know it is, i can hack up a "quick fix"
for 1.7, but i think a proper fix will be 1.8 material (it might involve
changing the cookie file format).

re bc's comment, i assume that when you say you can reproduce this, you're
talking about using your sprint account? if so, you can help us debug this...
mvl, can you give bc some help?

also, in regard to your comment about Internet Explorer and Opera using fifo
ordering, what makes you say that? have you tested them? if you know for sure
that they do, then that's an important datapoint.
> re bc's comment, i assume that when you say you can reproduce this, you're
> talking about using your sprint account? if so, you can help us debug this...
> mvl, can you give bc some help?

No I did not test Sprint, I tested the fifo/lifo behavior using a simple page
which set two cookies then document.write'd the value of document.cookie

> also, in regard to your comment about Internet Explorer and Opera using fifo
> ordering, what makes you say that? have you tested them? if you know for sure
> that they do, then that's an important datapoint.

With the simple test case where 

      document.cookie = "a=1";
      document.cookie = "b=2";
      document.write(document.cookie);

shows a=1; b=2 in Mozilla 1.4/1.5, MSIE 6 and Opera 7.5 on Windows XP and b=2;
a=1 in Mozilla 1.6/1.7.
> No I did not test Sprint, I tested the fifo/lifo behavior using a simple page
> which set two cookies then document.write'd the value of document.cookie

ah. well, technically, that's not "reproducing this bug" since we don't actually
know what this bug _is_, yet. and we can't call the lifo behavior a bug, since
we don't know if it actually has any effect on servers in the wild.

the datapoints for IE and opera are handy though, thanks for testing those!
Whiteboard: still testing and evaluating options
Time is getting really short for 1.7. Is there any chance this is going to be
moving forward in the next couple of days?
1. Tried to log in, fails.
2. Deleted SMSESSION cookie.
3. Tried to log in again, still fails.

Works the same if I delete the JSESSION cookie. If you also want a log testing
that, let me know.
i still see only cookie strings that start with the JSESSIONID cookie. I think
my trick didn't work.
dwitte, can you try to write a patch that adds the cookies to the end of the
list? I know it wouldn't work in all cases, but at least it will help finding
out if it works for this site.
Comment on attachment 149602 [details] [diff] [review]
quick hack for trunk

>+    for (; iter.current; ++iter);
I personally prefer a while construct.
while(iter.current)
 ++iter;

clearer, more common, not abusing for :)

I'd say lets try on trunk, see if it works. I'm not sure we really need the
full-blown 'correct' patch. This already is a work-around that shouldn't be
needed. if there isn't any evidence there are site needing inter-node fifo
order, lets not add unneeded bloat.

r=mvl
Attachment #149602 - Flags: review+
Comment on attachment 149602 [details] [diff] [review]
quick hack for trunk

yeah, this fix might be okay on its own, but certainly not ideal. :/

so let's get this in on the trunk and get a nightly generated, so our reporter
can test it. bz, since darin's away, any chance you could stamp this? (it's
really only for testing purposes; regardless of whether it works or not, i'll
back it out after we get a nightly build, and work on a more optimal fix.)
Attachment #149602 - Flags: superreview?(bzbarsky)
Comment on attachment 149602 [details] [diff] [review]
quick hack for trunk

I'd really be happier if you found someone to roll a Windows build with this
patch so reporter can download the zip and test...

If you absolutely can't find anyone to do that, sr=bzbarsky for trying this out
on trunk, I guess.  :(
Attachment #149602 - Flags: superreview?(bzbarsky) → superreview+
If this is checked into the trunk, I can try the nightly for May 30. If someone
otherwise wants to roll a win32 build with the patch that's cool too; just drop
me an email with a link. Either way, I should be at home to test it Sunday night
(EDT) sometime at the latest.
Greg, a firefox build with dwitte's hack is available for testing at
http://home.bluemarble.net/~walk84/firefox/builds/
Hack didn't work.
So, it seems dwitte's hack didn't do anything.
from the log:

0[2f4eb8]: ===== COOKIE ACCEPTED =====
0[2f4eb8]: cookie string: SMSESSION=pIVCAMX1SQoIHUbKGQyApKV2jehTg....
0[2f4eb8]: ===== COOKIE ACCEPTED =====
0[2f4eb8]: cookie string: JSESSIONID=QLpzeqO20DmxbH6rRCmfPZ4x2C1i....
0[2f4eb8]: ===== COOKIE SENT =====
0[2f4eb8]: cookie string: JSESSIONID=QLpzeqO20DmxbH6rRCmfPZ4x2C1i...;
SMSESSION=pIVCAMX1SQoIHUbKGQyApKV2jehTg....
 
So, still not fifo. weird.

greg: Can you try again, but delete all the cookies before trying? A good way
might be to use a clean profile.
I always test with a new, clean profile.
(In reply to comment #15)
> afaik, the spec only defines the order if the paths are different.
> JSESSIONID has path=/, SMSESSION has no path. We both interpret that as /. Maybe
> we shouldn't do that for sorting purposes.

So, I verified that dwitte's patch correctly changes LIFO to FIFO when the
pathes are the same.  And, I confirmed that when one of the cookies has a path
of / and the other has no path that the order remains LIFO.  So, it seems to me
that we should try to make the cookie code distinguish path=/ from no path in at
least this case.
Maybe i have been fooled by nspr logging truncating long lines, and both cookies
do have a path.

I tried the following javascript code:
document.cookie="a=b";
document.cookie="c=d";
document.cookie="a=e";
document.write(document.cookie);

That gives me a=e; c=d as output (without the patch). I think that means that
cookies are not modifed, but we create a new one. So, even if the order is fifo
for newly created cookies, as soon as a cookie is modified or just set again,
the order will change again. I think that is what bit us here.
I don't know how to change that behaviour. At least not easily.
mvl:

i tried your example, and i get the same behavior in both the mozilla trunk as
well as mozilla 1.4.
ok, here's what i think is going on.  dwitte's patch didn't help because
SMSESSION is for ".sprint.com" and JSESSIONID is for "localbillga.sprint.com"

so, it seems that we need to preserve FIFO ordering except when a cookie is
modified :-(

that sounds complicated to implement given the hash table.
Attached file reduced testcase
here's a reduced testcase
dwitte: do you have time to work on a better patch for this bug?  can you think
of a way around this problem without discarding the hashtable altogether? 
creation timestamps might be ideal... do you want to cut a patch that implements
that?  note: we'd want to preserve the creation time when modifying a cookie (i
think).
Attached patch possible patch (obsolete) — Splinter Review
here's a quick patch (hack) that implements my suggested fix.  it makes us
behave like 1.4, but i'm not sure that it will really fix the sprint site.  can
someone (swalker?) please generate a test build with this patch (instead of the
previous patch)?  thanks!!
Argh, i never looked at the domains! we even knew the patch wouldn't work in
that case...
But in the testcase, i do see test_a before test_b, on a build without any of
the two patches.
mvl: hmm.. so do i if i just use my default profile w/ a recent firefox build. 
hmm.. but, when i run a 1.7rc2 seamonkey build with a new profile, i can repro
the problem.  if i use a 1.4 seamonkey build with a new profile, i cannot repro
the problem.  i'm not sure why my firefox build doesn't exhibit the same
problem... maybe it has something to do with all the other b.m.o cookies?!
A firefox sea with darin's hack is up at
http://home.bluemarble.net/~walk84/firefox/builds/
(In reply to comment #47)
> A firefox sea with darin's hack is up at
> http://home.bluemarble.net/~walk84/firefox/builds/

thanks stephen!  i confirmed that your build passes my testcase.
And (drumroll please...) 

the patch fixes the Sprint site too. Thanks Darin!
okay, i'm back from holiday now... thanks for helping out while i was gone!

so yeah, creation time would be the way to go here... and for 1.7 it has to be
noninvasive, so darin's patch looks ok for that. it won't persist creation time
across browser sessions, but hopefully we don't need that for sites like sprint
to work. (sprint's cookies are session-only; but other sites might play the same
dirty tricks with persistent ones).

if that hope proves to be false, the alternative i see for 1.7 is a backout of
hashtables... which will be moderately messy at this point, given the length of
time it's been in. a third option would be to extend darin's patch to persist
creation time in cookies.txt, by dropping the LRU sort order (for allowing LRU
cookie deletion) and using creation time instead. fairly simple to do... it just
depends which behavior we consider the lesser of the two evils. ;)

in any case, i'd say a proper fix for 1.8 would entail a file format change.
> it won't persist creation time across browser sessions

i doubt any site could depend on the order of persistent cookies because the
browser historically could have evicted cookies in unpredicable order.  i could
be wrong, but it just seems less likely.

also, iirc cookies were not previously persisted in order of creation time. 
they were sorted in some other fashion in cookies.txt, no?  do you recall the
details?


> if that hope proves to be false, the alternative i see for 1.7 is a backout of
> hashtables... which will be moderately messy at this point, given the length of
> time it's been in.

i agree.  we want to try to avoid backing out this patch as much as possible.


> a third option would be to extend darin's patch to persist
> creation time in cookies.txt, by dropping the LRU sort order (for allowing LRU
> cookie deletion) and using creation time instead. fairly simple to do... it just
> depends which behavior we consider the lesser of the two evils. ;)

let's not go there unless and until we know that we have no other choice.


> in any case, i'd say a proper fix for 1.8 would entail a file format change.

let's definitely not go there if we can help it ;-)
Whiteboard: still testing and evaluating options → fix getting close
>also, iirc cookies were not previously persisted in order of creation time. 
>they were sorted in some other fashion in cookies.txt, no?  do you recall the
>details?

they were sorted with a primary order of path length, and a secondary order of
creation time (or update time, i can't recall). so, creation order was
persistent - but i agree, it's less likely that servers in the wild would depend
on order being preserved. since we do want to solve this as noninvasively as
possible, i'm happy to go along with your patch. if/when we do roll a new
fileformat, it'd be nice to solve this bug in a neater way.
dwitte: ok.. sounds good.  i'll clean up this patch and post it for your review.
Attached patch v1 patch - added comments (obsolete) — Splinter Review
This is the same patch with some additional comments to explain what's going
on.

dwitte: I thought about changing the name of compareCookiesByPath to something
like compareCookiesByPathThenCT, but i wasn't sure what you'd think of that. 
please let me know if you'd like me to change the name of that function to
better reflect the comparison algorithm used.  we could also name it something
like compareCookiesForSending.	or maybe you have a better suggestion?
Attachment #149845 - Attachment is obsolete: true
Attachment #149958 - Flags: review?(dwitte)
Comment on attachment 149958 [details] [diff] [review]
v1 patch - added comments

yeah, it's only used in the one place, so compareCookiesForSending sounds good
to me.

>Index: nsCookie.cpp
>===================================================================
>+// This is a counter that is incremented each time we allocate a new nsCookie.
>+// The value of the counter is stored with each nsCookie so that we can sort
>+// cookies by creation time (within the current browser session).
>+static PRUint32 gLastCreationTime;

static vars are inited to 0 by default, right? so no need for an explicit init
here?

looks nice, thanks for the patch, and for doing the hard work for me. :)

r=dwitte
Attachment #149958 - Flags: review?(dwitte) → review+
Component: Networking → Networking: Cookies
OS: Windows XP → All
Hardware: PC → All
ok, compareCookiesForSending... does that mean that you also like
compareCookiesForWriting instead of compareCookiesByLRU?

yes, static data is always initialized to zero.
Attached patch v1.1 patchSplinter Review
same patch with comparision functions renamed

  compareCookiesByPath -> compareCookiesForSending
  compareCookiesByLRU  -> compareCookiesForWriting

dwitte: please let me know if you approve of this change.  thanks!
Attachment #149958 - Attachment is obsolete: true
Attachment #150007 - Flags: review?(dwitte)
Comment on attachment 150007 [details] [diff] [review]
v1.1 patch

sure, why not :)
Attachment #150007 - Flags: review?(dwitte) → review+
Attachment #150007 - Flags: superreview?(dveditz)
Comment on attachment 150007 [details] [diff] [review]
v1.1 patch

>+  // compare by cookie length in accordance with RFC2109
>+  int rv = cookie2->Path().Length() - cookie1->Path().Length();

"by cookie length" -> "by cookie path length"

sr=dveditz
Attachment #150007 - Flags: superreview?(dveditz) → superreview+
Attachment #150007 - Flags: approval1.7?
Comment on attachment 150007 [details] [diff] [review]
v1.1 patch

a=chofmann for 1.7
Attachment #150007 - Flags: approval1.7? → approval1.7+
Whiteboard: fix getting close → approved patch ready to land
fixed-on-trunk and fixed1.7
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Keywords: fixed1.7
Resolution: --- → FIXED
Whiteboard: approved patch ready to land → needed-aviary1.0
Whiteboard: needed-aviary1.0 → fixed-aviary1.0
filed bug 248590 about the possibility of persisting creation time across
sessions, as noted in comment 52.
*** Bug 249025 has been marked as a duplicate of this bug. ***
*** Bug 246303 has been marked as a duplicate of this bug. ***
(In reply to comment #64)
> *** Bug 246303 has been marked as a duplicate of this bug. ***

Hi, sorry for this (in your eyes probably) lame question, but if this bug is
fixed how can I patch my FF 1.0. I still get this problem with a couple of
sites... www.kpn.com
www.hollandandbarrett.com (when clicking on the the "My Favourites" link)
Just curious as the lack of functionality is annoying.
Cheers
Paul
paulj@paullange.com.au
This bug was fixed on the branches and the trunk, therefore should be fixed in
Firefox 1.0. Try a recent trunk build to see if your issue works on the trunk.
If so, it will be fixed for Firefox 1.1 (which isn't far off).

If it's still broken, please file a new bug; you might want to mention this bug
in the new bug, as it might be a related issue.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: