Closed Bug 134203 Opened 22 years ago Closed 22 years ago

Some credit unions (including mine) can not be accessed with current Mozilla

Categories

(Core :: Networking: HTTP, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla1.0

People

(Reporter: mks, Assigned: darin.moz)

References

()

Details

(Keywords: regression, Whiteboard: [adt2])

Attachments

(4 files)

I am not yet sure where the problem lives, but accessing my Citadel CreditUnion
account fails with Mozilla.  There seems to be some internal error due to
cookies or other form of problem.

I have tried thinks like changing the user agent string to match those of other
browsers that do work (and I have not found any other browsers that fail in my
tests on Windows, Linux, FreeBSD, and MacOSX)  I have tried, so far, the user
agent strings from Netscape 4.x (Linux, FreeBSD, Win), Netscape 3.x (Win), Opera
(Mac), OmniWeb (Mac), IE (Mac, Win), and, of course Mozilla.

All of these other browsers had no problems with the site (or the site had no
problems with them).

I really don't want to give out my account information (;-) so we will need to
work with Citadel (and the other credit unions that use the same web banking
code base) in order to find the problem.

It seems that this is related to some problem in java script or cookies (or the
combination of those) since when the error happens the web server thinks I have
not logged in when I actually have - and even got some of my web page up (they
use a rather nasty multi-frame setup).

Watching the cookie requests shows the cookie being set at login and then asked
to be modified just before the server then shows an error.
These are the pages as saved from Mozilla - both the main login page and the
failed frame set page.

I have changed the html in one of the pages where it referenced my account
number to no longer do so.  The number was changed to '000000' just so that it
still is there but is not my account or anyone else's account.
Note that I do not know how to move forward without some support from the Credit
Union.  This may require more than just my talking with them.

Basic contact information is:

   Citadel Federal Credit Union
   POBox 147
   Thorndale, PA 19372
   1-800-666-0191 / 1-215-383-5700
Oops - forgot that the area code changed.  It is 610 rather than 215.
confirming...

would it be possible for you to attach some excerpts from the javascript that
the site uses?
Status: UNCONFIRMED → NEW
Ever confirmed: true
BTW, I've seen this bug and worked with Mike on verification.

Note Mike's comment that a bunch of credit unions use the same webpage codebase
for online account access, so this may affect a bunch of sites. 

It's also possible that NSS might be involved here.

I'm going to leave the component until we know that where the issue really lies.
  What we really need is investigation; I can work on this with Mike if we know
what needs to be checked.  CCing some JS/etc people for ideas, if you know
someone who might have an idea on what to check, please CC them.

Mike: you uploaded that as octet-stream, what is the real file type? .tar.gz?
Severity: normal → major
Sorry, when I did the upload, I let the system auto-detect.  It is a .tar.gz
file from the saves of the we pages/javascript from Mozilla.  (Well, most of it,
at least in the failure case)

I have yet to get the non-failure case cleaned up such that I can send it.
This looks like a regression.  I just tried the version of Mozilla that comes
with RedHat 7.2 (Mozilla 0.9.2.1) and it worked just fine.

I will track this down some more as I get the chance.
This must be a regression since 0.9.9 seems to work fine too (Just downloaded
the linux build 2002031312 for testing)

I have re-verified that a current build from trunk (0.9.9+) does not work.

Is there something else I should try?
Since this is a regression from 0.9.9, and it affects probably many credit
unions and the ability of people to access their accounts, nominating for 1.0.

Adding asa.
Until we can determine where the regression is, reassigning to
Asa/Browser-general.  We need some eyes on this to find out where the problems are.
Component: US Banks → Browser-General
Product: Tech Evangelism → Browser
Version: unspecified → other
Actaully reassigning to Asa
Assignee: aruner → asa
QA Contact: bclary → doronr
The tech support indicated that it's a cookie error message...
I have now tested the following Linux builds from the nightly builds tree:

2002031521 - fails
2002031423 - fails
2002031321 - fails
2002031312 - works  (Note that this is "0.9.9" release)
2002031221 - fails
2002031121 - fails
2002031021 - fails

So, what is happening here?  It is not just my systems that seem to build
versions the fail since these are the pre-packaged binaries of the form:
2002-03-15-21-trunk/mozilla-i686-pc-linux-gnu.tar.gz that are failing.  The
2002031312 was pulled from the release tree on the ftp site.

So, what do I do now?  If I can believe the build IDs, the builds around the
0.9.9 release all fail but the 0.9.9 release works.

To me, this sounds like a compiler issue - somewhere some code happens to work
on the compiler that the release was built with but not on the systems where the
nightly builds are made.  (Again, assuming that the build numbers can be trusted
on the dates)
Priority: -- → P1
I suspect that this is something that changed (on the trunk) in the period just
after the 0.9.9 branchpoint (and before 3/10/2002)
per comment #12 -> cookies
Assignee: asa → morse
Component: Browser-General → Cookies
QA Contact: doronr → tever
What's the setting of your cookie-acceptance pref?  Specifically, do you have it 
set to accept cookies from originating server only?  If so, change that to 
accept all and see if that makes the difference?
I normally run with cookies set to "Enable cookies for the originating web site
only" and do so on all browsers.

I thought that I had check that this did not matter, but it seems that there is
some issue with the builds that fail since if I do change that the site does
work.  (Sorry I did not notice that earlier)

Anyway, in 0.9.9 and previous releases and in IE (which has a simular option)
this has worked in the past.

I have just reverified that 0.9.9 works even with this set to "Enable cookies
for the originating web site only" (and it works with accept all cookies)

I have also checked as to what the cookie is and it seems to always be the same
server/site name of "pcu.citadelfcu.org" - no other cookies get set during the
session.

Is there some tracking or other information I could collect for you?
The third-party cookie code in mozilla has been non-functional for a long time 
and was just recently enabled.  That is why you did not see the problem before.

The third-party cookie test in mozilla is stronger in mozilla than in n4 (and 
probably IE) which might explain why you do not see the problem on those 
browsers.  In particular, n4 will block third-parties from setting cookies but 
will obediently release cookies to third-party sites.  This was a privacy hole 
that was fixed in mozilla.

If you don't want to give out your account information, then you'll have to do 
the debugging for us.  Go to extensions/cookie/nsCookies.cpp and comment out the 
lines that say "ifdef DEBUG_morse" and their corresponding endifs (there are two 
pairs of these) so that the print statements that they bracket will execute.  
Then run the browser and let me know what values are getting printed out.
The following are the two runs, one with cookies restricted to the originating
site and the other with the cookies not restricted.

(A simple ASCII file)
Could you put in print statements for when the cookie_IsInDomain routine returns 
false and for when the cookie_SameDomain routine returns 0.  That will tell us 
which of these pairs from your previous printout resulted in the rejected 
cookie.

My guess is that it is when the 1stURL is coming up null.  Can you verify that 
that is indeed the case when cookies are being rejected.  And, if so, could you 
then set some breakpoints and determine who it is that is calling the cookie 
module with a firstAddress of null, and why.

Thanks.
I will get some of that to you as soon as possible - I may be distracted for a
bit as work (the bill-paying kind) gets in the way :-(
While looking at the code, I think it is obvious that the problem happens when
1st URL is NULL.  In fact, when that happens, lots of tests do not run.

The question, as you put it, is why 1st URL is NULL rather than the web site
the page is from.  I wonder if it has to do with JavaScript generated requests.

I will see if I can look into this later (or maybe you can do so on a normal
server with simular type of page layout as what I submitted above)

I have wrappered the IsInDomain() call with a print for when it fails and it
shows that it never fails incorrectly.  However, it is not called in any of the
cases where 1st URL is NULL.  (Which I think I noticed in the code when I was
looking at how to get the best output for that case.)  Anyway, having the 1st
URL NULL may wish to be what I set a break point on.  (Arg - I knew I did not
want to get pulled into another large project - last time I did this I ended up
becoming one of the Blackdown core team for Java 1.1 :-)
Ok - I have this thing in the debugger and am tracing back where the NULL came
from for the firstAddress.

First off, all of the image loads seem to have firstAddress == null during the
loading of the left hand frame.  Now, those images are coming from a part of
the server that does not need the cookies, but why are those image loads coming
up with a null firstAddress?  It looks like it is due to the fact that
JavaScript is doing the load requests (see the first uploaded tar.gz file in
the bug report for details as to what is in the left hand frame)

I set a breakpoint in the GetCookieFromHttp that only fires if firstAddress
is null.  During the loading of the pages after login I get lots of hits
due to the images (which are loaded by the JavaScript)  For example:

Breakpoint 2, COOKIE_GetCookieFromHttp (address=0x891e8b0
"https://pcu.citadelfcu.org/images/System_AdvertisementBackground.jpg",
firstAddress=0x0, ioService=0x80ffca8) at nsCookies.cpp:944
Breakpoint 2, COOKIE_GetCookieFromHttp (address=0x87991e0
"https://pcu.citadelfcu.org/images/System_Nav_PersonalFinance.gif",
firstAddress=0x0, ioService=0x80ffca8) at nsCookies.cpp:944

The images that get loaded (via JavaScript) are not the parts causing the
problem.  However, the next nested frame structure seems to be the
problem.  This page is reqested to be loaded from, what looks like,
JavaScript.  I have not found that script yet (I need to turn on network
tracing or something like that) but the breakpoint looks like this and the
call tree follows:

Breakpoint 2, COOKIE_GetCookieFromHttp (address=0x89a7558
"https://pcu.citadelfcu.org/asp/Login/System_WholeCenterAreaFrameStruct.asp",
firstAddress=0x0, ioService=0x80ffca8) at nsCookies.cpp:944

(gdb) backtrace
#0  COOKIE_GetCookieFromHttp (address=0x89a7558
"https://pcu.citadelfcu.org/asp/Login/System_WholeCenterAreaFrameStruct.asp",
firstAddress=0x0, ioService=0x80ffca8) at nsCookies.cpp:944
#1  0x41733871 in nsCookieService::GetCookieStringFromHttp (this=0x86421f8,
aURL=0x89a7020, aFirstURL=0x0, aCookie=0xbfffe7bc) at
../../dist/include/string/nsString.h:72
#2  0x417409d7 in nsCookieHTTPNotify::OnModifyRequest (this=0x812cae8,
aHttpChannel=0x89a70e8) at ../../dist/include/xpcom/nsCOMPtr.h:650
#3  0x401eda59 in XPTC_InvokeByIndex (that=0x812cae8, methodIndex=3,
paramCount=1, params=0x89a7540) at xptcinvoke_unixish_x86.cpp:130
#4  0x401cf1cc in nsProxyObject::Post (this=0x89a7520, methodIndex=3,
methodInfo=0x8643ad8, params=0xbfffe958, interfaceInfo=0x8642110) at
../../../dist/include/xpcom/nsCOMPtr.h:839
#5  0x401d2c02 in nsProxyEventObject::CallMethod (this=0x89a7500, methodIndex=3,
info=0x8643ad8, params=0xbfffe958) at ../../../dist/include/xpcom/nsCOMPtr.h:650
#6  0x401edcd7 in PrepareAndDispatch (self=0x89a7500, methodIndex=3,
args=0xbfffea14) at xptcstubs_unixish_x86.cpp:95
#7  0x401edd4a in nsXPTCStubBase::Stub3 (this=0x89a7500) at
../../../../../../dist/include/xpcom/xptcstubsdef.inc:5
#8  0x408f1502 in nsHttpHandler::OnModifyRequest (this=0x80fe6e0,
chan=0x89a70e8) at ../../../../dist/include/xpcom/nsCOMPtr.h:650
#9  0x4090a0c7 in nsHttpChannel::AsyncOpen (this=0x89a70e8, listener=0x897fcf8,
context=0x0) at nsHttpHandler.h:78
#10 0x40904c77 in nsHttpChannel::ProcessRedirection (this=0x897f828,
redirectType=302) at ../../../../dist/include/xpcom/nsCOMPtr.h:650
#11 0x408ffcde in nsHttpChannel::ProcessResponse (this=0x897f828) at
nsHttpChannel.cpp:514
#12 0x4090be8d in nsHttpChannel::OnStartRequest (this=0x897f828,
request=0x8982cf4, ctxt=0x0) at nsHttpChannel.cpp:2742
#13 0x40931f43 in nsOnStartRequestEvent::HandleEvent (this=0x87a77d0) at
../../../dist/include/xpcom/nsCOMPtr.h:650
#14 0x4089b955 in nsARequestObserverEvent::HandlePLEvent (plev=0x87a77d0) at
nsRequestObserverProxy.cpp:115
#15 0x401c334b in PL_HandleEvent (self=0x87a77d0) at plevent.c:596
#16 0x401c31ab in PL_ProcessPendingEvents (self=0x80e9eb8) at plevent.c:526
#17 0x401c52ba in nsEventQueueImpl::ProcessPendingEvents (this=0x8096230) at
nsEventQueue.cpp:388
#18 0x417b8baa in event_processor_callback (data=0x8096230, source=7,
condition=GDK_INPUT_READ) at nsAppShell.cpp:184
#19 0x417b8808 in our_gdk_io_invoke (source=0x8135870, condition=G_IO_IN,
data=0x8195588) at nsAppShell.cpp:77
#20 0x40430f9e in g_io_unix_dispatch () from /usr/lib/libglib-1.2.so.0
#21 0x40432773 in g_main_dispatch () from /usr/lib/libglib-1.2.so.0
#22 0x40432d39 in g_main_iterate () from /usr/lib/libglib-1.2.so.0
#23 0x40432eec in g_main_run () from /usr/lib/libglib-1.2.so.0
#24 0x4034d333 in gtk_main () from /usr/lib/libgtk-1.2.so.0
#25 0x417b96c5 in nsAppShell::Run (this=0x815ade0) at nsAppShell.cpp:364
#26 0x41777125 in nsAppShellService::Run (this=0x815a588) at
../../../dist/include/xpcom/nsCOMPtr.h:650
#27 0x08051107 in main1 (argc=3, argv=0xbffff624, nativeApp=0x8092dd0) at
../../dist/include/xpcom/nsCOMPtr.h:650
#28 0x0804f0ad in main (argc=3, argv=0xbffff624) at nsAppRunner.cpp:1763
#29 0x40591627 in __libc_start_main (main=0x804ef20 <main>, argc=3,
ubp_av=0xbffff624, init=0x804e1a0 <_init>, fini=0x805c8b8 <_fini>,
rtld_fini=0x4000dcc4 <_dl_fini>, stack_end=0xbffff61c) at
../sysdeps/generic/libc-start.c:129

At which point, as I continue, the site then tries to send a new cookie.
When I don't restrict the cookies, only one cookie gets set during the
whole session (until I log out).  It seems that fact that we don't
provide the cookie when loading the URL above, it gets upset and thus
sends a new cookie that is an "invalid/logged out" cookie.

I will attempt to find out exactly how this happens (code path) but 
maybe someone who already knows this area better can find the problem.
So, from reading of the code, could someone explain why 1stURL (aFirstURL)
should ever be NULL?  And if it is NULL, what does that mean for things like the
cookie restrictions?

My gut feeling here is that this really should not be NULL as it seems to be
used as the url the "user" sees as the base of the document.  Would not the
JavaScript loaded images still have the same base URL as the document?
Ok - I think I know the problem...

That JavaScript comment was a red herring as it was related to the images but
not the root cause of the problem.

In nsHttpChannel::ProcessRedirection() the newChannel (nsIChannel) is not setup
with the base document during the redirect.  Around line 1469 there is the
following code:

  // update the DocumentURI indicator since we were just redirected
  if (newURI && (mURI == mDocumentURI))
      httpChannel->SetDocumentURI(newURI);

The problem is that the second half of this test is too strict.  If, while
running at the back site and I get the redirect that causes the problem, I can
force the httpChannel->SetDocumentURI(newURI) and the site works, even with
restricted cookies.

The question now is what is the correct comparison operation.

BTW - I checked, and the redirect is just to another location within the same
site.  (darn gdb just failed on me and I have to get back to where I was...)

Anyway, I think I am close to a solution here - if someone who knows what the
behavior should be could help out that would be great.  I looks like there needs
to be a different test there.

The net result is that SetDocumentURI() needs to be called on the new channel
otherwise it will end up with a null mDocumentURI which will make it pass a null
for 1stURL which then makes the cookie not be sent which then makes the site
complain since the cookie is how it identifies your session/login rights.
s/back site/bank site/
In response to Mike's analysis, reassigning to networking:http.  Adding morse to
the cc list.  This could affect quite a few sites, since it appears to be tied
to redirection, and testing shows that it happens on more than just this one
fetch.  Targetting for 1.0 tentatively to make sure it's on radar.

There might also be security concerns here; it's hard to be sure yet.
Assignee: morse → darin
Severity: major → critical
Component: Cookies → Networking: HTTP
Target Milestone: --- → mozilla1.0
the code probably needs to do this:

  // update the DocumentURI indicator since we were just redirected
  if (newURI && (mURI == mDocumentURI))
      httpChannel->SetDocumentURI(newURI);
+ else
+     httpChannel->SetDocumentURI(mDocumentURI);

i believe this code is trying to update the document URI if it has changed.  but
if any subcontent redirects, we also need to update the document URI.  but for
subcontent, the document URI is never mURI.  so i think my patch should do the
trick.
Attached patch v1 patchSplinter Review
here's the patch i suggested.  can someone test this out and let me know if it
works?	thx!!
Darin - that is exactly the patch I had just made here - after following the
code, it looks like that is what was actually intended.  This would keep a
subdocument (frames, etc) within the master document as far as original site.

A few more tracings into the code for side effects shows that this even works
correctly for those bad attempts at frame-ing other sites and trying to peek at
information that should not be available.
oops - forgot to say that the patch works (I sort of implied that but I wanted
to make sure it was stated directly)

BTW - mid-air collisions are no fun, even in bugzilla.  I twice had to retype my
message about the patch.
darin & michael,
Thank you both for looking into this.  You saved me from doing a lot of 
investigating here.  I was afraid that I was going to get some of the 
originating URI code wrong in a few places, and I guess this was one of them.

r=morse on the patch, providing it's been tested and works.
OK, I'm glad to hear that it works.  My comment also mid-air collided.
Here is how I read the code and a recommened updated patch that has a comment
about why the code is like it is:

Well, after I found the area where the problem was, I played with the code some
more and figured out that the newChannel really wanted to be a clone of the
current channel where the redirect is being processed from with some changes
based on the redirect.

However, when creating a new channel, the mDocumentURI is set to nsnull (null). 
Now, the special code for the mURI == mDocumentURI is there for a top-level
redirect such that the new URI replaces the mDocumentURI when it is a top-level
redirect.  However, in sub-documents (such as images and frames) the mURI does
not match the mDocumentURI and thus it was never setting the mDocumentURI in the
new channel.  What it should have been is the same as the current channel. 
There really should be a comment there about why this is being done.  I propose
a patch that looks like:

-        // update the DocumentURI indicator since we were just redirected
+        // Set mDocumentURI to the newURI if this was a top-level
+        // redirect, otherwise just pass along the current mDocumentURI
         if (newURI && (mURI == mDocumentURI))
             httpChannel->SetDocumentURI(newURI);
+       else
+           httpChannel->SetDocumentURI(mDocumentURI);
Here is how I read the code and a recommened updated patch that has a comment
about why the code is like it is:

Well, after I found the area where the problem was, I played with the code some
more and figured out that the newChannel really wanted to be a clone of the
current channel where the redirect is being processed from with some changes
based on the redirect.

However, when creating a new channel, the mDocumentURI is set to nsnull (null).
 Now, the special code for the mURI == mDocumentURI is there for a top-level
redirect such that the new URI replaces the mDocumentURI when it is a top-level
redirect.  However, in sub-documents (such as images and frames) the mURI does
not match the mDocumentURI and thus it was never setting the mDocumentURI in
the new channel.  What it should have been is the same as the current channel. 
There really should be a comment there about why this is being done.  I propose
a patch like this one.
Comment on attachment 77965 [details] [diff] [review]
v1 patch

adding r=morse to this patch.  i'll add some comments explaining why this is
the right thing to do before checking in.
Attachment #77965 - Flags: review+
Comment on attachment 77977 [details] [diff] [review]
Same patch but with a better comment

msinz: your indentation is busted... i'll add your comments before checking in
;-)
Attachment #77977 - Flags: needs-work+
Sorry about that - I think my editor put in a hard tab there.
OS: Linux → All
Hardware: PC → All
Keywords: nsbeta1nsbeta1+
Whiteboard: [adt2]
Attachment #77965 - Flags: superreview+
Keywords: adt1.0.0
Comment on attachment 77965 [details] [diff] [review]
v1 patch

a=rjesup@wgate.com for drivers.  I personally prefer adding the comment mike
suggested as well.
Attachment #77965 - Flags: approval+
It sounds like testing was done on this for cookies. Darin can you do some unit
testing on other areas this might affect in http?
this only effects cookie-blocking.  morse is running his full suite of cookie
blocking tests.

i've run the browser with this patch in my tree and it seems to work fine. 
there don't appear to be any negative side-effects.
I have been running with this patch (well, my patch, but it is only a comment
and tab difference :-) since Friday.  I have had no problems - in fact, things
now work more than before.  (AKA the bank :-)
I just applied the patch and ran my suite of third-party cookie tests.  They all 
worked fine -- no regressions.
adt1.0.0+ (on ADT's behalf) for approval to checkin to 1.0. Pls check this in today.
Keywords: adt1.0.0adt1.0.0+
fixed-on-trunk
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
*** Bug 134625 has been marked as a duplicate of this bug. ***
Michael, could you please test this with your Citadel account and mark verified
if it works.  Thanks.
*** Bug 134955 has been marked as a duplicate of this bug. ***
*** Bug 131232 has been marked as a duplicate of this bug. ***
I have been running with this patch.  I have had no problems - in fact, things
now work more than before.  (AKA the bank :-)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: