Need a way to determine third-party sites.

VERIFIED FIXED in mozilla1.0

Status

()

Core
Networking
P2
major
VERIFIED FIXED
17 years ago
16 years ago

People

(Reporter: Stephen P. Morse, Assigned: Darin Fisher)

Tracking

Trunk
mozilla1.0
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 9 obsolete attachments)

(Reporter)

Description

17 years ago
This bug report is an offshoot of bug 87388.  The discussion in that report is 
quite lengthy and sometimes off the topic.  So I've opened this report to focus 
on what the problem is.

What is needed is a networking API that allows us to determine the originating 
site, where "originating site" is defined as follows:

- If no redirects occur, originating site is the site that the user typed into 
location field or the site that the user clicked a link to.

- If redirects occur, originating site is the last site that redirect was to.

Examples:

1. user clicks on link to site A
   A does a redirect to site B
   site B attempts to set a cookie
   [answer should be B in this case]

2. user clicks on link to site C
   site C fetches an image from site D
   site D attempts to set a cookie
   [answer should be C in this case]

The correct result used to be what was being returned by GetURI.  But changes 
have occured in the networking layer so that this is no longer the case.  In 
particular, it returns D in example 2 which is not what I want.

--------------------------

Now let me summarize the discussion in bug 87388:

Darin suggested that I use GetOriginalURI.  But he then admitted that it will 
return A in example 1 which is not what I want.

Gagan suggested that I get the URI through the default channel of the load group 
(whatever that means).  Darin then said that Gagan's idea won't work because 
imagelib doesn't associate loadgroups with the http channel it creates.  Gagan 
admitted his error and then said that the problem will be solved if imagelib 
sets the toplevel document on the dummy loadgroup's URI onto the dummy loadgroup 
it creates.  At this point Gagan said that the problem will be fixed as soon as 
bug 87388 gets resolved.

Yesterday Pavlov and Darin decided not to fix bug 87388.  That's fine with me.  
But then networking needs to provide some other API for the cookie module to use 
to determine the originating server.  Otherwise we will need to abandon our 
tests for third-party cookies which will be a major switch from 4.x.
(Reporter)

Updated

17 years ago
Blocks: 87388
Note that there is no reason that the top level page has to be http -
ftp/file/etc pages can load images from http too.

Can't you just ask the docshell somehow? Or do we (understandably) not want
cookies to depend on docshell?
(Assignee)

Comment 2

17 years ago
cc'ing rpotts.
Actually, I have a better idea. [Side note, how do frames/iframes interact with
this model? I'm going to ignore them for the example below. Doing so may make
this not work, though]

When cookies gets called, get the loadflags on the channel. If the load flags
include LOAD_DOCUMENT_URI or LOAD_RETARGETTED_DOCUMENT_URI, then you have a top
level document. In this case, call GetURI. (This is case 1)

If neither of those load flags are present, then you are being loaded from
somewhere. In that case, get the referrer. The http SetReferrer logic then needs
to be changed for only keeping the referrer sometimes, to always keeping the
referrer, but only sending it sometimes. (This is case 2)

So, if we want this extended for frames then someone needs to tell me if
LOAD*DOCUMENT_URI is set for frames and/or iframes.
I haven't actually tested this, though...
(Reporter)

Comment 4

16 years ago
Raising severity to major.  Gagan, we need to know what your plans are for this 
bug.
Severity: normal → major

Comment 5

16 years ago
Is this same problem also breaking 3rd party image blocking? Can't find the bug
now, but it's broken as well. I believe there's additional discussion there
about determining the requested site.
OS: Windows NT → All
Hardware: PC → All
Target Milestone: --- → mozilla1.0
(Reporter)

Comment 6

16 years ago
Yes it is.

Comment 7

16 years ago
after speaking to darin and pavlov, I agree that there is no clean way of
implementing this. Bradley's suggestion would work as a hack fix but would fail
for cases where the referer is different. But that might be a way to get started
in the right direction. 
How can the referrer be different to the loading page?

I'm not sure who to ask about the loadflags on frames, and I don't have time to
dig into it ATM.
(Reporter)

Comment 9

16 years ago
Gagan, I don't understand why this is so difficult.  The old behavior of GetURI 
used to do exactly what I need.  Then somebody changed something and third-party 
cookies have not worked since.  So all I'm asking for is another API that does 
exactly what GetURI used to do.

Comment 10

16 years ago
bradley: the referrer will be different if the same image is referenced from two
different pages/frames.
Keywords: mozilla1.0

Comment 11

16 years ago
In a frame then, can't you check the parent's referrer?
gagan: But that doesn't matter. If we have one page from which loading is
permitted, and one from which it is not, then we will only load one of them anyway.

I suppose my plan may go wrong in combination with no-store and the same image
on multiple pages. I suspect that that case is really rare, though.

Did the old code handle that case?
(Assignee)

Comment 13

16 years ago
i vote for going with bradley's plan... it will get a solution that is right 99%
of the time.

Comment 14

16 years ago
Didn't the old 3rd-party code handle letting www.foo.com load images from
images.foo.com? Was com/net/org special, to avoid letting nicesite.co.uk load
content from annoyingsite.co.uk?
(Reporter)

Comment 15

16 years ago
There's some confusion in the comments above.  This has nothing to do with 
loading or not loading of images.  That's image blocking and this bug is not at 
all about blocking images.

The images are merely being used as vehicles to set third party cookies.
morse: s/image/subobject/ in my comments above. My argument applies to <object>,
<applet>.

It applies to <frame> and <iframe> if and only if my assumnptiopn about the
loadflags are correct.

Comment 17

16 years ago
bradley: not sure if the old code handled it ok or not but I agree that that is
a small percentage of cases that will be hurt by it. I second your idea.
Someone needs to check the <frame> stuff, then.

Comment 19

16 years ago
Is there any chance of getting this support by the end of 0.9.8?  AFAIK, we have
no approval to drop the 3rd party cookie blocking we've had since 4.x.  However,
we also cannot afford to have any affect on Gecko after 0.9.8.
(Reporter)

Comment 20

16 years ago
Attaching a patch that solves the problem without using the "referrer" hack 
described above.  Instead it introduces a new instance variable, 
mLastRedirectedURI.  That variable provides exactly what is needed to determine 
third-party cookies, as defined in the description of this report.

One of the files included in that patch is nsCookieHTTPNotify.cpp.  Strictly 
speaking, the change to that file is not part of this bug report but rather 
belongs to bug 87388 (this report is for an api change to netwerking to allow 
determination of a third-party site, and bug 87388 is to use that api change for 
determining if a cookie is from a third party site).  But since these two 
reports are so closely related, I'm including the entire patch here and will 
close out both reports as fixed once this patch is checked in.
(Reporter)

Comment 21

16 years ago
Created attachment 63584 [details] [diff] [review]
introduce instance variable to allow for determination of 3rd-party sites
(Reporter)

Comment 22

16 years ago
dveditz, darin, please review
(Reporter)

Comment 23

16 years ago
Created attachment 63763 [details] [diff] [review]
missing an NS_IF_ADDREF in nsHttpChannel::GetLastRedirectURI
Attachment #63584 - Attachment is obsolete: true
(Assignee)

Comment 24

16 years ago
morse: can you explain how your patch works.
(Reporter)

Comment 25

16 years ago
Sure.  I create another instance variable that I initialize to the URL, just 
like mURL and mOriginalURL are initialized.  Whenever an image is fetched and a 
new channel is created, I call on a method of that new channel to modify this 
new instance variable to be the URL of the parent channel rather than the url of 
the image.  That's how I solve requirement 2 of this report.

As for redirection, whenever that occurs I again call on the same method to 
update the new instance variable to be the url to which the redirection was 
made.  That solves requirement 1.

Incidentally, there are still two more problems that I am working on.  But I'd 
like to get this work checked in while I'm trying to solve those problems.  One 
has to do with frames (baetz' comment of using LOAD_DOCUMENT_URI doesn't work) 
and the other with cookies set via javascript.
(Reporter)

Comment 26

16 years ago
OK, there's a subtle logical problem with my patch.  Let me explain.

When the browser wants to fetch an image that is on a page, we get to the 
imgLoader module which will create a new channel for the image and then 
request it.  That was the call to imgLoader::LoadImage.  The modification that I 
made was to introduce a LoadImage2 instead which does the same thing but also 
calls a method in the new channel that sets the new instance variable that I 
introduced.  And it's this instance variable that my cookie module will use to 
determine that the image is from a third party.

The steps are as follows:

1. imgLoader::LoadImage2 creates newChannel
2. imgLoader::LoadImage2 calls newChannel->SetLastRedirectURI to be parent URL
3. channel builds up the http request
4. cookie module's OnModifyRequest is called.  It uses LastRedirectURI to 
   test for third-party site and, based on that, decides whether or not to add
   cookies to the request.
5. response comes in
6. cookie modules OnExamineResponse handler is called.  It uses LastRedirectURI
   to test for third-party site and, base on that, decides whether or not to
   accept any set-cookies found in the response.

All looks good, and the test in step 6 works fine.  However the test in step 4 
does not work.  Reason is that step 2 does not occur immediately after step 1 
(as I would have hoped) but actually occurs after step 4.

This can be explained by looking at the coding in LoadImage2:

    nsCOMPtr<nsIChannel> newChannel;
    ioserv->NewChannelFromURI(aURI, getter_AddRefs(newChannel));
    if (!newChannel) return NS_ERROR_FAILURE;

    nsCOMPtr<nsIHttpChannel> newHttpChannel = do_QueryInterface(newChannel);
    if (newHttpChannel) {
      newHttpChannel->SetLastRedirectURI(parentURI);
    }

I was expecting the SetLastRedirectURI to occur as soon as the newChannel was 
created but before the http request was built.  Unfortunately the request gets 
built as part of the call to NewChannelFromURI, so the call to 
SetLastRedirectURI is occuring too late.

So what do we do about it?  Here are some suggestions.  Other suggestions would 
be appreciated:

1. Change the call to NewChannelFromURI so that it passes in the LastRedirectURI 
value.  This way the new channel's init routine can set that value before it 
proceeds to build the http request.  But this will require changes not only to 
the NewChannelFromURI interface but to several other interfaces as well in order 
to allow this additional parameter to reach the new channel's init routine.

2. Don't worry about testing for third-party cookies on the request and do it 
only on the response.  This is not as bad as it sounds because it turns out that 
4.x does the third-party cookie test only on the response and not on the 
request.  That means we don't set cookies from third-party sites but we still 
send cookies to them if they managed to get their cookies set.  If our 
requirement is to at least be compatible with 4.x (and not worry about improving 
on it), then this solution could be acceptable.
(Assignee)

Comment 27

16 years ago
morse: why would checking for 3rd party sites on request be an improvement over
4x?  surely the user would not see any difference, correct?
(Reporter)

Comment 28

16 years ago
It's a privacy concern that was expressed by privacy advocate Richard Smith.  
Here's one scenerio.

- user visits third-party site and cookie is accepted
- user later becomes security conscious, sets pref to disable 3rd party cookies
- user returns to third-party site
- his cookies are sent to site

Another scenario is:

- user has pref set, goes to site directly (e.g., doubleclick)
- cookie gets set
- user goes to some other site that obtains images from doubleclick
- cookies are sent out to doubleclick even though it is now a third party

And I believe there were other cases as well but I don't recall.
(Assignee)

Comment 29

16 years ago
okay, that makes sense.  perhaps we need to delay the call to OnModifyRequest
until AsyncOpen is called.

also, i think "lastRedirectURI" is a bad name.  since there are not always
redirects involved.  perhaps what we really need is nsIChannel::documentURI. 
that way every channel could have a documentURI associated with it.  we'd need
to make sure docshell sets this attribute when it sets the load group, and we'd
also need to make the imgLoader set this attribute on channel's it creates.  and
then there's CSS and JS scripts to worry about.

another solution would be to create an additional loadgroup for each document
that could be used with image requests only.  then cookies could just check the
URI of the loadgroup's default request.  i like this approach better since it
doesn't involve an API change.  but, i know pavlov probably won't like it.
(Reporter)

Comment 30

16 years ago
I didn't like the name LastRedirectURI for the same reasons you gave.  I was 
just too lazy to change it since it would have required another clobber build 
before I could post the patch.  But I agree that a better name should be used.

I don't care which of the two solutions you outlined above gets implemented, as 
long as the end result allows the cookie module to determine third-party 
cookies.  So you and pav will have to come to some agreement on that.
(Assignee)

Comment 31

16 years ago
ok, talked with pavlov... we can't depend on loadgroups cuz some image requests
don't even get loadgroups... moreover for some image requests there isn't even a
default request... consider onmouseover image requests.  the document has long
since loaded.  this rules out depending on a loadgroup's default request for
solving this bug.

now as for adding nsIChannel::documentURI, i think that may be just overkill and
besides it'd be a lot of work to add that.  and it just becomes another field
clients must all be sure to set.  i don't like it for this reason.

now, there is a striking similarity between nsHttpChannel::mReferrer and morse's
mLastRedirectURI.  i think we can just use mReferrer.  to do this i have put
together a patch that modifies nsHttpChannel::SetReferrer to always store
mReferrer (but only add a Referer header to the actual http request if
yadda-yadda).  my patch also moves the call site of OnModifyRequest from
nsHttpChannel::Init to nsHttpChannel::AsyncOpen to allow for third-party cookie
blocking on http requests.

this is not a complete solution.  i think imagelib still needs to be given the
baseURI in a manner similar to morse's patch.  however, LoadImage2 is bad... we
should make LoadImage take an extra nsIURI parameter for this.  LoadImage should
use this when setting the http referrer... if baseURI is not set, then LoadImage
should resort to looking for the default request in its loadgroup (if present).

i'm going to attach the http portion of this patch now... and then i'll put
together a patch for libpr0n.

-> me
Assignee: neeti → darin
(Assignee)

Comment 32

16 years ago
Created attachment 64052 [details] [diff] [review]
http changes to make using nsIHttpChannel::GetReferrer work
(Reporter)

Comment 33

16 years ago
That's terrific.  I'll test it out as soon as the patch appears.

I should mention that I was trying to find a simpler solution than my original 
patch.  The issue is not that important as to warrent anything complicated.  The 
idea of blocking cookies from third-party sites is flawed in that sites know 
about this and get around it by simply doing a redirect and redirect back.

For example, suppose you want to visit www.vendor.com and he is being serviced 
by marketing site doubleclick.com.  In the old days, vendor.com would have an 
image for doubleclick and doubleclick would set a cookie.  Then we added 
third-party cookie blocking.  Then vendor.com realized this and changes his site 
to do a redirect to http://doubleclick.com?vendor.com/page2.  Then doubleclick 
gets called as a first-party site and sets the cookie.  Now doubleclick uses the 
query string it received to determine who to redirect back to.

OK, if this is so flawed, why are we even bothering to support it.  Ideally we 
should just fess up to the fact that it's useless and remove the third-party 
pref from the pref panel.  There are two reasons why we can't simply do that.  
First is that the press won't understand it and they'll say that we are no 
longer interested in protecting the user's privacy.  Second is that microsoft's 
p3p stragegy involves special blocking for third-party cookies if the site's 
privacy policy is not strong enough.  And we are forced to implement a similar 
p3p stragegy, otherwise we'll again take a PR hit in the press for not being as 
good as IE on privacy.  It's not a matter of whether or not there is any real 
privacy issue involved with third party cookies -- the user's will perceive that 
there is.

Well that was a long-winded way of saying that we really don't need a difficult 
or complicated patch here.  Just one that "appears" to work and will allow us to 
keep saying that we are in some sense still offering the ability to block 
third-party cookies.

Comment 34

16 years ago
> Another scenario is: user has pref set, goes to site directly, cookie
> gets set, user goes to some other site, cookie gets sent back

This is even worse when you consider right clicking an image and selecting 'View
Image' constitutes "go[ing] to [the] site directly".

> sites know about this and get around it by simply doing a redirect and
> redirect back

I thought one of the reasons this bug was filed was so third party detection for
images and cookies could detect things just like that.

If I load http://foo.com/, and it has <image src=ad.png>, Mozilla GETs ad.png.
If the HTTP GET returns a 3xx redirect to http://valueclick.com/ad.png, we check
the load third party images pref. If it's OK, we HTTP GET
http://valueclick.com/ad.png. If it sends a cookie, you compare foo.com vs.
valueclick.com, see it doesn't match, and don't save the cookie.

All you ever need is the hostname of where you're loading from, and the same
hostname that's in the URL bar (location.href). If location.href isn't available
in the networking code, we need a variable that's set and updated in the same
way, but is available. location.href updates when the main page requested is a
redirect, but never again until you load a new "main" document. That's what
should be compared to.

If neither of the current patches does, or will do, what I described, then they
are defective, though I do appreciate the work and attention this bug is now
getting.
(Reporter)

Comment 35

16 years ago
dolan: I think you misunderstood the redirect scenario that I was describing 
above.  It's not that the site has an image and the GET for that image does a 
redirect.  Rather the GET for the original URL that you are trying to reach has 
the redirect.  So here's what happens:

1. request: GET http://www.vendor.com
2. response: 302 redirected to http://www.doubleclick.com?vendor.com/page2.html
3. request: GET http://www.doubleclick.com?http://www.vendor.com
4. response: set-cookie, redirect to http://www.vendor.com/page2
5. request: GET http://www.vendor.com/page2

Now the request in step 3 (and the response in step 4) are consider as going to 
the original host.  We have to do that, otherwise if a cookie-setting site has 
really moved and issues a redirect, all accesses to that site through the old 
URL will treat their cookies as third-party cookies.
(Assignee)

Comment 36

16 years ago
-> 0.9.8
Priority: -- → P2
Target Milestone: mozilla1.0 → mozilla0.9.8
(Reporter)

Comment 37

16 years ago
I'll give an r=morse for darin's 1-8-02 patch, as far as it goes.  Of course it 
still needs a corresponding patch for ibpr0n as he indicated.
(Reporter)

Comment 38

16 years ago
Comment on attachment 64052 [details] [diff] [review]
http changes to make using nsIHttpChannel::GetReferrer work

r=morse
Attachment #64052 - Flags: review+
(Assignee)

Comment 39

16 years ago
Created attachment 64412 [details] [diff] [review]
imagelib patch

this patch is needed in addition to the http patch.  it ensures that the http
channel has a referrer set even if there isn't a loadgroup, which is actually a
separate bug altogether.

pavlov: can you review this patch?
Attachment #63763 - Attachment is obsolete: true
(Assignee)

Updated

16 years ago
Keywords: patch
(Reporter)

Comment 40

16 years ago
I applied the patches, then tried a test and got the crash indicated below.  The 
test consisted of going to the following site:

   http://ad.doubleclick.net/ad/N2870.ex/B42626.21;sz=468x60;ord=?

Specifically the crash is occuring in the destructor for imgLoader in 
imgLoader.cpp

imgLoader::~imgLoader() line 90 + 10 bytes
imgLoader::`scalar deleting destructor'(unsigned int 48714112) + 15 bytes
ImageListener::OnStartRequest(ImageListener * const 0x040f3b80, nsIRequest * 
0x040eb630, nsISupports * 0x00000000) line 178 + 96 bytes
nsDocumentOpenInfo::OnStartRequest(nsDocumentOpenInfo * const 0x040e6b60, 
nsIRequest * 0x040eb630, nsISupports * 0x00000000) line 228 + 34 bytes
nsHttpChannel::OnStartRequest(nsHttpChannel * const 0x040eb634, nsIRequest * 
0x040905e4, nsISupports * 0x00000000) line 2376 + 45 bytes
nsOnStartRequestEvent::HandleEvent() line 161 + 53 bytes
nsARequestObserverEvent::HandlePLEvent(PLEvent * 0x040e1934) line 116
PL_HandleEvent(PLEvent * 0x040e1934) line 590 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00a7e1e0) line 520 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x002905fc, unsigned int 49526, unsigned int 0, 
long 11002336) line 1071 + 9 bytes
USER32! 77e71268()
00a7e1e0()
(Assignee)

Comment 41

16 years ago
morse: thanks for the feedback... i'll investigate this right away.
(Reporter)

Comment 42

16 years ago
Ignore my comment above about a crash.  It was my error -- I forgot to do a 
clobber build, which was necessary because the interface changed.  I'm no longer 
seeing the crash now that I've done the clobber build.

I just did some preliminary testing and can report the following:

1. Page itself: referer is <null> which is correct
2. Image on page: referrer is the page which is correct
3. Frame on page: referrer is the page which is correct
4. Image in frame: referrer is the frame -- incorrect, should be the page

So case 4 will fail the third-party cookie test.  That's something we can live 
with if we have to, but it would be nice if we can detect this case.
(Assignee)

Comment 43

16 years ago
why is it incorrect?  you can enclose any website in a frame... eg. click on a
link from a hotmail account.  the toplevel page served up is in the hotmail.com
domain, with one of the frames containing the website of the link the user
clicked on.  in this case is it correct to consider the website the third-party?
 seems like it should always be the frame that is the first party.

Comment 44

16 years ago
Frames, in line or not, are subcontents. If I go to cnn.com and they have
ads.aol.com frame/iframes, I don't want cookies to be sent/accepted to those
clearly third party servers.

Yes, this affects crappy sites that open up off-site links in their own
frameset, but that was always awful UI anyway, usually used so they can continue
serving you ads as you surf elsewhere. Open link in new window/tab will allow it
to be first-party.

Comment 45

16 years ago
Comment on attachment 64412 [details] [diff] [review]
imagelib patch

I hate to do this, but can aReferer be the 2nd param (after aURI)?  
other than that, it looks fine :-)
(Reporter)

Comment 46

16 years ago
I think that Dolan is saying it right, but let me clarify a bit.

Case 1:

   page is from www.firstparty.com
   frame on page is from www.firstparty.com/secondpage

   Here I want referrer to be www.firstparty.com.  The cookie code will test for
   domain inclusion, find that the frame is in the right domain, and consider
   any cookies that the frame sets to be first-party cookies.

Case 2:

   page is from www.firstparty.com
   frame on page is from www.doubleclick.com

   Again I want referrer to be www.firstparty.com.  When cookie code makes its
   domain test, it will find that the frame is not in the same domain and
   consider the frames cookies to be third party.

But all that is working correctly.  What is not working is when there are images 
in the frames.

Case 3:

   page is from www.firstparty.com
   frame on page is from www.firstparty.com/secondpage
   frame contains image from www.firstparty.com/secondpage/image.jpg

   Now I still want the referrer to be www.firstparty.com.  Again the cookie
   test for domain inclusion will pass and the cookie set by image.jpg will
   be considered as first party.

   In this case you are giving me the wrong referrer (namely
   www.firstparty.com/page2) but that doesn't affect the result of the cookie
   test.

Case 4:

   page is from www.firstparty.com
   frame on page is from www.firstparty.com/secondpage
   frame contains image from www.doubleclick.com/image.jpg

   Once again I want referrer to be www.firstparty.com but you are giving me
   www.firstparty.com/secondpage instead.  But once again it won't affect the
   outcome of the cookie test -- I will detect the wrong domain and consider
   cookies from doubleclick to be third party.

But now for the clincher.

Case 5

   page is from www.firstparty.com
   frame on page is from www.doubleclick.com
   frame contains image from www.doubleclick.com/image.jpg

   Again I want the referrer to be www.firstparty.com but you are giving me
   www.doubleclick.com instead.  So when image.jpg goes to set a cookie, my
   cookie test will report that it is in the same domain and consider it to
   be a first-party cookie.  THIS IS THE ERROR!
(Assignee)

Comment 47

16 years ago
pavlov: no problem.. the order of the parameters can easily be changed ;-)  i'm
more concerned about the fact that it seems like the referrer is not giving us
what we need/want.
(Assignee)

Comment 48

16 years ago
morse: can you try backing out the imagelib portion of this patch and see what
that changes?  i'm wondering if the default load request matches the baseURL
that i'm passing into the imagelib.  thx!
(Reporter)

Comment 49

16 years ago
Here's the result of backing out the imagelib portion of the patch.  I didn't 
actually back out the patch (because that would have required a new 
clobberbuild) but instead bypassed the new processing that it would do.  Namely 
in imgLoader.cpp I changed

   if (aReferrer)

to

   if (0)

Here are the original results with the new results being indicated in angle 
brackets.

1. Page itself: referer is <null> which is correct
   <<now referrer is the page itself.  I can live with that>>

2. Image on page: referrer is the page which is correct
   <<same result>>

3. Frame on page: referrer is the page which is correct
   <<same result>>

4. Image in frame: referrer is the frame -- incorrect, should be the page
   <<same result -- still incorrect>>
(Assignee)

Comment 50

16 years ago
OK, then we need extra help from outside necko/imagelib to determine the
toplevel document URL.  anyone know how to get at this?
(Assignee)

Comment 51

16 years ago
-> 0.9.9 (looks like we're not going to have a solution for this by 0.9.8)
Target Milestone: mozilla0.9.8 → mozilla0.9.9

Updated

16 years ago
Keywords: nsbeta1+
(Assignee)

Comment 52

16 years ago
-> 1.0
Target Milestone: mozilla0.9.9 → mozilla1.0
(Reporter)

Comment 53

16 years ago
Trying to get this one back on track.  Looks like darin's patches hit a dead 
end.  So I'm resurecting my last patch and taking it a few steps further.  
Specifically it now allows for the recognition of third-party cookies in frames.

In fact, it now recognizes all of the following cookie-setting cases correctly:

1. Cookie set by server of page (first party)
2. Cookie set by server of image in page (third party)
3. Cookie set by server of frame in page (third party)
4. Cookie set by server of image in frame in page (third party)
5. Cookie set by redirected server (first party)

The only cases for which I am not yet recognizing third-party cookies are 
javascript cookies and cookies set with meta tags.
(Reporter)

Comment 54

16 years ago
Created attachment 70246 [details] [diff] [review]
recognizes third-party from images and frames correctly
Attachment #64052 - Attachment is obsolete: true
Attachment #64412 - Attachment is obsolete: true
(Reporter)

Comment 55

16 years ago
Posting a better patch.  This patch allows the cookie module to determine 
third-party cookies in all of the above cases including javascript cookies and 
meta-tag cookies.
(Reporter)

Comment 56

16 years ago
Created attachment 70477 [details] [diff] [review]
recognizes third-party cookies correctly in all cases
Attachment #70246 - Attachment is obsolete: true
(Assignee)

Comment 57

16 years ago
morse: looks like you're trying to associate a document URI with every http
channel.  i think that is probably the right thing to do, since the referrer
clearly doesn't correspond to the toplevel document URI.  it's unfortuate that
we have to add extra parameters to nsIDocShell::LoadURI and imgILoader::LoadImage,
but i'm not sure what else we can do... rpotts? pavlov?  you guys "ok" with
these API changes?
(Reporter)

Comment 58

16 years ago
See companion bug 87388 and the patch in the cookie module that I just posted 
in there.  That bug is really the one that recognizes third-party cookies.  This 
patch more accurately creates the infrastructure that the cookie module uses.

To test this out, apply both patches (the one here and the one in bug 87388), 
then run the tests at either of the following and note the printf output on the 
consule

   http://warp/u/morse/cookies/  -- inside netscape firewall
   http://people.netscape.com/morse/cookies/ -- outside firewall

The reason I am giving an address inside the firewall is that there is currently 
an error in the external version and the correction that I posted for it won't 
be visible outside the firewall until tomorrow.

Note: before running the test, you need to change DEBUG_morse to just DEBUG in 
the patch in bug 87388 (nsCookies.cpp).

Comment 59

16 years ago
please add the parentURI after the aURI param in LoadImage.. other than that, i
have no problem adding the extra param to loadImage
(Reporter)

Comment 60

16 years ago
Created attachment 70621 [details] [diff] [review]
satisfy pavlov's concern
(Reporter)

Comment 61

16 years ago
Comment on attachment 70621 [details] [diff] [review]
satisfy pavlov's concern

sr=dveditz (from private e-mail)
Attachment #70621 - Flags: superreview+
(Assignee)

Comment 62

16 years ago
Comment on attachment 70621 [details] [diff] [review]
satisfy pavlov's concern

some comments on the HTTP changes...


>Index: nsHTMLDocument.h
>+  nsIHttpChannel*     mHttpChannel;

nsCOMPtr??


>Index: nsIHttpChannel.idl

>+    /* keeping track of the document uri */
>+    readonly attribute nsIURI documentURI;
>+    void setDocumentURI(in nsIURI documentURI);

please change this to:

   /**
    * An http channel can own a reference to the document URI
    */
   attribute nsIURI documentURI;


>Index: nsHttpChannel.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/netwerk/protocol/http/src/nsHttpChannel.cpp,v
>retrieving revision 1.86
>diff -u -r1.86 nsHttpChannel.cpp
>--- nsHttpChannel.cpp	6 Feb 2002 08:41:16 -0000	1.86
>+++ nsHttpChannel.cpp	21 Feb 2002 00:47:36 -0000
>@@ -115,6 +115,7 @@
> 
>     mURI = uri;
>     mOriginalURI = uri;
>+    mDocumentURI = uri;

is this really correct?  shouldn't it be initialized to NULL to indicate that 
the document URI is unknown?


>@@ -1168,6 +1165,7 @@

>+    nsCOMPtr<nsIURI> newURI = nsnull;

no need to initialize a nsCOMPtr.


>@@ -1229,6 +1226,9 @@
> 
>     nsCOMPtr<nsIHttpChannel> httpChannel = do_QueryInterface(newChannel);
>     if (httpChannel) {
>+        // update the DocumentURI indicator since we were just redirected
>+        if (newURI)
>+            httpChannel->SetDocumentURI(newURI);

image loads can be redirected too.  why would the document URI be the newURI?
shouldn't this only happen if mURI == mDocumentURI?

>@@ -2014,6 +2014,10 @@

>+    // Notify nsIHttpNotify implementations
>+    rv = nsHttpHandler::get()->OnModifyRequest(this);
>+    NS_ASSERTION(NS_SUCCEEDED(rv), "OnModifyRequest failed");

I'm somewhat worried about this since it changes the contract about when
OnModifyRequest is called.  before it was called at initialization time,
allowing someone who creates a channel to see what the default set of
headers would be.  now they won't get a chance to intercept these at all.

of course i understand that why this change is being made.  it seems like
a channel should be initialized w/ a document URI but i don't think we want
to add that to nsIProtocolHandler::newChannel :-P

so, i guess this is the way it has to be.  not great, but we can live with
it.
(Reporter)

Comment 63

16 years ago
Attaching patch that addresses the 2nd through 5th items in darin's list.  
However I have not done anything about the following because I'm not sure what 
was meant.  Can you be more specific.

>Index: nsHTMLDocument.h
>+  nsIHttpChannel*     mHttpChannel;

nsCOMPtr??



(Reporter)

Comment 64

16 years ago
Created attachment 70737 [details] [diff] [review]
addresses darin's issues in comment #62
Attachment #70477 - Attachment is obsolete: true
Attachment #70621 - Attachment is obsolete: true
(Assignee)

Comment 65

16 years ago
morse: sorry, i meant: "should that be stored using a nsCOMPtr<nsIHttpChannel>
instead?"  is it referenced elsewhere?  how did you get a nsIHttpChannel pointer
without an owning reference?  etc...
(Reporter)

Comment 66

16 years ago
Created attachment 70912 [details] [diff] [review]
addresses darin's point in comment #65
(Reporter)

Updated

16 years ago
Attachment #70737 - Attachment is obsolete: true
(Assignee)

Comment 67

16 years ago
>Index: nsHTMLDocument.cpp

>+    mHttpChannel(nsnull),

nit: no need to initialize a nsCOMPtr


>Index: nsIHttpChannel.idl

>+   /**
>+    * An http channel can own a reference to the document URI
>+    */
>+   attribute nsIURI documentURI;

nit: indentation off by one space.


>Index: nsHttpChannel.cpp


r=darin on the http related changes.
(Reporter)

Comment 68

16 years ago
Created attachment 70990 [details] [diff] [review]
addresses darin's last two points in comment #67
Attachment #70912 - Attachment is obsolete: true

Comment 69

16 years ago
Comment on attachment 70990 [details] [diff] [review]
addresses darin's last two points in comment #67

instead of passing in a new boolean for firstParty, should that just be made an
extra load flag? the load flag parameter already aggregates various properties
about the load current load. 

Or should we not mix the two together? Rick might have an opinion on this.

Comment 70

16 years ago
hey steve,

i'm ok with adding an extra argument to nsIDocShell::LoadURI(...) and friends...  

i agree with mscott that it would be nicer to have it passed as an additional
load flag.  however, currently, the load flags withing the docshell are *really*
messed up -- there are 3 different versions, so making this change is NOT trivial !!

i'm happy adding the extra argument for now, and then changing it to a load flag
when i attach 'load flag unification' within the docshell ;-)

my one concern is whether the 'firstParty' information needs to be recorded for
sesion history loads?  it doesn't look like it is part of the session history
info...  Could you get the wrong info if the page is loaded via the BACK button??

-- rick
(Reporter)

Comment 71

16 years ago
Rick,

I just took a wild stab at the value for session history, mainly because I 
didn't understand what it was doing relative to cookies.  But now that you point 
out that it has to do with hitting the back button, then I guess we need for it 
to have the same first/third party indicator that it had initially.  Of course 
we don't save that.  So the wise thing to do here is to make it third-party so 
it can't set any cookies that it couldn't set the first time it was loaded (of 
course the cookies that it could set the first time will still be there so 
there's no need to set additional ones).  There might be hypothetical cases in 
which this is not correct, but it's not worth worrying abou them.

So consider the patch to nsSHistory.cpp to have PR_FALSE in instead of PR_TRUE.

Comment 72

16 years ago
Comment on attachment 70990 [details] [diff] [review]
addresses darin's last two points in comment #67

wfm :-)
r=rpotts@netscape.com
Attachment #70990 - Flags: review+
Comment on attachment 70990 [details] [diff] [review]
addresses darin's last two points in comment #67

sr=dveditz
Attachment #70990 - Flags: superreview+

Comment 74

16 years ago
Comment on attachment 70990 [details] [diff] [review]
addresses darin's last two points in comment #67

a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #70990 - Flags: approval+
(Drive-by nit: shouldn't that parameter be named aFirstParty, not firstParty?)

/be
(Reporter)

Comment 76

16 years ago
Fix checked in.
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED

Comment 77

16 years ago
qa to tom.

This mostly affects networking features that he tests.
QA Contact: benc → tever

Comment 78

16 years ago
verified - checked using lxr that patch is checked in 
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.