Closed Bug 537207 Opened 15 years ago Closed 5 years ago

Trailing "/" in path attribute not handled like other browsers

Categories

(Core :: Networking: Cookies, defect, P3)

defect

Tracking

()

RESOLVED INVALID

People

(Reporter: abarth-mozilla, Unassigned)

References

Details

(Whiteboard: [necko-backlog])

Attachments

(1 file)

== Trailing "/" in path attribute ==

http://home.example.org:8888/cookie-parser?path0014: FAIL

Set-Cookie: foo=bar; path=/cookie-parser-result/foo/qux/
and then try to read the cookie from /cookie-parser-result/foo/qux?path0014

Expected behavior: The cookie is not sent
Passing browsers: IE, Safari, Chrome, Opera
Failing browsers: Firefox

Recommendation: Change Firefox to match the other browsers
Attached patch work-in-progressSplinter Review
Fixing this bug turns out to be trickier than expected.  The problem is that the SQL database represents default paths with a trailing "/", which makes it hard to distinguish a default path from an explicit path attribute with a trailing slash.

I need to write some more tests to probe the data model for default paths.  We might need to write a schema migration to convert old cookies with an extra trailing slash.
Comment on attachment 419532 [details] [diff] [review]
work-in-progress

>@@ -2131,25 +2112,25 @@
>   if (aCookieAttributes.path.IsEmpty()) {
>     // strip down everything after the last slash to get the path,
>     // ignoring slashes in the query string part.
>     // if we can QI to nsIURL, that'll take care of the query string portion.
>     // otherwise, it's not an nsIURL and can't have a query string, so just find the last slash.
>     nsCOMPtr<nsIURL> hostURL = do_QueryInterface(aHostURI);
>     if (hostURL) {
>       hostURL->GetDirectory(aCookieAttributes.path);

Do you know whether GetDirectory() will include the trailing '/'? It sounds like we want it to.

> PRBool
>+nsCookieService::PathCanReadCookie(const nsCString &aPath, nsCookie *aCookie)
>+{
>+    PRUint32 cookiePathLen = aCookie->Path().Length();
>+
>+    // if the nsIURI path is shorter than the cookie path, don't send it back
>+    if (!StringBeginsWith(aPath, aCookie->Path()))
>+      return PR_FALSE;
>+
>+    if (aPath.Length() == cookiePathLen)
>+      return PR_TRUE; // The two are identical.
>+
>+    if (cookiePathLen > 0 &&
>+        ispathdelimiter(aCookie->Path().CharAt(cookiePathLen - 1)))
>+      return PR_TRUE;

Hmm. Do we really want to test ispathdelimiter() here, or just '/'? In other words - do other browsers accept 'path=foo?' and send it back to 'http://qux.baz/foo?bar' but not 'http://qux.baz/foo/bar'? I guess ispathdelimiter() is safe, but I'm curious.

Also, can use aCookie->Path().Last() for syntactic sugar.

A comment above this case would be nice, for clarity - "If the cookie path already terminates with a path delimiter, we're done", and for the following case, "Otherwise, make sure the next char in the URI path is a delimiter."

>+    if (aPath.Length() > cookiePathLen &&
>+        ispathdelimiter(aPath.CharAt(cookiePathLen))) {

No need to check aPath.Length() here, because we know it's longer than cookiePathLen.

>+      /*
>+       * |ispathdelimiter| tests four cases: '/', '?', '#', and ';'.
>+       * '/' is the "standard" case; the '?' test allows a site at host/abc?def
>+       * to receive a cookie that has a path attribute of abc.  this seems
>+       * strange but at least one major site (citibank, bug 156725) depends
>+       * on it.  The test for # and ; are put in to proactively avoid problems
>+       * with other sites - these are the only other chars allowed in the path.
>+       */

As above, I also wonder what other browsers do for '#' and ';', but that's not critical to this patch. :)

wrt the database, I wouldn't worry about touching it at all. The only case where things will break is where a) the cookie has a default path with a trailing '/' and b) the site wanted that to path-match a '?' terminator. Is that right? In which case, the cookie simply won't get sent, the site will re-set it, and the old cookie will eventually get evicted from the DB. (Although the two may collide for other URI paths.) Such cases should be quite rare, no?
(In reply to comment #2)
> Hmm. Do we really want to test ispathdelimiter() here, or just '/'? In other
> words - do other browsers accept 'path=foo?' and send it back to
> 'http://qux.baz/foo?bar' but not 'http://qux.baz/foo/bar'?

Whoops, scratch the 'foo/bar' one.
> Do you know whether GetDirectory() will include the trailing '/'? It sounds
> like we want it to.

I'll investigate this.

> >+    if (cookiePathLen > 0 &&
> >+        ispathdelimiter(aCookie->Path().CharAt(cookiePathLen - 1)))
> >+      return PR_TRUE;
> 
> Hmm. Do we really want to test ispathdelimiter() here, or just '/'? In other
> words - do other browsers accept 'path=foo?' and send it back to
> 'http://qux.baz/foo?bar' but not 'http://qux.baz/foo/bar'? I guess
> ispathdelimiter() is safe, but I'm curious.

We probably just want '/'.  The only browser that seems to treat ? and # specially in the path attribute is IE, but it thinks that ? and # terminate the path even when embedded in the middle of the attribute.

> >+      /*
> >+       * |ispathdelimiter| tests four cases: '/', '?', '#', and ';'.
> >+       * '/' is the "standard" case; the '?' test allows a site at host/abc?def
> >+       * to receive a cookie that has a path attribute of abc.  this seems
> >+       * strange but at least one major site (citibank, bug 156725) depends
> >+       * on it.  The test for # and ; are put in to proactively avoid problems
> >+       * with other sites - these are the only other chars allowed in the path.
> >+       */
> 
> As above, I also wonder what other browsers do for '#' and ';', but that's not
> critical to this patch. :)

I can test in more detail, but the intent of this code is correct, it's just a strange place to put the logic.  This code is about pulling out the "path" part of a URL.  I'd expect that to be handled by nsIURI.  Maybe we're not calling the right nsIURI API?  It seems odd that nsIURI::GetPath would return "foo?bar" from "http://example.com/foo?bar".  I'd expect the path to be "foo" and the query to be "bar".

> wrt the database, I wouldn't worry about touching it at all. The only case
> where things will break is where a) the cookie has a default path with a
> trailing '/' and b) the site wanted that to path-match a '?' terminator. Is
> that right?

The case I'm worried about is that you have page at a URL like:

http://example.com/foo/bar/

that sets a cooke:

Set-Cookie: baz=qux; Max-Age=39489

The old database will have a cookie with a path of "/foo/bar/" and we'll add a new cookie with a path of "/foo/bar".  Because the paths are different, the new cookie won't overwrite the old cookie, and the site will receive both cookies until the first one expires.
(In reply to comment #4)
> We probably just want '/'.  The only browser that seems to treat ? and #
> specially in the path attribute is IE, but it thinks that ? and # terminate the
> path even when embedded in the middle of the attribute.

Should we be checking the path attribute for these chars and truncating/rejecting it? I guess you're saying the other browsers don't.

> I can test in more detail, but the intent of this code is correct, it's just a
> strange place to put the logic.  This code is about pulling out the "path" part
> of a URL.

See http://mxr.mozilla.org/mozilla-central/source/netwerk/cookie/src/nsCookieService.cpp#2132 ... we could do something like that.

> The old database will have a cookie with a path of "/foo/bar/" and we'll add a
> new cookie with a path of "/foo/bar".  Because the paths are different, the new
> cookie won't overwrite the old cookie, and the site will receive both cookies
> until the first one expires.

Oh, I misread this change. :(

-        aCookieAttributes.path.Truncate(slash + 1);
+        aCookieAttributes.path.Truncate(slash);

Why are we taking the slash off? (And I suppose we'd need to explicitly add a slash in the hostURL->GetDirectory() case a few lines above, if necessary.) Leaving it on solves the above problem, but means that a cookie set from 'http://baz.com/foo/bar/' wouldn't get sent to 'http://baz.com/foo/bar?baz'. Is that a big deal?
> Why are we taking the slash off?

I can test to to confirm my understanding, but consider the following cases:

1) From http://example.com/foo/bar/ we receive the following header:

Set-Cookie: baz1=qux

2) From http://example.com/foo/bar/ we receive the following header:

Set-Cookie: baz2=qux; Path=/foo/bar/

Now, what cookie gets sent to "http://example.com/foo/bar" ?  I think we're supposed to send baz1 but not baz2.

Also, what happens when we generate a cookie string for "http://example.com/foo/bar/duck" ?  Again, I need to test, but I think we're supposed to send both baz1 and baz2.  The current implementation will clobber baz1 with baz2.
(Oops, the clobbering only takes place if the two cookies have the same name.)
With the schema migration, things are going to bust no matter what we do. Let's say we remove all trailing slashes. For cookies with a non-default path with a trailing slash, the possibility that we'll send it in slightly more cases probably isn't a big deal.

But for a site to modify or delete the cookie, it won't be able to do so since the path is different from what it expects. So it will either continue getting the cookie back after it's tried to delete it, or end up with two cookies. In the latter case, the new cookie will be sorted and sent before the old cookie, so this *might* not break much. I'm guessing the former case isn't such a big deal either, since the site will eventually end up setting a new cookie, and we're back to the latter case.

For cookies with a default path, it's really the same deal as above, just that we're generating the path instead of the site.

If we don't remove slashes, then those existing cookies with a default path will get sorted and sent back before the new ones. And things will definitely break.

In my cookies.sqlite, I have about 2600 cookies, with only a handful of non-'/' paths. Let's say most of these have been set with an explicit 'path=/', since I'm guessing that most URLs do not have a root path. (I might be wrong!) Then we could migrate by leaving '/' paths alone, and removing slashes for everything else. What do you think?

We could get smarter. Add logic in AddInternal() to ignore trailing slashes when comparing paths, so cookies can be expired or replaced properly. Wouldn't break much at all, I think, and we could remove the code once most of our users are on 3.7. I like this option better!
Whiteboard: [necko-backlog]
See Also: → 1255938
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3

This bug is invalid. Firefox passes WPT path0014.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: