www.foo.com/~bar and www.foo.com/%7Ebar don't properly share history

NEW
Unassigned

Status

()

Core
Networking
P5
normal
13 years ago
7 months ago

People

(Reporter: Chris Thomas (CTho) [formerly cst@andrew.cmu.edu cst@yecc.com], Unassigned)

Tracking

Trunk
x86
Windows XP
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [necko-would-take])

Attachments

(1 attachment)

1. Visit http://fly.cc.fer.hr/%7Eunreal/theredbook/
2. Click some links
3. Visit http://fly.cc.fer.hr/~unreal/theredbook/

Actual results:
No links are visited

Expected results:
The links that were clicked should appear visited

This is particularly annoying because hovering over links on the %7E page shows
~ in the status bar, yet loads %7E.

Bug 269751 might be related.

<bz> CTho: file a bug on necko?  Those should be testing equal as URIs, I would
think.

Comment 1

13 years ago
The problem here is that nsStandardURL::Equals does not equate '~' and %7E.
Status: UNCONFIRMED → NEW
Ever confirmed: true

Comment 2

13 years ago
What could possibly be done here (if have not had a good look at
nsStandardURL::Equals for quite some time) is that after parsing the escaped URL
we unescape the parts and then compare them. This would catch this problem but
might get us into a pletora of others.

Comment 3

13 years ago
Yeah... or, we could also make nsStandardURL::Equals be smart about equating
escaped bytes and non-escaped bytes.

Comment 4

13 years ago
But remember sometimes escaping a character is done to mask something that would
otherwise be parsed wrongly or equaled when it is not. Parsing and breaking into
components has to come first in my opinion. Maybe a reason to pull a fresh tree,
will see if my old login still works ...

Comment 5

13 years ago
(In reply to comment #4)
> But remember sometimes escaping a character is done to mask something that 
> would otherwise be parsed wrongly or equaled when it is not. Parsing and 
> breaking into components has to come first in my opinion.

I was only proposing a change to nsStandardURL::Equals -- not to any of our
parsing or componentization logic.  Right now, Equals is a series of strncmp
calls.  Those could be replaced by more intelligent URL char iterators that
unescape as they advance through each URL string.  BTW, I'm not sure why the
current code doesn't just strncmp the entire URL spec.  I suspect the current
code is carry-over from the days when we stored the URL components instead of
the URL string itself.


> Maybe a reason to pull a fresh tree, will see if my old login still works ...

Let me know if you have any trouble! :)

Comment 6

13 years ago
Ah, just after posting that question, I answered it for myself.  It is necessary
to compare the segments so that we ensure that we are parsing the segments
properly.  In fact, now it should be correct and safe to run the comparator that
I was describing over the individual segments.  It would be wrong to do it over
the length of the entire URL spec.  Case in point:

  http://foo.com/bar#baz  !=  http://foo.com/bar%23baz

But, if we run the comparator over the URL segments then we should be OK.  Sound
right?

Comment 7

13 years ago
Created attachment 175915 [details] [diff] [review]
v1 patch

Here's a first cut patch that implements the comparator I was thinking of for
URL segments.  It might make better sense to move the function into
nsEscape.{h,cpp} given that I'm reusing a macro from that module.  At any rate,
this doesn't actually fix the bug since I don't think that Mork (or RDF) knows
how to use nsIURI::Equals to compare URI strings.  That's one reason why we
might want to prefer implementing some better normalization algorithm.	We
could perhaps try unescaping segments provided the resulting chars form valid
UTF-8 (or just limit ourselves to unescaping an ASCII subset).
Are %2F and '/' the same in URI filepaths?

Comment 9

13 years ago
Yeah, I wonder about that too.  If so, then it would seem that we should compare
the entire filepath as one.  If not, then we have to do more work to compare
individual file path segments.  I suspect the former is true.

Updated

12 years ago
Assignee: darin → nobody
QA Contact: benc → networking
Whiteboard: [necko-would-take]
You need to log in before you can comment on or make changes to this bug.