Closed Bug 225910 Opened 21 years ago Closed 8 years ago

Remove hand-parsing for refs in content sinks and docshell

Categories

(Core :: Networking, defect, P5)

defect

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: bzbarsky, Assigned: dragana)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [necko-backlog])

Attachments

(1 file, 3 obsolete files)

The content sinks and docshell do this hand-parsing of refs... Now that nsJARURI
implements nsIURL, I do not think that's necessary any longer.  I propose to
remove that code.

Refs:

http://lxr.mozilla.org/seamonkey/source/content/html/document/src/nsHTMLContentSink.cpp#3811
http://lxr.mozilla.org/seamonkey/source/content/xml/document/src/nsXMLContentSink.cpp#883
http://lxr.mozilla.org/seamonkey/source/docshell/base/nsDocShell.cpp#5702

I think I can use getRelativeSpec to simplify the docshell code (if the relative
spec of aURI relative to mCurrentURI starts with a '#', then it just differs by
an anchor).

Thoughts?
Alright, looks like getRelativeSpec will in fact not help me one whit here.  I
may have to temporarily set the refs on mCurrentURI and aURI to nothing,
compare, then unset them back...
Priority: -- → P1
Summary: Remove hand-parsing for refs in content sinks and docshell → [FIX]Remove hand-parsing for refs in content sinks and docshell
Target Milestone: --- → mozilla1.7alpha
Attached patch Something like this (obsolete) — Splinter Review
This should also resolve any case-sensitivity issues we had there...
Comment on attachment 135700 [details] [diff] [review]
Something like this

darin, what do you think?
Attachment #135700 - Flags: review?(darin)
Priority: P1 → --
Target Milestone: mozilla1.7alpha → ---
Oops ... sorry. I didnt intend to clobber the priority and milestone.
Priority: -- → P1
Target Milestone: --- → mozilla1.7alpha
Actually, what I'd really love is if someone with access to a reasonable Tp
setup could run the Tp tests on this patch... that would indicate whether we
could in fact use something like this or whether I should just leave the
image-specific fix I put in and be content.
Er, ignore comment 5.  Wrong bug.  ;)
i presume we do not care about scrolling to named anchors in content generated
from a data: URL or even a wyciwyg: URI?  hmm... wyciwyg could be a real issue, no?
Attachment #135700 - Flags: review?(darin) → review+
For a data: uri, I think that would be incorrect behavior, plain and simple. See
URL field of this bug.

You're right about wyciwyg.  Is there any call to make that an nsIURL?
no, unfortunately wyciwyg just uses nsSimpleURI.
I realize that. The question is whether it should.  ;)
spoke with jst a bit about this... it seems like it might make sense for
nsSimpleURI to implement nsIURL.  in fact, it seems like this whole distinction
between what a URI is and what a URL is is just pretty bogus.  RFC 2396 says
that all URIs should support reference fragments.  section 2.4.3 says this:

  The character "#" is [...] used to delimit a URI from a fragment identifier in
  URI references.

perhaps we can make nsSimpleURI simply not implement the other nsIURL methods. 
this would be consistent with the way nsSimpleURI implements GetHost and SetHost
for example.

it's a shame nsIURI doesn't have the Get/SetRef methods.

btw, what makes you think that data: URIs cannot have a reference fragment?  RFC
2396 says that '#' is a character used as a URI delimiter (part of the "delims"
set).  that implies that it should be escaped if included in the URi itself.

jst also mentioned that he has seen some pages that depend on javascript: URIs
supporting reference fragments.
> it's a shame nsIURI doesn't have the Get/SetRef methods.

Yeah.  :(

> btw, what makes you think that data: URIs cannot have a reference fragment? 

Well, either they can't or our rendering of that testcase in the URL field is
buggy... ;)

> jst also mentioned that he has seen some pages that depend on javascript: URIs
> supporting reference fragments.

Oh, lord...  OK.  Perhaps we should in fact implement nsIURL on nsSimpleURI...
That will involve a sweep of the tree for people who assume that once you QI to
nsIURL you can get things like paths with impunity and no error returns... :(
Perhaps we should just wontfix this, btw, with nice comments explaining why
people shouldn't bother....
Note that the "something like this" patch breaks <a href=#"> -- it treats it as
not having a ref at all.
Blocks: 376535
Summary: [FIX]Remove hand-parsing for refs in content sinks and docshell → Remove hand-parsing for refs in content sinks and docshell
Priority: P1 → P5
Target Milestone: mozilla1.7alpha → ---
(In reply to comment #12)
> > it's a shame nsIURI doesn't have the Get/SetRef methods.
> 
> Yeah.  :(

Now they do! :) (as of bug 308590 landing)

(In reply to comment #11)
> jst also mentioned that he has seen some pages that depend on
> javascript: URIs supporting reference fragments.

They also support Get/SetRef, as of bug 308590 & bug 659698!

I don't know if there's anything that remains to be done here (since this bug has been mostly untouched for 6 years), but perhaps we can use nsIURI::Get/SetRef methods to fix this now?

(FWIW, I stumbled across this bug due to a comment inside of nsContentUtils::SplitURIAtHash mentioning the bug #)
We can't quite use GetRef/SetRef, sadly.  The docshell code needs to be able to tell apart URIs ending in just '#' with nothing after it and URIs with no ref at all, and nsIURI does not expose that difference.  :(

But yes, the docshell code is still doing hand-parsing of IRIs, and I would still love to get rid of it.
(In reply to comment #16)
> We can't quite use GetRef/SetRef, sadly.  The docshell code needs to be able
> to tell apart URIs ending in just '#' with nothing after it and URIs with no
> ref at all, and nsIURI does not expose that difference.  :(

We can tell that pretty easily by inspecting the last character of GetSpec() in that case, though.

Perhaps a helper-method "nsContentUtils::CompareURIs()" that returns an enum value to distinguish base-uri-differs vs. just-the-ref-differs  (vs. uris-exactly-equal) would be useful here?  (see end of bug 662094 comment 1 for more)
Depends on: 677260
Whiteboard: [necko-backlog]
In the current code, this parsing is only done in nsDocShell. I could not find any in parsing in nsHTMLContentSink.cpp or nsXMLContentSink.cpp.

Only use is at:
http://mxr.mozilla.org/mozilla-central/source/docshell/base/nsDocShell.cpp#9982

nsContentUtils::SplitURIAtHash is used now and I am not convinced that changing that to GetRef/GetHasRef would be faster or nicer (ok, we do noth need to parse it multiple times for a single uri). 

Boris, should we change this?
Flags: needinfo?(bzbarsky)
See comment 16 for the issue with GetRef here.  Does GetHasRef let us distinguish between "http://foo" and "http://foo#"?
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] from comment #19)
> See comment 16 for the issue with GetRef here.  Does GetHasRef let us
> distinguish between "http://foo" and "http://foo#"?

Yes, for http://foo# hasRef is true and for http://foo it is false
Attached patch bug_225910_v1.patch (obsolete) — Splinter Review
Assignee: bzbarsky → dd.mozilla
Attachment #135700 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8735117 - Flags: review?(bzbarsky)
Comment on attachment 8735117 [details] [diff] [review]
bug_225910_v1.patch

Please don't ignore failure returns from GetRef/GetHasRef/EqualsExceptRef.  The old code is careful about not assuming nsIURI methods succeed, and even has an explicit comment to that effect.  If those fail, set sameExceptHashes to false.

I'd like to see an updated patch with that fixed.

>+                            ((curURIHasRef != newURIHasRef) || !curHash.Equals(newHash));

That first != is over-parenthesized.

The rest looks good; I hope we have good enough test coverage here and fear we might not...

Note that it seems like we could fix nsGlobalWindow::DispatchAsyncHashchange to do something similar as well, and nix SplitURIAtHash completely.
Attachment #8735117 - Flags: review?(bzbarsky)
Attached patch bug_225910_v2.patch (obsolete) — Splinter Review
Attachment #8736214 - Flags: review?(bzbarsky)
Attachment #8735117 - Attachment is obsolete: true
Comment on attachment 8736214 [details] [diff] [review]
bug_225910_v2.patch

>+    nsresult rvURIOld;

Could you please push this down into the two blocks where it's actually used, so it won't be visible in a scope where it might not even be initialized?

> nsGlobalWindow::DispatchAsyncHashchange(nsIURI *aOldURI, nsIURI *aNewURI)

>+  NS_ENSURE_STATE(NS_SUCCEEDED(aOldURI->GetRef(oldHash)) &&
>+                  NS_SUCCEEDED(aNewURI->GetRef(newHash)) &&
>+                  !oldHash.Equals(newHash));

This looks wrong to me: it will bail out if aOldURI is "http://foo" and aNewURI is "http://foo#", no?  Please check that this either failed our existing tests or add a testcase...
Attachment #8736214 - Flags: review?(bzbarsky) → review+
Attachment #8736214 - Attachment is obsolete: true
Attachment #8737145 - Flags: review+
(In reply to Boris Zbarsky [:bz] from comment #27)
> 
> >+  NS_ENSURE_STATE(NS_SUCCEEDED(aOldURI->GetRef(oldHash)) &&
> >+                  NS_SUCCEEDED(aNewURI->GetRef(newHash)) &&
> >+                  !oldHash.Equals(newHash));
> 
> This looks wrong to me: it will bail out if aOldURI is "http://foo" and
> aNewURI is "http://foo#", no?  Please check that this either failed our
> existing tests or add a testcase...

There is a test for this (sorry I wanted to run it on try but try was closed on that day for a long time)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/170e0ac6b292
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
You need to log in before you can comment on or make changes to this bug.