data URL with hash - content doesn't match location.

VERIFIED FIXED in mozilla6



6 years ago
3 days ago


(Reporter: John P Baker, Assigned: dholbert)


(Blocks: 1 bug, {addon-compat, regression})

addon-compat, regression
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags



(Whiteboard: bz nominated near comment 3, URL)


(3 attachments, 3 obsolete attachments)



6 years ago
1. Go to data:text/html,<h1 id="a">AA</h1>#a
2. Content shows |AA #a|
3. Location bar shows |data:text/html,<h1 id="a">AA</h1>|

Mozilla/5.0 (Windows NT 5.0; rv:6.0a1) Gecko/20110522 Firefox/6.0a1
This also happens when I use character reference (e.g., &#x41;) in data URL.
Per RFC 3986, URIs can contain '#' characters even if they start with scheme.
> URI-reference = URI / relative-ref
> URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
> relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
RFC 2397 should be considered as an explanation about the absolute-URI (not URI) syntax for the data scheme per RFC 3986 4.3:
>      absolute-URI  = scheme ":" hier-part [ "?" query ]
>   URI scheme specifications must define their own syntax so that all
>   strings matching their scheme-specific syntax will also match the
>   <absolute-URI> grammar.  Scheme specifications will not define
>   fragment identifier syntax or usage, regardless of its applicability
>   to resources identifiable via that scheme, as fragment identification
>   is orthogonal to scheme definition.

> if (aBaseURI && !spec.IsEmpty() && spec[0] == '#') {
this test should be:
> if (aBaseURI && spec.Find('#') != kNotFound) {
And the following clause should be rewritten in response to this change.
No, that test is fine.  But we do need to change the "absolute data uri" parser to stop at '#' and toss the rest into the ref.  Daniel, do you want to do that?

That said, why the heck is the location bar not showing the '#a' part?  That's really weird...
tracking-firefox6: --- → ?
Keywords: regression
Yeah, I can look into this.
Component: General → Networking
OS: Windows XP → All
QA Contact: general → networking
Hardware: x86 → All
(side note, not really related to fixing this: if I paste the data URL into my location bar, hit enter, paste it again, hit enter again, THEN the "#a" shows up at the end of the URLbar.  That's probably just because we hit a newURL.equalsExceptRef(currentURL) special-case somewhere.)
Created attachment 534530 [details] [diff] [review]
wip fix

Here's a WIP fix, basically doing what bz suggested in comment 3. It fixes this part:
> 2. Content shows |AA #a|
but not this part:
> 3. Location bar shows |data:text/html,<h1 id="a">AA</h1>|

(also, interstingly, *after* you've got '#a' showing up in comment 5 -- if you switch tabs away & back, the #a disappears)
Assignee: nobody → dholbert
As noted with an XXXdholbert comment in the wip fix -- I'm not immediately sure what we want to do with the now-parsed-out #ref in nsDataChannel::OpenContentStream().
> but not this part:

Any idea why?  That's really odd...
Figured it out -- nsSimpleURI::Clone fails to copy mIsRefValid right now.  So, clones think they don't have a .ref.  New patch coming soon with that fixed.

(In reply to comment #1)
> This also happens when I use character reference (e.g., &#x41;) in data URL.

@masayuki:  That's a different issue, I think.  If you need to include a "#" character in a data URL, you'll need to encode it as "%23" (as is the case in any normal URL).  If that doesn't work, please file a bug on that.
Created attachment 534594 [details] [diff] [review]
patch 1: fix

Here's the fix. Overview of what it does:
 - Teaches nsDataHandler::ParseURI about #ref
 - Transfers mIsRefValid flag in nsSimpleURI::Clone (per prev comment)
 - Maintains (and asserts) the invariant that whenever we clear mIsRefValid, we also clear mRef.  (Not strictly necessary, but good for sanity & for memory savings)
Created attachment 534597 [details] [diff] [review]
patch 2: tests

This tests patch includes the reftest from before (making sure that a data URI with #ref appended doesn't render any differently), along with an addition to xpcshell test test_URIs.js for the clone() issue from comment 9.

I verified that both new tests fail with no patch & pass with the patch.

(In the interest of thoroughness, I also extended the testcases in test_URIs.js a few additional URIs that I encountered while debugging this.)
Attachment #534530 - Attachment is obsolete: true
Attachment #534597 - Flags: review?(bzbarsky)
Comment on attachment 534594 [details] [diff] [review]
patch 1: fix

Note that the fix still has this in nsDataChannel::OpenContentStream:
>      rv = nsDataHandler::ParseURI(spec, contentType, contentCharset,
> -                                 lBase64, dataBuffer);
> +                                 lBase64, dataBuffer, hashRef);
> +
> +    // XXXdholbert What should we do with 'hashRef' here?

IIUC, this is the point at which we actually decode the data portion of the data URI.  In a "real" URL, we'd want to treat hashRef as a "seek-to-this-point-in-the-document" indicator.  So if we ignore it here, that presumably won't work. (but I don't think it did before, and I don't think much else would break...?)  So perhaps this part can be handled in a follow (if there's anything that even needs handling?)
Attachment #534594 - Flags: review?(bzbarsky)
(In reply to comment #12)
> So perhaps this part can be handled in a
> follow (if there's anything that even needs handling?)

er s/follow/followup/
BTW, TryServer says the patches for this bug + bug 659177 currently fail some mochitest-oth tests, including test_moz_document_rules.html[1] -- I'll start investigating those failures after my local rebuild finishes.

(If bz gets to this review before I figure those out[2], I can post the test-failures-fix as a separate patch, if that helps.)


[2] (I'll be AFK for a chunk of tonight at my brother's musical performance; should be back around 11pm PDT.  Will do my best to figure out the tryserver orange before that starts...)
Comment on attachment 534594 [details] [diff] [review]
patch 1: fix

>+    // XXXdholbert What should we do with 'hashRef' here?

Absolutely nothing.  The scroll-to-ref code just asks the URI for the ref directly elsewhere.

Which means I don't think we need to return the hashRef at all here. Just make sure to not include it in dataBuffer.

Note that I don't think you need the separate SetRef call in NewURI, since SetSpec should do the right thing, I would think.

r=me with those changes.
Attachment #534594 - Flags: review?(bzbarsky) → review+
test_moz_document_rules does this:

49         var rule = "@-moz-document " + urltests +
50                    " { #display { z-index: " + zIndex + " } }";
51         var sheeturi = "data:text/css," + encodeURI(rule);

and JS encodeURI does not escape '#'.  So this test just needs to be fixed to do that.

Similar issues in browser/base/content/test/browser_bug517902.js and browser/base/content/test/tabview/browser_tabview_bug594958.js
Comment on attachment 534597 [details] [diff] [review]
patch 2: tests

Don't you want about:blank rather than moz-safe-about:blank here?  I think you do.

r=me with that, assuming it still tests what you want to test.  If not, why not?
Attachment #534597 - Flags: review?(bzbarsky) → review+
(In reply to comment #17)
> Don't you want about:blank rather than moz-safe-about:blank here?  I think
> you do.

I already have about:foo in that test (to cover all "about:xxx" type URLs), so about:blank is basically already tested there.

I added "moz-safe-about:blank" because "moz-safe-about:" was a new type of scheme that I hadn't seen before, until today.

> and JS encodeURI does not escape '#'.  So this test just needs to be fixed to do that.

Ah! Great, thanks.  (particularly because I didn't end up getting very far since comment 14 - sadly my home machine has gotten really slow at doing rebuilds for some reason...  Need to look into that)

I'll be sure to fix those tests up before landing.
moz-safe-about is used as the inner URI scheme for safe about: URIs like "about:blank" and a few others.  It's NOT used for random unknown about:foobar URIs.  I'm actually sort of surprised newURI succeeds for "about:foobar"...

Point being that testing "about:blank" should test whatever you wanted tested there.
Landed, starting with a "patch 0" that just fixes up some existing tests (s/#/%23/ in data URIs) to fix the try failures from comment 14 (inferring rubber-stamp=bz on that test-only patch, based on comment 16.)

patch 0:
patch 1:
patch 2:
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla6
Backed out due to orange on Windows Reftests.
Resolution: FIXED → ---
(see bug 623423 comment 6 regarding the orange)
Figured out the orange. We need one more s/#/%23/-within-data-URI fix:
>37 <?xml-stylesheet type="text/css" href="data:text/css,
>39 #_box_windowsDefaultTheme:-moz-system-metric(windows-default-theme) {
>40   display: none;
>41 }
Created attachment 534759 [details] [diff] [review]
patch 0: fix existing tests (and now also reftest.xul) (for checkin)

Here's the same version of patch 0 that I pushed before, but now with one final s/#/%23/ in the reftest.xul chunk quoted above.
Created attachment 534761 [details] [diff] [review]
patch 1: fix (for checkin)

(and here's the ready-to-land version of patch 1 that I landed before -- posting it here for convenience w/ relanding, since volkmar has graciously offered to re-land the patches here)
Attachment #534594 - Attachment is obsolete: true
Created attachment 534764 [details] [diff] [review]
patch 2: tests (for checkin)

and here's patch 2, ready for checkin (identical to what landed before)
Attachment #534597 - Attachment is obsolete: true
Attachment #534759 - Attachment description: patch 0: fix existing tests (and now also reftest.xul) → patch 0: fix existing tests (and now also reftest.xul) (for checkin)
Last Resolved: 6 years ago6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Blocks: 659466


6 years ago
Depends on: 659650


6 years ago
Whiteboard: bz nominated near comment 3

Comment 28

6 years ago
not going to track this but if you think it's safe, please nominate the patch with a risk analysis.
tracking-firefox6: ? → -
Comment on attachment 534761 [details] [diff] [review]
patch 1: fix (for checkin)

Requesting approval to land on aurora.

 (1) Fixes regression from bug 308590, caused by forgetting to set a flag in one case.
 (2) Makes our data-URI-parsing code consistent about handling "#ref" identifiers on full data URIs.
 (3) Includes tests.

 - Low, though note that this _intentionally_ breaks some existing content.  In particular -- in data URIs that contain a '#' character, everything after the "#" will now be considered a reference, rather than part of the document.  Bug 308590 was already expected to break such content, but it didn't until this bug here was fixed, because of (2) above.

(Such content is easy to fix with s/#/%23/, which I've done for all cases in mozilla-central, in patch 0 here and in bug 659466 .)
Attachment #534761 - Flags: approval-mozilla-aurora?
Comment on attachment 534761 [details] [diff] [review]
patch 1: fix (for checkin)

(oops, I forgot that this is already in aurora, from landing before the branch point.  Canceling approval request.)
Attachment #534761 - Flags: approval-mozilla-aurora?
Mozilla/5.0 (Windows NT 5.1; rv:6.0) Gecko/20100101 Firefox/6.0

Verified issue using FF 6.0b3 on Windows XP, Windows 7, Ubuntu 11.04 and Mac OS X 10.6 ('#a' part is properly shown in the location bar).

Setting resolution to VERIFIED FIXED.
Mentioned on per request in bug 659650.
Keywords: addon-compat
Depends on: 1312050


7 months ago
Depends on: 1312700


3 days ago
Blocks: 1367965
You need to log in before you can comment on or make changes to this bug.