Last Comment Bug 658949 - data URL with hash - content doesn't match location.
: data URL with hash - content doesn't match location.
Status: VERIFIED FIXED
bz nominated near comment 3
: addon-compat, regression
Product: Core
Classification: Components
Component: Networking (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla6
Assigned To: Daniel Holbert [:dholbert]
:
Mentors:
data:text/html,<h1 id="a">AA</h1>#a
Depends on: 659650
Blocks: 308590 659466
  Show dependency treegraph
 
Reported: 2011-05-23 00:48 PDT by John P Baker
Modified: 2013-12-27 14:20 PST (History)
9 users (show)
mounir: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-


Attachments
wip fix (5.60 KB, patch)
2011-05-23 13:10 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Review
patch 1: fix (8.52 KB, patch)
2011-05-23 15:42 PDT, Daniel Holbert [:dholbert]
bzbarsky: review+
Details | Diff | Review
patch 2: tests (2.84 KB, patch)
2011-05-23 15:46 PDT, Daniel Holbert [:dholbert]
bzbarsky: review+
Details | Diff | Review
patch 0: fix existing tests (and now also reftest.xul) (for checkin) (8.16 KB, patch)
2011-05-24 07:19 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Review
patch 1: fix (for checkin) (8.06 KB, patch)
2011-05-24 07:23 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Review
patch 2: tests (for checkin) (3.30 KB, patch)
2011-05-24 07:25 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Review

Description John P Baker 2011-05-23 00:48:22 PDT
1. Go to data:text/html,<h1 id="a">AA</h1>#a
2. Content shows |AA #a|
3. Location bar shows |data:text/html,<h1 id="a">AA</h1>|

Mozilla/5.0 (Windows NT 5.0; rv:6.0a1) Gecko/20110522 Firefox/6.0a1
Comment 1 Masayuki Nakano [:masayuki] (Mozilla Japan) 2011-05-23 01:04:40 PDT
This also happens when I use character reference (e.g., &#x41;) in data URL.
Comment 2 Masatoshi Kimura [:emk] 2011-05-23 02:05:19 PDT
Per RFC 3986, URIs can contain '#' characters even if they start with scheme.
> URI-reference = URI / relative-ref
> URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
> relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
RFC 2397 should be considered as an explanation about the absolute-URI (not URI) syntax for the data scheme per RFC 3986 4.3:
>      absolute-URI  = scheme ":" hier-part [ "?" query ]
>
>   URI scheme specifications must define their own syntax so that all
>   strings matching their scheme-specific syntax will also match the
>   <absolute-URI> grammar.  Scheme specifications will not define
>   fragment identifier syntax or usage, regardless of its applicability
>   to resources identifiable via that scheme, as fragment identification
>   is orthogonal to scheme definition.

So,
> if (aBaseURI && !spec.IsEmpty() && spec[0] == '#') {
this test should be:
> if (aBaseURI && spec.Find('#') != kNotFound) {
And the following clause should be rewritten in response to this change.
Comment 3 Boris Zbarsky [:bz] (Out June 25-July 6) 2011-05-23 06:06:05 PDT
No, that test is fine.  But we do need to change the "absolute data uri" parser to stop at '#' and toss the rest into the ref.  Daniel, do you want to do that?

That said, why the heck is the location bar not showing the '#a' part?  That's really weird...
Comment 4 Daniel Holbert [:dholbert] 2011-05-23 08:47:58 PDT
Yeah, I can look into this.
Comment 5 Daniel Holbert [:dholbert] 2011-05-23 11:30:31 PDT
(side note, not really related to fixing this: if I paste the data URL into my location bar, hit enter, paste it again, hit enter again, THEN the "#a" shows up at the end of the URLbar.  That's probably just because we hit a newURL.equalsExceptRef(currentURL) special-case somewhere.)
Comment 6 Daniel Holbert [:dholbert] 2011-05-23 13:10:27 PDT
Created attachment 534530 [details] [diff] [review]
wip fix

Here's a WIP fix, basically doing what bz suggested in comment 3. It fixes this part:
> 2. Content shows |AA #a|
but not this part:
> 3. Location bar shows |data:text/html,<h1 id="a">AA</h1>|


(also, interstingly, *after* you've got '#a' showing up in comment 5 -- if you switch tabs away & back, the #a disappears)
Comment 7 Daniel Holbert [:dholbert] 2011-05-23 13:11:48 PDT
As noted with an XXXdholbert comment in the wip fix -- I'm not immediately sure what we want to do with the now-parsed-out #ref in nsDataChannel::OpenContentStream().
Comment 8 Boris Zbarsky [:bz] (Out June 25-July 6) 2011-05-23 13:23:01 PDT
> but not this part:

Any idea why?  That's really odd...
Comment 9 Daniel Holbert [:dholbert] 2011-05-23 15:15:38 PDT
Figured it out -- nsSimpleURI::Clone fails to copy mIsRefValid right now.  So, clones think they don't have a .ref.  New patch coming soon with that fixed.

(In reply to comment #1)
> This also happens when I use character reference (e.g., &#x41;) in data URL.

@masayuki:  That's a different issue, I think.  If you need to include a "#" character in a data URL, you'll need to encode it as "%23" (as is the case in any normal URL).  If that doesn't work, please file a bug on that.
Comment 10 Daniel Holbert [:dholbert] 2011-05-23 15:42:41 PDT
Created attachment 534594 [details] [diff] [review]
patch 1: fix

Here's the fix. Overview of what it does:
 - Teaches nsDataHandler::ParseURI about #ref
 - Transfers mIsRefValid flag in nsSimpleURI::Clone (per prev comment)
 - Maintains (and asserts) the invariant that whenever we clear mIsRefValid, we also clear mRef.  (Not strictly necessary, but good for sanity & for memory savings)
Comment 11 Daniel Holbert [:dholbert] 2011-05-23 15:46:32 PDT
Created attachment 534597 [details] [diff] [review]
patch 2: tests

This tests patch includes the reftest from before (making sure that a data URI with #ref appended doesn't render any differently), along with an addition to xpcshell test test_URIs.js for the clone() issue from comment 9.

I verified that both new tests fail with no patch & pass with the patch.

(In the interest of thoroughness, I also extended the testcases in test_URIs.js a few additional URIs that I encountered while debugging this.)
Comment 12 Daniel Holbert [:dholbert] 2011-05-23 15:51:44 PDT
Comment on attachment 534594 [details] [diff] [review]
patch 1: fix

Note that the fix still has this in nsDataChannel::OpenContentStream:
>      rv = nsDataHandler::ParseURI(spec, contentType, contentCharset,
> -                                 lBase64, dataBuffer);
> +                                 lBase64, dataBuffer, hashRef);
> +
> +    // XXXdholbert What should we do with 'hashRef' here?

IIUC, this is the point at which we actually decode the data portion of the data URI.  In a "real" URL, we'd want to treat hashRef as a "seek-to-this-point-in-the-document" indicator.  So if we ignore it here, that presumably won't work. (but I don't think it did before, and I don't think much else would break...?)  So perhaps this part can be handled in a follow (if there's anything that even needs handling?)
Comment 13 Daniel Holbert [:dholbert] 2011-05-23 15:52:44 PDT
(In reply to comment #12)
> So perhaps this part can be handled in a
> follow (if there's anything that even needs handling?)

er s/follow/followup/
Comment 14 Daniel Holbert [:dholbert] 2011-05-23 19:00:21 PDT
BTW, TryServer says the patches for this bug + bug 659177 currently fail some mochitest-oth tests, including test_moz_document_rules.html[1] -- I'll start investigating those failures after my local rebuild finishes.

(If bz gets to this review before I figure those out[2], I can post the test-failures-fix as a separate patch, if that helps.)

[1] http://tbpl.mozilla.org/?tree=Try&rev=c6c035ce5552
http://tinderbox.mozilla.org/showlog.cgi?log=Try/1306194559.1306198023.19416.gz

[2] (I'll be AFK for a chunk of tonight at my brother's musical performance; should be back around 11pm PDT.  Will do my best to figure out the tryserver orange before that starts...)
Comment 15 Boris Zbarsky [:bz] (Out June 25-July 6) 2011-05-23 19:18:07 PDT
Comment on attachment 534594 [details] [diff] [review]
patch 1: fix

>+    // XXXdholbert What should we do with 'hashRef' here?

Absolutely nothing.  The scroll-to-ref code just asks the URI for the ref directly elsewhere.

Which means I don't think we need to return the hashRef at all here. Just make sure to not include it in dataBuffer.

Note that I don't think you need the separate SetRef call in NewURI, since SetSpec should do the right thing, I would think.

r=me with those changes.
Comment 16 Boris Zbarsky [:bz] (Out June 25-July 6) 2011-05-23 19:22:31 PDT
test_moz_document_rules does this:

49         var rule = "@-moz-document " + urltests +
50                    " { #display { z-index: " + zIndex + " } }";
51         var sheeturi = "data:text/css," + encodeURI(rule);

and JS encodeURI does not escape '#'.  So this test just needs to be fixed to do that.

Similar issues in browser/base/content/test/browser_bug517902.js and browser/base/content/test/tabview/browser_tabview_bug594958.js
Comment 17 Boris Zbarsky [:bz] (Out June 25-July 6) 2011-05-23 19:24:48 PDT
Comment on attachment 534597 [details] [diff] [review]
patch 2: tests

Don't you want about:blank rather than moz-safe-about:blank here?  I think you do.

r=me with that, assuming it still tests what you want to test.  If not, why not?
Comment 18 Daniel Holbert [:dholbert] 2011-05-23 19:51:16 PDT
(In reply to comment #17)
> Don't you want about:blank rather than moz-safe-about:blank here?  I think
> you do.

I already have about:foo in that test (to cover all "about:xxx" type URLs), so about:blank is basically already tested there.

I added "moz-safe-about:blank" because "moz-safe-about:" was a new type of scheme that I hadn't seen before, until today.

> and JS encodeURI does not escape '#'.  So this test just needs to be fixed to do that.

Ah! Great, thanks.  (particularly because I didn't end up getting very far since comment 14 - sadly my home machine has gotten really slow at doing rebuilds for some reason...  Need to look into that)

I'll be sure to fix those tests up before landing.
Comment 19 Boris Zbarsky [:bz] (Out June 25-July 6) 2011-05-23 19:57:36 PDT
moz-safe-about is used as the inner URI scheme for safe about: URIs like "about:blank" and a few others.  It's NOT used for random unknown about:foobar URIs.  I'm actually sort of surprised newURI succeeds for "about:foobar"...

Point being that testing "about:blank" should test whatever you wanted tested there.
Comment 20 Daniel Holbert [:dholbert] 2011-05-24 03:48:25 PDT
Landed, starting with a "patch 0" that just fixes up some existing tests (s/#/%23/ in data URIs) to fix the try failures from comment 14 (inferring rubber-stamp=bz on that test-only patch, based on comment 16.)

patch 0:  http://hg.mozilla.org/mozilla-central/rev/7e8a5585577b
patch 1:  http://hg.mozilla.org/mozilla-central/rev/d632d15ccb11
patch 2:  http://hg.mozilla.org/mozilla-central/rev/89f5c8191f3b
Comment 21 Mounir Lamouri (:mounir) 2011-05-24 06:51:27 PDT
Backed out due to orange on Windows Reftests.
See:
http://tbpl.mozilla.org/?tree=Firefox&rev=89f5c8191f3b
Comment 22 Daniel Holbert [:dholbert] 2011-05-24 06:58:00 PDT
(see bug 623423 comment 6 regarding the orange)
Comment 23 Daniel Holbert [:dholbert] 2011-05-24 07:12:21 PDT
Figured out the orange. We need one more s/#/%23/-within-data-URI fix:
http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.xul?rev=04e11820280c&mark=37-41#37
>37 <?xml-stylesheet type="text/css" href="data:text/css,
>38 
>39 #_box_windowsDefaultTheme:-moz-system-metric(windows-default-theme) {
>40   display: none;
>41 }
Comment 24 Daniel Holbert [:dholbert] 2011-05-24 07:19:24 PDT
Created attachment 534759 [details] [diff] [review]
patch 0: fix existing tests (and now also reftest.xul) (for checkin)

Here's the same version of patch 0 that I pushed before, but now with one final s/#/%23/ in the reftest.xul chunk quoted above.
Comment 25 Daniel Holbert [:dholbert] 2011-05-24 07:23:45 PDT
Created attachment 534761 [details] [diff] [review]
patch 1: fix (for checkin)

(and here's the ready-to-land version of patch 1 that I landed before -- posting it here for convenience w/ relanding, since volkmar has graciously offered to re-land the patches here)
Comment 26 Daniel Holbert [:dholbert] 2011-05-24 07:25:31 PDT
Created attachment 534764 [details] [diff] [review]
patch 2: tests (for checkin)

and here's patch 2, ready for checkin (identical to what landed before)
Comment 28 Asa Dotzler [:asa] 2011-05-31 14:37:27 PDT
not going to track this but if you think it's safe, please nominate the patch with a risk analysis.
Comment 29 Daniel Holbert [:dholbert] 2011-05-31 14:45:10 PDT
Comment on attachment 534761 [details] [diff] [review]
patch 1: fix (for checkin)

Requesting approval to land on aurora.

Reward:
 (1) Fixes regression from bug 308590, caused by forgetting to set a flag in one case.
 (2) Makes our data-URI-parsing code consistent about handling "#ref" identifiers on full data URIs.
 (3) Includes tests.

Risk:
 - Low, though note that this _intentionally_ breaks some existing content.  In particular -- in data URIs that contain a '#' character, everything after the "#" will now be considered a reference, rather than part of the document.  Bug 308590 was already expected to break such content, but it didn't until this bug here was fixed, because of (2) above.

(Such content is easy to fix with s/#/%23/, which I've done for all cases in mozilla-central, in patch 0 here and in bug 659466 .)
Comment 30 Daniel Holbert [:dholbert] 2011-06-01 08:47:24 PDT
Comment on attachment 534761 [details] [diff] [review]
patch 1: fix (for checkin)

(oops, I forgot that this is already in aurora, from landing before the branch point.  Canceling approval request.)
Comment 31 Simona B [:simonab](PTO) 2011-07-28 02:10:05 PDT
Mozilla/5.0 (Windows NT 5.1; rv:6.0) Gecko/20100101 Firefox/6.0

Verified issue using FF 6.0b3 on Windows XP, Windows 7, Ubuntu 11.04 and Mac OS X 10.6 ('#a' part is properly shown in the location bar).

Setting resolution to VERIFIED FIXED.
Comment 32 Matthew N. [:MattN] (PTO Jun. 29 - Jul. 1) 2012-05-10 23:16:52 PDT
Mentioned on https://developer.mozilla.org/en/Firefox/Updating_add-ons_for_Firefox_6#URL.C2.A0handling per request in bug 659650.

Note You need to log in before you can comment on or make changes to this bug.