534 bytes, text/html
Part 1: Add iid parameter to nsISeralizable::Read to determine a serialization format and add a type-safe utility function to nsIObjectInputStream.idl
31.16 KB, patch
|Details | Diff | Splinter Review|
3.41 KB, patch
|Details | Diff | Splinter Review|
2.22 KB, patch
|Details | Diff | Splinter Review|
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:18.104.22.168) Gecko/2008120122 Firefox/3.0.5 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:22.214.171.124) Gecko/2008120122 Firefox/3.0.5 The encodeURIComponent doesn't encode the "|" character to "%7C", yet all the major browsers (Chrome, IE, Opera, Safari) does, this way makes cross-browser development harder. Reproducible: Always Steps to Reproduce: 1. Created a variable containing a hash (the "|" character's role in the hash is significant) 2. Used the mentioned function on a the variable. 3. Replaced the current hash with the variable Actual Results: The current URL changed to this -> "blog.php#page:1|1" Expected Results: The current URL should have changed to this -> "blog.php#page:1%7C1"
Works for me: js1.9.0> encodeURIComponent("|") %7C js1.9.1> encodeURIComponent("|") %7C js1.9.2> encodeURIComponent("|") %7C Do you have a testcase that demonstrates the problem? Perhaps the difference in behavior is at some other level, or under different circumstances.
Need a testcase. Sounds like this involves URL processing by Necko or Gecko or even the Location Bar. It's not JS engine as Gavin shows. /be
Sorry, I just realized that the problem has a different nature: the function works correctly but if I enter "%7C" to the URL bar it will convert it back to "|".
We display unescaped values in the location bar on purpose - it's much more user friendly. That shouldn't have any impact on what the server sees, though. Are you suggesting it does?
No, on the server side it works correctly, my problem with it that I store my AJAX variables in the hash and separate them with the "|" character for example: "blog.php#page:3|user:2|category:5" This is only problematic when the variable itself contains the "|", in that case it should be stored as "%7C", so it doesn't cut the variable's content in half. Actually this can be considered quite rare, but since I'm working on a CMS project I would like to know if this will change in the near future or I should remove the "|" characters from the variables if the user agent is Firefox?
The difference shouldn't be visible to page scripts either. Can you provide example code that behaves differently in Firefox because of this difference?
Created attachment 367493 [details] a short script that displays different result because of the unescaped "|" I've uploaded an example script. I've tested it with Chrome, Firefox 3, IE 6/7/8beta, Opera 9 and Safari, all return the same result except Firefox.
Ah, this is because we unescape the value in GetHash explicitly: There's a relevant discussion in bug 135309, starting at bug 135309 comment 30.
(In reply to comment #9) > Ah, this is because we unescape the value in GetHash explicitly: http://hg.mozilla.org/mozilla-central/annotate/29a714ae9d11/dom/base/nsLocation.cpp#l399
Created attachment 367639 [details] testcase This testcase passes in: Opera 10.00 Alpha 6166 Safari Version 3.2.1 (5525.27.1) IE 7 It fails in Firefox trunk/branch.
Yeah, this is just a duplicate of bug 135309, no? Until necko stops escaping refs on url parse, we can't really change this.
(In reply to comment #12) > Yeah, this is just a duplicate of bug 135309, no? Except that that bug is FIXED...
That bug should have been marked "wontfix"; it got hijacked, basically.
This is also a problem with .hash on the <a> element. HTML5 doesn't prescribe unescaping here.
gsnedders says Google Wave expects the current behavior from Firefox (browser sniffing going on).
ok, 8 years later and we're still waiting for a fix for this. any estimates on when we can expect this to be fixed, guys?
Probably right after you fix it? ;) Seriously, just changing this behavior breaks other things; to change it one has to change how URI refs are handled throughout Gecko wholesale. In particular, it involves the mess in bug 436006. So once someone steps up and spends a few weeks of full-time work on it, it'll be fixed. Given the amount of work involved and the low impact, it's not a high priority for me personally at the moment.
(In reply to Boris Zbarsky (:bz) from comment #21) > Seriously, just changing this behavior breaks other things; to change it one > has to change how URI refs are handled throughout Gecko wholesale. In > particular, it involves the mess in bug 436006. So once someone steps up > and spends a few weeks of full-time work on it, it'll be fixed. Given the > amount of work involved and the low impact, it's not a high priority for me > personally at the moment. Is it not possible to fix the public interfaces of the <a> DOM element and the window/document.location object to expose the Right Thing? That sounds like it should cover the web-exposed side of this, be much easier, more contained, and take much less work. Or would that patch not get taken because it's Not the Right Thing? :-)
> Is it not possible to fix the public interfaces of the <a> DOM element and the > window/document.location object to expose the Right Thing No, because the "right thing" requires changing our internal storage of the URI. Right now we have to unescape because the URI is stored escaped. And escaping is always lossy. And if we change the internal URI storage, then you end up with a cascade of other things that need to be fixed, as I said in the comment you quoted.
What about 1. storing the URI without escaping, 2. escaping the URI on retrieving it via nsIURI or nsIURL, and 3. adding a new interface nsIUnescapedURL or something and use it from Location and others expecting unescaped representation to minimize the impact?
Created attachment 650387 [details] [diff] [review] Part 1: Add iid parameter to nsISeralizable::Read to determine a serialization format and add a type-safe utility function to nsIObjectInputStream.idl I do not want to add yet another hackaround for a new iid. Actual serialization format change will be in the following patch.
Created attachment 650388 [details] [diff] [review] Part 2: Remove bogus checks and add an intended check to ReadBoolean
Created attachment 650390 [details] [diff] [review] Part 3: Fix inconsistency between nsSimpleURI::SetSpec and nsSimpleURI::SetRef and add automated tests for non-ASCII hash
Created attachment 650391 [details] [diff] [review] Part 4: Implement nsIURI.hash
Created attachment 650392 [details] [diff] [review] Part 5: Add unit tests for nsIURI.hash
Created attachment 650393 [details] [diff] [review] Part 6: Use nsIURI.hash from DOM interfaces
I'm not going to get to these reviews at least until the week starting August 27, unfortunately.
But just so I'm clear on what's going on, is the .hash thing just working around the fact that <a name> stuff doesn't follow the spec right now? Or are there other consumers you're worried about?
I'm planning to implement the URL standard. http://dvcs.w3.org/hg/url/raw-file/tip/Overview.html nsURI.hash will be used from URL.hash.
I guess the real question is why we're adding .hash instead of changing behavior of .ref.
For compatibility with existing consumers. Note that ref (mRef) is used as a part of nsIURI comparison (by Equals method).
Right, but the question is what those consumers are. It's just a bit confusing to have two properties that basically do the same thing, and people will keep using the wrong one... I can live with it if we have to, but if we can avoid it that would be ideal. mRef can be kept as an implementation detail, of course; it just doesn't have to be exposed in the API. Though comparing on escaped representations if the unescaped ones are supposed to be used can screw things up (e.g. not scrolling on anchor navigation even though two different anchors are pointed to), no?
<a name="è"> and <a name="%C3%A8"> are different per HTML5 spec while "#è" and "#%C3%A8" are supposed to be the same per IRI spec (RFC 3987). So we need to keep unescaped representation for DOM even if they are considered identical as URI/IRI. I added mHash for this purpose.
> while "#è" and "#%C3%A8" are supposed to be the same per IRI spec (RFC 3987) Last I checked, this was one of the parts of RFC 3987 that doesn't quite match reality... In practice, we clearly can't canonicalize the former to the latter if they need to have different behavior!
IRI equivalence depends on the comparison level. https://tools.ietf.org/html/rfc3987#section-5.3.1 https://tools.ietf.org/html/rfc3987#section-126.96.36.199 Some consumers may require IRI-to-URI mapping and some others may not. So the different properties will be required. No?
Hmm. Maybe. ccing some people who're actually up on all this network stuff.
This really needs another reviewer. I don't know when I'll have a chance to dig into all this gunk, if ever, and the necko change should really get review from someon who's an active necko peer.
Comment on attachment 650393 [details] [diff] [review] Part 6: Use nsIURI.hash from DOM interfaces r=me if the previous stuff is all OK, I guess.
Comment on attachment 650391 [details] [diff] [review] Part 4: Implement nsIURI.hash OK, this attribute certainly needs more documentation on how it differs from .ref: http://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsIURI.idl#220 How does it differ from ref?
hmm so after reading some comments... .hash is the escaped version of .ref...?
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #46) > hmm so after reading some comments... .hash is the escaped version of > .ref...? More precisely, .hash will not be escaped regardless of "network.standard-url.escape-utf8" state. If the input string is already escaped, it will be kept as-is. I consider removing this pref because I doubt we can change the setting.
Comment on attachment 650390 [details] [diff] [review] Part 3: Fix inconsistency between nsSimpleURI::SetSpec and nsSimpleURI::SetRef and add automated tests for non-ASCII hash I'm hoping :biesi can take this--I've talked with him about it on IRC. Really only :bz or he are equipped to take this on without some learning curve, and I don't have time for it.
Comment on attachment 650387 [details] [diff] [review] Part 1: Add iid parameter to nsISeralizable::Read to determine a serialization format and add a type-safe utility function to nsIObjectInputStream.idl I'm going to clear this request until biesi has looked at the other parts of this bug. I pinged him about it in IRC. Please rerequest review when appropriate and I'll do it quickly ;-)
So, re. question b: I'm writing a web application with ember.js, which stores part of its state in an url fragment. It goes like this: location.hash = '#/graphs/' + encodeURIComponent(JSON.stringify(state)); This makes for bookmarkable URLs. At some later time, ember may parse the URL into routes and parameters - here the route is /graphs/:graph_json, and ember passes encoded data into my de-serialization function for deserialization. That works great in Chrome. Due to this bug, it *fails horribly* in Firefox, because there's every chance that the data I'm serializing contains slashes. I'm working around the problem by browser sniffing and double-escaping my URL fragments, which sort-of-but-not-quite works. Sometimes they end up double-unescaped, sometimes someone manages to get a double-escaped version in a chat message and send it to a Chrome user, where it'll fail, often the double-escaping means the URL is too long for the web server. Basically, when I assign some properly escaped value to location.hash, I want to get the same value back later. Please fix.
No mater what you guys thing about the correctness of encoding the fragment part of the URL, this behavior causes the `hash` to be changed after page reload, which is ridiculous and is definitely a BUG that requires a fix. I have seen this discussion started like 7 years ago, right? So I think it's about time to fix it, stop looking for excuses and move on.
In case anyone had still some doubts, please consider the following example: window.loaction.hash = "#%3F" // or "%3F", whatever The url becomes "/#%3F" which is the expected behavior. However, window.location.hash === "#?" which I could understood if it wasn't for the fact that after I hit `Ctrl-R`, the url becomes "/#?". WHAT? Really? Can anyone explain to me what is the logic behind this, and what are the reasons for not considering it as a BUG?
This above behaviour in Firefox is wrong per URL spec (and not only this): http://url.spec.whatwg.org/#fragment-state Other borwser do this correct (test Chrome37 and IE11), so right now (and for a long time) Firefox for developers is problematic. set: window.loaction.hash = "#%3F" get: window.loaction.hash Result: Firefox: "#?" Chrome and IE and Opera (Presto): "#%3F"
Hello. I also got some troubles with hash. See that: https://github.com/putaindecode/fix-all-the-things/issues/11
The unescaping is the real issue with the current implementation: window.location.hash = '%3F?'; console.log(window.location.hash); > #?? This way I have to use the window.location.href to get back the original value, since the getter cannot return the same string I set with the setter.
The testcase passes in current nightly versions. I think this bug was fixed by bug 1093611.
Sure. If I copied a url with "#%7C" from the location bar or press Enter on the location bar, it will turn into "#|", but it is a different (Firefox) bug.
The bug seems to be wrongly marked as "RESOLVED FIX", the testcase still fails in Firefox 38.0.5 as well es in Nightly 41.0a1 (2015-06-08).
(In reply to Malte Wedel from comment #60) > The bug seems to be wrongly marked as "RESOLVED FIX", the testcase still > fails in Firefox 38.0.5 as well es in Nightly 41.0a1 (2015-06-08). This was fixed by bug 1093611, which then got backed out because too much internet broke. I'm not sure it makes sense to keep both bugs open here.
guys - could this please be fixed at some point soon? "because too much internet broke" as an explanation simply isn't good enough - Firefox is just one out of many browsers and by far not the one with the biggest market share. Some poorly written code that only works in Firefox should not stop progress.
Workaround (I think it is worth repeating idea from comment #57, which is commonly known and used, but wasn't clearly stated here): For now consider `location.hash` write-only for cross-browser implementations. *Never ever READ its value*. So DO NOT USE constructs like `document.location.hash.slice(1)`. Instead, for retrieval get it from `document.location.href` like with `document.location.href.split('#')`.
a more fail-safe version would be: var location_hash = location.href.split("#"); location_hash.shift(); location_hash = "#"+location_hash.join("#"); // equals location.hash as implemented in other browsers the #-character could be in location.hash more than once. But yes, Michal's version should be sufficient for most cases.
Is this bug still happening? I can't seem to reproduce it in Fx 41.
current stable is 39 - can be reproduced in that version. As :gijs wrote in #61 this was fixed by the resolution of bug 1093611 which was included in v41 - though as he states the fix was then removed again(?)
Ah yeah, thanks, I didn't read this comment properly. Looks like Bug 1093611 was relanded! (Note it was originally backed out because it produced incorrect results -- not because "the internet broke" ;) ).