Redirects consume first # of the fragment (Fragmention uses two ##)

NEW
Unassigned

Status

()

Core
Networking: HTTP
P3
normal
3 years ago
2 months ago

People

(Reporter: beta, Unassigned)

Tracking

32 Branch
x86_64
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [necko-backlog])

(Reporter)

Description

3 years ago
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:32.0) Gecko/20100101 Firefox/32.0 (Beta/Release)
Build ID: 20140703004002

Steps to reproduce:

Visiting http://kartikprabhu.com/static/demo/fragmention.html##was+still should redirect the user to https://kartikprabhu.com/static/demo/fragmention.html##was+still but instead they are redirected to https://kartikprabhu.com/static/demo/fragmention.html#was+still

Note one of the hashes has been consumed.
Firefox may be doing some validation/normalisation of the fragment, correctly or not, causing this behaviour.

Chromium 34 & Opera 12 redirect without munging the fragment.

This is listed on the entry http://indiewebcamp.com/fragmention#Double-hashes_in_the_wild yet couldn’t see a Bugzilla issue for it, so filed.

Updated

3 years ago
Component: Untriaged → Networking: HTTP
Product: Firefox → Core

Comment 1

3 years ago
This is because http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp?mark=4160-4171#4160 re-adds the ref to the redirected bits, and then the code inside nsStandardURL.cpp here: http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsStandardURL.cpp?mark=2415-2418#2387 nixes the first #.

I would actually expect that this is correct. Similar behaviour is exhibited in e.g. http://jsbin.com/detisi/1/edit . Reading http://url.spec.whatwg.org/, I am still more tempted to mark this invalid in the sense that certainly the testcase in the jsbin is expected behaviour. Whether redirects should behave differently... probably? Maybe?

http://tools.ietf.org/html/rfc7231#section-7.1.2 talks about preserving the hash across redirects and doesn't indicate what should happen with hashes that contain... hashes.

http://blogs.msdn.com/b/ieinternals/archive/2011/05/17/url-fragments-and-redirects-anchor-hash-missing.aspx notes that IE pre IE10 just forgot the hash regardless (!)

Testing locally, IE11 behaves the same as Blink, so for web compat's sake, I guess we should standardize on keeping the extra hash, even though that feels... weird.

Patrick, does this analysis seem right?
Flags: needinfo?(mcmanus)
I generally lean towards web compat - but this is really only a tangentially networking issue.. ask bz and annevk for input.
Flags: needinfo?(mcmanus)

Comment 3

3 years ago
Boris / Anne, got an opinion on comment #1, per comment #2 ? :-)
Flags: needinfo?(bzbarsky)
Flags: needinfo?(annevk)
The "set the hash on a url" behavior exposed to the web is not necessarily the same as the "preserve this thing on redirect" behavior we want.  For example, the redirect-preservation could add an extra '#' at the front if it wants to call into the web-exposed algorithm that strips out a leading '#'.

That said, given the existing behavior of setting .hash using "##" for anything might be a poor choice.  You can see this on <https://kartikprabhu.com/static/demo/fragmention.html>, where it has to jump through nasty hoops like 

  window.location.protocol + "//" + window.location.host + window.location.pathname + "##" + text.replace(/\s+/g, '+')

instead of just being able to set .hash to the thing it wants.
Flags: needinfo?(bzbarsky)

Comment 5

3 years ago
I would expect a redirect to set a URL's fragment directly and not use the JavaScript's API.
Flags: needinfo?(annevk)

Comment 6

3 years ago
I could define this clearly in Fetch if HTTP is unclear I suppose.
The point is that our internal URL representation has the same "strip the '#'" behavior the JS API has, because that is in fact the behavior API consumers typically seem to want.

Comment 8

3 years ago
Do we know why that is? It seems pretty broken behavior. Especially internally it would be much better if API usage did not rely on a character getting eaten.
> Do we know why that is?

Probably because that was what all the consumers wanted in practice; see comment 7.
Status: UNCONFIRMED → NEW
Ever confirmed: true

Comment 10

3 years ago
(In reply to Boris Zbarsky [:bz] from comment #9)
> > Do we know why that is?
> 
> Probably because that was what all the consumers wanted in practice; see
> comment 7.

The story here seems kind of interesting...

For instance, we have some protocol handlers that make this assumption:
http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/data/nsDataHandler.cpp#69
http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsHostObjectProtocolHandler.cpp#622

But I figured window.location would abuse this functionality, but instead it manually inserts a hash if not already there:

http://mxr.mozilla.org/mozilla-central/source/dom/base/nsLocation.cpp#336

... which will then be removed by nsStandardURL, unless I'm missing something...


Boris, is the right fix here:
1) removing this feature from nsStandardURL and fixing all the consumers?
2) adding a different API that doesn't behave like this to nsStandardURL?
3) just inserting an extra hash in the http redirect handler if there is already one, and doing the same for any other consumers where this would make sense? (I don't think there are any, but...)

(best-effort grep for consumers: http://mxr.mozilla.org/mozilla-central/search?string=%28+|%3E%29SetRef\%28&regexp=1&find=&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central )
Flags: needinfo?(bzbarsky)
That's a question for the necko peers as to what they'd like their APIs to look like.
Flags: needinfo?(bzbarsky)

Comment 12

3 years ago
Need input on comment 10.
Flags: needinfo?(mcmanus)
amusingly, I think it is probably more important for the consumers to tell necko what they need. I'm pretty agnostic on it from a philosophical pov. If I was doing it from the start I would do #1.

pragmatically - #1 looks like it would create more potential for bugs with churn than it would have in value. I'd take the patch though.

#2 - no thanks

#3 - that seems the pragmatic path
Flags: needinfo?(mcmanus)
Whiteboard: [necko-backlog]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
You need to log in before you can comment on or make changes to this bug.