Closed Bug 251841 Opened 19 years ago Closed 19 years ago

body id from urlbase with tilde (~) fails validation

Categories

(Bugzilla :: Bugzilla-General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 2.20

People

(Reporter: kiko, Assigned: kiko)

Details

Attachments

(1 file, 3 obsolete files)

The body id generated in header.html.tmpl from the urlbase para on my site
includes a tilde tilde (~); see:

  http://www.async.com.br/~kiko/mybugzilla/show_bug.cgi?id=3

You'll find:

   id="www-async-com-br-~kiko-mybugzilla" 

This fails HTML4 validation:
     
http://validator.w3.org/check?uri=http://www.async.com.br/~kiko/mybugzilla/show_bug.cgi?id=3

   Line 45, column 24: character "~" is not allowed in the value of attribute "ID"

From the spec:

    id = name [CS]
    This attribute assigns a name to an element. This name must be unique
    in a document.

    ID and NAME tokens must begin with a letter ([A-Za-z]) and may be 
    followed by any number of letters, digits ([0-9]), hyphens ("-"), 
    underscores ("_"), colons (":"), and periods (".").

Sounds like some filtering is in order; at least something less hackish than the
current
  Param('urlbase').replace('^https?://','').replace('/$','').replace('[@:/.]','-')

Any ideas?
There's a css_class_quote filter in Util.pm which should do the job.

Gerv
css_class_quoting produces:

   id="www-async-com-br-~kiko-mybugzilla" 

I wonder if anyone would be against just replacing the tilde for a colon or
underscore? The tilde is definitely the more common "special" character in a
URL, isn't it?
Has someone yet come up with a spec for how to convert URLs into unique CSS site
IDs? I know the CSS community was kicking around the idea a while ago...

Gerv
Maybe Hixie knows :)
Keywords: qawanted
Depends on: @-moz-document
(In reply to comment #2)
> css_class_quoting produces:
> 
>    id="www-async-com-br-~kiko-mybugzilla" 
> 
> I wonder if anyone would be against just replacing the tilde for a colon or
> underscore? The tilde is definitely the more common "special" character in a
> URL, isn't it?

Underscore isn't necessarily a good choice and colon [:] would indicate a
pseudo-class like :hover et al

Perhaps replacing / with - and anything non-alphanumeric with nothing would be
better ?
OS: Linux → All
Hardware: PC → All
Just convert anything that isn't in the list of allowed characters to a hyphen. 
That's what was done with the other illegal characters (/, ., etc).
Well, in this case we'd have

  id="www-async-com-br--kiko-mybugzilla"

is that reasonable? If so, I'll be a-patch-producin'.
How about converting double, triple, etc. hyphens in a row to a single hyphen?
Removing bug 238099 because it doesn't really block me here, but it's definitely
related. 

I've decided to go ahead and:

  - swap tilde like we do with other "special" symbols
  - compress multiple "-"s into a single one (so we don't get
www.foo.bar---bazoo for things like www.foo.bar:/~bazoo).
Status: NEW → ASSIGNED
No longer depends on: @-moz-document
Attached patch kiko_v1: fix (obsolete) — Splinter Review
No animals were harmed in the production of this patch.
Attached patch kiko_v2: the right patch (obsolete) — Splinter Review
Attachment #153773 - Attachment is obsolete: true
Attachment #153774 - Flags: review?(jouni)
Attachment #153774 - Flags: review?(gerv)
Comment on attachment 153774 [details] [diff] [review]
kiko_v2: the right patch

Looks ok.

replace('[-~@:/.]+','-') is probably a tiny bit faster than
replace('[~@:/.]','-').replace('[-]+', '-'), but not necessarily more legible.

r=burnus for either version.
Attachment #153774 - Flags: review+
Attachment #153774 - Attachment is obsolete: true
Comment on attachment 153801 [details] [diff] [review]
kiko_v2: of course, that's better

Carrying over r+
Attachment #153801 - Flags: review+
Attachment #153774 - Flags: review?(jouni)
Attachment #153774 - Flags: review?(gerv)
Attachment #153801 - Attachment is obsolete: true
Comment on attachment 153802 [details] [diff] [review]
kiko_v3: merge in hyphens too

r=burnus
Attachment #153802 - Flags: review+
Flags: approval?
Flags: approval? → approval+
Doesn't compressing hyphens increase the risk of site ID clashes? And it doesn't
massively increase readability IMO...

Gerv
Yes, it increases the change for clashes, but don't you agree that the case for
someone who has installations like "~kiko ~@kiko :/~--@@kiko" is rather contrived? 

I do appear to be the first person running into this issue..
True, rather contrived in this case. But, if we are really blazing a trail here,
why not just adopt the simple algorithm Hixie suggested? We can then present our
solution to the world, for their use too.

It would be very handy indeed if everyone used the same mapping scheme; and
hyphen compression may break other people's unique IDs in ways we can't imagine.

Gerv
Did Hixie really want us to end up with IDs that had sequences of hyphens?
That's what we're discussing at this point, isn't it?
Hello. Reality check. It doesn't matter. Just pick something that looks nice and 
go with it.
Thanks!

/cvsroot/mozilla/webtools/bugzilla/template/en/default/global/header.html.tmpl,v
 <--  header.html.tmpl
new revision: 1.27; previous revision: 1.26
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Target Milestone: --- → Bugzilla 2.20
Keywords: qawanted
QA Contact: matty_is_a_geek → default-qa
You need to log in before you can comment on or make changes to this bug.