New &hy; entity for U+2010 HYPHEN

RESOLVED WONTFIX

Status

()

Core
HTML: Parser
--
enhancement
RESOLVED WONTFIX
5 years ago
3 years ago

People

(Reporter: Nicholas Shanks, Unassigned)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 years ago
User Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.19 (KHTML, like Gecko) Chrome/25.0.1323.1 Safari/537.19

Steps to reproduce:

I would like to add a new entity to HTML, and having vendor buy-in is critical, if it is to be accepted by Ian.

The entity is to represent U+2010 HYPHEN and my suggested entity for it is &hy; (since the soft hyphen is ­)
This is absolutely, critically, NOT to represent U+00AD HYPHEN MINUS, which I can type with a single keypress. At the moment I have to type ‐ every ten seconds, which is a pain (shift-7 option-3 alpha number number number number punct).



Expected results:

This should have been part of HTML since the introduction of Unicode.
We do not make the html standards here, we only interpret them. You are at the wrong pace if you want to change something in the html standard 
Or did i misunderstood your intention ?
Status: UNCONFIRMED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → INVALID
(Reporter)

Comment 2

5 years ago
The development of HTML is a constant negotiation between theorists (spec writers), implementers (browser vendors), and authors (web developers). Many parts of the current HTML specification are in there because vendors "embraced and extended" the spec of the day. <img>, <script>, <object>, <canvas>, <ruby>, <marquee> -- and those are just tags. Our current large swathe of entities, DOCTYPE switching, many attributes, and technologies related to HTML such as media queries, CSS properties, CSS @rules and so on, have all been introduced in browsers first and reverse-engineered/spec'd later. Netscape and its founders' previous projects are probably responsible for the introduction of the majority of custom HTML that was not previously in a spec.

Modern practice is to specify the truth on the ground more than live in cloud cuckoo land. My intent is to change that ground truth, just as all vendors continue to do to this day, Mozilla included.


Also, you should not resolve bugs without understanding them. Not only is it highly bad form, but it may potentially leave the team unaware that the bug exists.
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
(In reply to Nicholas Shanks from comment #0)
> User Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.19 (KHTML, like
> Gecko) Chrome/25.0.1323.1 Safari/537.19
> 
> Steps to reproduce:
> 
> I would like to add a new entity to HTML, and having vendor buy-in is
> critical, if it is to be accepted by Ian.
> 
> The entity is to represent U+2010 HYPHEN and my suggested entity for it is
> &hy; (since the soft hyphen is &shy;)
> This is absolutely, critically, NOT to represent U+00AD HYPHEN MINUS, which
> I can type with a single keypress. At the moment I have to type &#x2010;
> every ten seconds, which is a pain (shift-7 option-3 alpha number number
> number number punct).
> 
> 
> 
> Expected results:
> 
> This should have been part of HTML since the introduction of Unicode.

It's not clear to me that HTML "should" have specific entities for many Unicode characters (this one included). Why not use the actual characters? If you need to use the U+2010 frequently when writing, wouldn't it make more sense to type it literally as '‐' in your (UTF-8) source file, rather than using an entity of any kind, whether numeric or otherwise?
(Reporter)

Comment 4

5 years ago
Unlike 99.9% of computer users, I have created a custom personal keyboard layout which I use when on my Mac and writing prose (made with Ukelele and then hand-edited). Among many other changes, I have swapped the HYPHEN MINUS character for HYPHEN on the main keyboard, and for MINUS on they keypad.

When I am on a Windows or Linux machine, when I am using a tty session via SSH, when I am using software that doesn't understand Unicode or a machine that doesn't have the correct glyphs installed, I pretty much have to stick to 7 bit ASCII. I also have to switch to the standard British layout when I start writing any code, naming files in the filesystem, or passing flags to command line tools in the terminal.

Included in that list is my current situation, sat on a Windows Vista machine writing HTML, with poor fonts installed, and a total inability to remember any decimal equivalents to Unicode code points, for use with Microsoft's horrendous "hold down control and type in the codepoint" way of typing Unicode.

How many times do people want to type a hyphen, versus &rang; or the many other mathematical symbols. We have &minus;, we have &shy;, we have &dagger; and &Dagger; and plenty of other ones, including many many esoteric matematical entities, but the entity for a common hyphen, which does not exist in ASCII and is used tens to thousands of times a day by each and every individual English speaker, is conspicuous and painfully absent.
I suspect 99.9% of computer users would be perfectly happy with U+002D, which is readily available on standard keyboards and requires no special HTML representation.

Updated

5 years ago
OS: Windows Vista → All
Hardware: x86 → All

Updated

5 years ago
Severity: normal → enhancement

Updated

3 years ago
Component: Untriaged → HTML: Parser
Product: Firefox → Core
Now named characters have a bad backwards compatibility story and adding a new one would move as further away from interop. Not worth the trouble.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 5 years ago3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.