Remove space from XHTML definitions of ⃛ ⃛ ⃜ and ̑

NEW
Assigned to

Status

()

Core
HTML: Parser
3 years ago
2 months ago

People

(Reporter: fredw, Assigned: fredw)

Tracking

({parity-chrome, parity-safari})

Trunk
parity-chrome, parity-safari
Points:
---

Firefox Tracking Flags

(firefox45 affected)

Details

(URL)

Attachments

(4 attachments)

(Assignee)

Description

3 years ago
Created attachment 8686090 [details]
testcase

The following entities are defined as a sequence space + combining char in the XML Entity spec:

DownBreve U+0020 U+0311
tdot, TripleDot: U+0020 U+20DB
DotDot: U+0020 U+20DC

From the spec: 

"For reasons explained further in [Charmod-norm], it is not advisable to to start the replacement text of an entity with a combining character, as then potentially different results may be produced depending on the order in which entity expansion and Unicode normalisation are performed. As far as possible this specification uses non-combining characters, however, in the cases tdot, TripleDot and DotDot Unicode only has combining forms of the accents, and so the entity replacement text starts with a space, to avoid the possibility that the expansion of the entity combines with preceding text."

This is indeed how they are defined in htmlmathml-f.ent, but that seems to be lost when we expand the entity. I attach a "visual" testcase. I realized that today when I wrote a script test: http://tests.mathml-association.org/mathml/relations/html5-tree/entities.html

WebKit and Blink seems to have the same behavior as Gecko.
(Assignee)

Updated

3 years ago
Assignee: nobody → fred.wang
(Assignee)

Comment 3

3 years ago
Created attachment 8705732 [details]
testcase (xhtml)

Here is the same testcase using XHTML. As David Carlisle noted, in that case Gecko does add a space (this is because the DTD https://dxr.mozilla.org/mozilla-central/source/dom/xml/htmlmathml-f.ent is used). Apparently the whatwg is leaning towards keeping the current entity definitions (without a space before combining char). So we should instead probably just fix htmlmathml-f.ent to match HTML5.

Comment 4

3 years ago
If we change this, WebKit/Blink/Gecko will agree on "no space" for both HTML and XML. I think we should go ahead and do that. The HTML standard already requires this (in its XHTML section), it's arguably an oversight that it does so, though at this point it's easier to just change the implementations that do not agree (Gecko, and maybe Internet Explorer).

Comment 5

3 years ago
IE11 seems to do no space as well. I'll try to get htmlmathml-f.ent regenerated upstream without the spaces asap (I could do it now but process to be followed.....) you are of course free to edit your local copy at any time. For your copy you may as well just delete the spaces, in the version I distribute with the entity spec I may decide to make it paramaterised so you can have or not have the space (Or I may not, it may be that the resulting complication documenting how to set the parameter really isn't worth it, deciding....)
(Assignee)

Comment 6

3 years ago
Created attachment 8705740 [details] [diff] [review]
Patch (revome space from htmlmathml-f.ent)

Here is a version of the fix+test that instead remove the space from the XHTML definition.
(Assignee)

Updated

3 years ago
Summary: ⃛ ⃛ ⃜ and ̑ should generate a space → Remove space from XHTML definitions of ⃛ ⃛ ⃜ and ̑
(Assignee)

Updated

3 years ago
Attachment #8705740 - Flags: review?(hsivonen)
Attachment #8705740 - Flags: review?(hsivonen) → review+
Mass bug change to replace various 'parity' whiteboard flags with the new canonical keywords. (See bug 1443764 comment 13.)
Keywords: parity-chrome, parity-safari
Whiteboard: [parity-webkit][parity-blink]
You need to log in before you can comment on or make changes to this bug.