Open Bug 57498 Opened 25 years ago Updated 3 years ago

should treat defective combining characters as base characters

Categories

(Core :: Layout: Text and Fonts, defect, P3)

x86
Windows 98
defect

Tracking

()

Future

People

(Reporter: jruderman, Unassigned)

References

()

Details

(Whiteboard: [awd:tbl])

At http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata03.html, many (but not all) of numerical entities for the "combining" unicode characters, when used as the sole contents of a table cell, make the table cell not display at all (as if it were empty). I'm not familiar with unicode, so I could easily be missing something. This may be related to bug 35984.
Stills seeing this in Mozilla build 2001012205 Win32.
QA contact update
QA Contact: chrisd → amar
Whiteboard: [awd:tbl]
Temporarily moving to future until a milestone can be assigned.
Status: NEW → ASSIGNED
Target Milestone: --- → Future
I don't think this is actually related to tables. With a minimal HTML document like <html><body>&#810;</body></html> nothing is displayed. On the other hand, if I precede the combining character with something... <html><body>A&#810;</body></html> An A appears (with a "combining bridge below"). So, the problem (?) seems to be that if the combining character exists alone (which it never does in normal text, I suppose), it is not rendered at all. Therefore it sounds rather logical that the TD is also rendered as empty. Whether the behavior described above is correct or not escapes me; I'm not a Unicode specialist. But if it's incorrect, then this bug shouldn't be in HTMLTables, and if it's correct, then this is probably invalid. The preceding is written according to my test results with Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:0.9.8) Gecko/20020204 (unfortunately I can't access a newer build right now).
Surprisingly enough, my Mozilla just got upgraded to "Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:0.9.9+) Gecko/20020324". The previous comments still apply.
AIUI, combining characters have to follow base characters. They cannot stand on their own.
mass reassign to default owner
Assignee: karnaze → table
Status: ASSIGNED → NEW
QA Contact: amar → madhur
Target Milestone: Future → ---
Target Milestone: --- → Future
Combining characters *are* allowed "on their own" in a way, by making the base character a space, which can be tricky if you're using HTML entities for the combining character. I'm curious to know if that's even possible.
A combining character on its own may be treated as a base character according to UNICODE 3.0, section 3.5 D14, and it is definitely considered defective according to D17a. So such a Web page is invalid. I think we should treat it as a base character, as that is the least confusing behaviour and the only behaviour suggested in the spec.
Summary: many "combining" unicode characters make <td>s act empty → should treat defective combining characters as base characters
->Fonts and Text
Assignee: table → font
Component: Layout: Tables → Layout: Fonts and Text
QA Contact: madhur → ian
David Nesting: Yes, it is possible, but subject to bug 202285.
smontagu found further text in 5.14, "Fallback Rendering" section, that says "Defective combining character sequences should be rendered as if they had a space as a base character", so it fact this is a SHOULD not a MAY.
Assignee: layout.fonts-and-text → nobody
QA Contact: ian → layout.fonts-and-text
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.