40882 - should combine characters (diacritic and base) separated by element (frame) boundaries

Reporter

Description

•

24 years ago

From Bugzilla Helper:
User-Agent: Mozilla/4.61 [en] (WinNT; I)
BuildID:    2000030708

Diacritic character should be displayed under its base, not after.

HTML below demonstrates that. A diacritic (Hebrew vowel QAMATS) is displayed 
adjacent to the base letter (Hebrew letter ALEF):

<HTML>
  <HEAD></HEAD>
  <BODY>
    <P><font face="Arial"><B>&#x05D0;</B>&#x05B8;
    </P>
  </BODY>
</HTML>

The case when the vowel and the letter are in the same tag, produces correct 
behaviour.


Reproducible: Always

I expect the similar results when having some set of characters that form 
ligature.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 1

•

24 years ago

Moving to Internationalization, although this could be a purely layout problem.

Assignee: asadotzler → ftang

Component: Browser-General → Internationalization

QA Contact: jelwell → teruko

Frank Tang

Comment 2

•

24 years ago

This is an issue about rendering combination mark across FRAME. 
Since this particular issue is realted to hebrew. Reassign to mkaply@us.ibm.com

Assignee: ftang → mkaply

Status: UNCONFIRMED → NEW

Ever confirmed: true

Mike Kaply [:mkaply]

Comment 3

•

24 years ago

Sorry everyone is getting this again - I wanted Lina's comments in the bug.


One remark: This issue doesn't seem to be specific to hebrew only. 

Some quotes from the Unicode Standard: 
"Some scripts, such as Hebrew, Arabic, and the scripts of India and Southeast 
Asia, have combining characters indicated in the charts in relation to dotted 
circles to show their position relative to the base character."
"Diacritics are the principal class of combining characters used with European 
alphabets."

I tried the following test case:

<p>&#x0061;&#x030B;</p>
<p><b>&#x0061;</b>&#x030B;</p>

Each paragraph contains identical sequence of 2 characters (the latin letter "a" 
and the diactiric "double acute accent" (used in Hungarian) ), but in the 1st 
paragraph they appear in the same token, in the 2nd -- in separate tokens. 
Although I don't know if this combination in correct linguistically, it's 
rational to suppose that it should form the same shape. However, I could see 
that these 2 paragraphs were rendered differently.

Also, this problem is not specific to combining classes. Ligatures behave 
similarly; for example, the Arabic ligature LamAlef:

<p>&#x0644;&#x0627;</p>
<p><b>&#x0644;</b>&#x0627;</p>

(Only the 1st paragraph is displayed properly.)

Mike Kaply [:mkaply]

Updated

•

24 years ago

Status: NEW → ASSIGNED

Teruko Kobayashi

Comment 4

•

24 years ago

Changed QA contact to andreasb@netscape.com.

QA Contact: teruko → andreasb

Mike Kaply [:mkaply]

Comment 5

•

23 years ago

This bug belongs in layout based on the new testcase from Lina.

Layout is not combining characters across frames.

Again, here is a non Hebrew testcase:

<p>&#x0061;&#x030B;</p>
<p><b>&#x0061;</b>&#x030B;</p>

Assignee: mkaply → karnaze

Status: ASSIGNED → NEW

Component: Internationalization → Layout

QA Contact: andreasb → petersen

Lina Kemmel

Reporter

Comment 6

•

23 years ago

This bug is fixed for Hebrew combinings (marked with #ifdef FIX_FOR_BUG_40882 
in layout/html/base/src/nsLineLayout.cpp and 
layout/base/src/nsBidiPresUtils.cpp).
However, it could be preferable to use OpenType tables, as Erik suggested 
(which, at the same time, would kill another bird - wrong positioning of 
combinings (Hebrew, at least) on a non-bidi platform).

anthonyd

Comment 7

•

23 years ago

not a table bug. reassigning

Assignee: karnaze → attinasi

Kevin McCluskey (gone)

Updated

•

23 years ago

Target Milestone: --- → Future

Christopher Hoess (gone)

Comment 8

•

21 years ago

->fonts & text

Assignee: attinasi → font

Component: Layout → Layout: Fonts and Text

QA Contact: petersen → ian

Summary: Diacritic character and its base character, when contained in separate HTML tags, are positioned incorrectly → Need to combine characters across frames

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

21 years ago

Summary: Need to combine characters across frames → should combine characters (diacritic and base) separated by element (frame) boundaries

Jungshik Shin

Comment 9

•

21 years ago

we need a generic grapheme cluster breaker/iterator that works across 'frames'.

Depends on: grapheme-breaker

Keywords: intl

fantasai

Comment 10

•

19 years ago

See also http://lists.w3.org/Archives/Public/www-style/2005Jun/0057.html
Several testcases were posted to the Unicode list:

  http://www.unics.uni-hannover.de/nhtcapri/temp/nastaliq.html
    (Arabic joining is also broken.)
  http://www.qsm.co.il/Hebrew/HebrewTest/ColorHtml.htm
  http://www.qsm.co.il/Hebrew/HebrewTest/ColorCss.htm

OS: Windows NT → All

Hardware: PC → All

Simon Montagu :smontagu

Comment 11

•

19 years ago

*** Bug 297707 has been marked as a duplicate of this bug. ***

Kevin Brosnan

Comment 12

•

17 years ago

Gecko 1.9 has some improvements. Testcases from comment 0 and the duped bug work for me. The testcases in comment 10 mostly work short of the black points in the Hebrew tests. The testcases in comment 5 are still broken.

Simon Montagu :smontagu

Comment 13

•

17 years ago

In Gecko 1.9 I think there are two cases that don't work:
1) when the base character and the diacritics are in different fonts (including bold and non-bold). There might be room for improvement here, but perfection is unattainable.
2) if the base character and the diacritic have different colors, the diacritic is forced to the color of the base character.

Phil Ringnalda (:philor)

Updated

•

15 years ago

Assignee: layout.fonts-and-text → nobody

QA Contact: ian → layout.fonts-and-text

BMO Automation

Updated

•

2 years ago

Severity: minor → S4

Bugzilla

Quick Search

should combine characters (diacritic and base) separated by element (frame) boundaries

Categories

(Core :: Layout: Text and Fonts, defect, P3)

Tracking

()

People

(Reporter: lkemmel, Unassigned)

References

Details

(Keywords: intl)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Comment 8

Updated

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Updated

Updated