Closed Bug 815077 Opened 12 years ago Closed 7 years ago

fails to break lines at Ethiopic word space mark

Categories

(Core :: Layout: Text and Fonts, defect)

17 Branch
x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: Stevan_White, Assigned: MatsPalmgren_bugz)

Details

Attachments

(2 files)

Attached file ethi-line-brk.html
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:17.0) Gecko/17.0 Firefox/17.0 Build ID: 20121120042814 Steps to reproduce: View page with text in Ethiopic script with words separated by U+1361 "Ethiopic word space". E.g. the attachment, or http://unicode.org/udhr/d/udhr_amh.html Actual results: Lines separated by such characters are not broken, and simply run out of the window. Expected results: It should also break lines at the Ethiopic characters U+1360 section mark U+1362 full stop U+1363 comma U+1364 semicolon U+1365 colon U+1366 preface colon U+1367 question mark U+1369 paragraph separator (It's not clear to me whether paragraph separator should always start a new line...in the above page, they put a new-line after each instance of that character, so maybe it shouldn't.) Note there are other Unicode script ranges that contain line-breaking punctuation: consult UniCodeData.txt. Also note: Gedit on the same system (Ubuntu) does break lines properly.
Component: Untriaged → Layout: Text
Product: Firefox → Core
Attachment #685081 - Attachment mime type: text/plain → text/html
Attached patch fixSplinter Review
Not sure if this is the right fix but it seems to work. (I'm also moving the check for OGHAM SPACE MARK later than EM SPACE etc since I suspect it's less common.) https://treeherder.mozilla.org/#/jobs?repo=try&revision=5149a65969edf31908541145877579f66f2f0032
Assignee: nobody → mats
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #8902492 - Flags: review?(jfkthame)
Comment on attachment 8902492 [details] [diff] [review] fix Review of attachment 8902492 [details] [diff] [review]: ----------------------------------------------------------------- LGTM. (FWIW, it's less clear to me whether we should do anything further to handle the other characters listed in comment 0. In the Amharic UDHR document, for example, any occurrences of these seem to be followed by either a newline or an Ethiopic wordspace, which provides the desired line-break. So let's do this, and wait for a better understanding before considering any followup that may be appropriate.)
Attachment #8902492 - Flags: review?(jfkthame) → review+
FWIW, none of the other characters are a line-break opportunity in Chrome.
Pushed by mpalmgren@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/073963897752 Make unicode ETHIOPIC WORDSPACE count as a space character. r=jfkthame
(In reply to Mats Palmgren (:mats) from comment #3) > FWIW, none of the other characters are a line-break opportunity in Chrome. Fair enough. Thanks for checking!
I concur about the other punctuation marks I listed -- I just got carried away. Only U+1361 is listed on http://www.unicode.org/reports/tr14/tr14-39.html as a "line break opportunity". In Ge'ez text I've found, the other punctuation is always followed by a space. Could we see an image or PDF showing how the fix handles the line in the example text?
Flags: needinfo?(mats)
It's what you'd expect :-) The fix should be available in Nightly in a few days so you can verify. http://nightly.mozilla.org/
Flags: needinfo?(mats)
Pushed by archaeopteryx@coole-files.de: https://hg.mozilla.org/integration/mozilla-inbound/rev/2ab09319c214 Make unicode ETHIOPIC WORDSPACE count as a space character: remove test failure expecation. r=wpt-expectation-update
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: