fails to break lines at Ethiopic word space mark

RESOLVED FIXED in Firefox 57



Layout: Text
5 years ago
4 months ago


(Reporter: Steve White, Assigned: mats)


17 Branch

Firefox Tracking Flags

(firefox57 fixed)



(2 attachments)



5 years ago
Created attachment 685081 [details]

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:17.0) Gecko/17.0 Firefox/17.0
Build ID: 20121120042814

Steps to reproduce:

View page with text in Ethiopic script with words separated by U+1361 "Ethiopic word space".  E.g. the attachment, or

Actual results:

Lines separated by such characters are not broken, and simply run out of the window.

Expected results:

It should also break lines at the Ethiopic characters
U+1360 section mark
U+1362 full stop
U+1363 comma
U+1364 semicolon
U+1365 colon
U+1366 preface colon
U+1367 question mark
U+1369 paragraph separator

(It's not clear to me whether paragraph separator should always start a new the above page, they put a new-line after each instance of that character, so maybe it shouldn't.)

Note there are other Unicode script ranges that contain line-breaking punctuation: consult UniCodeData.txt.

Also note: Gedit on the same system (Ubuntu) does break lines properly.
Component: Untriaged → Layout: Text
Product: Firefox → Core


5 years ago
Attachment #685081 - Attachment mime type: text/plain → text/html

Comment 1

4 months ago
Created attachment 8902492 [details] [diff] [review]

Not sure if this is the right fix but it seems to work.

(I'm also moving the check for OGHAM SPACE MARK later than
EM SPACE etc since I suspect it's less common.)
Assignee: nobody → mats
Ever confirmed: true
Attachment #8902492 - Flags: review?(jfkthame)
Comment on attachment 8902492 [details] [diff] [review]

Review of attachment 8902492 [details] [diff] [review]:


(FWIW, it's less clear to me whether we should do anything further to handle the other characters listed in comment 0. In the Amharic UDHR document, for example, any occurrences of these seem to be followed by either a newline or an Ethiopic wordspace, which provides the desired line-break. So let's do this, and wait for a better understanding before considering any followup that may be appropriate.)
Attachment #8902492 - Flags: review?(jfkthame) → review+

Comment 3

4 months ago
FWIW, none of the other characters are a line-break opportunity in Chrome.

Comment 4

4 months ago
Pushed by
Make unicode ETHIOPIC WORDSPACE count as a space character.  r=jfkthame
(In reply to Mats Palmgren (:mats) from comment #3)
> FWIW, none of the other characters are a line-break opportunity in Chrome.

Fair enough. Thanks for checking!

Comment 6

4 months ago
I concur about the other punctuation marks I listed -- I just got carried away.

Only U+1361 is listed on as a "line break opportunity".
In Ge'ez text I've found, the other punctuation is always followed by a space.

Could we see an image or PDF showing how the fix handles the line in the example text?
Flags: needinfo?(mats)

Comment 7

4 months ago
It's what you'd expect :-)  The fix should be available in Nightly
in a few days so you can verify.
Flags: needinfo?(mats)

Comment 8

4 months ago
Pushed by
Make unicode ETHIOPIC WORDSPACE count as a space character: remove test failure expecation. r=wpt-expectation-update
Last Resolved: 4 months ago
status-firefox57: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.