All users were logged out of Bugzilla on October 13th, 2018

Lines not breaking at Tibetan Tsheg/Tsek (U+0F0B)

RESOLVED INCOMPLETE

Status

()

RESOLVED INCOMPLETE
11 years ago
8 years ago

People

(Reporter: edelsteinj, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [CLOSEME 2010-11-01])

Attachments

(1 attachment)

(Reporter)

Description

11 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6

This is similar to the line-break bug reported in https://bugzilla.mozilla.org/show_bug.cgi?id=95067, but different in that it deals specifically with foreign languages (in this case Tibetan).  

The character U+0F0B, the Tibetan intersyllabic tsheg (http://www.fileformat.info/info/unicode/char/0f0b/index.htm), is designed to provide a break opportunity.  Currently FF doesn't break at it at all.  Tibetan uses the tsheg in lieu of spaces, resulting in sentences and paragraphs that appear to be long unbroken words.  

In layouts such as width-controlled tables or css divs, this truncates the entire paragraph at the end of the line or else forces the paragraph to a) exceed the boundary of said div and b) forces the user to scroll horizontally to read the content (which may at this point be overlapping other content).

"white-space: -moz-pre-wrap; !important" does not work because there *IS* no white space, as the tsheg is an actual character.

I'm sure that there are many other such situations--Burmese and Khmer have similar issues but I don't know what their break character is if any.  But the tsheg is specifically a break character and is even typed with the space bar on Tibetan keyboard layouts.  It should behave in the browser as a space. The only time it wouldn't break is before a shad (U+0F0D), but in such cases theoretically one would type a final tsheg (U+0F0C).  This doesn't always happen though.

Reproducible: Always

Steps to Reproduce:
1. Take any long section of Unicode Tibetan text, such as བོད་ཤར་ཕྱོགས་ཨ་མཆོག་ཚད་ཉིན་དགོན་པ་ཡིན་པའི་གྲྭ་བཟོད་པ་རྣམ་རྒྱལ་ལོ་གཉིས་དང་ཟླ་བ་གཉིས་རིང་ཨ་མཆོག་ས་ཁུལ་ནས་རྒྱ་གར་བར་རྐྱང་ཕྱག་འཚལ་འགྲོ་བཞིན་ཡོད་པ་ཁོང་དེང་སྐབས་བལ་ཡུལ་དུ་འབྱོར་ཡོད་པ་རེད།སྐབས་དེར་ཨེ་ཤེ་ཡ་རང་དབང་རླུང་འཕྲིན་ཁང་གི་བལ་ཡུལ་ཁུལ་གྱི་གསར་འགོད་པས་ཁོང་ལ་རྐྱང་ཕྱག་འཚལ་དགོས་པའི་དམིགས་ཡུལ་དང་དགོས་པ་བཅས་ཀྱི་སྐོར་གནས་འདྲི་ཞུས་པ་རེད༎
2. Throw it in a DIV that's styled with a width of 400px (give it a nice border so you can really see it mess up).
3. Watch and enjoy.
Actual Results:  
Depending on the z-index of the DIV, it either leaks out of the DIV, crawls over the border and extends off the page or else it hits the border and truncates completely.

Expected Results:  
It should break after the tsheg symbol closest to the right border of the DIV and wrap to the next line (Tibetan is LTR).

Comment 1

11 years ago
I can confirm this bug.

See: e.g. <http://www.library.gov.bt/index-DZ.html> and compare line Tibetan script breaking in Firefox 2.x (poor) and IE 7.x (good). 

[The Jomolhari Tibetan script font for rendering the page propperly can be found at: <https://savannah.nongnu.org/projects/free-tibetan/>]

The *primary* line break opporunity in Tibetan script text should be immediatly after character u0f0b. This character is the word seperator in Tibetan and Dzongkha (similar to inter word space in English).

- Chris Fynn
National Library of Bhutan
 

 

Comment 2

11 years ago
A work around for this bug is to include <wbr> after every U+0f0b character in an HTML page. I've now done this at <http://www.library.gov.bt/index-DZ.html>. This work around is only necessary to get Dzongkha and Tibetan lines to wrap in Mozilla browsers. 

IE does not have this problem.

When you add <wbr> the (X)HTML no longer validates as <wbr> is not a standard tag.

- Chris   
(Reporter)

Comment 3

11 years ago
(In reply to comment #2)

Chris,

Your work around looks great; I've also experimented with adding U+200B after U+0F0B.  That looks better in the code, but in IE6 it actually adds a physical space instead of a zero-width space (many of our site visitors have IE6).

A more pressing question is how are you actually inserting your <wbr> string into the code?  Are you doing a search and replace on each file before uploading, is it being handled by a content management system, or is it being forced in by a javascript like tibetanportal.com is doing?

Joshua

> A work around for this bug is to include <wbr> after every U+0f0b character in
> an HTML page. I've now done this at <http://www.library.gov.bt/index-DZ.html>.
> This work around is only necessary to get Dzongkha and Tibetan lines to wrap in
> Mozilla browsers. 
> 
> IE does not have this problem.
> 
> When you add <wbr> the (X)HTML no longer validates as <wbr> is not a standard
> tag.
> 
> - Chris   
> 

Comment 4

11 years ago
Hi Joshua

The www.library.gov.bt site is deliberately pretty low tech - I'm doing a simple search and replace to insert the <wbr /> after every U+0F0B so lines of Dzongkha (Tibetan script) text wrap properly in Firefox. 

I know it looks ugly in the source, but I guess once this bug has been fixed for a while all the <wbr /> tags can be removed. I also tried using U+200B and ran into the same problem you did. 

- Chris

Comment 5

11 years ago
Bug 394954 seems to be fixed in FireFox 3.0 - though I have only tested this on Windows XP. Could someone confirm that this bug is also fixed on other platforms.

- Chris 

Comment 6

11 years ago
Created attachment 326253 [details]
Tibetan Line Break test for Bug 394954 

Test for Bug 394954
This is a mass search for bugs that are in the Firefox General component, are
UNCO, and have not been changed for 800 days and have an unspecified version. 

Reporter, can you please update to Firefox 3.6.10, create a fresh profile,
http://support.mozilla.com/en-US/kb/managing+profiles, and test again. If you
still see the bug, please update this bug. If the issue is gone, please set the
resolution to RESOLVED > WORKSFORME.
Whiteboard: [CLOSEME 2010-11-01]
No reply from reporter, INCOMPLETE. Please retest with Firefox 3.6.12 or later and a new profile (http://support.mozilla.com/kb/Managing+profiles). If you continue to see this issue with the newest firefox and a new profile, then please comment on this bug.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.