Closed
Bug 417899
Opened 16 years ago
Closed 16 years ago
Bonsai doesn't handle UTF-8 data
Categories
(Webtools Graveyard :: Bonsai, defect)
Webtools Graveyard
Bonsai
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 395003
People
(Reporter: zwnj, Assigned: tara)
References
(Blocks 1 open bug, )
Details
As you can see on the page of the URL, line-break algorithm doesn't count unicode characters, and just count raw bytes. This causes: - Less non-ascii characters on each line, and - Breaking UTF-8 sequences. (like the last line-break on the URL)
Reporter | ||
Comment 1•16 years ago
|
||
Also the encoding of the page content are not set to UTF-8 too.
before anyone is crazy enough to try to "fix" behnam's bug. please keep in mind the *goal* of this function, which is to get a fixed length string. 1. 80 Chinese "characters" are generally twice as wide (physical width, not byte encoding) as 80 ASCII characters. 2. some characters contribute no width (e.g. ZWNJ). 3. RTL markings (and pops) can't safely be split anyway. I'd propose to only do splitting if all characters in a line are in the Latin-1 character set of UTF8. An alternative is to replace line breaking with browser requested wrapping, and with each table row containing two table cells, each with a single line.
Updated•16 years ago
|
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
Updated•8 years ago
|
Product: Webtools → Webtools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•