Closed Bug 486917 Opened 16 years ago Closed 14 years ago

Uniform character width is an incorrect assumption

Categories

(Skywriter Graveyard :: Editor, defect, P5)

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: ehsan.akhgari, Assigned: ben)

Details

(Keywords: intl)

It seems like Bespin, using a fixed width font, assumes that any string with n characters is n*w pixels wide (w being the width of a single character). While this may be true for English text, it certainly isn't for scripts which include accent marks or ligatures. To give you an example (this looks exactly like Bespin's rendering because Bugzilla also uses fixed width fonts): ل (U+0644 Arabic Letter Lam) + ا (U+0627 Arabic Letter Alef) = لا (U+FEFB Arabic Ligature Lam With Alef Isolated Form) width of لا is w, whereas Bespin assumes it to be 2*w. This makes editing text which frequently includes ligatures and accent characters quite impractical, because the cursor's position doesn't actually represent the point in which edits are made. Ditto for selection highlights.
Yay for more variable-width characters! This is the opposite case as with tabs. With tabs, it was one character equals more than one character space. With ligatures, it's more than one character equals only one character space. This should be fun.
Whiteboard: editor
Version: unspecified → Trunk
Incidentally, simply copying and pasting that block of text above will not reproduce the problem, as the ligature is pasted as the single Unicode character. However, copying and pasting the individual letters next to each other indeed shows the problem.
(In reply to comment #2) > Incidentally, simply copying and pasting that block of text above will not > reproduce the problem, as the ligature is pasted as the single Unicode > character. However, copying and pasting the individual letters next to each > other indeed shows the problem. Copy and paste from external sources does not work on bespin.mozilla.com, so I couldn't test this by pasting, but the behavior that you describe is actually another bug! When pasting, the original Unicode characters should be pasted, not the Unicode glyphs used to render them. This breaks the invariant that copying some text, pasting it, and then copying it again should end up with the exact same contents on the clipboard. And also it makes the editor to behave strange and incorrectly (in this example, one wouldn't be able to delete the second character and replace it with another character, for example).
(In reply to comment #3) > Copy and paste from external sources does not work on bespin.mozilla.com, so I > couldn't test this by pasting, but the behavior that you describe is actually > another bug! If you'd like to set up your own instance of Bespin, there are fairly straight-forward instructions available: https://wiki.mozilla.org/Labs/Bespin/DeveloperGuide I thought it might have been something Bugzilla did, but it appears that Bugzilla has it right. I'll have to investigate more if it is indeed Bespin that is copying and pasting wrong. That's quite an interesting situation. As for this bugs here, do you know of a way to detect if a character is involved in a ligature? Because, with that detection, it should be an easy fix.
(In reply to comment #4) > If you'd like to set up your own instance of Bespin, there are fairly > straight-forward instructions available: > https://wiki.mozilla.org/Labs/Bespin/DeveloperGuide Thanks for the info! > I thought it might have been something Bugzilla did, but it appears that > Bugzilla has it right. I'll have to investigate more if it is indeed Bespin > that is copying and pasting wrong. That's quite an interesting situation. How is pasting being handled in Bespin? > As for this bugs here, do you know of a way to detect if a character is > involved in a ligature? Because, with that detection, it should be an easy fix. I'm not sure if this can be determined generally. I think it depends on the fonts used and the engine which transforms the input characters into the glyphs which constitute the rendered text. I assume that Bespin doesn't handle that internally and delegates the task to the browser engine's facilities for rendering text in canvas, right? If that is the case, I think this information should somehow be queried from the browser's canvas text implementation. I don't know if that is possible though.
(In reply to comment #5) > I'm not sure if this can be determined generally. I think it depends on the > fonts used and the engine which transforms the input characters into the glyphs > which constitute the rendered text. I assume that Bespin doesn't handle that > internally and delegates the task to the browser engine's facilities for > rendering text in canvas, right? If that is the case, I think this information > should somehow be queried from the browser's canvas text implementation. I > don't know if that is possible though. Looking through the canvas text API documentation, it seems like measureText is what you're looking for here: <https://developer.mozilla.org/en/Drawing_text_using_a_canvas#measureText%28%29>
(In reply to comment #6) > Looking through the canvas text API documentation, it seems like measureText is > what you're looking for here: > > <https://developer.mozilla.org/en/Drawing_text_using_a_canvas#measureText%28%29> Hmm... That function measures the width of the text in CSS pixels. We currently calculate based on individual character width. Using that function would require more of a revamp of the code than I'd been thinking. But thanks for doing the research for me!
This is a mass migration from Mozilla Labs :: Bespin to Bespin :: Editor.
Component: Bespin → Editor
Product: Mozilla Labs → Bespin
QA Contact: bespin → editor
Whiteboard: editor
Target Milestone: -- → ---
We saw this problem with the following scenario: Visit a site with multibyte characters (such as yahoo.co.jp) and copy and paste anything that is obviously going to be multibyte (i.e. any text consisting of ideographs). The Bespin editor pastes the text fine, but the cursor does not follow the text as it is painted in the editor. It seems to think the end of the line is about half way through the rendered text.
In general, Bespin's foreseeable future will be English, left-to-right text (sorry, have to keep scope small and manageable). However, it's conceivable that folks will use multi-byte characters in this scenario, so we need to fix this. Not sure on priority though.
Assignee: nobody → bgalbraith
Status: NEW → ASSIGNED
Severity: normal → minor
Priority: -- → P5
Target Milestone: --- → Future
Target Milestone: Future → ---
I'll just note that this bug remains an issue in the Rebooted Bespin.
ACETRANSITION The Skywriter project has merged with Ajax.org's Ace project (the full server part of which is their Cloud9 IDE project). Background on the change is here: http://mozillalabs.com/skywriter/2011/01/18/mozilla-skywriter-has-been-merged-into-ace/ The bugs in the Skywriter product are not necessarily relevant for Ace and quite a bit of code has changed. For that reason, I'm closing all of these bugs. Problems that you have with Ace should be filed in the Ace issue tracker at GitHub: https://github.com/ajaxorg/ace/issues
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.