Closed
Bug 988387
Opened 10 years ago
Closed 9 years ago
Getting whole characters example only includes code for a surrogate pair
Categories
(Developer Documentation Graveyard :: JavaScript, defect, P5)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: wowmotty, Assigned: bruant.d)
References
()
Details
:: Developer Documentation Request Request Type: Correction Gecko Version: unspecified Technical Contact: :: Details I am by no means an expert when it comes to dealing with unicode and the multilingual planes. I do have a keyboard plugin with which users contribute keyboard layouts. One such layout is the tamil language which contains a grouping of up to four unicode characters (source: https://github.com/Mottie/Keyboard/blob/master/layouts/tamil.js#L37) "\u0bb6\u0bcd\u0bb0\u0bc0" and "\u0b95\u0bcd\u0bb7" I was trying to use the "getting whole characters" code, but it is only designed to examine a surrogate pair. Would it be wrong to just create a loop looking for the next space? Or are the above character groupings not typical?
Comment 1•10 years ago
|
||
I am by no means an expert with unicode either. Wondering if this is rather a question for platforms like Stackoverflow. We can always improve the examples of the documentation, but I am not sure if many experts in the unicode area will read this bug report. First of all, moving over to JavaScript documentation. Let's see if someone has an idea.
Assignee: eshepherd → bruant.d
Component: General → JavaScript
Whiteboard: c=General u=webdev p=0
Assignee | ||
Comment 2•10 years ago
|
||
I think Tom worked in this area some time ago. Maybe he knows.
Flags: needinfo?(evilpies)
Assignee | ||
Comment 3•10 years ago
|
||
Related : http://www.youtube.com/watch?v=XD_5xDN7KUA
Comment 4•10 years ago
|
||
See <http://mathiasbynens.be/notes/javascript-unicode#other-grapheme-clusters> for the answer to your question. TL;DR You’d need to implement UAX#29’s algorithm for determining grapheme cluster boundaries (http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries) in JavaScript to do this.
Flags: needinfo?(evilpies)
Thanks Mathias! So, as you stated, the example I shared is not a surrogate pair, but a grapheme cluster. Thanks for clarifying. Anyway, I guess that the MDN String.prototype.charAt() page should include some quotes from your talk (very interesting!) and a link to your page instead of the code that is there now. (In reply to Mathias Bynens from comment #4)
Comment 6•9 years ago
|
||
Added a link to Mathias' blog post for now. Feel free to edit the wiki to add more advanced information.
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•