Closed Bug 233669 Opened 21 years ago Closed 21 years ago

Korean compatibility Jamo, Kanbun, Bopomofo should be treated as 'syllabic' in line breaking

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jshin1987, Assigned: jshin1987)

Details

(Keywords: intl)

Attachments

(1 file, 2 obsolete files)

Korean compatibility Jamos are not syllables, but as far as line breaking is concerned, they have to be treated as if they're. Kanbun and Bopomofo are syllabic so that they can start and end lines just like CJK ideographs and Korean Hangul syllables are. Patch coming up.
Attached patch one liner (obsolete) — Splinter Review
a very simple patch
I forgot to include Yi radicals and Yi syllables that are also to be treated like CJK ideographs (lines can begin and endi with any of them).
Attachment #141044 - Attachment is obsolete: true
Comment on attachment 141045 [details] [diff] [review] update : include Yi syllables and radicals This is just a trivial case. Simon is on the road so that I'm asking Kat for review. He'd be as good as Simon on this issue. Kat, what I'm doing is to make characters between U+3100 and U+31FF (Kanbun, Hangul Compatibility Jamos and Bopomofo) and Yi radicals and syllables be treated just like CJK ideographs and Hangul syllables are.
Attachment #141045 - Flags: superreview?(bzbarsky)
Attachment #141045 - Flags: review?(momoi)
The range [0x3100, 0x31FF] is a bit more complex than I thought. There are 16 Katakana letters (for Ainu) in [0x31F0, 0x31FF] that are assigned 'NS' in UTR #14 [1]. Characters belonging to 'NS' cannot start a line although they can be treated like ideographs in simpler implementations. [1] http://www.unicode.org/reports/tr14
Attached patch updateSplinter Review
sorry for spamming. Now I'm pretty sure I got them all right.
Attachment #141045 - Attachment is obsolete: true
Attachment #141045 - Flags: superreview?(bzbarsky)
Attachment #141045 - Flags: review?(momoi)
I have one comment and a few questions: 1. + else if ( ( ( 0x3200 <= u) && ( u <= 0xA4BF) ) || // CJK and Yi Yi Radicals go up to 0xA4CF. This should be corrected. 2. Does any document (UTR#14, other UTR's or Unicoede 4 or supplements) address Kanbun as a linebreaking class? I have JIS X 4051 buried somewhere in the house and can't access it easily. I used to use Kanbun characters when reading classical Chinese. At least 'from a handwriting point of view', it should be considered a part of the preceding character and I would not break it before, but only after if breaking is needed. Please provide Unicode documentation on this specific issue. 3. I have not seen UTR#14 specifically address the linebreking class issue of Bopomofo. Was there discussion somewhere on the Unicode Mailing list about this? Can you point me to some documentation on this? (Bopomofo seems similar to Kanbun for Japanese.) I have no problem with adding Korean Compatibility Jamos or Yy Syllables and Radicals to the Syllable class for line breaking. (per UTR#14.)
Yeah, 'A4BF' was a typo. As for Kanbun, I had my own doubt so that I looked them up in the data file accompanying UTR #14. It classified them as 'ID' class (the class CJK ideographs and Hangul syllables belong to). The same is true of Bopomofo. The data file (for Unicode 4.0 repertoire) is available at http://www.unicode.org/Public/4.0-Update/LineBreak-4.0.0.txt If you think Kanbun and Bopomofo have to be treated differently, you may write to the author of UTR #14 or post your opinion to the Unicode mailing list.
So I asked on the Unicode mailing list and got an answer from Ken Whistler: ================================= > In Kanbun reading (classical Chinese), I always thought that these > characters are a part of the preceding character so that a line should > not break before it. For example, 0x3191 is an instruction to skip the > preceding character and read the next character first and then come back > to the preceding character. > > Can someone on the list tell me the rationale for classifying them in > the "ID" class? > The same question for Bopomofo characters. Because taken by themselves there is no particular rationale for them to be treated (by default) any differently than ideographic characters (or kana). When used in actual Kanbun text (or Bopomofo used to annotate Chinese text with pronunciations), you are actually dealing with rich text that has an interlinear format -- more than one line of text elements kept in synch. Line breaking for interlinear text is considerably more complex than just an algorithm for pair determinations of break points in a single line. So it is basically out of scope for the specification in UAX #14. --Ken ================================= So even if we classify Kanbun and Bopomofo as "ID" for line breaking purposes, its applicability is limited to cases of citation as examples, e.g. "'A, B, C, D' are instances of Kanbun characters". Since real uses of Kanbun or Bopomofo are outside the scope of UAX #14, they will be governed by a different rule, which is not our concern here. This and jshin's fix for the Yi Radicals range will address all my concerns and we should be ready to go.
Comment on attachment 141047 [details] [diff] [review] update Asking for r/sr. Thanks, Kat, for checking that with Ken. I suspected that.
Attachment #141047 - Flags: superreview?(bzbarsky)
Attachment #141047 - Flags: review?(momoi)
Jungshik, I'm still seeing 0xA4BF for Yi Radicals rather than 0xA4CF.
I can't edit the attachment, in situ, that has been already uploaded. You have to trust me that I'll fix it when I'm checking in the patch to the CVS repository :-) Actually, I've already fixed it in my local source tree.
Comment on attachment 141047 [details] [diff] [review] update Assuming that the Yi Radicals range problem will be fixed, all my concerns have been addressed.
Attachment #141047 - Flags: review?(momoi) → review+
Comment on attachment 141047 [details] [diff] [review] update sr=bzbarsky
Attachment #141047 - Flags: superreview?(bzbarsky) → superreview+
fix checked into the trunk (I added a comment about a possible future change in the treatment of Kanbun and Bopomofo per Kat's comment off-line)
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: