Closed
Bug 233669
Opened 21 years ago
Closed 21 years ago
Korean compatibility Jamo, Kanbun, Bopomofo should be treated as 'syllabic' in line breaking
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
People
(Reporter: jshin1987, Assigned: jshin1987)
Details
(Keywords: intl)
Attachments
(1 file, 2 obsolete files)
2.02 KB,
patch
|
momoi
:
review+
bzbarsky
:
superreview+
|
Details | Diff | Splinter Review |
Korean compatibility Jamos are not syllables, but as far as line breaking is
concerned, they have to be treated as if they're.
Kanbun and Bopomofo are syllabic so that they can start and end lines just like
CJK ideographs and Korean Hangul syllables are.
Patch coming up.
Assignee | ||
Comment 1•21 years ago
|
||
a very simple patch
Assignee | ||
Comment 2•21 years ago
|
||
I forgot to include Yi radicals and Yi syllables that are also to be treated
like CJK ideographs (lines can begin and endi with any of them).
Assignee | ||
Updated•21 years ago
|
Attachment #141044 -
Attachment is obsolete: true
Assignee | ||
Comment 3•21 years ago
|
||
Comment on attachment 141045 [details] [diff] [review]
update : include Yi syllables and radicals
This is just a trivial case. Simon is on the road so that I'm asking Kat for
review. He'd be as good as Simon on this issue.
Kat, what I'm doing is to make characters between U+3100 and U+31FF (Kanbun,
Hangul Compatibility Jamos and Bopomofo) and Yi radicals and syllables be
treated just like CJK ideographs and Hangul syllables are.
Attachment #141045 -
Flags: superreview?(bzbarsky)
Attachment #141045 -
Flags: review?(momoi)
Assignee | ||
Comment 4•21 years ago
|
||
The range [0x3100, 0x31FF] is a bit more complex than I thought. There are 16
Katakana letters (for Ainu) in [0x31F0, 0x31FF] that are assigned 'NS' in UTR
#14 [1]. Characters belonging to 'NS' cannot start a line although they can be
treated like ideographs in simpler implementations.
[1] http://www.unicode.org/reports/tr14
Assignee | ||
Comment 5•21 years ago
|
||
sorry for spamming. Now I'm pretty sure I got them all right.
Attachment #141045 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #141045 -
Flags: superreview?(bzbarsky)
Attachment #141045 -
Flags: review?(momoi)
Comment 6•21 years ago
|
||
I have one comment and a few questions:
1. + else if ( ( ( 0x3200 <= u) && ( u <= 0xA4BF) ) || // CJK and Yi
Yi Radicals go up to 0xA4CF. This should be corrected.
2. Does any document (UTR#14, other UTR's or Unicoede 4 or supplements)
address Kanbun as a linebreaking class? I have JIS X 4051 buried
somewhere in the house and can't access it easily.
I used to use Kanbun characters when reading classical Chinese. At
least 'from a handwriting point of view', it should be considered a part
of the preceding character and I would not break it before, but only
after if breaking is needed. Please provide Unicode documentation
on this specific issue.
3. I have not seen UTR#14 specifically address the linebreking class
issue of Bopomofo. Was there discussion somewhere on the Unicode
Mailing list about this? Can you point me to some documentation
on this? (Bopomofo seems similar to Kanbun for Japanese.)
I have no problem with adding Korean Compatibility Jamos or Yy Syllables
and Radicals to the Syllable class for line breaking. (per UTR#14.)
Assignee | ||
Comment 7•21 years ago
|
||
Yeah, 'A4BF' was a typo. As for Kanbun, I had my own doubt so that I looked them
up in the data file accompanying UTR #14. It classified them as 'ID' class (the
class CJK ideographs and Hangul syllables belong to). The same is true of
Bopomofo. The data file (for Unicode 4.0 repertoire) is available at
http://www.unicode.org/Public/4.0-Update/LineBreak-4.0.0.txt
If you think Kanbun and Bopomofo have to be treated differently, you may write
to the author of UTR #14 or post your opinion to the Unicode mailing list.
Comment 8•21 years ago
|
||
So I asked on the Unicode mailing list and got an answer from
Ken Whistler:
=================================
> In Kanbun reading (classical Chinese), I always thought that these
> characters are a part of the preceding character so that a line should
> not break before it. For example, 0x3191 is an instruction to skip the
> preceding character and read the next character first and then come back
> to the preceding character.
>
> Can someone on the list tell me the rationale for classifying them in
> the "ID" class?
> The same question for Bopomofo characters.
Because taken by themselves there is no particular rationale for
them to be treated (by default) any differently than ideographic
characters (or kana).
When used in actual Kanbun text (or Bopomofo used to annotate
Chinese text with pronunciations), you are actually dealing with
rich text that has an interlinear format -- more than one line
of text elements kept in synch. Line breaking for interlinear
text is considerably more complex than just an algorithm for
pair determinations of break points in a single line. So it is
basically out of scope for the specification in UAX #14.
--Ken
=================================
So even if we classify Kanbun and Bopomofo as "ID" for line breaking
purposes, its applicability is limited to cases of citation as examples,
e.g. "'A, B, C, D' are instances of Kanbun characters". Since real
uses of Kanbun or Bopomofo are outside the scope of UAX #14, they will
be governed by a different rule, which is not our concern here.
This and jshin's fix for the Yi Radicals range will address all my
concerns and we should be ready to go.
Assignee | ||
Comment 9•21 years ago
|
||
Comment on attachment 141047 [details] [diff] [review]
update
Asking for r/sr.
Thanks, Kat, for checking that with Ken. I suspected that.
Attachment #141047 -
Flags: superreview?(bzbarsky)
Attachment #141047 -
Flags: review?(momoi)
Comment 10•21 years ago
|
||
Jungshik, I'm still seeing 0xA4BF for Yi Radicals rather than
0xA4CF.
Assignee | ||
Comment 11•21 years ago
|
||
I can't edit the attachment, in situ, that has been already uploaded. You have
to trust me that I'll fix it when I'm checking in the patch to the CVS
repository :-) Actually, I've already fixed it in my local source tree.
Comment 12•21 years ago
|
||
Comment on attachment 141047 [details] [diff] [review]
update
Assuming that the Yi Radicals range problem will be fixed, all my concerns have
been addressed.
Attachment #141047 -
Flags: review?(momoi) → review+
![]() |
||
Comment 13•21 years ago
|
||
Comment on attachment 141047 [details] [diff] [review]
update
sr=bzbarsky
Attachment #141047 -
Flags: superreview?(bzbarsky) → superreview+
Assignee | ||
Comment 14•21 years ago
|
||
fix checked into the trunk (I added a comment about a possible future change in
the treatment of Kanbun and Bopomofo per Kat's comment off-line)
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•