Closed
Bug 1495067
Opened 6 years ago
Closed 6 years ago
Drop cap effect layout weirdness (possibly due to presence of BOM)
Categories
(Core :: Internationalization, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: miketaylr, Assigned: hsivonen)
References
()
Details
Attachments
(2 files)
Originally filed @ https://github.com/webcompat/web-bugs/issues/19171.
See screenshot for expected behavior on left, actual behavior on right.
Possibly a regression? But kind of unclear, especially with mozregression pointing at https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ec6bd22c4d0034a2ee2dc12781406f2d74202295&tochange=9a92be6b431a3de68dd8fca2a3a883f3a10750cd.
Forgive the crappy Summary.
Comment 1•6 years ago
|
||
I think the regression range makes sense, actually, though it may be the case that this reproduces with rtl content and is a layout bug.
Component: Layout → Layout: Text and Fonts
Flags: needinfo?(hsivonen)
Assignee | ||
Comment 2•6 years ago
|
||
Ah, the BOM is part of Arabic Presentation Forms-B, which is listed as an RTL block at https://www.unicode.org/roadmaps/bmp/ . So that's why the new code does what it does.
I'm going to investigate why the old code didn't do this.
(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)
> I think the regression range makes sense, actually, though it may be the
> case that this reproduces with rtl content and is a layout bug.
This indeed seems to be the case.
Assignee | ||
Comment 3•6 years ago
|
||
So there are two differences between the new an old code:
1) The new code categorizes Hebrew presentation forms as LTR, because https://www.unicode.org/roadmaps/bmp/ fails to mark them as RTL.
2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the last of these is bidi-neutral.
I'm going to change encoding_rs, but it's unclear to me what the right thing for U+FEFD and U+FEFE is.
jfkthame, do you know or can you guess why the two unassigned code points U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The comments in bug 240943 don't discuss them.
Flags: needinfo?(hsivonen) → needinfo?(jfkthame)
Comment 4•6 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #3)
> So there are two differences between the new an old code:
> 1) The new code categorizes Hebrew presentation forms as LTR, because
> https://www.unicode.org/roadmaps/bmp/ fails to mark them as RTL.
The roadmaps/bmp table isn't a sufficient guide here; it only highlights blocks that default to RTL. The Hebrew presentation forms are individually RTL even though they're in a block that doesn't have this default.
> 2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the
> last of these is bidi-neutral.
>
> I'm going to change encoding_rs, but it's unclear to me what the right thing
> for U+FEFD and U+FEFE is.
Although these codepoints are unassigned, their bidi category should be AL, according to Unicode: see https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBidiClass.txt.
> jfkthame, do you know or can you guess why the two unassigned code points
> U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The
> comments in bug 240943 don't discuss them.
I don't know, but I strongly suspect this was simply an oversight: the range in IS_RTL_PRESENTATION_FORM was based on the characters that are actually assigned, and nobody noticed/cared that these two codepoints at the end of the range were supposed to have an explicit (non-LTR) directionality even though they're unassigned.
For consistency, we should fix them there as well (although it's unlikely to affect any real-world use case, given that they're unassigned).
Flags: needinfo?(jfkthame)
Assignee | ||
Comment 5•6 years ago
|
||
* Update encoding_rs to 0.8.8.
* Change U+FEFD and U+FEFE to RTL in IS_RTL_PRESENTATION_FORM to make the
Rust and C++ code agree on what's RTL.
MozReview-Commit-ID: CuK6fN4pojG
Assignee | ||
Comment 6•6 years ago
|
||
Moving this to another component. Spinning off bug 1495674 for layout.
(In reply to Jonathan Kew (:jfkthame) from comment #4)
> The roadmaps/bmp table isn't a sufficient guide here; it only highlights
> blocks that default to RTL. The Hebrew presentation forms are individually
> RTL even though they're in a block that doesn't have this default.
Looks like they are in a non-block range with a default...
> > 2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the
> > last of these is bidi-neutral.
> >
> > I'm going to change encoding_rs, but it's unclear to me what the right thing
> > for U+FEFD and U+FEFE is.
>
> Although these codepoints are unassigned, their bidi category should be AL,
> according to Unicode: see
> https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBidiClass.txt.
>
> > jfkthame, do you know or can you guess why the two unassigned code points
> > U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The
> > comments in bug 240943 don't discuss them.
>
> I don't know, but I strongly suspect this was simply an oversight: the range
> in IS_RTL_PRESENTATION_FORM was based on the characters that are actually
> assigned, and nobody noticed/cared that these two codepoints at the end of
> the range were supposed to have an explicit (non-LTR) directionality even
> though they're unassigned.
>
> For consistency, we should fix them there as well (although it's unlikely to
> affect any real-world use case, given that they're unassigned).
Thanks. I'll do that.
Component: Layout: Text and Fonts → Internationalization
Assignee | ||
Comment 7•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Assignee | ||
Comment 8•6 years ago
|
||
Comment 9•6 years ago
|
||
Comment on attachment 9013304 [details]
Bug 1495067 - Make HasRTLChars() consider Hebrew presentation forms as RTL (again) and not consider U+FEFF as RTL (again).
Jonathan Kew (:jfkthame) has approved the revision.
Attachment #9013304 -
Flags: review+
Comment 10•6 years ago
|
||
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e9fda19be060
Make HasRTLChars() consider Hebrew presentation forms as RTL (again) and not consider U+FEFF as RTL (again). r=jfkthame
Comment 11•6 years ago
|
||
Pushed by emilio@crisal.io:
https://hg.mozilla.org/integration/autoland/rev/ac55a8f97f06
Enable a test. r=me
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/13320 for changes under testing/web-platform/tests
Can't merge web-platform-tests PR due to failing upstream checks:
Github PR https://github.com/web-platform-tests/wpt/pull/13320
* Taskcluster (pull_request) (https://tools.taskcluster.net/task-group-inspector/#/eH3ywO5aQcCASKkikVrxxg)
Comment 14•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e9fda19be060
https://hg.mozilla.org/mozilla-central/rev/ac55a8f97f06
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Upstream PR merged
You need to log in
before you can comment on or make changes to this bug.
Description
•