Closed Bug 1495067 Opened Last year Closed Last year

Drop cap effect layout weirdness (possibly due to presence of BOM)

Categories

(Core :: Internationalization, defect, P3)

63 Branch
Unspecified
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox64 --- fixed

People

(Reporter: miketaylr, Assigned: hsivonen)

References

()

Details

Attachments

(2 files)

Originally filed @ https://github.com/webcompat/web-bugs/issues/19171.

See screenshot for expected behavior on left, actual behavior on right.

Possibly a regression? But kind of unclear, especially with mozregression pointing at https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ec6bd22c4d0034a2ee2dc12781406f2d74202295&tochange=9a92be6b431a3de68dd8fca2a3a883f3a10750cd.

Forgive the crappy Summary.
I think the regression range makes sense, actually, though it may be the case that this reproduces with rtl content and is a layout bug.
Component: Layout → Layout: Text and Fonts
Flags: needinfo?(hsivonen)
Ah, the BOM is part of Arabic Presentation Forms-B, which is listed as an RTL block at https://www.unicode.org/roadmaps/bmp/ . So that's why the new code does what it does.

I'm going to investigate why the old code didn't do this.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #1)
> I think the regression range makes sense, actually, though it may be the
> case that this reproduces with rtl content and is a layout bug.

This indeed seems to be the case.
So there are two differences between the new an old code:
 1) The new code categorizes Hebrew presentation forms as LTR, because https://www.unicode.org/roadmaps/bmp/ fails to mark them as RTL.
 2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the last of these is bidi-neutral.

I'm going to change encoding_rs, but it's unclear to me what the right thing for U+FEFD and U+FEFE is.

jfkthame, do you know or can you guess why the two unassigned code points U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The comments in bug 240943 don't discuss them.
Flags: needinfo?(hsivonen) → needinfo?(jfkthame)
(In reply to Henri Sivonen (:hsivonen) from comment #3)
> So there are two differences between the new an old code:
>  1) The new code categorizes Hebrew presentation forms as LTR, because
> https://www.unicode.org/roadmaps/bmp/ fails to mark them as RTL.

The roadmaps/bmp table isn't a sufficient guide here; it only highlights blocks that default to RTL. The Hebrew presentation forms are individually RTL even though they're in a block that doesn't have this default.

>  2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the
> last of these is bidi-neutral.
> 
> I'm going to change encoding_rs, but it's unclear to me what the right thing
> for U+FEFD and U+FEFE is.

Although these codepoints are unassigned, their bidi category should be AL, according to Unicode: see https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBidiClass.txt.

> jfkthame, do you know or can you guess why the two unassigned code points
> U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The
> comments in bug 240943 don't discuss them.

I don't know, but I strongly suspect this was simply an oversight: the range in IS_RTL_PRESENTATION_FORM was based on the characters that are actually assigned, and nobody noticed/cared that these two codepoints at the end of the range were supposed to have an explicit (non-LTR) directionality even though they're unassigned.

For consistency, we should fix them there as well (although it's unlikely to affect any real-world use case, given that they're unassigned).
Flags: needinfo?(jfkthame)
* Update encoding_rs to 0.8.8.
 * Change U+FEFD and U+FEFE to RTL in IS_RTL_PRESENTATION_FORM to make the
   Rust and C++ code agree on what's RTL.

MozReview-Commit-ID: CuK6fN4pojG
See Also: → 1495674
Moving this to another component. Spinning off bug 1495674 for layout.

(In reply to Jonathan Kew (:jfkthame) from comment #4)
> The roadmaps/bmp table isn't a sufficient guide here; it only highlights
> blocks that default to RTL. The Hebrew presentation forms are individually
> RTL even though they're in a block that doesn't have this default.

Looks like they are in a non-block range with a default...

> >  2) The new code categorizes U+FEFD, U+FEFE and U+FEFF as RTL although the
> > last of these is bidi-neutral.
> > 
> > I'm going to change encoding_rs, but it's unclear to me what the right thing
> > for U+FEFD and U+FEFE is.
> 
> Although these codepoints are unassigned, their bidi category should be AL,
> according to Unicode: see
> https://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedBidiClass.txt.
> 
> > jfkthame, do you know or can you guess why the two unassigned code points
> > U+FEFD and U+FEFE aren't treated as RTL by IS_RTL_PRESENTATION_FORM? The
> > comments in bug 240943 don't discuss them.
> 
> I don't know, but I strongly suspect this was simply an oversight: the range
> in IS_RTL_PRESENTATION_FORM was based on the characters that are actually
> assigned, and nobody noticed/cared that these two codepoints at the end of
> the range were supposed to have an explicit (non-LTR) directionality even
> though they're unassigned.
> 
> For consistency, we should fix them there as well (although it's unlikely to
> affect any real-world use case, given that they're unassigned).

Thanks. I'll do that.
Component: Layout: Text and Fonts → Internationalization
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Comment on attachment 9013304 [details]
Bug 1495067 - Make HasRTLChars() consider Hebrew presentation forms as RTL (again) and not consider U+FEFF as RTL (again).

Jonathan Kew (:jfkthame) has approved the revision.
Attachment #9013304 - Flags: review+
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e9fda19be060
Make HasRTLChars() consider Hebrew presentation forms as RTL (again) and not consider U+FEFF as RTL (again). r=jfkthame
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/13320 for changes under testing/web-platform/tests
Can't merge web-platform-tests PR due to failing upstream checks:
Github PR https://github.com/web-platform-tests/wpt/pull/13320
* Taskcluster (pull_request) (https://tools.taskcluster.net/task-group-inspector/#/eH3ywO5aQcCASKkikVrxxg)
https://hg.mozilla.org/mozilla-central/rev/e9fda19be060
https://hg.mozilla.org/mozilla-central/rev/ac55a8f97f06
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
You need to log in before you can comment on or make changes to this bug.