Closed Bug 924851 Opened 6 years ago Closed 3 years ago

Replace usage of nsBidi.h/cpp with ubidi.h/c from ICU

Categories

(Core :: Layout: Text and Fonts, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: smontagu, Assigned: jfkthame)

References

(Blocks 2 open bugs)

Details

(Keywords: rtl)

Attachments

(3 files, 3 obsolete files)

This should be not too complicated, since nsBidi was based on ubidi in the first place.
Blocks: 922963
Note that I wouldn't want to ship the current ICU we have without updating it to the latest release (bug 924839), because of likely security fixes since then.  If by some chance I don't get to that before the next uplift, I'd likely disable ICU entirely rather than ship an out-of-date version.  So this may want to depend on that bug.  (Also worth noting that http://bugs.icu-project.org/trac/ticket/10043 was a blocker to updating to ICU 51 awhile back -- I don't know if that's since been resolved, or if we'd have to revert anything in local patches to complete an update or not.)
Depends on: 915735
Attached patch w-i-p patchSplinter Review
w-i-p patch.

I didn't know the right build system incantations to make the ICU headers available to core code, so I hackily added them to LOCAL_INCLUDES in Makefile.in in each directory where I needed them.

The build then failed on android and b2g.
Depends on: 864843, 866301
Attachment #8361021 - Attachment description: 924851 → w-i-p patch
> I didn't know the right build system incantations to make the ICU headers
> available to core code, so I hackily added them to LOCAL_INCLUDES in
> Makefile.in in each directory where I needed them.

This is now bug 960505
Depends on: 960505
It looks like this is so close to being done! Waiting for it eagerly.

BTW, does this really depend on bug 864843 and 866301? How?
(In reply to Aharon (Vladimir) Lanin from comment #5)
> BTW, does this really depend on bug 864843 and 866301? How?

I assume the question refers to the "Enable ECMAScript Internationalization API" in the title of those bugs. This doesn't depend on ECMAScript i18n as such, but it does depend on the build system changes in those bugs to allow building with ICU on Android and Firefox OS.
Is this bug still relevant given the comment 11 on issue 922963?
Yes I think so. After all ubidi.c is already part of our source tree via ICU, so even though this bug won't make any difference to functionality after bug 922963 is fixed independently, nsBidi.cpp will still be duplicate code and if we can remove it, we will.
Depends on: 1215247
Depends on: 1215252
We can probably do this: before we eventually have ICU everywhere, we can change nsBidi to simulate the api of ubidi, and link things with ICU when it is available, and nsBidi otherwise; once we have ICU everywhere, we simply remove nsBidi, and no other change would be needed.
Depends on: 1284406
As a possible way forward here, I propose to make the change in two stages: first, we can replace the implementation of nsBidi with ICU functions, while maintaining the existing API for the rest of Gecko to use. Then, after we have verified the behavior is OK, and once we no longer need to support an ICU-free build (currently still needed for Android, but not much longer...), we can just remove the nsBidi "wrapper" and update the callsites to use the ubidi.h API directly. That update should be pretty mechanical, given how simple the nsBidi wrapper around ICU is.
I propose we do this as a first step, to avoid the need to update all the callsites. Then, when this is confirmed to be working and we're ready to enable ICU for all builds, we can remove the wrapper and call the ubidi functions directly wherever needed.
Attachment #8796518 - Flags: review?(xidorn+moz)
Attachment #8795440 - Attachment is obsolete: true
Comment on attachment 8796518 [details] [diff] [review]
When ENABLE_INTL_API is true, make nsBidi into a minimal wrapper around ICU's ubidi.h functions

Review of attachment 8796518 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsBidi_ICU.h
@@ +294,5 @@
> +  static nsresult ReorderVisual(const nsBidiLevel* aLevels, int32_t aLength,
> +                                int32_t* aIndexMap);
> +
> +  NS_DECLARE_FRAME_PROPERTY_SMALL_VALUE(BidiDataProperty,
> +                                        mozilla::FrameBidiData)

I think we should move bidi data property as well as the related static methods to nsIFrame first. Since we would eventually get rid of nsBidi so these methods eventually need to be moved, it makes no sense to duplicate them in two files. Duplicating them also makes it harder to maintain before nsBidi completely gets removed, especially given they are actually not part of the bidi engine.

To move that, you may need to duplicate the typedef of nsBidiLevel in nsIFrame. But that's small, and compilers can ensure the typedefs are identical.
Attachment #8796518 - Flags: review?(xidorn+moz)
Sounds reasonable. OK, here's a preliminary patch that moves the FrameBidiData, along with the nsBidiLevel and nsBidiDirection typedefs, over to nsIFrame in anticipation of nsBidi itself going away.
Attachment #8797229 - Flags: review?(xidorn+moz)
Assignee: smontagu → jfkthame
Status: NEW → ASSIGNED
Attachment #8796518 - Attachment is obsolete: true
Comment on attachment 8797229 [details] [diff] [review]
patch 1 - Move definition of nsBidiLevel/nsBidiDirection and frame bidi properties from nsBidi.h to nsIFrame.h, in preparation for removal of nsBidi

Review of attachment 8797229 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. r=me with the following issues addressed.

::: layout/base/nsBidi.h
@@ -117,5 @@
> -  /** All right-to-left text This is a 1 value. */
> -  NSBIDI_RTL,
> -  /** Mixed-directional text. */
> -  NSBIDI_MIXED
> -};

It doesn't seem necessary to move nsBidiDirection.

::: layout/base/nsBidiPresUtils.cpp
@@ +185,5 @@
>     */
>    nsBidiLevel GetParaLevel()
>    {
>      nsBidiLevel paraLevel = mParaLevel;
> +    if (paraLevel == NSBIDI_DEFAULT_LTR || paraLevel == NSBIDI_DEFAULT_RTL) {

This change doesn't look necessary, either?
Attachment #8797229 - Flags: review?(xidorn+moz) → review+
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #15)
> Comment on attachment 8797229 [details] [diff] [review]
> patch 1 - Move definition of nsBidiLevel/nsBidiDirection and frame bidi
> properties from nsBidi.h to nsIFrame.h, in preparation for removal of nsBidi
> 
> Review of attachment 8797229 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good. r=me with the following issues addressed.
> 
> ::: layout/base/nsBidi.h
> @@ -117,5 @@
> > -  /** All right-to-left text This is a 1 value. */
> > -  NSBIDI_RTL,
> > -  /** Mixed-directional text. */
> > -  NSBIDI_MIXED
> > -};
> 
> It doesn't seem necessary to move nsBidiDirection.

It's not necessary at this point, but if we're expecting to kill nsBidi altogether in the near-ish future it'll need to go somewhere, and it seemed logical for it to accompany nsBidiLevel; I don't see any benefit to leaving them in two separate places.

> ::: layout/base/nsBidiPresUtils.cpp
> @@ +185,5 @@
> >     */
> >    nsBidiLevel GetParaLevel()
> >    {
> >      nsBidiLevel paraLevel = mParaLevel;
> > +    if (paraLevel == NSBIDI_DEFAULT_LTR || paraLevel == NSBIDI_DEFAULT_RTL) {
> 
> This change doesn't look necessary, either?

In patch 2, I didn't include IS_DEFAULT_LEVEL in the ICU-based version of nsBidi, as it's really an internal implementation detail (the same #define exists within ICU, but it's in the internal ubidiimp.h header). This change was to avoid forcing nsBidi_ICU to expose that, when ICU doesn't consider it public API.

So unless you have strong objections, I'd be inclined to leave these two points as they are in the current patch.
(In reply to Jonathan Kew (:jfkthame) from comment #16)
> (In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #15)
> 
> It's not necessary at this point, but if we're expecting to kill nsBidi
> altogether in the near-ish future it'll need to go somewhere, and it seemed
> logical for it to accompany nsBidiLevel; I don't see any benefit to leaving
> them in two separate places.

When we kill nsBidi, it should be replaced by UBiDiDirection, shouldn't it?

> > ::: layout/base/nsBidiPresUtils.cpp
> > @@ +185,5 @@
> > >     */
> > >    nsBidiLevel GetParaLevel()
> > >    {
> > >      nsBidiLevel paraLevel = mParaLevel;
> > > +    if (paraLevel == NSBIDI_DEFAULT_LTR || paraLevel == NSBIDI_DEFAULT_RTL) {
> > 
> > This change doesn't look necessary, either?
> 
> In patch 2, I didn't include IS_DEFAULT_LEVEL in the ICU-based version of
> nsBidi, as it's really an internal implementation detail (the same #define
> exists within ICU, but it's in the internal ubidiimp.h header). This change
> was to avoid forcing nsBidi_ICU to expose that, when ICU doesn't consider it
> public API.

OK, this makes sense.
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #17)
> (In reply to Jonathan Kew (:jfkthame) from comment #16)
> > (In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #15)
> > 
> > It's not necessary at this point, but if we're expecting to kill nsBidi
> > altogether in the near-ish future it'll need to go somewhere, and it seemed
> > logical for it to accompany nsBidiLevel; I don't see any benefit to leaving
> > them in two separate places.
> 
> When we kill nsBidi, it should be replaced by UBiDiDirection, shouldn't it?

Yes, it will (just as nsBidiLevel will be replaced by UBiDiLevel). I can leave it in nsBidi.h for now if you prefer, it just seemed logical to me that we'd keep these two together.
Comment on attachment 8797230 [details] [diff] [review]
patch 2 - When ENABLE_INTL_API is true, make nsBidi into a minimal wrapper around ICU's ubidi.h functions

Review of attachment 8797230 [details] [diff] [review]:
-----------------------------------------------------------------

Canceling r? for now for the issue around nsBidi::GetCharTypeAt(). This patch looks good to me otherwise.

(We may want to follow the coding style that return type is in its own line for function definitions. But given the old nsBidi uses a different coding style and we expect to remove nsBidi_ICU in foreseeable future, I'm probably fine with leaving this style as-is.)

::: layout/base/moz.build
@@ +174,5 @@
> +    ]
> +else:
> +    EXPORTS += [
> +        'nsBidi_noICU.h',
> +    ]    

Please remove the tailing whitespaces.

::: layout/base/nsBidi_ICU.cpp
@@ +37,5 @@
> +}
> +
> +nsresult nsBidi::GetParaLevel(nsBidiLevel* aParaLevel)
> +{
> +  *aParaLevel = reinterpret_cast<nsBidiLevel>(ubidi_getParaLevel(mBiDi));

There is no need for reinterpret_cast. nsBidiLevel and UBiDiLevel are the same type (both are uint8_t).

@@ +54,5 @@
> +      aCharIndex + 1 < len &&
> +      NS_IS_LOW_SURROGATE(text[aCharIndex + 1])) {
> +    ch = SURROGATE_TO_UCS4(ch, text[aCharIndex + 1]);
> +  }
> +  *pType = (nsCharType)ubidi_getCustomizedClass(mBiDi, ch);

I'm concerned about this method. It is not an identical replacement of the current nsBidi::GetCharTypeAt().

The current nsBidi::GetCharTypeAt() fetches the value from its underlying buffer mDirProps, and according to my reading of the nsBidi code, the data in this buffer already resolves W2 which converts EN to AN when necessary, and also convert the rest of EN to its internal value ENL or ENR. [1] It also indicates that the current code path handling the same conversion in nsBidiPresUtils::CalculateCharType() [2] and the number formatting in nsBidiPresUtils::FormatUnicodeText() [3] may not be in effect at all.

Having said that, I wouldn't expect this change introduces regression given we have the handling code elsewhere. Contrarily, it may actually fix some issues. But I'm concerned about doing so in this bug. In my understanding, this bug should be doing just a straightforward conversion from the old impl to the new ICU-based impl.

I suggest we handle this in a new bug and land that separately, so that it would be easier to locate any regression (or improvement) in the future in case that happens, rather than hiding this behavior change in a refactoring patch. Probably that bug can simply remove nsBidi::GetCharTypeAt() completely and replace its callsites (actually only one callsite) with a direct call to GetBidiCat().

Could you do that? Please tell me if I missed something.

It may also be worth checking whether the code paths I mentioned above are actually in effect. If yes, how does they work, and if no, are the behavior desired? And in the latter case, if the behavior is desired, probably we should add test so that they wouldn't be regressed again, and if the behavior is not desired, probably we should update the code accordingly.


[1] https://dxr.mozilla.org/mozilla-central/rev/42c95d88aaaa7c2eca1d278399421d437441ac4d/layout/base/nsBidi.cpp#869-885
[2] https://dxr.mozilla.org/mozilla-central/rev/955840bfd3c20eb24dd5a01be27bdc55c489a285/layout/base/nsBidiPresUtils.cpp#1941-1945
[3] https://dxr.mozilla.org/mozilla-central/rev/955840bfd3c20eb24dd5a01be27bdc55c489a285/layout/base/nsBidiPresUtils.cpp#1814-1816,1830-1831,1837-1838
Attachment #8797230 - Flags: review?(xidorn+moz)
Depends on: 1307842
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #19)
> @@ +54,5 @@
> > +      aCharIndex + 1 < len &&
> > +      NS_IS_LOW_SURROGATE(text[aCharIndex + 1])) {
> > +    ch = SURROGATE_TO_UCS4(ch, text[aCharIndex + 1]);
> > +  }
> > +  *pType = (nsCharType)ubidi_getCustomizedClass(mBiDi, ch);
> 
> I'm concerned about this method. It is not an identical replacement of the
> current nsBidi::GetCharTypeAt().
> 
> The current nsBidi::GetCharTypeAt() fetches the value from its underlying
> buffer mDirProps, ...

Right, it's not identical; this was because ubidi doesn't expose the internal dirProps in the public API. AFAICT, we don't actually need it, the plain bidi category is fine (because the calling code in nsBidiPresUtils does its own special-case handling of number categories).

But you're right, it would be better to separate out a change here, so I've filed bug 1307842 to do that.
Updated patch, assuming GetCharTypeAt() is removed by bug 1307842.
Attachment #8798147 - Flags: review?(xidorn+moz)
Attachment #8797230 - Attachment is obsolete: true
Comment on attachment 8798147 [details] [diff] [review]
patch 2 - When ENABLE_INTL_API is true, make nsBidi into a minimal wrapper around ICU's ubidi.h functions

Review of attachment 8798147 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks.
Attachment #8798147 - Flags: review?(xidorn+moz) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/44a6dbfa02464bababf944fc1ee8c0889b33c82a
Bug 924851 - patch 1 - Move definition of nsBidiLevel/nsBidiDirection and frame bidi properties from nsBidi.h to nsIFrame.h, in preparation for removal of nsBidi. r=xidorn

https://hg.mozilla.org/integration/mozilla-inbound/rev/38b8cf854279011a0290a36dd1e6dabcee39406d
Bug 924851 - patch 2 - When ENABLE_INTL_API is true, make nsBidi into a minimal wrapper around ICU's ubidi.h functions. r=xidorn
No longer blocks: 1288761
No longer depends on: 1215247
https://hg.mozilla.org/mozilla-central/rev/44a6dbfa0246
https://hg.mozilla.org/mozilla-central/rev/38b8cf854279
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.