Regenerating nsUnicodeScriptCodes.h with genUnicodePropertyData.pl after the harfbuzz update in 745780 causes lots of failures, as can be seen at https://tbpl.mozilla.org/?tree=Try&rev=db3e14617a49 That push included patches for bug 738101 and other stuff, but the build failures on Linux and reftest oranges on other platforms are all reproducible without any of the changes from there, just by re-running genUnicodePropertyData.pl on a clean tree.
Created attachment 617396 [details] [diff] [review] rerun genUnicodePropertyData.pl Also, the #include of hb-common.h needs to be changed to hb.h
In the recent harfbuzz update, Behdad decided to rearrange the list that defines his HB_SCRIPT_* codes into "script age" and then alphabetical order (whereas previously it matched the script enums found in pango and glib). The rearrangement doesn't affect HB itself, as it's using tags rather than a sequentially-allocated list of codes. However, it means that the genUnicodePropertyData tool finds the codes in a different order, and hence generates different enum values. So this causes a couple of issues for us. First, gfxPangoFonts assumes that MOZ_SCRIPT_* codes are equal to PANGO_SCRIPT_* codes, and can just be cast from one to the other when we want to call into pango. The static assertions in gfxPangoFonts.cpp were included in order to alert us if this assumption were to break. And second, there are a couple of places in gfxScriptItemizer.cpp where we assume that COMMON and INHERITED are the first two codes, as we check script<=INHERITED to see whether a "real" script code has been encountered. This could easily be modified to explicitly check the codes without assuming anything about their ordering. Apart from this, I don't think our code makes any assumptions about the specific values of the MOZ_SCRIPT_* constants. As for the mismatch with pango, I think we have a couple of options. Either we ensure that our script codes match pango's, by abandoning the use of hb-common.h as the source of the ordered list (maybe just hard-code the order in genUnicodePropertyData, and make the tool warn if it finds new HB_SCRIPT constants are defined that we don't know about yet?), or else we simply accept that our codes aren't expected to match pango's, and add a table that maps between them.
Created attachment 617412 [details] [diff] [review] patch, avoid dependency on script code ordering in script-run itemizer If we don't want to maintain a known ordering of the script codes (to match pango and glib), then I believe this (untested) patch would fix the test failures on non-Linux platforms. However, I'm feeling more inclined to say that we should keep the old script codes, and abandon the dependency on how they happen to be listed in hb-common. ISTM that either way, we're going to have to bite the bullet and maintain a list somewhere - either the order of constants we want to use, or the mapping between our arbitrarily-generated ordering and pango's externally-defined one.
Created attachment 617432 [details] [diff] [review] patch, make genUnicodePropertyData tool independent of the listing of script codes in harfbuzz This should avoid the fragile dependency on the harfbuzz header. It still checks that all the script codes defined by harfbuzz are also defined for MOZ_SCRIPT_*, but the constants are no longer dependent on the order of the HB list.
(Note that if we do this, we don't need the patch to gfxScriptRunItemizer.)