Closed Bug 631479 Opened 13 years ago Closed 13 years ago

Add Graphite2 to gecko

Categories

(Core :: Layout: Text and Fonts, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: martin_hosken, Assigned: jfkthame)

References

(Blocks 1 open bug)

Details

(Whiteboard: [completed secreview])

Attachments

(5 files, 30 obsolete files)

10.73 KB, patch
jtd
: review+
Details | Diff | Splinter Review
2.83 MB, patch
jfkthame
: review+
Details | Diff | Splinter Review
561.30 KB, patch
jtd
: review+
Details | Diff | Splinter Review
8.50 KB, patch
khuey
: review+
Details | Diff | Splinter Review
460.97 KB, patch
jtd
: review+
Details | Diff | Splinter Review
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-GB; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10 (maverick) Firefox/3.6.13
Build Identifier: 4.0.0

Graphite is a smart font technology used particularly in fonts for scripts that do not yet have good OpenType support. The engine has recently been rewritten as graphite2 and is about to reach 1.0. The improvements in the engine include a 10x speed up to be comparable with any OpenType engine and security enhancements to stop malicious fonts from crashing or deadlocking the application. The rewrite involves no changes to existing fonts.

Reproducible: Always
Just a note that I've got some WIP here - will be posting patches shortly, after more testing.
Assignee: nobody → jfkthame
Status: UNCONFIRMED → NEW
Ever confirmed: true
Jonathan, what exactly is "Graphite2" spec'ed to be?  On the SIL Graphite site I see binary formats 2.0 and 3.0.  Or is the "2" aspect refer to a source version?
(In reply to comment #3)
> Jonathan, what exactly is "Graphite2" spec'ed to be?  On the SIL Graphite site
> I see binary formats 2.0 and 3.0.  Or is the "2" aspect refer to a source
> version?

See http://projects.palaso.org/projects/graphitedev. It's a rewrite of the actual shaping code (maintaining font-table compatibility, so existing fonts work unchanged), aiming at a lighter-weight, high-performance, robust engine, and an API that fits better with typical client applications than the original graphite implementation.
This is the graphite2 rendering engine from upstream, current hg snapshot - slightly newer than the 0.9.3 release tarball. (They'll put out a new release for us when we're ready to actually incorporate this.)
Attachment #530849 - Flags: review?(jdaggett)
Minor patch (removing a stray semicolon); expect this to be fixed in the next upstream snapshot.
Attachment #530850 - Flags: review?(jdaggett)
Attachment #530851 - Flags: review?(jdaggett)
OTS doesn't understand graphite tables, but we need it to preserve them so that downloadable fonts work properly. The responsibility for robustly handling/rejecting bad tables lies with the graphite code. Note that graphite tables are not used or examined at all by other platform shaping engines, so allowing them to remain in the font does not represent a risk there.
Attachment #530853 - Flags: review?(jdaggett)
As this stands, this patch still creates a harfbuzz shaper as well as the graphite one, so that we'll have a fallback path in case graphite shaping returns an error. This fallback shaper will often be redundant, I expect. We could defer creation of the alternate shaper, and only instantiate it if actually needed, but I'm not sure it's worth the trouble for now. Note that the actual shaper object is pretty cheap, and also that the redundancy only occurs for the small minority of fonts that actually have graphite tables present.
Attachment #530855 - Flags: review?(jdaggett)
Test that graphite is actually working, including support for language-dependent shaping and user-specified features.
Attachment #530856 - Flags: review?(jdaggett)
Argh, attached the wrong version of the patch (that one didn't handle FT2Fonts properly, so graphite didn't actually work on Android). Now fixed.
Attachment #530855 - Attachment is obsolete: true
Attachment #530859 - Flags: review?(jdaggett)
Attachment #530855 - Flags: review?(jdaggett)
What's the timeline on this bug? When would you like a new gr2 source release? I'm trying to do some gr2 project org. TIA.
Comment on attachment 530853 [details] [diff] [review]
part 3 - let OTS pass through graphite tables untouched

How much extra work would it be to do the work to sanitize these tables?  Is your thought that the graphite code is responsible for always treating these tables as tainted?
Comment on attachment 530853 [details] [diff] [review]
part 3 - let OTS pass through graphite tables untouched

+#ifdef MOZ_GRAPHITE
+#define GFX_DOWNLOADABLE_FONTS_SANITIZE_PRESERVE_GRAPHITE \
+            "gfx.downloadable_fonts.sanitize.preserve_graphite_tables"
+#endif
+

I don't see the need for an additional pref here, there's already a boolean
pref to enable/disable Graphite shaping, that's sufficient.  Adding another
pref adds complexity unnecessarily.
(In reply to comment #13)
> Comment on attachment 530853 [details] [diff] [review] [review]
> part 3 - let OTS pass through graphite tables untouched
> 
> How much extra work would it be to do the work to sanitize these tables?  Is
> your thought that the graphite code is responsible for always treating these
> tables as tainted?

That is my assumption as the graphite library maintainer. I don't expect application contexts to have to sanitize some rather complex tables. Instead graphite2 returns a null face if the graphite tables are corrupt in some way, or else handles the situation during rendering giving a garbage in garbage out result.

A quick way to decide whether it is worth passing a font to graphite for potential loading is to check if the font has a Silf table in it. I have added this to gr_make_face so that it will fail quickly for non-graphite fonts.
(In reply to comment #15)
> (In reply to comment #13)
> > Comment on attachment 530853 [details] [diff] [review] [review] [review]
> > part 3 - let OTS pass through graphite tables untouched
> > 
> > How much extra work would it be to do the work to sanitize these tables?  Is
> > your thought that the graphite code is responsible for always treating these
> > tables as tainted?
> 
> That is my assumption as the graphite library maintainer. I don't expect
> application contexts to have to sanitize some rather complex tables. Instead
> graphite2 returns a null face if the graphite tables are corrupt in some
> way, or else handles the situation during rendering giving a garbage in
> garbage out result.

Ok, so the assumption is that OTS sanitizing is unnecessary because the Graphite code is doing a sanitization process internally.
Documentation/background page for Graphite
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&cat_id=RenderingGraphite

Martin, what is the current state of the attached code compared to the documentation on the page above?  For example, does this code include full support for version 3.0 of the file format?  Are there any additional changes/differences/updates that the documentation page doesn't reflect (the most recent file on that page is dated 2007).
Attachment #530851 - Flags: review?(jdaggett) → review+
Yes Graphite2 fully supports versions 2-4 of the GTF spec (v4 adds a Glat v2 for larger numbers of glyph attributes and may be found in the source repo).
Attachment #530849 - Attachment is obsolete: true
Attachment #530850 - Attachment is obsolete: true
Attachment #542147 - Flags: review?(jdaggett)
Attachment #530849 - Flags: review?(jdaggett)
Attachment #530850 - Flags: review?(jdaggett)
Attachment #530851 - Attachment is obsolete: true
Attachment #542148 - Flags: review?(jdaggett)
Attachment #530853 - Attachment is obsolete: true
Attachment #542149 - Flags: review?(jdaggett)
Attachment #530853 - Flags: review?(jdaggett)
Attachment #542149 - Attachment description: let OTS pass through graphite tables untouched (removed option in prefs) → part 3 - let OTS pass through graphite tables untouched (removed option in prefs)
Attachment #530859 - Attachment is obsolete: true
Attachment #542150 - Flags: review?(jdaggett)
Attachment #530859 - Flags: review?(jdaggett)
(In reply to comment #14)
> +#ifdef MOZ_GRAPHITE
> +#define GFX_DOWNLOADABLE_FONTS_SANITIZE_PRESERVE_GRAPHITE \
> +            "gfx.downloadable_fonts.sanitize.preserve_graphite_tables"
> +#endif

> I don't see the need for an additional pref here, there's already a boolean
> pref to enable/disable Graphite shaping, that's sufficient.  Adding another
> pref adds complexity unnecessarily.

Fine, I've removed this in the updated copy of the patch. The actual OTS patch still handles this as an option, so that in builds without graphite at all, it's not forced to preserve useless tables, but the gfx/thebes code just passes a constant instead of checking a pref.
Our Windows build didn't like the updated makefile using graphite's files.mk, because of the form of the source paths it returns. Fixed here.
Attachment #542148 - Attachment is obsolete: true
Attachment #542424 - Flags: review?(jdaggett)
Attachment #542148 - Flags: review?(jdaggett)
Updated reftests and manifest to deal with a tiny subpixel-AA discrepancy under D2D. This ran fully green on all tryserver platforms.
Attachment #530856 - Attachment is obsolete: true
Attachment #542803 - Flags: review?(jdaggett)
Attachment #530856 - Flags: review?(jdaggett)
Oops, missed a prefs check in the GDI backend, so that turning off Graphite didn't actually work on XP; now fixed.
Attachment #542150 - Attachment is obsolete: true
Attachment #542936 - Flags: review?(jdaggett)
Attachment #542150 - Flags: review?(jdaggett)
I'm all for Graphite getting in to Gecko, but I want to make sure we look before we leap. Specifically, before we turn graphite on by default, I think we have to have a full security review and probably some fuzzing done on it. (Whatever the security review says, anyways.)

The fact that graphite has a small virtual machine in it sort of terrifies me from a security POV :)
(Sorry for interrupting this bug, but am I correct that there are only about a dozen "Graphite enabled" fonts? Is there a feature page or similar document explaining what benefits we get by taking this code?)
(In reply to comment #28)
> I'm all for Graphite getting in to Gecko, but I want to make sure we look
> before we leap. Specifically, before we turn graphite on by default, I think
> we have to have a full security review and probably some fuzzing done on it.
> (Whatever the security review says, anyways.)
> 
> The fact that graphite has a small virtual machine in it sort of terrifies
> me from a security POV :)

There is a fuzztest that comes as part of the graphite2 code and I've run it over a few fonts (it randomly changes values in the font looking to see if the engine crashes or hangs) and it runs clean for graphite tables. It's not possible to do exhaustive testing, but it does give some measure of confidence of robustness and also makes it easier for others to hammer on the engine and shake out more bugs.
(In reply to comment #29)
> (Sorry for interrupting this bug, but am I correct that there are only about
> a dozen "Graphite enabled" fonts? Is there a feature page or similar
> document explaining what benefits we get by taking this code?)

There is part of an answer here: http://scripts.sil.org/GraphiteFAQ which says why we developed Graphite. Over the years, I have wondered whether Graphite should just die. But I am noticing that there is an increased interest in Graphite from those wishing to have a consistent rendering behaviour across all platforms. With so many OT engines out there, it is hard to get a font to behave the same in all of them.

Other advantages are things like the ability to have user features, complete control of the shaping process by the font engineer, allowing for high end typographical effects and complex rendering to be done quickly. In my experience it is much quicker to develop a graphite font than an opentype one (amazingly) because the font engineer doesn't have to learn the shaping layer as well as the encoding layer. Mind you this is for more complex scripts.

So while the niche of Graphite for majority script support may be reducing, the ability for it to meet minority script needs, give consistent behaviour across all platforms and to give complete shaping control to developers means that it still has considerable value to give.

As to why there are so few fonts out there. That is something of a catch 22 question. No good application support means no incentive to add Graphite tables to a font. The fonts that do exist tend to be reference standard implementations for scripts that support all known writing systems for that script. So these fonts are key in providing underlying script support for minority language content on the net.
graphite2 v1.0.0 has been released available from http://projects.palaso.org/projects/graphitedev/files and http://sourceforge.net/projects/silgraphite/files/graphite2-1.0.0.tgz/download

Bug fixes, bidi support (that gecko won't want), bidi mirroring support (when the compiler catches up), smaller memory footprint. All round general goodness added :)
I notice that this has been whiteboarded. I have done all I can to facilitate a security review including running fuzz testing. What constitutes a completed review? How can we say that the review is done? It sounds like an open ended, never to be completable task to me.

I really really would like to see this in FF6 or in gecko as soon as possible, please.

New version at http://sourceforge.net/projects/silgraphite/files/graphite2-1.0.1.tgz/download
(In reply to martin_hosken from comment #33)
> I really really would like to see this in FF6 or in gecko as soon as
> possible, please.

Firefox 6 is basically done. Firefox 7 is only taking stabilizing fixes. The Firefox 8 development cycle is one week away from wrapping up and moving into stabilization. Assuming this is something we do end up taking, a realistic first chance for this feature is Firefox 9.
> Firefox 6 is basically done. Firefox 7 is only taking stabilizing fixes. The
> Firefox 8 development cycle is one week away from wrapping up and moving
> into stabilization. Assuming this is something we do end up taking, a
> realistic first chance for this feature is Firefox 9.

Well it's not going into any version until this security review is done. I would like to point out that for the development team, a fuzztest constitutes corrupting each byte in a font, in turn, and running the engine over some text to see if it crashes or hangs for each byte corruption. A clear test is when a complete pass of corruptions, over a font causes no crashes or hangs. The tools to do this are included in the source.

If there are some criteria for what a security review involves or means, we are happy to help. I'm just scared that graphite will never escape this review phase because there is no way of declaring something absolutely secure.
The sec-review-needed keyword is used by the security team to monitor items we think may need to be looked at in a variety of ways. This does not stop a feature from landing, shipping or otherwise being worked on at this time.

This bug was whiteboarded as the team triaged all bugs marked as sec-review-needed to determine if a bug or item needed security tasks, if so what task or if the keyword needed to be removed. In the case of this bug responsibility for driving of security tasks is being handled by bsterne with the concept of reviewing the fuzzing or preforming more fuzzing for the patches. This could mean a variety of things with fuzzers or the fuzzing@ group.

As for landing and prioritization of the feature that is outside the security team's control. And per the feature page (https://wiki.mozilla.org/Features/Platform/Graphite_font_shaping) this feature is still unprioritized by the module owner. 

As for engaging the security team for a review the onus is on the feature and patch owner to engage the security team. I would suggest you review the recent mail sent to the dev-planning list (https://groups.google.com/group/mozilla.dev.planning/browse_thread/thread/d8299dd16187909/8fdeaf8f84b7fa2a?lnk=gst&q=koenig#8fdeaf8f84b7fa2a).

As to what constitutes a completed review, that varies from bug to bug and feature to feature. Some items need more in depth analysis than others. In the case of this particular bug it we skipped a large design discussion in favor of fuzz testing and maybe an implementation review. It does have a defined end. In the case of this bug it is not security that is holding up landing or shipping of the item.
Updated the graphite2 code to release 1.0.1.
Attachment #542147 - Attachment is obsolete: true
Attachment #552350 - Flags: review?(jdaggett)
Attachment #542147 - Flags: review?(jdaggett)
Updated for the new Graphite release.
Attachment #542424 - Attachment is obsolete: true
Attachment #552351 - Flags: review?(jdaggett)
Attachment #542424 - Flags: review?(jdaggett)
Attachment #542936 - Attachment is obsolete: true
Attachment #552352 - Flags: review?(jdaggett)
Attachment #542936 - Flags: review?(jdaggett)
Christoph is going to take a shot at fuzzing this library.
Whiteboard: [sr:bsterne] → [sr:christoph fuzzing]
Whiteboard: [sr:christoph fuzzing] → [secr:christoph fuzzing]
Updated the Graphite2 code to pick up fixes for fuzzbugs 682204 and 682423.
Attachment #552350 - Attachment is obsolete: true
Attachment #556272 - Flags: review?(jdaggett)
Attachment #552350 - Flags: review?(jdaggett)
Minor un-bitrotting.
Attachment #552351 - Attachment is obsolete: true
Attachment #556273 - Flags: review?(jdaggett)
Attachment #552351 - Flags: review?(jdaggett)
Updated graphite code to pick up latest bugfix.
Attachment #556272 - Attachment is obsolete: true
Attachment #557118 - Flags: review?(jdaggett)
Attachment #556272 - Flags: review?(jdaggett)
Un-bitrotted again.
Attachment #556273 - Attachment is obsolete: true
Attachment #557119 - Flags: review?(jdaggett)
Attachment #556273 - Flags: review?(jdaggett)
Argh, previous version of part 1 was bad (failed to "hg add" a file). This one should be correct.
Attachment #557118 - Attachment is obsolete: true
Attachment #557126 - Flags: review?(jdaggett)
Attachment #557118 - Flags: review?(jdaggett)
Blocks: 682204
jdaggett, review ping? It'd be much appreciated if we could get this into the tree (initially preffed-off, I assume) prior to the next Aurora merge, so that interested parties can begin wider testing.
It would really help us if the graphite integration could be merged into trunk before the FF9 branch. All the security review bugs have been fixed and there has been nothing more raised for 2 weeks. Integration into trunk does not reduce our commitment for responsive bug fixing, should such things arise,
Have we reached a consensus that we actually want the feature? Is there any Webfonts working group consensus regarding it?
Jonathan and I agree that the Web needs this. I don't know who else needs to be part of that consensus. I guess John Daggett could chime in.

It hasn't been discussed on the web-fonts working group. I suppose we could raise it there, but we'll get more noise than signal.
But anyway, Jonathan, why don't you raise the possibility of including Graphite in browsers in www-font?

Of course, it's the people who use marginal languages who will really benefit and they won't be hanging out in www-font...
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #50)
> Jonathan and I agree that the Web needs this. I don't know who else needs to
> be part of that consensus. I guess John Daggett could chime in.

Yes, I definitely agree that this would be a useful feature to better support minority scripts that are difficult to support via commerical standards like OpenType. 

Since it's basically a "smart font" format, I doubt this is something a W3C group would work to standardize but it might eventually affect CSS if it's used widely enough.
Depends on: 691505
Updated graphite code to the 1.0.3 release.

(For actual landing, I expect we'll update again to take any additional bugfixes that occur upstream - I know there's just been a precautionary fix to prevent out-of-control growth of the glyph list.)
Attachment #557126 - Attachment is obsolete: true
Attachment #564650 - Flags: review?(jdaggett)
Attachment #557126 - Flags: review?(jdaggett)
Attachment #557119 - Attachment is obsolete: true
Attachment #564652 - Flags: review?(jdaggett)
Attachment #557119 - Flags: review?(jdaggett)
Attachment #552352 - Attachment is obsolete: true
Attachment #564653 - Flags: review?(jdaggett)
Attachment #552352 - Flags: review?(jdaggett)
Oops, forgot to "hg add" a couple of new files - now fixed.
Attachment #564650 - Attachment is obsolete: true
Attachment #564756 - Flags: review?(jdaggett)
Attachment #564650 - Flags: review?(jdaggett)
Blocks: 692270
Updated graphite code to latest tip from upstream. This includes a couple more bugfixes, and provides a hook we can use for bug 691505.
Attachment #564756 - Attachment is obsolete: true
Attachment #566479 - Flags: review?(jdaggett)
Attachment #564756 - Flags: review?(jdaggett)
jtd, review ping? We'd like to get this into the tree....
Yes, i know, i know.  I've been trying to complete a first cut of the font-variant-xxx property implementations before taking the time for other things, including this one.  Sorry for the delay.
Attachment #564652 - Flags: review?(jdaggett) → review+
Comment on attachment 542149 [details] [diff] [review]
part 3 - let OTS pass through graphite tables untouched (removed option in prefs)

> -    if (ots::Process(&output, aData, aLength)) {
> +    if (ots::Process(&output, aData, aLength,
> +#ifdef MOZ_GRAPHITE
> +        true
> +#else
> +        false
> +#endif
> +        ))

Using #ifdef like this makes the code hard to read.  Just define 
PRESERVE_GRAPHITE using the same logic and use that instead.
Attachment #542149 - Flags: review?(jdaggett) → review+
Minor rebasing to current trunk - carrying forward r=jdaggett.
Attachment #564652 - Attachment is obsolete: true
Attachment #568610 - Flags: review+
Refreshed to apply on current trunk (no functional changes, just the bool update).
Attachment #564653 - Attachment is obsolete: true
Attachment #568611 - Flags: review?(jdaggett)
Attachment #564653 - Flags: review?(jdaggett)
Among the graphite reftests, I'm seeing warnings on graphite-01-ref.html and graphite-02-ref.html:

WARNING: Ended font run in the middle of a cluster: 'end == aSource->GetLength() || aSource->IsClusterStart(end)', file /builds/mozcentral/gfx/thebes/gfxFont.cpp, line 4347
WARNING: Started font run in the middle of a cluster: 'aSource->IsClusterStart(start)', file /builds/mozcentral/gfx/thebes/gfxFont.cpp, line 4345
Comment on attachment 542803 [details] [diff] [review]
part 5 - reftests to check that graphite shaping is working

> +random-if(d2d) HTTP(..) == graphite-02.html graphite-02-ref.html # possible subpixel-AA discrepancy under DWrite

That test is simple enough, we really should figure out why there's a
difference between the two cases.  The positions should be the same,
why are they different?  And why only in the D2D case?

I'm minus'ing this because I think for a library of this size and
complexity this simply isn't enough in the way of tests.  The HarfBuzz
code doesn't have a lot of tests but it's our main text rendering path
so it's going to get a lot of testing just running other tests.  This
code will only get invoked when Graphite tables are present in the
font, which means that unless there are explicit tests this codepath
won't get called.
Attachment #542803 - Flags: review?(jdaggett) → review-
Comment on attachment 568611 [details] [diff] [review]
part 4 - implement gfxGraphiteShaper (rebased)

> +    const gfxFontStyle *style = aTextRun->GetFontGroup()->GetStyle();
> +    PRUint32 grLang = 0;
> +    if (style->languageOverride) {
> +        grLang = MakeGraphiteLangTag(style->languageOverride);
> +    } else if (mFont->GetFontEntry()->mLanguageOverride) {
> +        grLang = MakeGraphiteLangTag(mFont->GetFontEntry()->mLanguageOverride);
> +    } else {
> +        nsCAutoString langString;
> +        style->language->ToUTF8String(langString);
> +        PRInt32 hyphIndex = langString.FindChar('-');
> +        if (hyphIndex >= 0) {
> +            langString.Truncate(hyphIndex);
> +        }
> +        langString.Append("\0\0\0\0", 4 - langString.Length()); // NUL-pad
> +        grLang = (langString[0] << 24) + (langString[1] << 16) +
> +                 (langString[2] << 8) + langString[3];
> +    }
> +    gr_feature_val *grFeatures = gr_face_featureval_for_lang(mGrFace, grLang);

This is wrong, the language override is an OpenType language tag (e.g. ENG) so
you need to do a conversion of OpenType lang ==> BCP47 lang, which is the same
format as style->language.

> +    const nsTArray<gfxFontFeature> *features = &style->featureSettings;
> +    if (features->IsEmpty()) {
> +        features = &mFont->GetFontEntry()->mFeatureSettings;
> +    }

This logic is wrong but that's already a bug against the HarfBuzz shaper (bug
687778).  The patches I have for implementing the font-variant subproperties
fixes this and I can fix this up too.

> +#ifdef MOZ_GRAPHITE
> +pref("gfx.font_rendering.graphite.enabled", true);
> +#endif
> +

This should be pref'ed off by default for now.

I'm still puzzling over the hairy logic in SetGlyphsFromSegment.  The
code there at least needs some simple commenting to explain roughly
what's going on.  I think it would also be a good idea to be asserting
various array bound conditions, especially with all the value-1 and
value+1 referencing there.
Attachment #568611 - Flags: review?(jdaggett) → review-
(In reply to John Daggett (:jtd) from comment #65)

> > +    const gfxFontStyle *style = aTextRun->GetFontGroup()->GetStyle();
> > +    PRUint32 grLang = 0;
> > +    if (style->languageOverride) {
> > +        grLang = MakeGraphiteLangTag(style->languageOverride);
> > +    } else if (mFont->GetFontEntry()->mLanguageOverride) {
> > +        grLang = MakeGraphiteLangTag(mFont->GetFontEntry()->mLanguageOverride);
> > +    } else {
> > +        nsCAutoString langString;
> > +        style->language->ToUTF8String(langString);
> > +        PRInt32 hyphIndex = langString.FindChar('-');
> > +        if (hyphIndex >= 0) {
> > +            langString.Truncate(hyphIndex);
> > +        }
> > +        langString.Append("\0\0\0\0", 4 - langString.Length()); // NUL-pad
> > +        grLang = (langString[0] << 24) + (langString[1] << 16) +
> > +                 (langString[2] << 8) + langString[3];
> > +    }
> > +    gr_feature_val *grFeatures = gr_face_featureval_for_lang(mGrFace, grLang);
> 
> This is wrong, the language override is an OpenType language tag (e.g. ENG)
> so
> you need to do a conversion of OpenType lang ==> BCP47 lang, which is the
> same
> format as style->language.

I'm not sure that's appropriate. When using OpenType fonts, it makes sense for font-language-override to be an OpenType "language system" tag, but when using Graphite fonts, surely it should be the Graphite language tag. Using OT langSys tags to control Gr language makes no sense, and would prevent access to arbitrary languages in Graphite fonts that might not have a corresponding OT langSys tag defined.

> > +    const nsTArray<gfxFontFeature> *features = &style->featureSettings;
> > +    if (features->IsEmpty()) {
> > +        features = &mFont->GetFontEntry()->mFeatureSettings;
> > +    }
> 
> This logic is wrong but that's already a bug against the HarfBuzz shaper (bug
> 687778).  The patches I have for implementing the font-variant subproperties
> fixes this and I can fix this up too.

Yes, right now font-feature-settings is handled as a single string at the CSS level, so there's no "merging" of features from the @font-face rule and the style. This should change as part of implementing the individual subproperties and the newer font-feature-settings syntax. I haven't looked in detail at your font-variant patches yet, but I assume all this will get dealt with together as part of that work.

> > +#ifdef MOZ_GRAPHITE
> > +pref("gfx.font_rendering.graphite.enabled", true);
> > +#endif
> > +
> 
> This should be pref'ed off by default for now.

Right; it's on for testing purposes, but I've always assumed we'll land preffed-off at first.

We can't reftest a preffed-off feature, though, can we? So I suppose we'll need to convert tests to mochitest form.
You can force the pref on while tests are running. For example, you could set the pref in reftest.js.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #67)
> You can force the pref on while tests are running. For example, you could
> set the pref in reftest.js.

For local testing or tryserver pushes, yes; but is there a way to toggle the pref just for the specific tests that need it on m-c, etc? I wasn't aware of a way to adjust prefs per-test - or did you mean to suggest we should extend the reftest manifest format to support this?
You could just turn it on for all reftests.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #69)
> You could just turn it on for all reftests.

I'd be reluctant to do that, as it means we'd be testing ALL text rendering using a different configuration than we're shipping. The presence of the graphite code path isn't _supposed_ to have any effect on non-graphite fonts, but this would create the possibility that we break the actual shipping configuration (with graphite.enabled=false) and reftests wouldn't detect it.

Filed bug 696585 with a patch to allow setting the pref on a per-test basis. This will allow us to begin reftesting the feature even while it's preffed off by default; it will also allow us to test both "on" and "off" settings on an ongoing basis, to confirm that they have the expected difference in behavior.
(In reply to Jonathan Kew (:jfkthame) from comment #66)
> > This is wrong, the language override is an OpenType language tag
> > (e.g. ENG) so you need to do a conversion of OpenType lang ==>
> > BCP47 lang, which is the same format as style->language.
> 
> I'm not sure that's appropriate. When using OpenType fonts, it makes
> sense for font-language-override to be an OpenType "language system"
> tag, but when using Graphite fonts, surely it should be the Graphite
> language tag. Using OT langSys tags to control Gr language makes no
> sense, and would prevent access to arbitrary languages in Graphite
> fonts that might not have a corresponding OT langSys tag defined.

The spec currently defines the string to be an OpenType language tag:

  The value of the <string> is a single three-letter OpenType
  language system tag, defined in the layout tag registry of the
  OpenType specification.

Sounds like you'd prefer something like:

  The value of the <string> is a determined by the underlying
  font technology used.  For OpenType fonts, it is a single
  three-letter OpenType language system tag, defined in the
  layout tag registry of the OpenType specification.  Other font
  technologies may use a different format but the value of the
  string only specifies a single language.

There's a related problem with the way font-feature-settings is defined,
we can't access the (awful) numeric values that many Graphite fonts seem
to use.  Maybe the syntax should be expanded a bit, to include a
feature() wrapper that could take either a tag or a number?  Using a
modified version of the current spec syntax:

  font-feature-settings: "dlig" 1, "ss02", "swsh" 12;
  font-feature-settings: feature("dlig") 1, feature(1059) 3;

I think it would really help if SIL laid out clearer guidelines about feature
id's, so that feature settings would be consistent across fonts. There's
really no reason for two different fonts to use two different feature
id's for standard features like small caps glyphs!  It would also be
nice if they recommended use of the OpenType styleset and character
variant feature, since many (most?) of the features I've seen in
Graphite fonts seem to be of this nature, choosing alternate variants of
specific characters that match the definition of ssXX/cvXX.
(In reply to John Daggett (:jtd) from comment #71)
> The spec currently defines the string to be an OpenType language tag:
> 
>   The value of the <string> is a single three-letter OpenType
>   language system tag, defined in the layout tag registry of the
>   OpenType specification.
> 
> Sounds like you'd prefer something like:
> 
>   The value of the <string> is a determined by the underlying
>   font technology used.  For OpenType fonts, it is a single
>   three-letter OpenType language system tag, defined in the
>   layout tag registry of the OpenType specification.  Other font
>   technologies may use a different format but the value of the
>   string only specifies a single language.

Yes, I realize the current WD text specifies OT tags; I think this should be modified along the lines you suggest here, so that it's not restricted to one particular technology. (Otherwise, we should call it font-opentype-language-system, and not expect it to apply to anything else.)

> There's a related problem with the way font-feature-settings is defined,
> we can't access the (awful) numeric values that many Graphite fonts seem
> to use.  Maybe the syntax should be expanded a bit, to include a
> feature() wrapper that could take either a tag or a number?

I discussed this with Martin a while ago, and at that time he was OK with only supporting OT-style tags; I believe he was intending to encourage SIL to update fonts that are currently using arbitrary numeric IDs to offer tags instead (or as well?) for access to the features.

(Note that some Graphite fonts are already using OT-style tags; e.g. see http://www.numbertext.org/linux/fontfeatures.pdf.)

Martin, any further thoughts on this?
Actually, OpenType is the one that's out of sync with the rest of the industry which is using ISO639 (2 and 3 letter codes). So, given also that there is the lang tag already that should be sufficient, the font-language-override feels very like it is technology specific just like the font-feature-settings. Therefore I agree with John's proposed wording.

Yes the numeric ids are far from ideal, but there is a standard way to map a string to a numeric id for graphite. If the string consists entirely of digits, then it is a numeric id (even if there are 4 or less of them), otherwise it is a 4 char tag. If that is too much, then 4 char tags only would be OK, even though it would be a shock to the Roman Font guys.

The whole numeric id thing, while not going away in principle in the engine, will probably get a major overhaul in our Roman Fonts for the next major release over the next year, anyway. The history is that the fonts were using numeric ids (coming from the AAT concept) before we had the sensible idea to support 4 letter tags as well.

I agree with the idea of a recommended list of tags to use. Notice I don't say a registry, because the list should be open rather than closed. But where different font developers want to use a feature to control the same kind of behaviour, it would be good if they used the same id. It would really help us in that process if you could give us a list of the mappings from CSS specific settings so that we know which ones we really have to get people to standardise on. I agree that the more we can overlap with OT the better, but I doubt we will go to things like ss02, though.
Actually, OpenType is the one that's out of sync with the rest of the industry which is using ISO639 (2 and 3 letter codes). So, given also that there is the lang tag already that should be sufficient, the font-language-override feels very like it is technology specific just like the font-feature-settings. Therefore I agree with John's proposed wording.

Yes the numeric ids are far from ideal, but there is a standard way to map a string to a numeric id for graphite. If the string consists entirely of digits, then it is a numeric id (even if there are 4 or less of them), otherwise it is a 4 char tag. If that is too much, then 4 char tags only would be OK, even though it would be a shock to the Roman Font guys.

The whole numeric id thing, while not going away in principle in the engine, will probably get a major overhaul in our Roman Fonts for the next major release over the next year, anyway. The history is that the fonts were using numeric ids (coming from the AAT concept) before we had the sensible idea to support 4 letter tags as well.

I agree with the idea of a recommended list of tags to use. Notice I don't say a registry, because the list should be open rather than closed. But where different font developers want to use a feature to control the same kind of behaviour, it would be good if they used the same id. It would really help us in that process if you could give us a list of the mappings from CSS specific settings so that we know which ones we really have to get people to standardise on. I agree that the more we can overlap with OT the better, but I doubt we will go to things like ss02, though.

BTW I don't know of any fonts other than the Roman ones that are seriously using numeric ids anymore. All new fonts use tags and there are a number out there now.
Shouldn't font-language-override take string values that are consistent with the values used in HTML "lang" attributes and the CSS :lang() selector? I.e., values from BCP 47?

User-agents already have to be able to map those values into whatever font systems they support.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #75)
> Shouldn't font-language-override take string values that are consistent with
> the values used in HTML "lang" attributes and the CSS :lang() selector?
> I.e., values from BCP 47?
> 
> User-agents already have to be able to map those values into whatever font
> systems they support.

There isn't always a 1-to-1 mapping, as the documentation explains:

http://www.microsoft.com/typography/otspec/languagetags.htm

For example, you have 'Yi Classic' and 'Yi Modern'.
(In reply to John Daggett (:jtd) from comment #71)
>   The value of the <string> is a determined by the underlying
>   font technology used.  For OpenType fonts, it is a single
>   three-letter OpenType language system tag, defined in the
>   layout tag registry of the OpenType specification.  Other font
>   technologies may use a different format but the value of the
>   string only specifies a single language.

Could this work reasonably for fonts that support both OpenType and Graphite technologies?
(In reply to Karl Tomlinson (:karlt) from comment #77)
> Could this work reasonably for fonts that support both OpenType and Graphite
> technologies?

We only shape with *one* of these, with the patches attached here if Graphite tables exist then Graphite shaping is used.  Without Graphite tables, OpenType shaping is used.  So effectively, OpenType layout tables are ignored when Graphite tables are present.
(In reply to John Daggett (:jtd) from comment #78)
> We only shape with *one* of these, with the patches attached here if
> Graphite tables exist then Graphite shaping is used.

Does this mean the web author needs to guess which technology the browser will use?
(Not all browsers use Graphite.)
(In reply to Karl Tomlinson (:karlt) from comment #79)
> (In reply to John Daggett (:jtd) from comment #78)
> > We only shape with *one* of these, with the patches attached here if
> > Graphite tables exist then Graphite shaping is used.
> 
> Does this mean the web author needs to guess which technology the browser
> will use?
> (Not all browsers use Graphite.)

Yes unfortunately.  That's already the case.  An author needs to know that an AAT font for Indic is required on OSX because OpenType layout for Indic is not well supported (don't know whether this has improved in 10.7 but you get the point...).

Although Graphite allows a lot of flexibility, I don't think we should think of it as a replacement for OpenType but rather as something to cover areas that OpenType doesn't handle well.
Font-language-override, like font-feature-settings, is a low-level and inherently font-specific tool for authors who understand in detail the capabilities of their fonts. If the browser does not support the rendering technology that the author was targeting, then neither of these is likely to work as intended; most likely they'll simply not match, and be ignored.

It would be possible for a font developer concerned about this scenario to support the OpenType langsys tags in Graphite, in addition to ISO language tags, and such a font could then work with the same font-language-override setting under both technologies.

(The same issue applies to font-feature-settings: It would be possible for a dual-technology font supporting both OpenType and Graphite to be designed such that all the feature tags and resulting behavior match across both systems, so that font-feature-settings can be used regardless of the rendering system, but there's nothing constraining the font developer to implement things in such a way - in general, the greater flexibility of Graphite encourages a more "natural" expression of the behavior, while the constraints of OpenType features and the limits of various application UIs tend to make designers artificially shoehorn required behavior into a framework of a few "well-known" features. So in practice, the value of font-feature-settings must be assumed to be technology-specific; font-language-override is no different. Making these low-level controls work consistently across multiple fonts and/or rendering technologies is possible, but only if the font developer explicitly chooses to support this.)
(In reply to John Daggett (:jtd) from comment #64)
> Comment on attachment 542803 [details] [diff] [review] [diff] [details] [review]
> part 5 - reftests to check that graphite shaping is working
> 
> > +random-if(d2d) HTTP(..) == graphite-02.html graphite-02-ref.html # possible subpixel-AA discrepancy under DWrite
> 
> That test is simple enough, we really should figure out why there's a
> difference between the two cases.  The positions should be the same,
> why are they different?  And why only in the D2D case?

The difference arises because of how we handle glyph positioning information.

In the "real" graphite case, where the string "Latin" gets rendered as the glyphs "Atinlay", the reordering means that the entire word becomes a single cluster, with the glyphs associated with the first character, and the remaining characters being cluster-continuations with no glyphs of their own.

In this situation, the advance width of the entire cluster is assigned to the first glyph, and the remaining glyphs are positioned using the mXOffset field. Note that advance width is stored as an integer in appUnits, while the position offsets in the DetailedGlyph record are floating-point. Thus, the relative positions of glyphs within the cluster are stored in the textrun with floating-point accuracy.

In the "reference" case, however, the characters "Atinlay" are each handled in a separate segment, to circumvent the graphite processing in the font and allow us to spell out the expected result. But this means that each glyph in the textrun is a "simple" glyph with an advance width that is stored as an integer number of appunits. So in effect we have forcibly "snapped" each glyph to an integer-appunit position.

The result is that the testcase has more precise positioning of glyphs within the "cluster" of the reordered word than we can achieve in the reference case. In principle this could lead to tiny (subpixel) discrepancies on any platform, I think, but D2D seems to be the only one that is sufficiently sensitive to the precision of the positions for it to actually make the test fail.
(In reply to John Daggett (:jtd) from comment #63)
> Among the graphite reftests, I'm seeing warnings on graphite-01-ref.html and
> graphite-02-ref.html:
> 
> WARNING: Ended font run in the middle of a cluster: 'end ==
> aSource->GetLength() || aSource->IsClusterStart(end)', file
> /builds/mozcentral/gfx/thebes/gfxFont.cpp, line 4347
> WARNING: Started font run in the middle of a cluster:
> 'aSource->IsClusterStart(start)', file
> /builds/mozcentral/gfx/thebes/gfxFont.cpp, line 4345

These warnings are expected and harmless. They arise because the reference files use &zwnj; between adjacent characters in order to block the graphite rules from applying. Zwnj is a grapheme cluster extender, so it becomes part of the same cluster as the preceding letter, but the test font doesn't support the zwnj character, so font-matching falls back to another font for it. The result is that we have a mid-cluster font change, which is exactly what the warning says.
Updated the gfxGraphiteShaper code to include some comments in the "hairy logic" dealing with char/glyph clusters.

Also set the graphite-enabled pref to FALSE by default, in preparation for initial landing. This means that you'll need to explicitly turn it on in order to test. (It also means that we'll need bug 696585 or some other solution before we can land reftests in the tree.)

I did not modify the (-moz-)font-language-override handling, as per comments 71 and following. We should probably discuss this in the Fonts WG at some point, but for now I suggest that the simple model of passing the language override directly to the font is the most reasonable, understandable, and useful thing to do.
Attachment #568611 - Attachment is obsolete: true
Attachment #570203 - Flags: review?(jdaggett)
Depends on: 696585
how's this bug going? Does silence mean that all the shouting is done and the patches are ready to go?
One other comment about the reftests, they're using both feature and language-mapping, it would be useful to have a simple, brief comment explaining those.  Specifically, the feature you enable in graphite-03b.html, 'kdot', is a feature that's enabled by default when the language is Karen, 'ksw', in the Padauk font.  I had to dig up the GDL source to figure that out.
Blocks: 700022
Blocks: 700023
As landing unit tests is dependent on support for setting the graphite pref to "on", I've split that out into a separate bug 700022. I've also filed bug 700023 on enabling graphite by default.

Meanwhile, can we finish review here and get the basic patches for graphite support (preffed-off) into the tree, please?
Comment on attachment 542803 [details] [diff] [review]
part 5 - reftests to check that graphite shaping is working

Marking the reftest patch as obsolete - this will be updated and moved to bug 700022.
Attachment #542803 - Attachment is obsolete: true
(In reply to Jonathan Kew (:jfkthame) from comment #87)
> As landing unit tests is dependent on support for setting the graphite pref
> to "on", I've split that out into a separate bug 700022. I've also filed bug
> 700023 on enabling graphite by default.

This isn't completely true.  You can still write Graphite tests that compare the use of two Graphite fonts compiled with different GDL files that should render the same way.  Without graphite enabled, they would just trivially pass.

I don't like the idea of checking in a large piece of code like this with no tests.
(In reply to John Daggett (:jtd) from comment #86)
> Specifically, the feature you enable in graphite-03b.html, 'kdot',
> is a feature that's enabled by default when the language is Karen,
> 'ksw', in the Padauk font.  I had to dig up the GDL source to figure
> that out.

Actually, I got this wrong.  The reftest comparison is a != comparison
so it's relying on the rendering being different.  Either way, it
would be helpful to have a comment explaining what these tests are
doing since there's several steps of indirection to figure this out.

I was demoing the use of Graphite in Firefox a couple days ago to
Richard Ishida, W3C head of internationalization.  He noticed that one
of the reftests was using lang='mya' and noted that it should be 'my'
instead.  The HTML5 spec specifically defines the lang tag [1] as
referencing "a valid BCP 47 language tag".  So the tag should be one
of those listed in the IANA subtag registry [2].

Which brings up a few questions/problems.  The code currently simply
passes along the string in the language tag but if Graphite expects a
different form of langauge tag (e.g. ISO 636-3) then there should be a
translation done.  While the language override may be looser, the
treatment of the contents of the lang tag should be clearer, otherwise
there will be a mismatch between the lang tag and the specific format
used (i.e. OpenType or Graphite).  What are language tags in Graphite
fonts defined to be?  ISO 636-3?  If this is the case then we at least
need to convert BCP47 ==> ISO 636-3 and pass along unknown languages.
My main concern is that properly named language tags should match with
whatever is defined in a Graphite font (e.g 'my' should match for a
Burmese font in both the OpenType case and the Graphite case).
(In reply to John Daggett (:jtd) from comment #90)

> I was demoing the use of Graphite in Firefox a couple days ago to
> Richard Ishida, W3C head of internationalization.  He noticed that one
> of the reftests was using lang='mya' and noted that it should be 'my'
> instead.  The HTML5 spec specifically defines the lang tag [1] as
> referencing "a valid BCP 47 language tag".  So the tag should be one
> of those listed in the IANA subtag registry [2].

Sorry, my mistake - it should have been lang="my". The error is irrelevant to the test, though, as the Myanmar behavior is the (untagged) default in the Padauk font, and so neither "my" nor "mya" are explicitly recognized, and in either case we get the font's default rendering. It's only in the case where we want non-default behavior (lang="ksw") that the tag is actually significant.

> Which brings up a few questions/problems.  The code currently simply
> passes along the string in the language tag

Actually, it truncates it and passes just the primary language subtag. Graphite language tags are limited to 4 characters, so they can only be expected to represent a primary language, not an arbitrary BCP47 tag (but that's reasonable, I think).

> but if Graphite expects a
> different form of langauge tag (e.g. ISO 636-3) then there should be a
> translation done.  While the language override may be looser, the
> treatment of the contents of the lang tag should be clearer, otherwise
> there will be a mismatch between the lang tag and the specific format
> used (i.e. OpenType or Graphite).  What are language tags in Graphite
> fonts defined to be?  ISO 636-3?  If this is the case then we at least
> need to convert BCP47 ==> ISO 636-3 and pass along unknown languages.

Martin, any further comment on what tags are expected to be used in Graphite, and how we should relate HTML language tagging to this?

> My main concern is that properly named language tags should match with
> whatever is defined in a Graphite font (e.g 'my' should match for a
> Burmese font in both the OpenType case and the Graphite case).

Yes, of course. I'd expect that to work fine (though in practice it often "works" just like lang="en" will "work" and select the appropriate English behavior in a Latin-script font, by virtue of being the unmarked default rather than actually matching any tags).
> const gfxFontStyle *style = aTextRun->GetFontGroup()->GetStyle();
> PRUint32 grLang = 0;
> if (style->languageOverride) {
>     grLang = MakeGraphiteLangTag(style->languageOverride);
> } else if (mFont->GetFontEntry()->mLanguageOverride) {
>     grLang = MakeGraphiteLangTag(mFont->GetFontEntry()->mLanguageOverride);
> } else {
>     nsCAutoString langString;
>     style->language->ToUTF8String(langString);
>     PRInt32 hyphIndex = langString.FindChar('-');
>     if (hyphIndex >= 0) {
>         langString.Truncate(hyphIndex);
>     }
>     langString.Append("\0\0\0\0", 4 - langString.Length()); // NUL-pad
>     grLang = (langString[0] << 24) + (langString[1] << 16) +
>              (langString[2] << 8) + langString[3];
> }
> gr_feature_val *grFeatures = gr_face_featureval_for_lang(mGrFace, grLang);

Above is the code in gfxGraphiteShaper::InitTextRun, the language
portion is stripped out but what remains might be BCP47 language
subtag or not.  For example 'mya' (from ISO 639-3) will be passed
through to Graphite, as will 'my' (from ISO 639-1).  Only 'my' should
be recognized, 'mya' should be treated as an unknown language.  

I actually think at this point in the code we should be validating
against the IANA language subtags and treating language tags that
don't match as mapping to the default language.  Without this we won't
have consistent behavior across font technologies (i.e. if the font
also supports both OpenType and Graphite layout, the same languages
should be recognized whether a user agent supports Graphite or not). I
think we should be extremely cautious about supporting anything that
could be used as a private use tag, which is what we effectively
enable if we don't validate these to some degree.

If I understand this correctly, effectively we shouldn't support the
three-letter tags in 639-3 that correspond to two-letter tags in 639-1.
(In reply to John Daggett (:jtd) from comment #92)

> I actually think at this point in the code we should be validating
> against the IANA language subtags ...

IMO, any such validation should be done up front, when the lang attribute is parsed, rather than as part of text shaping.

I'd suggest filing a separate bug about this, and making it block the BCP47 tracking bug (356038) - we still have quite a bit to do in relation to BCP47, I believe. But I don't think it needs to block the initial implementation of graphite support.
(In reply to Jonathan Kew (:jfkthame) from comment #93)
> (In reply to John Daggett (:jtd) from comment #92)
> 
> > I actually think at this point in the code we should be validating
> > against the IANA language subtags ...
> 
> IMO, any such validation should be done up front, when the lang attribute is
> parsed, rather than as part of text shaping.

No, because per the wording in the HTML5 spec, lang="blah" can still
be used when matching selectors like :lang("blah"), even though both
are treated as "unknown languages".  

http://www.w3.org/TR/html5/elements.html#the-lang-and-xml:lang-attributes

The wording there regarding treatment of unknown language tags is somewhat
contradictory:

  If the resulting value is not a recognized language tag, then
  it must be treated as an unknown language having the given
  language tag, distinct from all other languages. For the
  purposes of round-tripping or communicating with other services
  that expect language tags, user agents should pass unknown
  language tags through unmodified.

I don't think an internal API qualifies here as a "service".  I think
we should figure out the specified format of language tags in Graphite
and match it.  The same thing is done  for languages in HarfBuzz,
effectively the language subtag matches a BCP47 language tag or
HB_OT_TAG_DEFAULT_LANGUAGE is used.

http://mxr.mozilla.org/mozilla-central/source/gfx/harfbuzz/src/hb-ot-tag.c#176

We should be consistent across API's and do the same thing as is done
in HarfBuzz and not pass through arbitrary language subtags.
Logged bug against HTML5 regarding the ambiguous wording:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=14709
(In reply to John Daggett (:jtd) from comment #94)

> The wording there regarding treatment of unknown language tags is somewhat
> contradictory:
> 
>   If the resulting value is not a recognized language tag, then
>   it must be treated as an unknown language having the given
>   language tag, distinct from all other languages. For the
>   purposes of round-tripping or communicating with other services
>   that expect language tags, user agents should pass unknown
>   language tags through unmodified.
> 
> I don't think an internal API qualifies here as a "service".

It's not an internal API that is at issue, it's the "communication" between the browser and the Graphite font, which could reasonably be regarded as "another service that expects language tags".

>  I think
> we should figure out the specified format of language tags in Graphite
> and match it.  The same thing is done  for languages in HarfBuzz,
> effectively the language subtag matches a BCP47 language tag or
> HB_OT_TAG_DEFAULT_LANGUAGE is used.
> 
> http://mxr.mozilla.org/mozilla-central/source/gfx/harfbuzz/src/hb-ot-tag.
> c#176
> 
> We should be consistent across API's and do the same thing as is done
> in HarfBuzz and not pass through arbitrary language subtags.

The OpenType and Graphite cases are not comparable. In the case of OpenType, it is clear that the "language system" tags are quite distinct from ISO language codes, and that the tag sets do not directly correspond. Passing tags from HTML lang attributes directly to OpenType langSys APIs would be a clear "type mismatch". That's not the case with graphite.

Hmm, on re-reading the graphite doc, it does specifically mention ISO-639-3, which implies an expectation that three-letter codes should always be used. (Martin, please confirm this is the intent.)

In that case, I suppose we should map two-letter primary subtags from lang to corresponding ISO-639-3 codes before passing them to graphite. Sigh.

However, I do _not_ think we should attempt to "validate" anything here - if the lang code is a 3-letter code already, or if it's something else that we don't recognize, we should simply pass it through to the font (as per the HTML5 spec's paragraph on unrecognized tags).
(In reply to Jonathan Kew (:jfkthame) from comment #96)
> It's not an internal API that is at issue, it's the "communication"
> between the browser and the Graphite font, which could reasonably be
> regarded as "another service that expects language tags".

That's the problem with this wording, it allows the interpretation that
you're advocating, it effectively allows *any* language scheme to be
used, not just BCP47.  Treating internal code interfaces as "another
service" is a recipe for poor interoperability across user agents and
font types.

> The OpenType and Graphite cases are not comparable. In the case of
> OpenType, it is clear that the "language system" tags are quite
> distinct from ISO language codes, and that the tag sets do not
> directly correspond. Passing tags from HTML lang attributes directly
> to OpenType langSys APIs would be a clear "type mismatch". That's not
> the case with graphite.

Just to be clear, I'm not talking about the language override property
but the lang attribute of elements that is passed down in the
gfxFontStyle struct.  The treatment of the lang tag should be
consistent, no matter what the underlying font technology used.  If
Graphite is using ISO-639-3 tags instead of BCP47 tags, even though
there's considerable overlap that's a type mismatch by definition.  Just
as we don't pass down unknown language codes to an OpenType font, we
shouldn't be passing down unknown language codes to Graphite fonts.  If
the concern is that we won't allow handling of not-yet-defined language
codes, I think the language override provides room for more flexibility.
But we should not conflate different id types in our handling of lang
tags.

> In that case, I suppose we should map two-letter primary subtags from
> lang to corresponding ISO-639-3 codes before passing them to graphite.
> Sigh.

Actually, we should map the list of language subtags listed at the URL
below to the appropriate ISO 639-3 codes, assuming that's what Graphite
uses.  This list includes both two-letter and three-letter codes:

http://www.iana.org/assignments/language-subtag-registry

Anything not in this list should use the default language unless the language override property is specified.

> However, I do _not_ think we should attempt to "validate" anything
> here - if the lang code is a 3-letter code already, or if it's
> something else that we don't recognize, we should simply pass it
> through to the font (as per the HTML5 spec's paragraph on unrecognized
> tags).

Again, I really don't think we should treat internal API's as "other
services". Doing so exposes users to different behavior across font
types, something we should actively seek to eliminate.
Wow! Quite the conversation. The ISO 639-3 specification for language tags in the GDL spec is out of date. It was written long before RFC4646 even let alone RFC5646. You will also notice that in the very next example, both 2 letter and 3 letter codes are supported by a font. As such, GDL places no limitation beyond the four letters on what language tags a font developer may decide to support. I will get the documentation changed to say: BCP47 language tags and that people would be advised to also support iso 639-3 3 letter tags. There are even cases where someone may want to hack via passing a 4 letter tag where a writing system variation is required, given the 4 letter limitation and BCP47's language variant mechanism.

In conclusion, I suggest that what is already being done is the right way to address this.
(In reply to martin_hosken from comment #98)
> As such, GDL places no limitation beyond the four letters on what
> language tags a font developer may decide to support. I will get the
> documentation changed to say: BCP47 language tags and that people
> would be advised to also support iso 639-3 3 letter tags. There are
> even cases where someone may want to hack via passing a 4 letter tag
> where a writing system variation is required, given the 4 letter
> limitation and BCP47's language variant mechanism.
> 
> In conclusion, I suggest that what is already being done is the right
> way to address this.

Martin, the problem with this approach is that language tags are defined
to be in BCP47 format and if a *different* format is used then some
translation is needed.

What I hear you saying is that for these tags in Graphite, anything goes
as long as the tag is four characters or less.  That's unfortunate
because it means there's no clear way to map a BCP47 language tag onto a
Graphite language tag.

The problem with the "just pass it through" approach is that it makes
language tag use across user agents inconsistent.  If a given user agent
supports Graphite and a Graphite font uses ISO-639-3 language codes,
then lang="fra" will match French language behavior in the font.  But
for user agents that don't support Graphite but do support OpenType, the
mapping won't work (i.e. there's no mapping for 'fra' in a BCP47 ==>
OpenType language code table) and language-specific features will not be
enabled.

While mechanisms like this are fine for document formats with
tightly-coupled interchange options, we shouldn't be introducing
inconsistencies like this to the web.
No, I didn't say "anything goes". I said, we will change to require BCP 47, but in keeping with safe programming practices, font developers are encouraged to be "accepting of all" and so are recommended to handle ISO 639-3 3 letter codes should they arise. So Firefox should generate a BCP47 4 character tag (i.e. just the initial language component of the name), thus French should be passed as "fr". But a font may also decide to support "fra" as a way of indicating French, for support of other applications, but Firefox should not use that when passing the code.

In the case of a font specific override, which may need to be used where a language-variant pair is trying to be mapping from a 3-5 char language tag to a single 4 character tag, then there may be cases where a 4 letter code is used. This is not specified in any standard yet, which is somewhat unfortunate, but entirely understandable. And it is for the font developer to specify the mapping they are using in their font.

To fully unpack my meaning of saying that GDL supports anything, there is no limitations placed by the graphite compiler on what someone specifies as a language tag within GDL. The interpretation is assumed to be BCP 47, but where tags are used that are not in BCP 47 (that includes 4 letter tags and 3 letter tags in ISO 639-3 for which there is a corresponding 2 letter tag in BCP47), the meaning is undefined, and if used is assumed to be in a private use specification between the document provider and the font developer. I would also recommend that here a 3 letter tag outside BCP47 is used it be interpretted as being from ISO 639-3.
(In reply to martin_hosken from comment #100)
> No, I didn't say "anything goes". I said, we will change to require
> BCP 47, but in keeping with safe programming practices, font
> developers are encouraged to be "accepting of all" and so are
> recommended to handle ISO 639-3 3 letter codes should they arise. So
> Firefox should generate a BCP47 4 character tag (i.e. just the initial
> language component of the name), thus French should be passed as "fr".
> But a font may also decide to support "fra" as a way of indicating
> French, for support of other applications, but Firefox should not use
> that when passing the code.

So you're saying user agents should accept either BCP47 -or- ISO 639-3
codes and map them to the appropriate BCP47 form?  The problem with this
is that it makes two different language codes synonymous (e.g.
lang="fra" and lang="fr" would both map to French).  It's also *not* the
way language tags are defined in HTML5:

  The lang attribute (in no namespace) specifies the primary
  language for the element's contents and for any of the
  element's attributes that contain text. Its value must be a
  valid BCP 47 language tag, or the empty string.

  http://dev.w3.org/html5/spec/Overview.html#the-lang-and-xml:lang-attributes

So ISO 639-3 codes that are not BCP47 codes (i.e. the codes for which a
two-letter version exists such as 'mya') are clearly not valid language
subtags for the lang attribute in HTML. The question is whether invalid
tags should be passed down to API's or not. I don't think a user agent
should ever be translating ISO 636-3 codes into BCP47 ones, that's
clearly not what is defined in the spec.  In the OpenType case, we
validate by looking up the OpenType language in a BCP47 ==> OpenType
language tag table.  Entries not found are translated to the OpenType
default language tag.  I'm arguing that we should do something similar
in the Graphite case.

I don't feel as strongly about the value of the font-language-override
property, since it's designed to be an escape hatch for passing a
font-specific language tag down to the font.  But for the lang attribute
I think we should be enforcing validity and not passing invalid tags
down to lower-level API's.
(In reply to John Daggett (:jtd) from comment #101)
> (In reply to martin_hosken from comment #100)
> > No, I didn't say "anything goes". I said, we will change to require
> > BCP 47, but in keeping with safe programming practices, font
> > developers are encouraged to be "accepting of all" and so are
> > recommended to handle ISO 639-3 3 letter codes should they arise. So
> > Firefox should generate a BCP47 4 character tag (i.e. just the initial
> > language component of the name), thus French should be passed as "fr".
> > But a font may also decide to support "fra" as a way of indicating
> > French, for support of other applications, but Firefox should not use
> > that when passing the code.
> 
> So you're saying user agents should accept either BCP47 -or- ISO 639-3
> codes and map them to the appropriate BCP47 form?  The problem with this
> is that it makes two different language codes synonymous (e.g.
> lang="fra" and lang="fr" would both map to French).  It's also *not* the
> way language tags are defined in HTML5:

I think he's saying Firefox shouldn't pass "fra" down to the font.

In any case, I agree with you that we shouldn't be passing non-BCP47 strings from the HTML "lang" down into Graphite.
Correct. I am saying that FF should not pass "fra" but should pass "fr".

At a semantic interpretation level of the two strings "fr" and "fra", they both represent French, all be it in different standards. FF uses BCP 47 so must pass "fr". But if an application (not FF) is ISO639-3 only, it will pass "fra", and font developers may well want them both to mean the same thing, hence supporting them both. "fra" will always be an invalid code in BCP 47 (since "fr" exists and "fra" is the same thing in ISO639-3, so BCP47 will enver have a different interpretation of "fra") so it should never arise, but it is valid in ISO639-3. The font should never receive a request for "fra" from FF.
Ok, good.  So I think this translates in the code to:

1. Look up the lang subtag in the set of BCP47 lang subtags:

http://www.iana.org/assignments/language-subtag-registry

(Not sure where this should live)

2. If the lang subtag is in the set of BCP47 lang subtags, pass it through.  Otherwise pass in a null language code.
OK, here's a version that validates the lang tag against the IANA registry before passing it on to graphite.

(Personally, I'm not convinced that passing unknown lang codes from HTML through unchanged would be incorrect or harmful, but if this is what people want, fine.)

For now, the list of valid codes is built in to gfxGraphiteShaper, and loaded into a hashset on first use. Once bug 356038 (BCP47 support) is finished, we should be able to replace this with a validation function which checks whether the tag is known as a BCP47 language in the properties files there. So the presence of gfxLanguageTagList.cpp here is an interim fix only.
Attachment #570203 - Attachment is obsolete: true
Attachment #572462 - Flags: review?(jdaggett)
Attachment #570203 - Flags: review?(jdaggett)
Comment on attachment 572462 [details] [diff] [review]
part 4 - implement gfxGraphiteShaper - with lang tag validation

Sorry if I'm being thick but I still don't quite see the need for a
two-pass process of grouping all glyphs into clusters in one pass in
gfxGraphiteShaper::SetGlyphsFromSegment and then adding in clusters one
by one into the text run.  Seems like the code should be able to take a
start slot and figure out the last slot in the current cluster, then add
that cluster to the text run without the need set up the four local
arrays the code now uses.

Otherwise, patch 4 looks fine.

How is kerning applied?  And how can it be disabled?
(In reply to John Daggett (:jtd) from comment #106)
> Comment on attachment 572462 [details] [diff] [review] [diff] [details] [review]
> part 4 - implement gfxGraphiteShaper - with lang tag validation
> 
> Sorry if I'm being thick but I still don't quite see the need for a
> two-pass process of grouping all glyphs into clusters in one pass in
> gfxGraphiteShaper::SetGlyphsFromSegment and then adding in clusters one
> by one into the text run.  Seems like the code should be able to take a
> start slot and figure out the last slot in the current cluster, then add
> that cluster to the text run without the need set up the four local
> arrays the code now uses.

It's hard to do that in the presence of arbitrary glyph expansion (one character maps to many glyph slots) and rearrangement (the slots associated with a source character may not be contiguous). Essentially, we have to walk the complete list of slots in order to know whether we've found all the glyphs associated with any given character position - and if we didn't gather the details for _every_ cluster during that process, we'd have to re-scan the slot list for each one.

> How is kerning applied?  And how can it be disabled?

There's nothing special about kerning. It would be implemented by GDL rules that adjust glyph positions, just like any other aspect of positioning. It would be entirely up to the font developer to choose which aspects of positioning are exposed as user-selectable features; obviously, if some of the positioning behavior corresponds to the concept of "kerning", it'd make sense to expose this as a 'kern' feature.
(In reply to Jonathan Kew (:jfkthame) from comment #107)

> > How is kerning applied?  And how can it be disabled?
> 
> There's nothing special about kerning. It would be implemented by GDL rules
> that adjust glyph positions, just like any other aspect of positioning. It
> would be entirely up to the font developer to choose which aspects of
> positioning are exposed as user-selectable features; obviously, if some of
> the positioning behavior corresponds to the concept of "kerning", it'd make
> sense to expose this as a 'kern' feature.

So with a Graphite font, 'kern' table data is ignored?
(In reply to John Daggett (:jtd) from comment #108)
> So with a Graphite font, 'kern' table data is ignored?

Yes. The Graphite font developer is expected to fully control positioning via GDL, not a mixture of GDL plus 'kern' tables. (Right, Martin?)
Yes that is correct. There is nothing for the application to do in this area.
Is Graphite TrueType only?  I was trying to add Graphite rules to a Postscript CFF font today and the compiler upgaffawed on me.  After conversion to .ttf everything seemed to be fine.  Is this a restriction of Graphite itself or just the compiler?
Graphite is currently TrueType only. There are no plans to extend it to support CFF. Most of the work would be in the compiler, but the engine would need to add CFF support too (if only to get hold of bbox metrics for a glyph). I'd be very open to contributions :)
These use a set of custom-built graphite-enabled fonts that also include OT substitution features such that these tests will pass trivially when Graphite is not enabled.  See the README file in layout/reftests/fonts/graphite for details.
Attachment #574420 - Flags: review?(jfkthame)
Updated Graphite code drop to pick up a few additional bugfixes from upstream.
Attachment #566479 - Attachment is obsolete: true
Attachment #579274 - Flags: review?(jdaggett)
Attachment #566479 - Flags: review?(jdaggett)
Un-bitrotted. An earlier version was already r+'d by jdaggett, but also flagging Ted for review of the build-system patch.
Attachment #568610 - Attachment is obsolete: true
Attachment #579276 - Flags: review?(ted.mielczarek)
Just minor updates for current m-c trunk.
Attachment #572462 - Attachment is obsolete: true
Attachment #579278 - Flags: review?(jdaggett)
Attachment #572462 - Flags: review?(jdaggett)
Comment on attachment 574420 [details] [diff] [review]
patch, simple set of graphite reftests

I suppose it's fine to have these tests, and the fonts used (with source) provide a handy example. The fact that they will pass even if Graphite is not used (or not present at all) seems like a weakness, though - we could completely break gfxGraphiteShaper, and the tests will merrily stay green.

In general, I think the more interesting tests will be those where there's an observable difference between the behavior with Graphite and without. But for those we need to control the Graphite preference (as per bug 696585, and tests in bug 700022), or else mark them as known-fail until we flip the pref, and then update the manifest at that time.
Attachment #574420 - Flags: review?(jfkthame) → review+
Blocks: 708181
Comment on attachment 579278 [details] [diff] [review]
part 4 - implement gfxGraphiteShaper (un-bitrotted again)

r+ based on this being pref'ed off by default.

I'm a little concerned about the performance of the graphite code since it's constructing a lot of data structures for every text run.  But running some simple perf tests it's in the ballpark of our existing code so I thing we can deal with any perf issues that arise once the code has landed.
Attachment #579278 - Flags: review?(jdaggett) → review+
Possibly ignorant question: do we need to be wrapping calls to graphite with try..catch blocks, since graphite is using STL and is clearly passing through exceptions?  There don't appear to be any try..catch blocks below the interface level within graphite so I'm assuming this means that it's up to the app to deal with exceptions.
List of headers we wrap:
http://mxr.mozilla.org/mozilla-central/source/config/stl-headers

Usage in graphite2 code that's not in the list above:

<cstddef>
<cwchar>
<iterator>
<stdexcept>

Full list of headers (grepped info, might be conditionally included):

<algorithm>
<assert.h>
<cassert>
<climits>
<cstddef>
<cstdio>
<cstdlib>
<cstring>
<cwchar>
<iterator>
<stdarg.h>
<stddef.h>
<stdexcept>
<stdio.h>
<stdlib.h>
<string.h>
<utility>
<vector>
Please add 

<cstddef>
<cwchar>
<iterator>
<stdexcept>

to stl-headers.  Otherwise looks OK.
(In reply to John Daggett (:jtd) from comment #119)
> Possibly ignorant question: do we need to be wrapping calls to graphite with
> try..catch blocks, since graphite is using STL and is clearly passing
> through exceptions?  There don't appear to be any try..catch blocks below
> the interface level within graphite so I'm assuming this means that it's up
> to the app to deal with exceptions.

graphite, while it is written in C++, is very carefully written to not use anything that would come from stdlibc++. In fact, the first test in the regression tests makes just such a check that stdlibc++ has not been linked in by anything. This means that graphite does not throw exceptions. The only way graphite could throw is if the getTable function passed to gr_make_face, threw. Other ways to make graphite throw: link it to a library that throws in new. But Graphite itself doesn't throw or use anything that we know throws.

I've just gone through and cleared out a few redundant .h files references so stdexcept is no longer referenced, although the other 3 in your list of 'to be checked' headers, are still required. But if this is a question about what might throw, graphite won't if you won't :)
Comment on attachment 579274 [details] [diff] [review]
part 1 - import graphite2 code from http://hg.palaso.org/graphitedev (rev.851)

I've gone through the graphite2 src code and overall the code looks
fine.  I don't understand completely the inner workings of the VM but
the code is simple enough that we should be able to figure out
problems when they arise via fuzzing.

I think performance and memory usage will be a concern but I think we
can evaluate that more carefully once this lands.

Couple small comments/questions:

In List.h:

> // designed to have a limited subset of the std::vector api
...
> template <typename T> 
> class Vector

In Segment.h:

> #ifdef USE_GRLIST
> #include "GrList.h"
> #else
> #include <vector>
> #endif

The STL version of vector is never used, only the Vector dupe in
List.h.  Is this duplication really necessary? The inclusion of
<vector> should probably be trimmed out since it's never used.  In
Segment.h there are several references to std::vector that should
probably be Vector instead.

In opcodes.h:

> STARTOP(next_n)
>     use_params(1);
>     NOT_IMPLEMENTED;
>     //declare_params(1);
>     //const size_t num = uint8(*param);
> ENDOP

Just curious, why is this left active in the code?  Other
unimplemented instructions have #if 0 wrapped around them.
Attachment #579274 - Flags: review?(jdaggett) → review+
Attachment #579276 - Flags: review?(ted.mielczarek) → review?(khuey)
Comment on attachment 579276 [details] [diff] [review]
part 2 - include graphite in the build (refreshed)

Review of attachment 579276 [details] [diff] [review]:
-----------------------------------------------------------------

r=me provided that the sizeof size_t stuff is dropped as we discussed on IRC.

::: config/autoconf.mk.in
@@ +369,5 @@
>  IMPORT_LIB_SUFFIX = @IMPORT_LIB_SUFFIX@
>  LIBS_DESC_SUFFIX = @LIBS_DESC_SUFFIX@
>  USE_N32		= @USE_N32@
>  HAVE_64BIT_OS	= @HAVE_64BIT_OS@
> +SIZEOF_SIZE_T	= @SIZEOF_SIZE_T@

Make sure you drop this line along with the configure check.

::: configure.in
@@ +8258,5 @@
> +dnl SIL Graphite
> +dnl ========================================================
> +MOZ_ARG_DISABLE_BOOL(graphite,
> +[  --disable-graphite      Disable SIL Graphite],
> +    MOZ_GRAPHITE= )

I think you can drop the MOZ_ARG_DISABLE_BOOL here.  If we need to kill this code we can set | MOZ_GRAPHITE= | above, and if Fennec doesn't want to ship this they can do the same in their confvars.sh.

@@ +8263,5 @@
> +if test "$MOZ_GRAPHITE"; then
> +  if test -z "$SIZEOF_SIZE_T"; then
> +    AC_MSG_ERROR([SIL Graphite requires SIZEOF_SIZE_T to be defined, but it is unknown;
> +                  fix the configure script, or build with --disable-graphite.])
> +  fi

And the sizeof size_t can be dropped here too.

::: gfx/graphite2/src/Makefile.in
@@ +64,5 @@
> +
> +EXPORTS_NAMESPACES = graphite2
> +EXPORTS_graphite2 = $(_PUBLIC_HEADERS)
> +
> +LOCAL_INCLUDES  += -I$(srcdir)

Is this line really necessary?
Attachment #579276 - Flags: review?(khuey) → review+
(In reply to Chris Jones [:cjones] [:warhammer] from comment #121)
> Please add 
> 
> <cstddef>
> <cwchar>
> <iterator>
> <stdexcept>
> 
> to stl-headers.  Otherwise looks OK.

The latest graphite code (rev.852) has eliminated <stdexcept>, so I've dropped that from the list. Adding <iterator> and <cstddef> is fine; however, when I add <cwchar> to stl-headers, it breaks the build on Linux and Windows (but builds OK on Mac and Android).

Linux failure:

/usr/bin/ccache /tools/gcc-4.5-0moz3/bin/g++ -o nsUnicharInputStream.o -c -I../../dist/stl_wrappers -I../../dist/system_wrappers -include /builds/slave/try-lnx-dbg/build/config/gcc_hidden.h -DMOZILLA_INTERNAL_API -DOSTYPE=\"Linux2.6.18-8\" -DOSARCH=Linux -DEXCLUDE_SKIA_DEPENDENCIES  -DOS_LINUX=1 -DOS_POSIX=1  -D_IMPL_NS_COM -I/builds/slave/try-lnx-dbg/build/ipc/chromium/src -I/builds/slave/try-lnx-dbg/build/ipc/glue -I../../ipc/ipdl/_ipdlheaders  -I.. -I/builds/slave/try-lnx-dbg/build/xpcom/io -I. -I../../dist/include -I../../dist/include/nsprpub  -I/builds/slave/try-lnx-dbg/build/obj-firefox/dist/include/nspr -I/builds/slave/try-lnx-dbg/build/obj-firefox/dist/include/nss      -fPIC -fno-rtti -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof -Wno-variadic-macros -Werror=return-type -Wno-long-long -fno-exceptions -fno-strict-aliasing -std=gnu++0x -pthread -ffunction-sections -fdata-sections -pipe -DDEBUG -D_DEBUG -DTRACING -g -Os -freorder-blocks -finline-limit=50 -fno-strict-aliasing -fno-omit-frame-pointer   -DMOZILLA_CLIENT -include ../../mozilla-config.h -MD -MF .deps/nsUnicharInputStream.pp /builds/slave/try-lnx-dbg/build/xpcom/io/nsUnicharInputStream.cpp
In file included from ../../dist/stl_wrappers/cstddef:67:0,
                 from /tools/gcc-4.5-0moz3/lib/gcc/i686-pc-linux-gnu/4.5.2/../../../../include/c++/4.5.2/new:41,
                 from ../../dist/system_wrappers/new:3,
                 from ../../dist/stl_wrappers/string:61,
                 from ../../../ipc/chromium/src/chrome/common/ipc_message_utils.h:8,
                 from ../../dist/include/IPC/IPCMessageUtils.h:42,
                 from ../../../xpcom/io/nsMultiplexInputStream.cpp:44:
../../dist/include/mozilla/mozalloc.h:227:39: error: expected type-specifier
../../dist/include/mozilla/mozalloc.h:227:39: error: expected ')'
../../dist/include/mozilla/mozalloc.h:227:39: error: expected initializer
../../dist/include/mozilla/mozalloc.h:233:39: error: 'nothrow_t' in namespace 'std' does not name a type
../../dist/include/mozilla/mozalloc.h:239:41: error: expected type-specifier
../../dist/include/mozilla/mozalloc.h:239:41: error: expected ')'
../../dist/include/mozilla/mozalloc.h:239:41: error: expected initializer
../../dist/include/mozilla/mozalloc.h:245:41: error: 'nothrow_t' in namespace 'std' does not name a type
../../dist/include/mozilla/mozalloc.h:257:39: error: 'nothrow_t' in namespace 'std' does not name a type
../../dist/include/mozilla/mozalloc.h:269:41: error: 'nothrow_t' in namespace 'std' does not name a type

Win32 failure:

d:/mozilla-build/python25/python2.5.exe -O e:/builds/moz2_slave/try-w32-dbg/build/build/cl.py cl -FonsDependentString.obj -c -D_HAS_EXCEPTIONS=0 -I../../../dist/stl_wrappers  -DMOZILLA_INTERNAL_API -DOSTYPE=\"WINNT5.2\" -DOSARCH=WINNT -D_IMPL_NS_COM  -I/e/builds/moz2_slave/try-w32-dbg/build/xpcom/string/src -I. -I../../../dist/include -I../../../dist/include/nsprpub  -Ie:/builds/moz2_slave/try-w32-dbg/build/obj-firefox/dist/include/nspr -Ie:/builds/moz2_slave/try-w32-dbg/build/obj-firefox/dist/include/nss        -TP -nologo -W3 -Gy -Fdgenerated.pdb -wd4800 -we4553  -DDEBUG -D_DEBUG -DTRACING -Zi -O1 -Oy- -MDd           -FI ../../../dist/include/mozilla-config.h -DMOZILLA_CLIENT /e/builds/moz2_slave/try-w32-dbg/build/xpcom/string/src/nsDependentString.cpp
nsDependentString.cpp
D:\msvs8\VC\INCLUDE\new(49) : error C3646: '_THROW0' : unknown override specifier
D:\msvs8\VC\INCLUDE\new(49) : error C2091: function returns function
D:\msvs8\VC\INCLUDE\new(53) : error C3646: '_THROW0' : unknown override specifier
D:\msvs8\VC\INCLUDE\new(53) : error C2091: function returns function
D:\msvs8\VC\INCLUDE\new(53) : error C2556: 'void (__cdecl *operator delete(void *))(void)' : overloaded function differs only by return type from 'void operator delete(void *)'
        predefined C++ types (compiler internal)(24) : see declaration of 'operator delete'
D:\msvs8\VC\INCLUDE\new(53) : error C2373: 'operator delete' : redefinition; different type modifiers
        predefined C++ types (compiler internal)(24) : see declaration of 'operator delete'
D:\msvs8\VC\INCLUDE\new(54) : error C3646: '_THROW1' : unknown override specifier
D:\msvs8\VC\INCLUDE\new(54) : error C2039: 'bad_alloc' : is not a member of 'std'
D:\msvs8\VC\INCLUDE\new(54) : error C2065: 'bad_alloc' : undeclared identifier
D:\msvs8\VC\INCLUDE\new(54) : error C2072: 'operator new' : initialization of a function
D:\msvs8\VC\INCLUDE\new(58) : error C3646: '_THROW0' : unknown override specifier
D:\msvs8\VC\INCLUDE\new(59) : error C2091: function returns function
D:\msvs8\VC\INCLUDE\new(59) : error C2824: return type for 'operator new' must be 'void *'
[...a bunch of similar error messages snipped...]
e:\builds\moz2_slave\try-w32-dbg\build\obj-firefox\dist\include\mozilla/mozalloc.h(234) : error C2556: 'void *operator new(size_t,const std::nothrow_t &)' : overloaded function differs only by return type from 'void *(__cdecl *operator new(size_t,const std::nothrow_t &))(void)'
        D:\msvs8\VC\INCLUDE\new(87) : see declaration of 'operator new'
e:\builds\moz2_slave\try-w32-dbg\build\obj-firefox\dist\include\mozilla/mozalloc.h(234) : error C2824: return type for 'operator new' must be 'void *'
e:\builds\moz2_slave\try-w32-dbg\build\obj-firefox\dist\include\mozilla/mozalloc.h(246) : error C2556: 'void *operator new[](size_t,const std::nothrow_t &)' : overloaded function differs only by return type from 'void *(__cdecl *operator new[](size_t,const std::nothrow_t &))(void)'
        D:\msvs8\VC\INCLUDE\new(90) : see declaration of 'operator new[]'
[...a bunch of similar error messages snipped...]

Advice, please...?
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #124)
> Comment on attachment 579276 [details] [diff] [review]
> part 2 - include graphite in the build (refreshed)

> ::: gfx/graphite2/src/Makefile.in
> @@ +64,5 @@
> > +
> > +EXPORTS_NAMESPACES = graphite2
> > +EXPORTS_graphite2 = $(_PUBLIC_HEADERS)
> > +
> > +LOCAL_INCLUDES  += -I$(srcdir)
> 
> Is this line really necessary?

Apparently not - it now builds without it. Must be residue from an earlier version. Removed.
(In reply to Jonathan Kew (:jfkthame) from comment #125)
> The latest graphite code (rev.852) has eliminated <stdexcept>, so I've
> dropped that from the list. Adding <iterator> and <cstddef> is fine;
> however, when I add <cwchar> to stl-headers, it breaks the build on Linux
> and Windows (but builds OK on Mac and Android).

Argh, looking at the wrong patches & logs! It's actually <cstddef> that causes the breakage, which makes more sense in view of the errors that are showing up.
That sounds vaguely familiar.  Removing cstddef from stl-headers should be fine.
Unfortunately Graphite has had to be preffed off by default for now, due to bug 709193:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0a1f7f98c27b

Sucks, but we don't really have any other choice here :-(
Depends on: 709193
https://hg.mozilla.org/mozilla-central/rev/0973ce3ecd5d (part 1)
https://hg.mozilla.org/mozilla-central/rev/c25e81047a9e (part 2)
https://hg.mozilla.org/mozilla-central/rev/f51fc55403ba (part 3)
https://hg.mozilla.org/mozilla-central/rev/afb24aa8ed2e (part 4)
https://hg.mozilla.org/mozilla-central/rev/4cef2af9f1da (reftests)

https://hg.mozilla.org/mozilla-central/rev/0a1f7f98c27b (disabling Graphite by default for now)

Leaving open until graphite is re-enabled, pending bug 709193 / product-team's assessment come Monday.

Sorry your hard work ended up being switched off :-(
Flags: in-testsuite+
Unrelated to the PGO issues, but thought I'd mention it before I forgot...
Building from inbound (which has this preffed back on) locally (using MSVC2010), results in quite a few warnings whilst linking:
http://pastebin.mozilla.org/1405100
During the investigations on inbound (which has now merged to mozilla-central), this was preffed back on (https://hg.mozilla.org/mozilla-central/rev/f1abb2b731e0), but had to be turned back off again (https://hg.mozilla.org/mozilla-central/rev/472dad831258) to allow us sufficient headroom for at least the rest of this cycle. Please speak to khuey / see bug 709193 if you have any questions about this - thanks :-)
Fwiw, I think we *probably* could get away with turning the back on, but if we want this on for 11 I think we should turn it on at the very end of the cycle (if we still can, at that point), because it does add a fair amount of code.
Pushed https://hg.mozilla.org/integration/mozilla-inbound/rev/1c542f9a2e10 to re-enable this, as we believe that enough has been trimmed from libxul for this to be safe.
https://hg.mozilla.org/mozilla-central/rev/1c542f9a2e10
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Blocks: 712120
<cough> Didn't anyone notice that graphite is LGPLed?

Bug 713006 opened.

Gerv
The license headers on the graphite code indicate an LGPL/GPL/MPL tri-license, both in our source tree and in the source downloaded from http://sourceforge.net/projects/silgraphite/files/graphite2/graphite2-1.0.3.tgz/download .
I think this bug is responsible for breaking compilation on case-folding file systems such as NTFS (like I use from within a Linux VM). A fix (renaming) like in bug 686255 is required. I'll leave it to you guys whether it needs an upstream fix + backport, new bug or whatever.

Files including Endian.h:
src/Face.cpp
src/FeatureMap.cpp
src/GlyphFace.cpp
src/GlyphFaceCache.cpp
src/NameTable.cpp
src/Pass.cpp
src/Silf.cpp
src/TtfUtil.cpp
I've asked Martin to consider changing this in the upstream repo, as I think that would be the best way to address the issue. If that happens, we can just take a fresh code drop and that should resolve the problem.
No longer depends on: 709193
This sounds like a bug in your vm setup. You are trying to make a caseless filesystem behave like a cased filesystem and the linux gcc .h tree presumes a cased filesystem. Is this really a requirement that mozilla be buildable in such a way? Where do we draw the line? Font.h? Main.h? These are internal source header files.
If this is the only library that breaks Firefox building on such a setup, it seems worth fixing.
Checked in a change to move the .h files from src into src/inc. Builders should not include src/inc in their -I paths. This should deal with clashing header files in obscure cases.
Thanks, Martin - that looks like it should resolve the reported problem, and also prevent potential similar issues with other header names.
(In reply to martin_hosken from comment #143)
> Checked in a change to move the .h files from src into src/inc. Builders
> should not include src/inc in their -I paths. This should deal with clashing
> header files in obscure cases.

FTR, this made it to m-c as part of bug 721068, so it's fixed for mozilla12.
Whiteboard: [secr:christoph fuzzing] → [sec-assigned:christoph fuzzing]
Flags: sec-review?(cdiehl)
Flags: sec-review?(cdiehl) → sec-review+
Whiteboard: [sec-assigned:christoph fuzzing] → [completed secreview]
(In reply to Christoph Diehl [:cdiehl] from comment #150)
> [completed secreview]
Thnaks for completing the review. Do you feel we need additional security tests added to the tree for this? That is the one remaining question to turning this feature on by default in the prefs.
We haven't found any serious issues lately. We will continue to fuzz Graphite2 on a regular basis and report bugs to the developers.
Depends on: 831292
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: