stylo: Bug 1358634 added a hazard to the build

RESOLVED FIXED

Status

()

Core
CSS Parsing and Computation
P1
normal
RESOLVED FIXED
3 months ago
2 months ago

People

(Reporter: KWierso, Assigned: manishearth)

Tracking

({leave-open})

Firefox Tracking Flags

(Not tracked)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

3 months ago
https://hg.mozilla.org/integration/autoland/rev/de6c23251b7585559dda4c12eca980bb58be3e15 added a hazard to the build.
Flags: needinfo?(manishearth)

Comment 1

3 months ago
Pushed by kwierso@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/2ce843dc8642
Bump number of allowed hazards until it can be sorted out a=bustage
(Reporter)

Updated

3 months ago
Keywords: leave-open
Assignee: nobody → manishearth
Priority: -- → P1
Summary: Bug 1358634 add a hazard to the build → stylo: Bug 1358634 add a hazard to the build
Summary: stylo: Bug 1358634 add a hazard to the build → stylo: Bug 1358634 added a hazard to the build
(Assignee)

Comment 2

3 months ago
Have a fix, however it introduces more locking. The locking will happen whenever font-family is explicitly set, which isn't great.

I can improve this a bit by making it happen only when the generic is none (or fantasy/cursive on !usedocumentfonts) but that is still some guaranteed locking when the family is set in most cases.

The best solution here is to cache the default variable font (cached on ThreadLocalStyleContext, calculated each time language changes) and reuse that. Not sure if it's overkill. There will still be some locking here.
Flags: needinfo?(manishearth)
Comment hidden (mozreview-request)
For the moment, let's get this fix in.  But:

(In reply to Manish Goregaokar [:manishearth] from comment #2)
> The best solution here is to cache the default variable font (cached on
> ThreadLocalStyleContext, calculated each time language changes) and reuse
> that. Not sure if it's overkill. There will still be some locking here.

Doing something like this sounds like a good idea.  Most documents won't have many unique lang=""-specified languages in them, although Wikipedia pages do, in the languages sidebar!

Since the work of updating the ProeprtyDeclarationBlocks for presentation attributes (including lang="") is done on the main thread before traversal, could we eagerly look up the default variable font for every lang="" we map into -x-lang?

Comment 5

3 months ago
mozreview-review
Comment on attachment 8865059 [details]
Bug 1362599 - Use lock in Gecko_nsStyleFont_FixupNoneGeneric ;

https://reviewboard.mozilla.org/r/136730/#review139788
Attachment #8865059 - Flags: review?(cam) → review+
Comment hidden (mozreview-request)
(Assignee)

Comment 7

3 months ago
Good idea, I made us call GetDefaultFont each time we handle an -x-lang property in the mapped attrs.

However, the cache is cleared every time the charset changes or the prefs change. I guess we should then recalculate all the mapped attrs? Seems like overkill but I guess it's a rare occurrence.

(Alternatively we could just look for those that affect lang but it's trickier to handle that way)
(Assignee)

Comment 8

3 months ago
Yep, I can get it to panic by changing the font prefs :)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 11

3 months ago
mozreview-review
Comment on attachment 8865176 [details]
Bug 1362599 - Remember which languages were used and force-cache when reset;

https://reviewboard.mozilla.org/r/136838/#review139854

Looks good, unless I'm missing something with the math stuff.

::: layout/base/nsPresContext.h:376
(Diff revision 1)
>      nsIAtom* lang = aLanguage ? aLanguage : mLanguage.get();
>      return StaticPresData::Get()->GetDefaultFontHelper(aFontID, lang,
>                                                         GetFontPrefsForLang(lang));
>    }
>  
> +  void ForceCacheLang(nsIAtom *aLanguage);

Nit: "*" next to type.

::: layout/base/nsPresContext.h:1406
(Diff revision 1)
>    // typically be a very short (usually of length 1) linked list. There are 31
>    // language groups, so in the worst case scenario we'll need to traverse 31
>    // link items.
>    LangGroupFontPrefs    mLangGroupFontPrefs;
>  
> +  bool mHasUncachedLangGroups;

Personal preference perhaps, but a name like "mCachedLangGroupsDirty" sounds better to me.

::: layout/base/nsPresContext.cpp:1981
(Diff revision 1)
>      MediaFeatureValuesChanged(nsRestyleHint(0), nsChangeHint(0));
>    }
>  }
>  
>  void
> +nsPresContext::ForceCacheLang(nsIAtom *aLanguage)

Nit: "*" next to type.

::: layout/base/nsPresContext.cpp:1986
(Diff revision 1)
> +    nsCOMPtr<nsIAtom>* slot = mLanguagesUsed.AppendElement();
> +    *slot = aLanguage;

mLanguagesUsed.AppendElement(aLanguage);

::: layout/base/nsPresContext.cpp:1997
(Diff revision 1)
> +    // todo: this is wrong
> +    nsCOMPtr<nsIAtom> maaath(NS_Atomize("x-math"));
> +    GetDefaultFont(kPresContext_DefaultVariableFont_ID, maaath.get());

What's wrong with it?

I guess you're doing this here because  we set |-x-lang: x-math| from synthesize_presentational_hints_for_legacy_attributes rather than Gecko's presentation attribute stuff?  (Also, why not just nsGkAtoms::x_math?)
Attachment #8865176 - Flags: review?(cam) → review+
(Assignee)

Comment 12

3 months ago
So the current solution is to store all found langs in a vector, and re-force-cache them every time mLangGroupFontPrefs is reset. This is probably better than invalidating all stored preshint ServoDeclarationBlocks each time, however mLangGroupFontPrefs.Reset() should be rare (only on pref changes), so perhaps it's better to avoid the extra search and allocation. *shrug*.

I have yet to figure out a good solution for the x-math and xml:lang things. Right now I just force-cache x-math unconditionally (otherwise all the mathml tests crash), but I'd like to avoid that. xml:lang will still be able to crash this. IIRC synthesize_presentational_hints is off main thread, so this might be hard to deal with, ideas welcome.


------



There's one tricky bit which I need font advice on from Jonathan: This patch causes editor/reftests/338427-1.html (and only that test) to crash. I believe this to be a bug in the existing code, because the end result is that GetLangGroup returns a different result depending on when you call it. The reason that test crashes is because we try to access an uncached lang group off main thread, which this patch makes crash.

Basically, that test has a `lang="testing-XX"`, and that maps to a different lang group depending on when it's called, which is probably wrong.

The *first time* "testing-XX" is passed to GetLangGroup[1], it goes down to GetLanguageGroup[2]. It's not cached, so GetUncachedLanguageGroup() is called. That function first attempts to look for it in a table. It doesn't find it. So then it chops off the part after the hyphen, and searches for "testing". At this point, having not found it, it sets the language group name to "x-unicode", and exits the loop. `res` is an error code at this stage. In this pass, we will return the language group x-unicode, along with an error.

GetLanguageGroup at this stage will cache the language group in mLangToGroup, but *also* return an error. GetLangGroup, seeing the error, concludes that the language group is x-western and returns that.

In the second pass, GetLanguageGroup notes the existing cached value of x-unicode for the language testing-XX, and returns that with no error. GetLangGroup sees the "successful" result and happily returns x-unicode.

Something's wrong here, GetLangGroup should be an observably constant function. But I'm not sure what the right behavior is. I suppose returning x-unicode with a success state is one solution, but then that error return value is pretty useless (I'm not sure if it's supposed to have a purpose). We could also note the error code and not cache it in mLangToGroup but then what's the point of returning x-unicode if it's never going to be looked at in the first place? (Also, it seems inefficient to do a search each time)


Basically, we have two fallback error behaviors -- GetUncachedLanguageGroup considers the error fallback to be x-unicode, and GetLangGroup has a fallback of x-western. Due to the way the caching works, you may get one or the other fallback depending on when the function is called. I'm not sure which is the correct fallback here. Jonathan, do you have any idea what we should do here?






 [1]: https://dxr.mozilla.org/mozilla-central/source/layout/base/StaticPresData.cpp#239
 [2]: https://dxr.mozilla.org/mozilla-central/source/intl/locale/nsLanguageAtomService.cpp#65

Comment 13

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/2ce843dc8642
(Assignee)

Updated

3 months ago
Depends on: 1363172
Comment hidden (mozreview-request)
(Assignee)

Updated

3 months ago
Attachment #8865059 - Attachment is obsolete: true
Comment hidden (mozreview-request)
(Assignee)

Comment 16

3 months ago
So I discussed xml:lang a bit with bz and we didn't really want to insert that check on every bindtotree.

So the idea was that we could have the code in this bug lock if off main thread and the font is not cached.

However, that's still unsafe, since while locked there's no guarantee a different thread won't read this. What we really need here is a read-write lock. Filed bug 1363172 to add one.
(Assignee)

Comment 17

3 months ago
Also, I realized that in that case there's no point of having all this additional machinery. An RW lock possibly with some basic preemptive caching is enough.

Until bug 1363172 lands I'm just going to leave the hazard there.

Comment 18

3 months ago
mozreview-review
Comment on attachment 8865610 [details]
Bug 1362599 - Guard against threadsafety issues in lang group stuff ;

https://reviewboard.mozilla.org/r/137234/#review140396

::: commit-message-621f4:1
(Diff revision 1)
> +Bug 1362599 - Use lock in Gecko_nsStyleFont_FixupNoneGeneric ; r?heycam

The commit message doesn't really match what the patch is doing.
Attachment #8865610 - Flags: review?(cam) → review+
:manishearth - this is the one that was generating the bogus "Error: Dereference write __temp_8", isn't it? I'm having trouble with that -- when I push this to try, I get the __temp_8 hazard. But when I run the analysis locally (using the same revision, so it should be the same code!) I get a different one:

[70.85s] #181 Analyzing Gecko_nsStyleFont_FixupNoneGeneric ...
Error: Field write mozilla::FontFamilyList.mDefaultFontType
Location: _ZN7mozilla14FontFamilyList18SetDefaultFontTypeENS_14FontFamilyTypeE$void mozilla::FontFamilyList::SetDefaultFontType(uint32) @ obj-analyzed/dist/include/gfxFontFamilyList.h#335 ### SafeArguments: aFont
Stack Trace:
_ZN10nsRuleNode16FixupNoneGenericEP6nsFontPK13nsPresContexthPKS0_$void nsRuleNode::FixupNoneGeneric(nsFont*, nsPresContext*, uint8, nsFont*) @ layout/style/nsRuleNode.cpp#442
Gecko_nsStyleFont_FixupNoneGeneric @ layout/style/ServoBindings.cpp#1791

Is the above expected at this point?

Btw, there is already code that should have handled the DebugOnly<T> that is triggering the __temp_8 hazard, which is why I was trying to run it locally to see what's going wrong.

I am very disturbed by a local run giving different results. I am using identical inputs as far as I can tell.
Flags: needinfo?(manishearth)
(Assignee)

Comment 20

2 months ago
Yeah, that write is expected -- that's why we need the rwlock.
Flags: needinfo?(manishearth)
(Assignee)

Comment 21

2 months ago
Ugh, looks like I forgot to ni? 

Jonathan, could you look at comment 12?
Flags: needinfo?(jfkthame)
Hmm... yes, I agree it's wrong for GetLangGroup to be returning two different fallback answers, as described there.

As for which is "right", I doubt there's any particularly strong argument to be made one way or the other. (The whole langGroup concept and its connection to font selection is badly outdated, really, and I'd love to replace it, but that's another matter. For now, this is what we have.) I think my vote would be for x-unicode as fallback, as it essentially means "we don't know anything more specific to help us choose".

Given that fixing this will be a change to existing (though potentially broken) Gecko behavior, I would suggest fixing it separately in its own bug; that will make tracking easier in the event that it leads to any surprises. (I played around a little with trying to demonstrate the error in Firefox, by getting inconsistent behavior of font prefs, but didn't succeed in exposing it. So I'm not currently sure whether this is possible, or if the error is papered over by other aspects of how Gecko uses the service.)

I think a minimal fix for this would be to make nsLanguageAtomService::GetUncachedLanguageGroup always return NS_OK, even when it does its fallback to x-unicode. I've pushed a try job with this change at https://treeherder.mozilla.org/#/jobs?repo=try&revision=8cd551965f86664633d4074b96155baed690821d to see if anything complains. More generally, we should give the lang-atom service a general cleanup and deCOMtamination, as per bug 734008; it looks like all its nsresult error params could be eliminated, for example.
Flags: needinfo?(jfkthame)
(Assignee)

Comment 23

2 months ago
Okay, thanks.

For the scope of this bug I'll just force-cache the x-unicode atom, since we need this to land for bug 1366347. In the final version it will use an rwlock since it's currently panicky in the presence of xml:lang on stylo.

I'll work on fixing GetLangGroup (and making this non panicky).
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 26

2 months ago
Pushed by bholley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/57944331d429
Guard against threadsafety issues in lang group stuff. r=heycam
https://hg.mozilla.org/integration/autoland/rev/551b26ca92b3
Remember which languages were used and force-cache when reset; r=heycam
Duplicate of this bug: 1366468
https://hg.mozilla.org/mozilla-central/rev/57944331d429
https://hg.mozilla.org/mozilla-central/rev/551b26ca92b3
(Assignee)

Updated

2 months ago
Depends on: 1366886
(Assignee)

Updated

2 months ago
Status: NEW → RESOLVED
Last Resolved: 2 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.