Closed Bug 1309867 Opened 3 years ago Closed 3 years ago

Painting during JS interrupt can trigger layout-related JS via FontFaceSet API

Categories

(Core :: Layout, defect, P1)

51 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox50 --- unaffected
firefox51 + fixed
firefox52 + fixed

People

(Reporter: jdm, Assigned: ehsan)

References

Details

(4 keywords)

Crash Data

Attachments

(2 files, 1 obsolete file)

https://crash-stats.mozilla.com/report/index/74069f22-a970-4091-82bf-dfa012161013 shows a smoking gun. We interrupt JS, start painting, then as part of computing font metrics we end up calling FlushUserFontSet. This creates a Promise object via the FontFaceSet API, and the JS engine is displeased.
[Tracking Requested - why for this release]: Regression from 1279086
Duplicate of this bug: 1309608
Crash Signature: [@ mozilla::dom::AutoJSAPI::InitInternal ] → [@ mozilla::dom::AutoJSAPI::InitInternal ] [@ mozilla::dom::AutoJSAPI::Init]
Flags: needinfo?(wmccloskey)
It looks like the signature [@ JSObject2JSObjectMap::Find ] is another variation on this issue.

Examples:
bp-4dc1d354-cc8b-4ec4-ae13-0a8992161012
bp-44e2beee-1d82-4a10-b6e9-64f772161012
Crash Signature: [@ mozilla::dom::AutoJSAPI::InitInternal ] [@ mozilla::dom::AutoJSAPI::Init] → [@ mozilla::dom::AutoJSAPI::InitInternal ] [@ mozilla::dom::AutoJSAPI::Init] [@ JSObject2JSObjectMap::Find ]
[@ nsXPCWrappedJS::GetJSObject ] looks like another variation on this, though I only see 2 crashes.
bp-28fdd70c-f42b-48e2-a81b-fd41f2161012
bp-99292f11-c96f-4190-a5b6-de5542161012

(I'm not going to add it to the signature list, as it doesn't seem so common, and the signature is very generic.)
Also this lone instance of [@ nsWrapperCache::GetWrapper const ]:
bp-c677e39c-487c-469a-ab1d-e97882161013
[Tracking Requested - why for this release]: If you add together the various signatures, I think this is the top Nightly Windows crash on 10-12.
This should help avoid calling into the JS engine in the middle of
painting.
Attachment #8800805 - Flags: review?(dholbert)
My patch should eliminate the codepath that causes us to call into the JS engine when painting...
Flags: needinfo?(wmccloskey)
Thanks Ehsan! I spent most of today trying to reproduce this and I couldn't. My worry is that we'll still need to fix more stuff after this. For example, I think that FontFaceSet::UpdateRules ends up creating FontFace objects, which can create more promises. I wonder if there's someplace higher on the callstack where we could put off doing this work?
Comment on attachment 8800805 [details] [diff] [review]
Create FontFaceSet's ready promise lazily

Punting to jfkthame, as I'm on PTO for the next few days (and jfkthame is more likely to be familiar with FontFaceSet than I am).

(assuming jfkthame is OK reviewing this, do remember to adjust the "r=dholbert" in the patch's commit message when landing, too.)
Attachment #8800805 - Flags: review?(dholbert) → review?(jfkthame)
Comment on attachment 8800805 [details] [diff] [review]
Create FontFaceSet's ready promise lazily

Review of attachment 8800805 [details] [diff] [review]:
-----------------------------------------------------------------

I'd prefer to punt this to :heycam, as I believe he knows the FontFaceSet code, whereas I don't.

::: layout/style/FontFaceSet.cpp
@@ +384,5 @@
>    if (!mReady) {
> +    nsCOMPtr<nsIGlobalObject> global = GetParentObject();
> +    mReady = Promise::Create(global, aRv);
> +    if (!mReady) {
> +      return nullptr;

The old code would Throw(NS_ERROR_FAILURE) in this case (where promise creation has failed); do we no longer want that?
Attachment #8800805 - Flags: review?(jfkthame) → review?(cam)
(In reply to Bill McCloskey (:billm) from comment #9)
> Thanks Ehsan! I spent most of today trying to reproduce this and I couldn't.
> My worry is that we'll still need to fix more stuff after this. For example,
> I think that FontFaceSet::UpdateRules ends up creating FontFace objects,
> which can create more promises.

That's certainly possible.  Unfortunately the easiest way I can think of for sure is to land this and see what happens.  :(

> I wonder if there's someplace higher on the
> callstack where we could put off doing this work?

Well, because of the way FontFaceSet has been implemented, it cannot defer the creation of the promise after GetReady() is called.  If we have internal code that uses the promise machinery to be notified of something, we'd need to replace that with some other mechanism.  Otherwise I think the current patch is sufficient.  Not sure what other solution you were thinking of?

Another good question to ask is, how can we prevent this kind of thing from occurring again.  Can we add an experimental mode to Gecko where to do _any_ painting, we'd call into SpiderMonkey briefly just to land in PresShell::Paint or something, so that we can have a mode where there's always JS on the stack when painting?  That should help make it trivial to push a patch to try to evaluate what happens when we do that, or help reproduce issues like this...  We can even consider maintaining such a configuration green on TreeHerder.  Thoughts?
Flags: needinfo?(wmccloskey)
(In reply to Jonathan Kew (:jfkthame) from comment #11)
> ::: layout/style/FontFaceSet.cpp
> @@ +384,5 @@
> >    if (!mReady) {
> > +    nsCOMPtr<nsIGlobalObject> global = GetParentObject();
> > +    mReady = Promise::Create(global, aRv);
> > +    if (!mReady) {
> > +      return nullptr;
> 
> The old code would Throw(NS_ERROR_FAILURE) in this case (where promise
> creation has failed); do we no longer want that?

Exposing internal nsresult values to the Web is a terrible thing to do.  The current patch just propagates the exception that Promise::Create would put into aRv, which is arguably the best thing we can do here.
(In reply to :Ehsan Akhgari from comment #12)
> Another good question to ask is, how can we prevent this kind of thing from
> occurring again.  Can we add an experimental mode to Gecko where to do _any_
> painting, we'd call into SpiderMonkey briefly just to land in
> PresShell::Paint or something, so that we can have a mode where there's
> always JS on the stack when painting?  That should help make it trivial to
> push a patch to try to evaluate what happens when we do that, or help
> reproduce issues like this...  We can even consider maintaining such a
> configuration green on TreeHerder.  Thoughts?

I guess I could just add JS::AutoAssertOnGC to the stack every time we paint. That's kind of a no-brainer now that I think of it. Pushing that to try should expose a lot more issues like this.
Flags: needinfo?(wmccloskey)
(In reply to Bill McCloskey (:billm) from comment #14)
> (In reply to :Ehsan Akhgari from comment #12)
> > Another good question to ask is, how can we prevent this kind of thing from
> > occurring again.  Can we add an experimental mode to Gecko where to do _any_
> > painting, we'd call into SpiderMonkey briefly just to land in
> > PresShell::Paint or something, so that we can have a mode where there's
> > always JS on the stack when painting?  That should help make it trivial to
> > push a patch to try to evaluate what happens when we do that, or help
> > reproduce issues like this...  We can even consider maintaining such a
> > configuration green on TreeHerder.  Thoughts?
> 
> I guess I could just add JS::AutoAssertOnGC to the stack every time we
> paint. That's kind of a no-brainer now that I think of it. Pushing that to
> try should expose a lot more issues like this.

Yes, seems like exactly what we want.
This patches FontFace::mLoaded in the same way. I was able to reproduce this eventually, and with both these patches the assertion no longer fires.
Attachment #8800902 - Flags: review?(cam)
Oh, of course!  Sorry I missed that one.  :-)
Tracking 52+ for this identified regression in JS that causes a crash.
Comment on attachment 8800805 [details] [diff] [review]
Create FontFaceSet's ready promise lazily

Review of attachment 8800805 [details] [diff] [review]:
-----------------------------------------------------------------

The general approach is fine.  A couple of changes needed.

::: layout/style/FontFaceSet.cpp
@@ +384,5 @@
>    if (!mReady) {
> +    nsCOMPtr<nsIGlobalObject> global = GetParentObject();
> +    mReady = Promise::Create(global, aRv);
> +    if (!mReady) {
> +      return nullptr;

Since the IDL doesn't say we return a nullable Promise object, returning null here is incorrect (and the bindings will assert if we do).  I think we should continue throwing the exception.

@@ +1518,5 @@
> +      if (ready) {
> +        mReady.swap(ready);
> +      }
> +    } else {
> +      mReadyResolution = nullptr;

I am not sure what causes Promise creation to fail, so I don't know whether we could first fail to create a Promise and later succeed.  In case we can, then if the Promise creation just above here fails, we should set mResolveLazilyCreatedReadyPromise to false, so that it accurately records what to do when that later Promise creation succeeds.  So let's take this out of the else clause and have:

  if (!mReady) {
    mResolveLazilyCreatedReadyPromise = false;
  }

::: layout/style/FontFaceSet.h
@@ +318,3 @@
>    RefPtr<mozilla::dom::Promise> mReady;
> +  // A pointer to what mReady needs to be resolved to when created.
> +  FontFaceSet* MOZ_NON_OWNING_REF mReadyResolution;

We always resolve the promise with the same value (the FontFaceSet itself), so I think this should just be a bool.  How about:

  bool mResolveLazilyCreatedReadyPromise;

for a more descriptive name.
Attachment #8800805 - Flags: review?(cam) → review-
Comment on attachment 8800902 [details] [diff] [review]
Lazily create FontFace::mLoaded

Review of attachment 8800902 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/style/FontFace.cpp
@@ -202,5 @@
> -      if (mLoaded) {
> -        // The SetStatus call we are about to do assumes that for
> -        // FontFace objects with sources other than ArrayBuffer(View)s, that the
> -        // mLoaded Promise is rejected with a network error.  We get
> -        // in here beforehand to set it to the required syntax error.

Please retain this comment.
Attachment #8800902 - Flags: review?(cam) → review+
Priority: -- → P1
(In reply to Cameron McCormack (:heycam) from comment #19)
> @@ +1518,5 @@
> > +      if (ready) {
> > +        mReady.swap(ready);
> > +      }
> > +    } else {
> > +      mReadyResolution = nullptr;
> 
> I am not sure what causes Promise creation to fail, so I don't know whether
> we could first fail to create a Promise and later succeed.

With an OOM the first time, that's possible.
This should help avoid calling into the JS engine in the middle of
painting.
Attachment #8801780 - Flags: review?(cam)
Attachment #8800805 - Attachment is obsolete: true
Comment on attachment 8801780 [details] [diff] [review]
Part 1: Create FontFaceSet's ready promise lazily

Review of attachment 8801780 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with this.

::: layout/style/FontFaceSet.cpp
@@ +1518,3 @@
>  
> +      if (ready) {
> +        mReady.swap(ready);

Looking at this again, I think we should unconditionally swap.  (Or, just assign directly to mReady in the first place.)  My earlier comment about setting mResolveLazilyCreatedReadyPromise really only makes sense if we can set mReady to null after it being non-null.

In some ways it probably doesn't matter whether we end up throwing an exception or returning an out-of-date Promise object, since Promise creation shouldn't fail in practice, but allowing mReady to go back to null when the new Promise creation fails seems more correct to me.
Attachment #8801780 - Flags: review?(cam) → review+
Thanks for the review!  I'll address review comments and will land both patches together.
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/34c77f26702b
Part 1: Create FontFaceSet's ready promise lazily; r=heycam
https://hg.mozilla.org/integration/mozilla-inbound/rev/21c49d14a9d7
Part 2: Lazily create FontFace::mLoaded; r=heycam
Flags: needinfo?(wmccloskey)
Sorry about that. I made a stupid mistake. I fixed the problem locally and pushed to try. I'll push once that's done.
Flags: needinfo?(wmccloskey)
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9a5afb8186e9
Part 1: Create FontFaceSet's ready promise lazily; r=heycam
https://hg.mozilla.org/integration/mozilla-inbound/rev/7302f5fb2b19
Part 2: Lazily create FontFace::mLoaded; r=heycam
https://hg.mozilla.org/mozilla-central/rev/9a5afb8186e9
https://hg.mozilla.org/mozilla-central/rev/7302f5fb2b19
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Track 51+ as regression.
Since this also affects 51, do you think the patch is worth uplifting to 51?
Flags: needinfo?(wmccloskey)
Yeah, let's backport this. I don't actually think 51 is affected, but it might be. The patches are pretty safe, so we might as well.
Flags: needinfo?(wmccloskey)
Comment on attachment 8800902 [details] [diff] [review]
Lazily create FontFace::mLoaded

Approval Request Comment
[Feature/regressing bug #]: I suspect this was regressed by bug 1308039, which landed in 52, but it may have been regressed by bug 1279086, which landed in 51.
[User impact if declined]: probably none, but maybe some sort of crash
[Describe test coverage new/current, TreeHerder]: on m-c for a while
[Risks and why]: Pretty simple patches. Low risk.
[String/UUID change made/needed]: None.
Attachment #8800902 - Flags: approval-mozilla-aurora?
Hi :jdm,
From the crash stat, I don't see any crashes in 51, do you think 51 is affected?
Flags: needinfo?(josh)
It doesn't crash in 51. The crash is cause by an assertion that was landed in 52. But, as I said, the same underlying problem may affect 51.
Flags: needinfo?(josh)
Comment on attachment 8800902 [details] [diff] [review]
Lazily create FontFace::mLoaded

Fix a crash which might affect 51. Take it in 51 aurora.
Attachment #8800902 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8801780 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
See Also: → 1316683
Assignee: nobody → ehsan
Version: unspecified → 51 Branch
You need to log in before you can comment on or make changes to this bug.