Closed Bug 1183786 Opened 9 years ago Closed 9 years ago

4,500 instances of "Shouldn't call SchedulePaint in a detached pres context" emitted from layout/generic/nsFrame.cpp during linux64 debug testing

Categories

(Core :: Layout, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla42
Tracking Status
firefox42 --- fixed

People

(Reporter: erahm, Assigned: dholbert)

References

(Blocks 2 open bugs)

Details

Attachments

(5 files, 2 obsolete files)

> 4532 [NNNNN] WARNING: Shouldn't call SchedulePaint in a detached pres context: file layout/generic/nsFrame.cpp, line 5195

Bug 974905 added this warning [1], it is currently the 5th most verbose warning during linux64 debug testing.

This is isolated to e10s reftests:

> mozilla-central_ubuntu64_vm-debug_test-reftest-e10s-1-bm68-tests1-linux64-build0.txt:3239
> mozilla-central_ubuntu64_vm-debug_test-reftest-e10s-2-bm123-tests1-linux64-build2.txt:1293

It shows up in 305 tests. A few of the most prevalent:

> 45 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/css-ruby/inflated-ruby-1.html
> 26 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/font-inflation/threshold-scope-float-overflow-2.html
> 25 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/font-inflation/threshold-scope-float-overflow-1.html
> 25 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/font-inflation/threshold-scope-float-2.html
> 25 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/font-inflation/threshold-scope-cell-1.html
> 24 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/font-inflation/select-listbox-1.html
> 24 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/font-inflation/intrinsic-fit-2c.html
> 24 - file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/font-inflation/intrinsic-fit-2b.html

[1] https://hg.mozilla.org/mozilla-central/annotate/e7e69cc8c07b/layout/generic/nsFrame.cpp#l5194
This is probably something that we want to understand why it's happening.
I'm able to repro locally with:
> ./mach reftest --e10s layout/reftests/font-inflation/

Looks like it's coming from the child process.
Attached file Sampling of stacks
Timothy, I'm not sure how helpful this is but I've a attached a few stacks that I sampled at every 100 instances of the warning runnibg the layout/reftests/font-inflation/ tests.

Let me know if there's more info that would be useful.
Flags: needinfo?(tnikkel)
Thanks.

Looks like the pref change callbacks only get unregistered when the prescontext is destroyed. So maybe we need to be more careful when handling pref changes because we could be in a detached prescontext.
Timothy do you plan on working on this?
This is now the #2 most verbose warning.
So I think what we should do here is check if we are detached (null mContainer) in nsPresContext::UpdateAfterPreferencesChanged. If so then set a flag that we need a pref update if we get get re-attached, and just return. Then in nsPresContext::SetContainer check this flag if we are going from null to non-null container and start a timer to call UpdateAfterPreferencesChanged.
Flags: needinfo?(tnikkel)
Timothy is this along the lines of what you were thinking?
Attachment #8641229 - Flags: feedback?(tnikkel)
Assignee: nobody → erahm
Status: NEW → ASSIGNED
Comment on attachment 8641229 [details] [diff] [review]
Delay updating after prefs changed if detached

Looks good thanks!

> nsPresContext::SetContainer(nsIDocShell* aDocShell)
> {
>   if (aDocShell) {
>     mContainer = static_cast<nsDocShell*>(aDocShell);
>+    if (mNeedsPrefUpdate) {
>+      mPrefChangedTimer = do_CreateInstance("@mozilla.org/timer;1");
>+      if (mPrefChangedTimer) {
>+        mPrefChangedTimer->InitWithFuncCallback(
>+            nsPresContext::PrefChangedUpdateTimerCallback,
>+            (void*)this, 0, nsITimer::TYPE_ONE_SHOT);
>+        mNeedsPrefUpdate = false;
>+      }
>+    }

We should probably check if there is an existing mPrefChangedTimer already before created a new one.

Can we also assert that if mNeedsPrefUpdate is true then mContainer is null (before we assign aDocShell to it)?
Attachment #8641229 - Flags: review?(dholbert)
Attachment #8641229 - Flags: feedback?(tnikkel)
Attachment #8641229 - Flags: feedback+
Comment on attachment 8641229 [details] [diff] [review]
Delay updating after prefs changed if detached

(In reply to Timothy Nikkel (:tn) from comment #11)
> We should probably check if there is an existing mPrefChangedTimer already
> before created a new one.

Also, we should probably share the existing do_CreateInstance/InitWithFuncCallback timer-setup-code by refactoring it into a helper-function, instead of duplicating those calls and their various args in a new  spot.

>+++ b/layout/base/nsPresContext.h
>@@ -1368,16 +1368,19 @@ protected:
>+  // Is the a pref update pending having a container?
>+  unsigned              mNeedsPrefUpdate : 1;

s/the/there/

Also, "update pending having a container" is a bit hard to visually parse. Maybe replace with:
  // Is there a pref update to process, once we have a container?
(RE the helper function -- maybe call it "InitPrefChangedUpdateTimer"?)
Comment on attachment 8641470 [details] [diff] [review]
Part 2: Delay updating after prefs changed if detached

Looks good to me. I'd like Daniel to give the final okay to make sure the approach I came up with is reasonable.
Attachment #8641470 - Flags: review?(tnikkel)
Attachment #8641470 - Flags: review?(dholbert)
Attachment #8641470 - Flags: review+
Comment on attachment 8641469 [details] [diff] [review]
Part 1: Refactor timer setup code into a helper function

Review of attachment 8641469 [details] [diff] [review]:
-----------------------------------------------------------------

Ah, I didn't realize we had multiple timers in play here. Nice code-sharing with the helper function.

One nit about the way the function returns its value / success-indication, noted below; this is probably r+ with that addressed, but I'll set feedback+ since it's worth sanity-checking the final result. (since refcounted-pointer-handling is tricky)

::: layout/base/nsPresContext.cpp
@@ +2530,5 @@
>    return mShell && mShell->StyleSet()->HasCachedStyleData();
>  }
>  
> +bool
> +nsPresContext::InitTimer(nsCOMPtr<nsITimer>& aTimer,

Instead of taking this parameter & also returning a bool, it'd be a bit cleaner to:
 - Change the return type to already_AddRefed<nsITimer>.
 - Use a local "nsCOMPtr<nsITimer> timer" variable inside this function.
 - Return timer.forget() on success, or nullptr on failure.

This avoids the possibility of getting into a weird state where we might confusingly indicate "success" by initializing the timer outparam, but indicate failure by returning false.  (e.g. if do_CreateInstance succeeds but InitWithFuncCallback fails)

So then callers would explicitly set "mWhateverTimer = InitTimer(...)" and then null-check mWhateverTimer if they care about failure.
Attachment #8641469 - Flags: review?(dholbert) → feedback+
Attachment #8641229 - Attachment is obsolete: true
Attachment #8641229 - Flags: review?(dholbert)
Comment on attachment 8641470 [details] [diff] [review]
Part 2: Delay updating after prefs changed if detached

Review of attachment 8641470 [details] [diff] [review]:
-----------------------------------------------------------------

I need to study mContainer a bit more before I'm comfortable signing off on part 2 (since I'm not sure I really know what it represents right now); I'll take a closer look in the morning.

For now, though, one nit:

::: layout/base/nsPresContext.cpp
@@ +1575,5 @@
>  nsPresContext::SetContainer(nsIDocShell* aDocShell)
>  {
>    if (aDocShell) {
> +    NS_ASSERTION(!(!!mContainer && mNeedsPrefUpdate),
> +                 "Should only need pref update if mContainer is null.");

The "!(!!" here is hard to read, IMO.

Let's just drop the "!!" boolification and let the boolification just happen automagically.  This is what the coding style recommends, anyway:
  #  When testing a pointer, use (!myPtr) or (myPtr)
  https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#CC_practices

It doesn't explicitly mention !! as an antipattern; but we might as well just use the more concise (and less "!(!!"-heavy) explicitly-recommended technique of just testing the pointer directly.
(In reply to Daniel Holbert [:dholbert] from comment #19)
> Comment on attachment 8641470 [details] [diff] [review]
> Part 2: Delay updating after prefs changed if detached
> 
> Review of attachment 8641470 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I need to study mContainer a bit more before I'm comfortable signing off on
> part 2 (since I'm not sure I really know what it represents right now); I'll
> take a closer look in the morning.
> 
> For now, though, one nit:
> 
> ::: layout/base/nsPresContext.cpp
> @@ +1575,5 @@
> >  nsPresContext::SetContainer(nsIDocShell* aDocShell)
> >  {
> >    if (aDocShell) {
> > +    NS_ASSERTION(!(!!mContainer && mNeedsPrefUpdate),
> > +                 "Should only need pref update if mContainer is null.");
> 
> The "!(!!" here is hard to read, IMO.
> 
> Let's just drop the "!!" boolification and let the boolification just happen
> automagically.  This is what the coding style recommends, anyway:
>   #  When testing a pointer, use (!myPtr) or (myPtr)
>  
> https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/
> Coding_Style#CC_practices
> 
> It doesn't explicitly mention !! as an antipattern; but we might as well
> just use the more concise (and less "!(!!"-heavy) explicitly-recommended
> technique of just testing the pointer directly.

Daniel, thanks for the thorough review. It sounds like you have some specific requirements, would you mind finishing this off while I'm out?
Assignee: erahm → dholbert
Sure - I can address my review-nits in a followup patch here. Thanks.
Comment on attachment 8641470 [details] [diff] [review]
Part 2: Delay updating after prefs changed if detached

Following up here -- the approach seems good to me. r=me on part 2, with my nits addressed (which I'll post a patch to do, & have tn sign off on it)
Attachment #8641470 - Flags: review?(dholbert) → review+
This addresses my review feedback in comment 18.

(I renamed the function, too, because "Init" wasn't as correct anymore now that we're not passing in something to be initialized).
Attachment #8642584 - Flags: review?(tnikkel)
...and this addresses my feedback in comment 19 (just dropping "!!" boolification).
Attachment #8642586 - Flags: review?(tnikkel)
Attachment #8642584 - Flags: review?(tnikkel) → review+
Comment on attachment 8642586 [details] [diff] [review]
part 4: drop unnecessary "!!"

Thanks.
Attachment #8642586 - Flags: review?(tnikkel) → review+
Good work everyone, it's nice to see this finally fixed properly.

Perhaps we can now also change back the warning to a fatal assertion
to make sure we catch any remaining issues / regressions?
(it was degraded to a warning in bug 974905)
Yes, let's do that.
(In reply to Timothy Nikkel (:tn) from comment #29)
> try push with assertion (and these patches)

Instead of MOZ_ASSERT(false, ...) as you've got in your try run, we should just directly back out mozilla-central changeset c2f752b69b3a, which is where we demoted the assert to a check-and-return.

Here's a patch to do that.

(The patch in your Try run is close enough to this that it'll still serve as proof that this is OK, I think.)
Attachment #8642609 - Flags: review?(tnikkel)
Comment on attachment 8642609 [details] [diff] [review]
part 5: upgrade warning to assertion

(In reply to Daniel Holbert [:dholbert] (out 8/5-8/10) from comment #30)
> Instead of MOZ_ASSERT(false, ...) as you've got in your try run, we should
> just directly back out mozilla-central changeset c2f752b69b3a, which is
> where we demoted the assert to a check-and-return.

Yeah, I didn't intend to land that, just wanted to get the try results quickly.
Attachment #8642609 - Flags: review?(tnikkel) → review+
Thanks!

Yeah, looks like this warning isn't quite ready to be a fatal assert again; your Try run shows that we hit it during some printing & SVG-as-an-image reftests, in reftest-e10s. e.g.:

15:35:46     INFO -  REFTEST TEST-LOAD | file:///builds/slave/test/build/tests/reftest/tests/layout/reftests/backgrounds/table-background-print.html | 1890 / 6315 (29%)
15:35:46     INFO -  ++DOMWINDOW == 38 (0x7f26413f7800) [pid = 1906] [serial = 7525] [outer = 0x7f2645d7c800]
15:35:47     INFO -  Assertion failure: false (Shouldn't call SchedulePaint in a detached pres context), at /builds/slave/try-l64-d-00000000000000000000/build/src/layout/generic/nsFrame.cpp:5195

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tnikkel@gmail.com-3ec616a39f2a/try-linux64-debug/try_ubuntu64_vm-debug_test-reftest-e10s-1-bm52-tests1-linux64-build292.txt.gz
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tnikkel@gmail.com-3ec616a39f2a/try-linux64-debug/try_ubuntu64_vm-debug_test-reftest-e10s-2-bm53-tests1-linux64-build182.txt.gz

I'll land parts 1-4 here and file a new bug for part 5.
My Try run in comment 25 looks good (modulo some Linux32 M-e10s bc1 orange -- but I'm pretty sure that's intermittent because it didn't fire on Linux64, and it's been reported as an intermittent before e.g. bug 947634 comment 5.)

So, pushed parts 1-4. (csets in previous comment)
Blocks: 1190645
Comment on attachment 8642609 [details] [diff] [review]
part 5: upgrade warning to assertion

Spun off bug 1190645 for fixing the remaining instances & landing part 5 (the upgrade-to-assertion).
Attachment #8642609 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: