Closed Bug 1243267 Opened 4 years ago Closed 4 years ago

Assertion failure: mScrollEvent, at nsGfxScrollFrame.cpp:4274, while scrolling twitter

Categories

(Core :: Layout, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox46 --- fixed
firefox47 --- fixed

People

(Reporter: dbaron, Assigned: kats, NeedInfo)

References

(Blocks 1 open bug, )

Details

(Keywords: assertion, regression)

Attachments

(2 files, 3 obsolete files)

Twice in the past few days I've seen this fatal assertion:

Assertion failure: mScrollEvent, at mozilla/layout/generic/nsGfxScrollFrame.cpp:4274

while scrolling down on https://twitter.com/
Maybe a regression from bug 1209970?
Flags: needinfo?(bugmail.mozilla)
Yeah, probably. Looking at the code I'm not entirely sure why it would happen - although I guess it might if two calls to WillRefresh are scheduled before the first one is dispatched. Then the second call would run into a null mScrollEvent and hit this assertion. I can put up a patch that guards against this.
Assignee: nobody → bugmail.mozilla
Flags: needinfo?(bugmail.mozilla)
Attached patch Patch (obsolete) — Splinter Review
This should fix the assertion (by getting rid of it) and handling that scenario. If it is the scenario I described above then not sending the second scroll event should be fine because the scroll position would not have changed between the two calls (otherwise mScrollEvent would have been re-created).
Attachment #8712742 - Flags: review?(mats)
Comment on attachment 8712742 [details] [diff] [review]
Patch

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> Yeah, probably. Looking at the code I'm not entirely sure why it would
> happen - although I guess it might if two calls to WillRefresh are scheduled
> before the first one is dispatched.

What do you mean by "scheduled"?  WillRefresh is called synchronously from
the RefreshDriver here:
http://mxr.mozilla.org/mozilla-central/source/layout/base/nsRefreshDriver.cpp?rev=3b5f43556647#1659
and PostScrollEvent() only creates one at a time:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsGfxScrollFrame.cpp?rev=7a7def04840d#4286
so I don't see how we can get a FireScrollEvent() call (i.e. a WillRefresh
callback) with mScrollEvent being null.  Maybe I'm missing something,
can you elaborate on the series of steps that would lead up to this
assertion?
Comment on attachment 8712742 [details] [diff] [review]
Patch

That's a good point, I guess that scenario can't actually happen. I'll think about this a bit more. :dbaron, if you see it again and can grab a stack trace that would be helpful. I'm wondering if it's some sort of reentrant behavior.
Attachment #8712742 - Flags: review?(mats)
http://www.bonprix.it/categoria/per-lei-moda-donna-dalla-a-alla-z-pullover-e-felpe/?page=2

Assertion failure: mScrollEvent, at /mozilla/builds/nightly/mozilla/layout/generic/nsGfxScrollFrame.cpp:4401

Aurora/46, Nightly/47 on Linux and Windows. OSX not tested.
http://hg.mozilla.org/mozilla-central/rev/a87d6d52c1fcfffbd6f44537a9671973802cca13

Once you reproduce you need to clear your history and reload to see it again. Let me know if you need a regression range.
Thanks! I was able to repro on OS X as well. What appears to be happening is that there's a scroll event listener on the page which does a (sync?) XHR request, which spins a nested event loop. That nested event loop fires the refresh driver which then triggers the same scroll event dispatch. So basically yeah, the ScrollEvent::WillRefresh function is getting called reentrantly, hence the problem.
Attached patch Patch (obsolete) — Splinter Review
Patch is the same as before, commit message is updated. Given the scenario this happens in I feel like this solution is probably appropriate, unless we want to avoid this reentrancy problem entirely.
Attachment #8712742 - Attachment is obsolete: true
Attachment #8721738 - Flags: review?(mats)
Attached patch Patch (obsolete) — Splinter Review
Oops, wrong file
Attachment #8721738 - Attachment is obsolete: true
Attachment #8721738 - Flags: review?(mats)
Attachment #8721739 - Flags: review?(mats)
Comment on attachment 8721739 [details] [diff] [review]
Patch

I think the root of the problem is that the refresh driver takes
a kungFuDeathGrip on the observer before calling WillRefresh():
http://mxr.mozilla.org/mozilla-central/source/layout/base/nsRefreshDriver.cpp?rev=75dfe10ec44a#1688
so nulling out mScrollEvent doesn't delete it.
It seems like the intention was that it should've been the last ref which
would delete it, causing the dtor to remove it from the refresh driver:
http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsGfxScrollFrame.cpp?rev=4189a6261ca9#4382

I seems to me that ScrollEvent is meant to be "one shot" so it
feels wrong that we may get more than one call to FireScrollEvent
on the same event.

I suggest we add a RemoveRefreshObserver call at the start of
ScrollEvent::WillRefresh instead, and keep the assertion as is.
The RemoveRefreshObserver in the dtor is then redundant in case
WillRefresh was called, but it's still needed if not.
I don't think it hurts to call it twice though.

What do you think?
Flags: needinfo?(bugmail.mozilla)
I'm surprised our fuzz testing framework didn't catch this.  It seems like we should try to
enhance it to include the "spinning the event loop from event handlers using sync XHR (or some
other means) and then doing nasty stuff" case.  Or maybe we have that and we were just unlucky
to not hit this assertion yet?
Flags: needinfo?(jruderman)
Comment 10 sounds reasonable, I'll update the patch accordingly.
Flags: needinfo?(bugmail.mozilla)
Comment on attachment 8721739 [details] [diff] [review]
Patch

Dropping review for now.
Attachment #8721739 - Flags: review?(mats)
Attached patch Patch v2Splinter Review
Attachment #8721739 - Attachment is obsolete: true
Attachment #8722044 - Flags: review?(mats)
Attachment #8722044 - Flags: review?(mats) → review+
https://hg.mozilla.org/mozilla-central/rev/2bf13f9218ad
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Comment on attachment 8722044 [details] [diff] [review]
Patch v2

Approval Request Comment
[Feature/regressing bug #]: bug 1209970
[User impact if declined]: probably not much observable, it's an assertion failure which crashes debug builds. In release I think the only problem is that we sometimes fire two scroll events instead of one, in very specific scenarios. However it is a regression in 46 and the patch is low-risk so I figured it's probably ok to uplift.
[Describe test coverage new/current, TreeHerder]: we have tests that exercise the scroll event code but apparently none that exercise this specific re-entrant code path. jruderman is needinfo'd to see if we can fuzz this scenario.
[Risks and why]: pretty low risk patch
[String/UUID change made/needed]: none
Attachment #8722044 - Flags: approval-mozilla-aurora?
Comment on attachment 8722044 [details] [diff] [review]
Patch v2

Fix for recent regression, while not really user-facing, still good to fix before we ship it.
Attachment #8722044 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.