Closed Bug 1781821 Opened 2 years ago Closed 2 years ago

"Attempt to invoke virtual method 'boolean android.widget.EdgeEffect.isFinished()'" Crash in [@ java.lang.NullPointerException: at org.mozilla.geckoview.OverscrollEdgeEffect.draw(OverscrollEdgeEffect.java)]

Categories

(GeckoView :: General, defect, P1)

Unspecified
Android
defect

Tracking

(firefox103 wontfix, firefox104 fixed, firefox105 unaffected, firefox106 unaffected)

RESOLVED FIXED
Tracking Status
firefox103 --- wontfix
firefox104 --- fixed
firefox105 --- unaffected
firefox106 --- unaffected

People

(Reporter: cpeterson, Unassigned)

References

Details

(Keywords: crash, regression, Whiteboard: [geckoview:m106])

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/48182c33-6ab3-4013-bf7b-910c60220727

This crash looks like a regression in 103. I see crash reports in 103 (Beta and Release) and 104 (Nightly), but none from Nightly 105 yet.

Java stack trace:

java.lang.NullPointerException
	at org.mozilla.geckoview.OverscrollEdgeEffect.draw(OverscrollEdgeEffect.java:4)
	at org.mozilla.geckoview.GeckoView.dispatchDraw(GeckoView.java:3)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.draw(View.java:17240)
	at android.view.View.updateDisplayListIfDirty(View.java:16219)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at androidx.coordinatorlayout.widget.CoordinatorLayout.drawChild(CoordinatorLayout.java:4)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at androidx.constraintlayout.widget.ConstraintLayout.dispatchDraw(ConstraintLayout.java:5)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at androidx.fragment.app.FragmentContainerView.drawChild(FragmentContainerView.java:4)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at androidx.fragment.app.FragmentContainerView.dispatchDraw(FragmentContainerView.java:4)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at androidx.fragment.app.FragmentContainerView.drawChild(FragmentContainerView.java:4)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at androidx.fragment.app.FragmentContainerView.dispatchDraw(FragmentContainerView.java:4)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at org.mozilla.fenix.perf.HomeActivityRootLinearLayout.dispatchDraw(HomeActivityRootLinearLayout.kt:2)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.updateDisplayListIfDirty(View.java:16214)
	at android.view.View.draw(View.java:17003)
	at android.view.ViewGroup.drawChild(ViewGroup.java:3727)
	at android.view.ViewGroup.dispatchDraw(ViewGroup.java:3513)
	at android.view.View.draw(View.java:17240)
	at com.android.internal.policy.DecorView.draw(DecorView.java:757)
	at android.view.View.updateDisplayListIfDirty(View.java:16219)
	at android.view.ThreadedRenderer.updateViewTreeDisplayList(ThreadedRenderer.java:648)
	at android.view.ThreadedRenderer.updateRootDisplayList(ThreadedRenderer.java:654)
	at android.view.ThreadedRenderer.draw(ThreadedRenderer.java:762)
	at android.view.ViewRootImpl.draw(ViewRootImpl.java:2800)
	at android.view.ViewRootImpl.performDraw(ViewRootImpl.java:2608)
	at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:2215)
	at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1254)
	at android.view.ViewRootImpl$TraversalRunnable.run(ViewRootImpl.java:6343)
	at android.view.Choreographer$CallbackRecord.run(Choreographer.java:874)
	at android.view.Choreographer.doCallbacks(Choreographer.java:686)
	at android.view.Choreographer.doFrame(Choreographer.java:621)
	at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:860)
	at android.os.Handler.handleCallback(Handler.java:751)
	at android.os.Handler.dispatchMessage(Handler.java:95)
	at android.os.Looper.loop(Looper.java:241)
	at android.app.ActivityThread.main(ActivityThread.java:6274)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:886)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:776)

java.lang.NullPointerException: Attempt to invoke virtual method 'boolean android.widget.EdgeEffect.isFinished()' on a null object reference, which suggests mEdges contains a null edge at one of these calls:

https://searchfox.org/mozilla-central/rev/2bbb0c0a90df20702df8c8011a8996536a83cb75/mobile/android/geckoview/src/main/java/org/mozilla/geckoview/OverscrollEdgeEffect.java#162,166,170,174

Summary: Crash in [@ java.lang.NullPointerException: at org.mozilla.geckoview.OverscrollEdgeEffect.draw(OverscrollEdgeEffect.java)] → "Attempt to invoke virtual method 'boolean android.widget.EdgeEffect.isFinished()'" Crash in [@ java.lang.NullPointerException: at org.mozilla.geckoview.OverscrollEdgeEffect.draw(OverscrollEdgeEffect.java)]

I will ask the APZ/Layout team to take a look. Did they land any changes related to scrolling or Android in v103?

There are two suspicious relevant changes landed in 103, bug 1760368 and bug 1312165. That's said, both landed at the very beginning of v103 branch, so it's weird that there's no crash reports on 103 nightly (given that there are reports on 104 nightly). So I guess it landed on a day close to the end of v103 cycle. There's one, that is bug 1773865, but it unlikely caused this crash in Java layer.

Another point I noticed is, if the crash happens because one of mEdges is null, it's managed in the Java class, so APZ is unlikely related related to the crash.

The line numbers for the top two frames (the only ones in our code) are bogus (line 3 and line 4 land in comments or declarations). Is that normal in java stacks? Could the problem be higher up?

Also, although I can see the overscroll pref is enabled on android I am unable to generate any overscroll effect on an android device (I tried on two different ones). Is this expected?

Given that the crash started since 103-beta5, bug 1772839 is somewhat related? I found the bug number while I was looking at GeckoView.java history.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #5)

Given that the crash started since 103-beta5, bug 1772839 is somewhat related? I found the bug number while I was looking at GeckoView.java history.

Jamie, do you think your FallbackFromAcceleration crash fix for bug 1772839 might have caused these NullPointerExceptions in org.mozilla.geckoview.OverscrollEdgeEffect.draw()?

Hiro found that these NullPointerExceptions started in Fenix 103-beta5, which was the first beta build with your uplifted fix.

Flags: needinfo?(jnicol)
See Also: → 1772839

It seems unlikely given this crash is affecting a completely different set of devices, so the workaround won't kick in as the original issue won't have occurred. None of the crash reports contain the logging which would be present had the original issue occurred, although it's possible that logging doesn't get added to java exception reports. Looking at the code I also can't see how it would affect it.

Flags: needinfo?(jnicol)

Hiro, what do you recommend we do next for this bug? Jamie doesn't think his fix for FallbackFromAcceleration bug 1772839 would have caused this NullPointerException in the OverscrollEdgeEffect Java code.

I still suspect that this crash was a regression in a beta uplift because the first crash report is from 103.0.0-beta.5. We saw no crash reports from Nightly 103.0a1 even though we see reports from Nightly 104 and 105.

(In reply to Jamie Nicol [:jnicol] from comment #7)

None of the crash reports contain the logging which would be present had the original issue occurred, although it's possible that logging doesn't get added to java exception reports.

Jamie, what is the name of the logging function or macro you use? I can ask the crash reporter team if those log messages are included in Java exception reports or not.

Flags: needinfo?(jnicol)
Flags: needinfo?(hikezoe.birchill)

I'd suggest trying to see common URLs in the crash reports and trying to reproduce the crash. That's said, I am now suspecting the crash happens on the Fenix home?

FWIW; here is the list of changes landed in 103-beta8 (in terms of desktop version numbering, I am not sure about Fenix version numbering).

https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=70aafe271fa773d1a43307de3dcd1eb74fc84a5d&tochange=6ed2603f8caef9cf2e8c2b87395de4c4e6210f7d

And this list is in 103-beta9, if it's worth.
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=843dccfaf51adbd1a5e139c8366833b8107d0acd&tochange=849278a96553266685e054706b730e2f34d9acec

Also note that, given that GeckoView.dispatchDraw doesn't have ThreadUtils.assertOnUiThread(), it gets called on a different thread? Then, there's a race condition where GeckoView.mSession is valid, but OverscrollEdgeEffect.mEdges isn't initialized here in between mSession = session and setTheme(context). Though even if the race is true, I am not sure it's related to this crash, I am not sure why the crash started recently.

Flags: needinfo?(hikezoe.birchill)

We don't have URLs for Android crashes.

Setting 103 to Won't Fix, entering RC week for 104
In the GV changes added before Fenix 103.0.0-beta5, Bug 1778718 was uplifted - the commit mentions changes in scroll frame building code?

The logging function is graphicsCriticalNote. I don't believe they will be included in java exception crash reports.

Looking at the changeling I agree bug 1772839 does look suspicious as a GeckoView.java change, although I don't understand how it could have caused it.

Flags: needinfo?(jnicol)

(In reply to Donal Meehan [:dmeehan] from comment #11)

Setting 103 to Won't Fix, entering RC week for 104
In the GV changes added before Fenix 103.0.0-beta5, Bug 1778718 was uplifted - the commit mentions changes in scroll frame building code?

Well, I don't think changes in Gecko are related to this crash.

I happened to find bug 1710708 that overscroll effect doesn't work on Android 10+. (this is why I don't see any overscroll effects on my Pixel3). Given that the code in question here isn't working on Android 10+ devices and there are a bunch of crash reports on such devices (Android API version 29+). Can we disable the code entirely on Android 10+ until we fix bug 1710708 to reduce the crash number?

Makoto, did you make any recent changes to Android scrolling? This crash looks like a regression in Fenix 103.

Do you know where to disable the overscroll code for Android 10+?

Flags: needinfo?(m_kato)
Priority: -- → P1
See Also: → 1710708
Whiteboard: [geckoview:m106]

(In reply to Chris Peterson [:cpeterson] from comment #14)

Makoto, did you make any recent changes to Android scrolling? This crash looks like a regression in Fenix 103.

I don't know. buildid: 20220715095545 (bp-fde7a8ae-ca46-4165-a65d-fda160220717) is first crash for this, but we don't change OverscrollEdgeEffect.java. (Bug 1756530 is later change after fist crash signature).

Do you know where to disable the overscroll code for Android 10+?

Even if Android 9 with Firefox 102 on Galaxy S7, overscroll isn't displayed. We may need to check regression range.

Flags: needinfo?(m_kato)

Even if Android 9 with Firefox 102 on Galaxy S7, overscroll isn't displayed. We may need to check regression range.

This is removed by bug 1724480 for Android 9 support. And :agi says that overscroll support is broken according to https://phabricator.services.mozilla.com/D115031.

I filed GPU process issue as bug 1785786

(In reply to Makoto Kato [:m_kato] from comment #17)

(In reply to Makoto Kato [:m_kato] from comment #18)

I filed GPU process issue as bug 1785786

Is this bug's NullPointerException caused by that GPU process bug 1785786? Or is that an unrelated bug you found while looking at this code?

Flags: needinfo?(m_kato)
See Also: → 1785786

(In reply to Chris Peterson [:cpeterson] from comment #19)

Is this bug's NullPointerException caused by that GPU process bug 1785786? Or is that an unrelated bug you found while looking at this code?

No. I guess that this crash is mEdges may be null. But OverscrollEdgeEffect.setTheme will be called by GeckoView.setSession. So if mSession isn't null, mEdges won't be null... But updateOverscrollVelocity and updateOverscrollOffset won't be called with enabling GPU process, so overscroll animation doesn't run until bug 1785786 is fixed.

But we might have to check mEdges in OverscrollEdgeEffect.draw to avoid this crash...

Flags: needinfo?(m_kato)

This crash doesn't occurs on 104.0b6 or later... I should watch whether this issue occurs on stable release of 104.

The last recent Nightly crashes I see are from build ID 20220808092108, about two weeks. Did some code change on August 8 or 9?

Is it just a coincident https://hg.mozilla.org/integration/autoland/rev/fb6f6155866e which is tied to bug 1772839 landed on Aug 8th?

(In reply to Hiroyuki Ikezoe (:hiro) from comment #23)

Is it just a coincident https://hg.mozilla.org/integration/autoland/rev/fb6f6155866e which is tied to bug 1772839 landed on Aug 8th?

Bug 1783244 is uplifted to 104.0b8, but there is no crash from 104.0b6.

Note that Desktop betas and Fenix betas are not aligned in version number. Fenix 104.0b6 would have corresponded to the landing in bug 1783244.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #26)

Confirmed that Fenix 104.0b6 includes bug 1783244.

Do we think bug 1783244 fixed this crash? That bug fix landed in mozilla-central on 2022-08-08 and the last crash was from Nightly build ID 20220808092108, which might suggest that the bug fixed this crash.

I'd say, yes. And I'd suggest not adding any null checks, I mean leaving OverscrollEdgeEffect code as it is, so that once after we see the same crash signature again, we can start with looking around SurfaceViewWrapper code first. Given that bug 1783244 and bug 1772839 have different symptoms, I suspect this crash will imply a bug somewhere else.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #28)

I'd say, yes. And I'd suggest not adding any null checks, I mean leaving OverscrollEdgeEffect code as it is, so that once after we see the same crash signature again, we can start with looking around SurfaceViewWrapper code first. Given that bug 1783244 and bug 1772839 have different symptoms, I suspect this crash will imply a bug somewhere else.

If bug 1783244 and bug bug 1772839 fixed this issue, GeckoView.setSession (Display.aquire) would throw an exception before calling setTheme. Then, if any caller had try-catch block, this issue might occur since setTheme isn't called.

If so, bug 1786119 (landed last week) helps this situation. OverscrollEdgeEffect reuses first GeckoSession unfortunately, so it sets current GeckoSession via GeckoView.setSession and OverscrollEdgeEffect.draw checks whether GeckoSession is available.

should we close this bug given that there are no crashes in 104/105 or is there further investigation to happen?

(In reply to Dianna Smith [:diannaS] from comment #30)

should we close this bug given that there are no crashes in 104/105 or is there further investigation to happen?

I don't think we need to investigate any further, so I'll resolve this bug as fixed. We don't see any crash reports from build IDs > Nightly 20220804094607 or Beta > 104.0b5.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED

Removing regressionwindow-wanted keyword because this bug has been resolved.

Removing regressionwindow-wanted keyword because this bug has been resolved.

Removing regressionwindow-wanted keyword because this bug has been resolved.

You need to log in before you can comment on or make changes to this bug.