Closed Bug 1586770 Opened 1 year ago Closed 1 year ago

"Welcome to Firefox" Fennec Crash in [@ java.lang.IndexOutOfBoundsException: at android.support.v7.widget.RecyclerView$Recycler.tryGetViewHolderForPositionByDeadline(RecyclerView.java)]

Categories

(Firefox for Android :: General, defect, P1)

Unspecified
Android
defect

Tracking

()

VERIFIED FIXED
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 70+ verified
firefox67 --- unaffected
firefox68 --- unaffected
firefox69 --- unaffected
firefox70 --- unaffected
firefox71 --- unaffected

People

(Reporter: marcia, Assigned: vlad.baicu)

References

(Regression)

Details

(Keywords: crash, regression, Whiteboard: [fennec68.2])

Crash Data

Attachments

(1 file)

This bug is for crash report bp-0503366f-42e2-4ce0-94e9-a75d20191007.

Seen while looking at Fennec nightly crash stats - crashes started in 20191002040044: https://bit.ly/2VjHmW8

Comment: "Trying to close the "welcome to nightly" banner at the bookmarks. No response"

Java stack trace:

java.lang.IndexOutOfBoundsException
	at android.support.v7.widget.RecyclerView$Recycler.tryGetViewHolderForPositionByDeadline(RecyclerView.java:5923)
	at android.support.v7.widget.RecyclerView$Recycler.getViewForPosition(RecyclerView.java:5858)
	at android.support.v7.widget.RecyclerView$Recycler.getViewForPosition(RecyclerView.java:5854)
	at android.support.v7.widget.LinearLayoutManager$LayoutState.next(LinearLayoutManager.java:2230)
	at android.support.v7.widget.LinearLayoutManager.layoutChunk(LinearLayoutManager.java:1557)
	at android.support.v7.widget.LinearLayoutManager.fill(LinearLayoutManager.java:1517)
	at android.support.v7.widget.LinearLayoutManager.onLayoutChildren(LinearLayoutManager.java:612)
	at android.support.v7.widget.RecyclerView.dispatchLayoutStep1(RecyclerView.java:3875)
	at android.support.v7.widget.RecyclerView.dispatchLayout(RecyclerView.java:3639)
	at android.support.v7.widget.RecyclerView.consumePendingUpdateOperations(RecyclerView.java:1877)
	at android.support.v7.widget.RecyclerView$1.run(RecyclerView.java:407)
	at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1012)
	at android.view.Choreographer.doCallbacks(Choreographer.java:823)
	at android.view.Choreographer.doFrame(Choreographer.java:755)
	at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:998)
	at android.os.Handler.handleCallback(Handler.java:873)
	at android.os.Handler.dispatchMessage(Handler.java:99)
	at android.os.Looper.loop(Looper.java:193)
	at android.app.ActivityThread.main(ActivityThread.java:6746)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:858)

https://hg.mozilla.org/releases/mozilla-esr68/pushloghtml?changeset=749f9c937ab9e821a4dd89d9cde2da20f96a199a is the set of items that landed in that timeframe. Adding ni on :vlad to take a look.

Flags: needinfo?(vlad.baicu)

Adding cpeterson based on conversation with telin at the Nightly Deep Dive.

(In reply to Marcia Knous [:marcia - needinfo? me] from comment #1)

https://hg.mozilla.org/releases/mozilla-esr68/pushloghtml?changeset=749f9c937ab9e821a4dd89d9cde2da20f96a199a is the set of items that landed in that timeframe. Adding ni on :vlad to take a look.

This is probably a regression from the Awesomescreen's new "Welcome to Firefox" card (bug 1570880).

Assigning to Vlad and setting priority P1 because we should fix this crash before shipping Fennec ESR 68. 2 (October 22).

Assignee: nobody → vlad.baicu
Priority: -- → P1
Regressed by: fennec-new-tab-signin
Whiteboard: [fennec68.2]
Summary: Crash in [@ java.lang.IndexOutOfBoundsException: at android.support.v7.widget.RecyclerView$Recycler.tryGetViewHolderForPositionByDeadline(RecyclerView.java)] → "Welcome to Firefox" Fennec Crash in [@ java.lang.IndexOutOfBoundsException: at android.support.v7.widget.RecyclerView$Recycler.tryGetViewHolderForPositionByDeadline(RecyclerView.java)]

We'd need a fix as soon as possible, preferably by Sunday as I think the ESR 68.2 build will kick off on Monday. I don't have a way to mark this as a release blocker but we should consider it one.

I cannot repro this (in a few minutes of fiddling) on an x86 emulator (Android 24), nor an x86_64 emulator (Android 28), nor an aarch64 Pixel 2 (Android 28). I don't have meaningful Fennec profile data, nor am I upgrading from a previous version.

I was hoping to use the Android xref to read the RecyclerView source, but I'm not finding exactly the right version. It's possible we're using a particular version via Gradle, which might let us see what's actually happening here.

A cursory bit of reading suggests two things to start investigating:

  1. It appears that notify add/remove is subtle. See, for example this SO answer and this related Android bug report. Our code swaps a cursor and does notifications (https://hg.mozilla.org/releases/mozilla-esr68/rev/4396e184de5d#l13.227) as well as recreates the adapter entirely. Perhaps we need to notify more in these cases?

  2. It appears that threading can be subtle here. It's clear that the shared preference listener that updates the top sites adapter is run on the main/UI thread (see https://developer.android.com/reference/android/content/SharedPreferences.OnSharedPreferenceChangeListener.html#onSharedPreferenceChanged(android.content.SharedPreferences,%20java.lang.String)), so nothing has changed there. I don't see any other obvious thread hopping that might happen: the update model stays essentially the same.

In general I'd prefer to not keep a long-lived Context reference -- down that road lies trouble eventually -- but I also can't see an immediate problem with the reference in the offending patch.

Hello, I tried to reproduce this issue using different scenarios but I wasn't able to trigger the crash.

I tried the following:

  • Closing the 'Welcome to Nightly' banner by tapping on the 'X' button;
  • Closing the 'Welcome to Nightly' banner by tapping on the Sign-up/Sign-in button;
  • Using the Welcome screen to trigger instantly the sign-in flow to close the 'Welcome to Nightly';
  • Log-in/Sign-up and doing some bookmarks;
  • Sending some bookmarks to other devices after sign-in from the 'Welcome to Nightly' banner;

Devices used:

  • Samsung Galaxy S8 (Android 9);
  • Samsung Galaxy Note 8 (Android 9);
  • Google Pixel 2 (Android 9);
  • OnePlus 5T (Android 9);
  • LG G7 fit (Android 8.1);

Build:

  • Nightly 68.2a1 (2019-10-07);

If I can help with other tests, just let me know.

I have managed to reproduce the issue on Firefox Beta 68.2b6 on a Samsung Galaxy S8+ (Android 8.0.0), Google Pixel 2 (Android 9), OnePlus 5T (Android 9), LG G7 fit (Android 8.1) and Google Pixel 3a XL (Android 9) using the following steps.

  1. Go to Setting -> General -> Home.
  2. Hide the Bookmarks and History Panels.
  3. In the Top sites uncheck Recommended by Pocket, Recent Bookmarks and Visited.
  4. Open a new tab and tap the close button from the Welcome to Firefox card and Firefox will crash.

(In reply to Laurentiu Apahidean from comment #7)

I have managed to reproduce the issue on Firefox Beta 68.2b6 on a Samsung Galaxy S8+ (Android 8.0.0), Google Pixel 2 (Android 9), OnePlus 5T (Android 9), LG G7 fit (Android 8.1) and Google Pixel 3a XL (Android 9) using the following steps.

  1. Go to Setting -> General -> Home.
  2. Hide the Bookmarks and History Panels.
  3. In the Top sites uncheck Recommended by Pocket, Recent Bookmarks and Visited.
  4. Open a new tab and tap the close button from the Welcome to Firefox card and Firefox will crash.

Laurentiu: great sleuthing! I wondered if it was possible for this index (1) to be incorrect at https://hg.mozilla.org/releases/mozilla-esr68/rev/4396e184de5d#l13.234... and by removing all the various panels, I think you could achieve that. Maybe we could iterate all of the different views, and notify changes for the ones that have the right type? I don't know if one can re-order the views, but if we can, this would handle that as well.

Disable predictive animations. There is a bug in RecyclerView which causes views that
are being reloaded to pull invalid ViewHolders from the internal recycler stack if the
adapter size has decreased since the ViewHolder was recycled.

Thank you everybody for the help, I've managed to track the issue down indeed to the swap cursor call https://hg.mozilla.org/releases/mozilla-esr68/rev/4396e184de5d#l13.234. When tested with a notify on the entire data set, the crash stopped occuring, which led me to think that the stack of the rv was corrupted. Seems to be a long standing issue on RVs - https://stackoverflow.com/a/33985508/7027339 (kudos to this man).

Flags: needinfo?(vlad.baicu)

Is this ready for an ESR68 approval request?

Flags: needinfo?(vlad.baicu)
Flags: needinfo?(vlad.baicu)
Keywords: checkin-needed

Comment on attachment 9100363 [details]
Bug 1586770 - Fix activity stream invalid viewholders crash.r=petru

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Fixes a crash introduced by the new sign in row
  • User impact if declined: Users will experience crashes
  • Fix Landed on Version:
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This patch only disables the preventive animations of the RV
  • String or UUID changes made by this patch:
Attachment #9100363 - Flags: approval-mozilla-esr68?

Comment on attachment 9100363 [details]
Bug 1586770 - Fix activity stream invalid viewholders crash.r=petru

Fixes a top crash from the New Tab Page work. Approved for 68.2b7.

Attachment #9100363 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED

Hello, I can confirm that this issue is not reproducible on Firefox 68.2b7 using the steps from comment 7.
Devices:

  • Nokia 6 (Android 7.1.1);
  • OnePlus 5T (Android 9);
  • Samsung Galaxy Note 8 (Android 9);
  • Google Pixel 2 (Android 9).

Due to my findings, I'll close this issue as verified, thanks.

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.