Closed
Bug 1014603
Opened 10 years ago
Closed 5 years ago
crash in java.lang.NullPointerException: at org.mozilla.gecko.db.TopSitesCursorWrapper.moveToPosition(TopSitesCursorWrapper.java)
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(firefox32 wontfix, firefox33 wontfix, firefox34 wontfix, firefox35 affected, firefox36 affected, firefox37 affected, firefox38 affected)
People
(Reporter: aaronmt, Unassigned)
References
Details
(Keywords: crash)
Crash Data
Attachments
(2 files, 1 obsolete file)
6.35 KB,
patch
|
mcomella
:
review+
|
Details | Diff | Splinter Review |
6.49 KB,
text/plain
|
Details |
This bug was filed from the Socorro interface and is report bp-42ec7b20-4b03-4b77-837a-ece282140515. ============================================================= java.lang.NullPointerException at org.mozilla.gecko.db.TopSitesCursorWrapper.moveToPosition(TopSitesCursorWrapper.java:313) at android.support.v4.widget.CursorAdapter.getView(CursorAdapter.java:247) at android.widget.HeaderViewListAdapter.getView(HeaderViewListAdapter.java:220) at android.widget.AbsListView.obtainView(AbsListView.java:2263) at android.widget.ListView.makeAndAddView(ListView.java:1790) at android.widget.ListView.fillUp(ListView.java:725) at android.widget.ListView.fillGap(ListView.java:664) at android.widget.AbsListView.trackMotionScroll(AbsListView.java:5136) at android.widget.AbsListView$FlingRunnable.run(AbsListView.java:4247) at android.os.Handler.handleCallback(Handler.java:733) at android.os.Handler.dispatchMessage(Handler.java:95) at android.os.Looper.loop(Looper.java:136) at android.app.ActivityThread.main(ActivityThread.java:5017) at java.lang.reflect.Method.invokeNative(Native Method) at java.lang.reflect.Method.invoke(Method.java:515) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:779) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:595) at de.robv.android.xposed.XposedBridge.main(XposedBridge.java:132) at dalvik.system.NativeStart.main(Native Method)
Reporter | ||
Comment 1•10 years ago
|
||
Single crash reported on a Tab 10.1 (GT-P7510)
status-firefox32:
--- → affected
Comment 2•9 years ago
|
||
Crash on a Kindle Fire running Nightly: https://crash-stats.mozilla.com/report/index/00d33d74-0b26-4ba2-98b0-db2c12140627
status-firefox33:
--- → affected
Comment 3•9 years ago
|
||
This crash could occur if: * topCursor is null. I added a sanity check for this which will make the exception more obvious. This could only occur if something odd happened in the SQLite layer. * pinnedPositions is null: e.g., if moveToPosition was called after close (or if my reading of the constructor is wrong). This patch does the following: * Fixes all warnings, typos, and style errors. * Adds a few sanity checks. * Makes pinnedPositions final and inits it during construction, rather than as a side-effect of something that happens during construction. We never null it out, only clear it.
Attachment #8447523 -
Flags: review?(michael.l.comella)
Updated•9 years ago
|
Assignee: nobody → rnewman
Status: NEW → ASSIGNED
Comment 4•9 years ago
|
||
Comment on attachment 8447523 [details] [diff] [review] Proposed patch. v1 Review of attachment 8447523 [details] [diff] [review]: ----------------------------------------------------------------- This patch prevents NullPointerExceptions, yes, but does it actually solve the logic error? When is pinnedPositions getting used such that it throws a NullPointerException? After `moveToPosition` was called after `close()`, as you mentioned in comment 3? Then shouldn't we ensure close was not called as the first step in `moveToPosition`? It sounds like there is more going on here. If you think the issue's solved, please elaborate. ::: mobile/android/base/db/TopSitesCursorWrapper.java @@ +251,5 @@ > map = suggestedIndexes; > break; > + > + default: > + return -1; You can remove the null check and `return -1` below this.
Comment 5•9 years ago
|
||
(In reply to Michael Comella (:mcomella) from comment #4) > Comment on attachment 8447523 [details] [diff] [review] > Proposed patch. v1 > > Review of attachment 8447523 [details] [diff] [review]: > ----------------------------------------------------------------- > > This patch prevents NullPointerExceptions, yes, but does it actually solve > the logic error? > When is pinnedPositions getting used such that it throws a > NullPointerException? This bug is somewhat complicated because evidently code is being inlined (there's nothing in that section of the method that would directly crash with an NPE), and this is evidently not a common flow -- four crashes between Nightly and Aurora. The commonality between lines 311 and 313 is that pinnedPositions is touched by both called methods, and is thus the null culprit. So I deduce that either: 1. The cursor is being closed then used, or 2. updatePinnedPositions is never called. > After `moveToPosition` was called after `close()`, as > you mentioned in comment 3? Then shouldn't we ensure close was not called as > the first step in `moveToPosition`? uPP is called by requery and in the constructor, so the only way pinnedPositions can be null is if the cursor was closed then used. My guess, then, is that SimpleCursorLoader -- literally the only consumer of TopSitesCursorWrapper -- is buggy, or Android is buggy. SCL does a lot of cursor closing and close checking, and the checks are worrisome.
Comment 6•9 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #5) > (In reply to Michael Comella (:mcomella) from comment #4) > > Comment on attachment 8447523 [details] [diff] [review] > > Proposed patch. v1 > > > > Review of attachment 8447523 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > This patch prevents NullPointerExceptions, yes, but does it actually solve > > the logic error? > > When is pinnedPositions getting used such that it throws a > > NullPointerException? > > > This bug is somewhat complicated because evidently code is being inlined > (there's nothing in that section of the method that would directly crash > with an NPE), and this is evidently not a common flow -- four crashes > between Nightly and Aurora. > > The commonality between lines 311 and 313 is that pinnedPositions is touched > by both called methods, and is thus the null culprit. > > So I deduce that either: > > 1. The cursor is being closed then used, or > 2. updatePinnedPositions is never called. > > > After `moveToPosition` was called after `close()`, as > > you mentioned in comment 3? Then shouldn't we ensure close was not called as > > the first step in `moveToPosition`? > > uPP is called by requery and in the constructor, so the only way > pinnedPositions can be null is if the cursor was closed then used. My guess, > then, is that SimpleCursorLoader -- literally the only consumer of > TopSitesCursorWrapper -- is buggy, or Android is buggy. SCL does a lot of > cursor closing and close checking, and the checks are worrisome. FWIW, the stack trace tells us that this crash happens during just after a fling gesture. The fling implementation in ListView involves posting runnables in the main Looper handler. These runnables are cancelled in several situations, including when the view is detached from the window, for example. My informed guess here is that there's a very short window of time between the cursor being closed and the next fling runnable being executed that might cause the adapter's getView() to be called after the cursor has been closed and before the view is actually detached.
Comment 7•9 years ago
|
||
Looking at the implementation of CursorAdapter#getView (which we don't override), it looks kinda like there's no sane action we can take here: public View getView(int position, View convertView, ViewGroup parent) { if (!mDataValid) { throw new IllegalStateException("this should only be called when the cursor is valid"); } if (!mCursor.moveToPosition(position)) { throw new IllegalStateException("couldn't move cursor to position " + position); } ... That is, by the time we get to having getView be called, our only options are to throw or return a bound view. So the only real fix for this bug is to make sure that we're not trying to make views from closed cursors at all -- higher up the stack. Lucas, can you suggest the right fix? And are you happy with the small fixes I describe in Comment 3, which are in this patch, as an additional layer?
Updated•9 years ago
|
Flags: needinfo?(lucasr.at.mozilla)
Comment 9•9 years ago
|
||
Comment on attachment 8447523 [details] [diff] [review] Proposed patch. v1 Clearing r?: waiting for more info.
Attachment #8447523 -
Flags: review?(michael.l.comella)
Comment 10•9 years ago
|
||
Got into this crash on Firefox 34 Beta 10. Device: Samsung Galaxy R (Android 2.3.4).
status-firefox34:
--- → affected
Comment 11•9 years ago
|
||
Michael, do you have an updated opinion on this?
Flags: needinfo?(lucasr.at.mozilla) → needinfo?(michael.l.comella)
Comment 12•9 years ago
|
||
I think this is important enough that someone should work on it -- 8 crashes in the past month on Nightly, 1200+ crashes in the last month on Release, making this our #32 crasher and risihg -- but I don't have the time to dig into it. Michael, consider taking this over?
Assignee: rnewman → nobody
Status: ASSIGNED → NEW
Updated•9 years ago
|
status-firefox35:
--- → affected
status-firefox36:
--- → affected
status-firefox37:
--- → affected
status-firefox38:
--- → affected
Updated•9 years ago
|
Assignee: nobody → michael.l.comella
Flags: needinfo?(michael.l.comella)
Comment 13•9 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #7) > That is, by the time we get to having getView be called, our only options > are to throw or return a bound view. Assuming the problem is actually that we're closing cursors when we shouldn't, an escape hatch with questionable side effects could be overriding the newView and bindView methods in our CursorAdapter implementations that use TopSitesCursorWrapper to ignore a closed cursor and return a dummy view. Context (CursorAdapter.getView): 241 public View getView(int position, View convertView, ViewGroup parent) { 242 if (!mDataValid) { 243 throw new IllegalStateException("this should only be called when the cursor is valid"); 244 } 245 if (!mCursor.moveToPosition(position)) { 246 throw new IllegalStateException("couldn't move cursor to position " + position); 247 } 248 View v; 249 if (convertView == null) { 250 v = newView(mContext, mCursor, parent); 251 } else { 252 v = convertView; 253 } 254 bindView(v, mContext, mCursor); 255 return v; But let's try not to close those Cursors.
Comment 14•9 years ago
|
||
* This method is called *twice* in some circumstances. [1] ... protected void onLoadFinishedAfterTransitions(Loader<Cursor> loader, Cursor c) { I wonder if the first call closes the Cursor and the second throws the Exception in a race condition. [1]: https://mxr.mozilla.org/mozilla-central/source/mobile/android/base/home/TopSitesPanel.java#701
Comment 15•9 years ago
|
||
This is difficult to diagnose so my plan is as follows: we land Richard's patch to get more information. If we start getting closed Cursor errors instead, then Lucas' analysis in comment 6 is likely. A workaround would be my suggestion in comment 13 (except override getView instead of newView or bindView), though it might be worthwhile to dig into further. It's difficult because we have to determine where the LoaderManager code releases the Cursors (i.e. the Loader's deliverResult, onReset, and onCancelled) runs in relation to where the FlingRunnable is removed from the runnable queue, i.e. when the window is detached. If I analyzed this correctly and the stack trace is not modified further up the stack (e.g. catch and throw new), after Richard's patch we should start getting closed Cursor errors from: SQLiteClosable.acquireReference [1] SQLiteQuery.fillWindow [2] SQLiteCursor.fillWindow [3] SQLiteCursor.getCount [4] / SQLiteCursor.onMove [5] AbstractCursor.moveToPosition [6] Since mQuery gets closed when a SQLiteCursor is closed [7] which releases the final reference in the SQLiteQuery extends SQLiteProgram extends SQLiteClosable [8]. Note that the stack trace from comment 0 indicates this bug only occurs on the mListAdapter (note: I have not looked at the other traces to see if it affects the Grid but I don't see why it shouldn't). An alternative solution is to simplify the code by replacing the TopSitesPanel with the speed dial, as we previously discussed. [1]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/sqlite/SQLiteClosable.java#52 [2]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/sqlite/SQLiteQuery.java#57 [3]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/sqlite/SQLiteCursor.java#138 [4]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/sqlite/SQLiteCursor.java#131 [5]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/sqlite/SQLiteCursor.java#120 [6]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/AbstractCursor.java#195 [7]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/sqlite/SQLiteCursor.java#205 [8]: http://androidxref.com/4.4.4_r1/xref/frameworks/base/core/java/android/database/sqlite/SQLiteClosable.java#105
Comment 16•9 years ago
|
||
Comment on attachment 8447523 [details] [diff] [review] Proposed patch. v1 r+, noting that it does not fix the bug.
Attachment #8447523 -
Flags: review+
Comment 17•9 years ago
|
||
Updated the description; author is still Richard.
Updated•9 years ago
|
Attachment #8447523 -
Attachment is obsolete: true
Comment 18•9 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/03487443c9e5 Aaron, this bug is not fixed but I expect the crash signature will change to something like: IllegalStateException("attempt to re-open an already-closed object: " + this); Do you mind letting me know if you see this signature, or anything similar (regarding closed Cursors or TopSitesCursorWrapper)? Also, how would you recommend I watch for these signatures myself?
Flags: needinfo?(aaron.train)
Keywords: leave-open
Updated•9 years ago
|
Attachment #8558799 -
Flags: review+
Reporter | ||
Comment 19•9 years ago
|
||
(In reply to Michael Comella (:mcomella) from comment #18) > https://hg.mozilla.org/integration/fx-team/rev/03487443c9e5 > > Aaron, this bug is not fixed but I expect the crash signature will change to > something like: > > IllegalStateException("attempt to re-open an already-closed object: " + > this); > > Do you mind letting me know if you see this signature, or anything similar > (regarding closed Cursors or TopSitesCursorWrapper)? > > Also, how would you recommend I watch for these signatures myself? Sure, I can watch this. I visit https://crash-stats.mozilla.com, Firefox for Android, 38.0a1, Top Crashers, [Type All, Days: 1] and then double click on first appearance sort.
Flags: needinfo?(aaron.train)
Comment 21•8 years ago
|
||
Unassigning because I'm not actively working on this. I can't find the signature mentioned in comment 18 and the steps in comment 19 didn't help me to find it. I also searched for the signature directly: https://crash-stats.mozilla.com/search/?signature=~attempt+to+re-open&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
Assignee: michael.l.comella → nobody
Updated•8 years ago
|
Crash Signature: [@ java.lang.NullPointerException: at org.mozilla.gecko.db.TopSitesCursorWrapper.moveToPosition(TopSitesCursorWrapper.java)] → [@ java.lang.NullPointerException: at org.mozilla.gecko.db.TopSitesCursorWrapper.moveToPosition(TopSitesCursorWrapper.java)]
[@ java.lang.NullPointerException: at org.mozilla.gecko.db.TopSitesCursorWrapper.moveToPosition]
Comment 22•8 years ago
|
||
I opened the doorhanger and maybe did some scrolling on Google Maps when it crashed. NI self to investigate.
Flags: needinfo?(michael.l.comella)
Comment 23•8 years ago
|
||
I think the work in bug 760956 may fix this.
Flags: needinfo?(michael.l.comella)
Comment 24•5 years ago
|
||
No recent crashes in crash stats.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
Updated•5 years ago
|
Keywords: leave-open
Assignee | ||
Updated•3 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•