Closed Bug 1492706 Opened 11 months ago Closed 8 months ago

OOM during GeckoApp.restoreSessionTabs

Categories

(Firefox for Android :: Session Restore, defect, P3, critical)

All
Android
defect

Tracking

()

VERIFIED FIXED
Firefox 66
Tracking Status
firefox64 --- wontfix
firefox65 --- fixed
firefox66 --- fixed

People

(Reporter: jchen, Assigned: JanH)

References

Details

(Keywords: crash)

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is
report bp-46f9bda2-3a74-4915-ad86-6f2fb0180919.
=============================================================

Maybe we should handle OOM situations more gracefully during session restore.

Java stack trace:

java.lang.OutOfMemoryError
	at java.lang.AbstractStringBuilder.enlargeBuffer(AbstractStringBuilder.java:94)
	at java.lang.AbstractStringBuilder.append0(AbstractStringBuilder.java:132)
	at java.lang.StringBuilder.append(StringBuilder.java:124)
	at org.json.JSONStringer.string(JSONStringer.java:344)
	at org.json.JSONStringer.value(JSONStringer.java:252)
	at org.json.JSONObject.writeTo(JSONObject.java:672)
	at org.json.JSONStringer.value(JSONStringer.java:237)
	at org.json.JSONArray.writeTo(JSONArray.java:602)
	at org.json.JSONStringer.value(JSONStringer.java:233)
	at org.json.JSONObject.writeTo(JSONObject.java:672)
	at org.json.JSONStringer.value(JSONStringer.java:237)
	at org.json.JSONArray.writeTo(JSONArray.java:602)
	at org.json.JSONStringer.value(JSONStringer.java:233)
	at org.json.JSONObject.writeTo(JSONObject.java:672)
	at org.json.JSONStringer.value(JSONStringer.java:237)
	at org.json.JSONArray.writeTo(JSONArray.java:602)
	at org.json.JSONStringer.value(JSONStringer.java:233)
	at org.json.JSONObject.writeTo(JSONObject.java:672)
	at org.json.JSONStringer.value(JSONStringer.java:237)
	at org.json.JSONArray.writeTo(JSONArray.java:602)
	at org.json.JSONStringer.value(JSONStringer.java:233)
	at org.json.JSONObject.writeTo(JSONObject.java:672)
	at org.json.JSONStringer.value(JSONStringer.java:237)
	at org.json.JSONArray.writeTo(JSONArray.java:602)
	at org.json.JSONStringer.value(JSONStringer.java:233)
	at org.json.JSONObject.writeTo(JSONObject.java:672)
	at org.json.JSONObject.toString(JSONObject.java:641)
	at org.mozilla.gecko.GeckoApp.restoreSessionTabs(GeckoApp.java:1637)
	at org.mozilla.gecko.GeckoApp.access$200(GeckoApp.java:113)
	at org.mozilla.gecko.GeckoApp$9.run(GeckoApp.java:1153)
	at android.os.Handler.handleCallback(Handler.java:733)
	at android.os.Handler.dispatchMessage(Handler.java:95)
	at android.os.Looper.loop(Looper.java:136)
	at org.mozilla.gecko.util.GeckoBackgroundThread.run(GeckoBackgroundThread.java:43)
It looks as if in 64 a single install is generating the crashes.
Hmm, we've also had stack overflows when parsing the session string into JSONObjects in the first place (bug 1352149) and now apparently also crashes for the reverse procedure, i.e. turning the JSONObjects back into a String.
Although some of those crashes apparently happen before even that stage, when we simply try reading the file into a String [1] and don't do any JSON-specific processing yet.

Other than just catching those errors and treating it as a session restore failure, I wonder whether it might be possible to use a streaming JSON parser here (in fact that was suggested by kats way back in bug 836948 comment 2, but unfortunately never got anywhere)?
The limiting factor is that for the session restore on startup, we still need to turn the whole session data back into a string for sending to Gecko, so in the worst case a streaming parser might avoid OOMs while reading and parsing the file, but still run into trouble when turning the processed session data back into a single, large JSON string. Unfortunately I've got no idea how likely that scenario would be, i.e. how much a streaming JSON parser would really buy us.

[1] https://dxr.mozilla.org/mozilla-central/rev/90853a59691e0c77014c38049bfb38b2b6ca1e16/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#762
Priority: -- → P3
Duplicate of this bug: 1509818
Assignee: nobody → jh+bugzilla
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
We just treat this like a defective session store file and first fall back to
the backup (although if the OOM is caused by a too-large file, it is likely that
the backup will be too large as well) and then turn of session restoring
completely.

We don't plug those failures into the session restore telemetry, though, because
that is supposed to only cover truly defective files.
Looking at Crash Stats, the most common causes of OOMs involving the RecentTabs-
Adapter happen while reading the previous session store file into memory for
parsing, respectively while stringifying the parsed data back into a flat String
for further storage.

In the former case, we give up completely, because there's nothing we can do
short of switching to a streaming JSON parser (which is out of scope for this
bug), while in the latter case, we only skip the affected tab in the hope that
at least some tabs might be small enough to not cause an OOM.
Duplicate of this bug: 1512035
Crash Signature: [@ java.lang.OutOfMemoryError: at java.lang.AbstractStringBuilder.enlargeBuffer(AbstractStringBuilder.java)] [@ java.lang.OutOfMemoryError: at java.lang.String.<init>(String.java)] → [@ java.lang.OutOfMemoryError: at java.lang.AbstractStringBuilder.enlargeBuffer(AbstractStringBuilder.java)] [@ java.lang.OutOfMemoryError: at java.lang.StringBuilder.toString(StringBuilder.java)] [@ java.lang.OutOfMemoryError: at java.lang.String.<init…
Flags: needinfo?(nalexander)
Flags: needinfo?(nalexander)
Pushed by mozilla@buttercookie.de:
https://hg.mozilla.org/integration/autoland/rev/bfc4d495548f
Part 1: Catch OOM during startup session restore. r=nalexander
https://hg.mozilla.org/integration/autoland/rev/acefb73b4d60
Part 2: Cover common OOM causes in the Recent tabs panel. r=nalexander
https://hg.mozilla.org/mozilla-central/rev/bfc4d495548f
https://hg.mozilla.org/mozilla-central/rev/acefb73b4d60
Status: REOPENED → RESOLVED
Closed: 9 months ago8 months ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 66
Please nominate this for Beta approval when you get a chance.
Flags: needinfo?(jh+bugzilla)
Comment on attachment 9027638 [details]
Bug 1492706 - Part 1: Catch OOM during startup session restore. r?nalexander

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Session restore

User impact if declined: If memory is tight, large amounts of serialised session store data can cause an OOM crash.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: No

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: Part 2

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Just adding try-catch statements for OOMErrors for the relevant bits so we no longer crash and instead just skip the too-large data.

String changes made/needed: none
Flags: needinfo?(jh+bugzilla)
Attachment #9027638 - Flags: approval-mozilla-beta?
Comment on attachment 9027638 [details]
Bug 1492706 - Part 1: Catch OOM during startup session restore. r?nalexander

Avoiding an OOM sounds good. Let's uplift now and maybe we will see results with next week's Fennec beta 7 build.
Attachment #9027638 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Based on the comments above as no manual QE is needed marking it as Verified.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.