Crash in java.lang.IllegalStateException: Trying to start session but it is already started at org.mozilla.gecko.telemetry.measurements.SessionMeasurements.recordSessionStart(SessionMeasurements.java)

VERIFIED FIXED in Firefox 60

Status

()

defect
P1
critical
VERIFIED FIXED
2 years ago
Last year

People

(Reporter: ting, Assigned: vlad.baicu)

Tracking

({crash})

Trunk
Firefox 61
Unspecified
Android
Points:
---

Firefox Tracking Flags

(firefox58 wontfix, firefox59 wontfix, firefox60 fixed, firefox61+ verified)

Details

(Whiteboard: [FNC][SPT59.5][MVP], crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

This bug was filed from the Socorro interface and is 
report bp-e9a9e7f0-7e4c-4fe3-bd57-8179a0171002.
=============================================================
#4 top crash of Nightly 20171001100334 on Android. The signature has been there for a while, there are 12226 reports in the past 6 months.
Any ideas?
Flags: needinfo?(michael.l.comella)
Nevin, could the core fennec team take a look at this? I have my hands full with AS right now. Feel free to NI me if you get stuck!
Flags: needinfo?(michael.l.comella) → needinfo?(cnevinchen)
Please help prioritize this.
Assignee: nobody → cnevinchen
Flags: needinfo?(wehuang)
Flags: needinfo?(jcheng)
Flags: needinfo?(cnevinchen)
#12 in 57.0.4 while still seen in 59.0a1 (#40 top crasher) today so worth checking, adding to 59.5 sprint planning.
Flags: needinfo?(wehuang)
Whiteboard: [FNC][SPT59.5][MVP]
The fix will be similar to bug 1317191. I'll update the patch later.
This is the #1 Fennec topcrash in Nightly 20180202103243.
[triage] Top 25 top crasher - will revisit from top crashers list.
Assignee: cnevinchen → nobody
Priority: -- → P1
Updating status flags - this crash is #4 top browser crash on 61.

One interesting comment: "It breaks when I'm in webmail google on android and request permission to record audio and record video (access to microphone and camera) and I deny.... "
Nevin, can you please take another look at this?
Vlad - this is top crasher that QA would like fixed sooner rather than later - can one of you three please pick it up ASAP.
Flags: needinfo?(vlad.baicu)
I see that Nevin already had submitted a patch for this issue. I have been unable to reproduce but it appears that these crashes occur whenever onResume is triggered twice without an onPause call in between. I have remade the patch with a slight addition and it should resolve the issue.
Flags: needinfo?(vlad.baicu)
Assignee: nobody → vlad.baicu
Comment on attachment 8970919 [details]
Bug 1405192 - Return if onResume has already been called and onPause hasn't. Add telemetry ping for future investigations.

https://reviewboard.mozilla.org/r/239676/#review245970

> After investigating and analyzing the crash reports this type of scenario can occur due to multi-window or some popups.

This is good info! I'd actually recommend putting this directly into the comments since it's less likely to be discovered in extended commit summaries!

I don't think it's worth the turn-around time to add it but if you wanted to file a follow-up to add it, I could give it a quick r+.

---

Thinking about this problem, we have an assumption in the session measurements that onResume -> onPause will always be called in order. That assumption is clearly wrong so I wonder if instead of band-aiding it in TelemetryDelegate, we should just remove that assumption. That being said, this seems like it'd work to fix the crash and will give us reliable data for the (presumed) majority who don't have this misorder issue so it wfm.
Attachment #8970919 - Flags: review?(michael.l.comella) → review+
fwiw, we do have telemetry docs for UI telemetry but I don't think anyone updates them: https://firefox-source-docs.mozilla.org/mobile/android/fennec/uitelemetry.html
Pushed by michael.l.comella@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/a64ffd8dd132
Return if onResume has already been called and onPause hasn't. Add telemetry ping for future investigations. r=mcomella
https://hg.mozilla.org/mozilla-central/rev/a64ffd8dd132
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → Firefox 61
No crashes on 61 since this landed, so marking that as verified. Vlad or Susheel, can you please nominate this patch for Beta approval so we can get this into 60 still? Thanks!
Status: RESOLVED → VERIFIED
Flags: needinfo?(vlad.baicu)
Flags: needinfo?(sdaswani)
Flags: needinfo?(cnevinchen)
Attachment #8941757 - Attachment is obsolete: true
Comment on attachment 8970919 [details]
Bug 1405192 - Return if onResume has already been called and onPause hasn't. Add telemetry ping for future investigations.

Approval Request Comment
[Feature/Bug causing the regression]:
[User impact if declined]:
[Is this code covered by automated tests?]:
[Has the fix been verified in Nightly?]:
[Needs manual test from QE? If yes, steps to reproduce]: 
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]:
[Why is the change risky/not risky?]:
[String changes made/needed]:
Flags: needinfo?(sdaswani)
Attachment #8970919 - Flags: approval-mozilla-beta?
Comment on attachment 8970919 [details]
Bug 1405192 - Return if onResume has already been called and onPause hasn't. Add telemetry ping for future investigations.

I'll let Vlad do it so he can fill out the questions appropriately.
Comment on attachment 8970919 [details]
Bug 1405192 - Return if onResume has already been called and onPause hasn't. Add telemetry ping for future investigations.

In the interest of time I'll approve for 60.0b17 and 60.0rc now.  We should still have the form filled out for the record though, so leaving the ni on Vlad.
Attachment #8970919 - Flags: approval-mozilla-release+
Attachment #8970919 - Flags: approval-mozilla-beta?
Attachment #8970919 - Flags: approval-mozilla-beta+
(In reply to Julien Cristau [:jcristau] from comment #21)
> Comment on attachment 8970919 [details]
> Bug 1405192 - Return if onResume has already been called and onPause hasn't.
> Add telemetry ping for future investigations.
> 
> In the interest of time I'll approve for 60.0b17 and 60.0rc now.  We should
> still have the form filled out for the record though, so leaving the ni on
> Vlad.

Thanks I forgot he is on PTO but he will fill the form when he is back.
Approval Request Comment
[Feature/Bug causing the regression]: onResume called more than once with no onPause call in between. Examples of such occurences include usage of multi-window or some popups.
[User impact if declined]: The bug will continue to be a top crasher.
[Is this code covered by automated tests?]: Not that I am aware of.
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]:  No
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: We are simply adding a check in case this event happens.
[String changes made/needed]: None
Flags: needinfo?(vlad.baicu)
You need to log in before you can comment on or make changes to this bug.