Closed Bug 1256049 Opened 4 years ago Closed 4 years ago

crash in mozilla::dom::indexedDB::`anonymous namespace''::DatabaseConnection::Close

Categories

(Core :: DOM: IndexedDB, defect, critical)

46 Branch
All
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox46 + fixed
firefox47 + fixed
firefox48 + fixed

People

(Reporter: mats, Assigned: khuey)

References

Details

(Keywords: crash, regression, Whiteboard: btpp-active)

Crash Data

Attachments

(1 file, 4 obsolete files)

[Tracking Requested - why for this release]:

This bug was filed from the Socorro interface and is 
report bp-9fa004dd-a325-4820-ad19-3f1832160312.
=============================================================

About 200 reported crashes in the past 28 days, all on Windows.
Looks like a regression in v46.

Stack:
mozilla::dom::indexedDB::`anonymous namespace'::DatabaseConnection::Close()
mozilla::dom::indexedDB::`anonymous namespace'::ConnectionPool::CloseConnectionRunnable::Run()
nsThread::ProcessNextEvent(bool, bool*)
NS_ProcessNextEvent(nsIThread*, bool)
mozilla::dom::indexedDB::`anonymous namespace'::ConnectionPool::ThreadRunnable::Run()
nsThread::ProcessNextEvent(bool, bool*)
NS_ProcessNextEvent(nsIThread*, bool)
mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)
MessageLoop::RunHandler()
MessageLoop::Run()
nsThread::ThreadFunc(void*)
PR_NativeRunThread
This is about #40 on the beta 46 topcrash list. 47 and 48 are affected too.  
I'd like to find an owner to look into this bug but I'm not sure we need to track it for 46. We could still  uplift it if there is a good fix though.
Sure does look like a regression in 46 though. I added the keyword.
Keywords: regression
Ugh I feel too guilty not tracking it. Maybe we can find a regression range.
I made this assertion fatal in bug 1195149 so we could see it.
Attached patch Diagnostics (obsolete) — Splinter Review
This is evil.
Attachment #8731494 - Flags: review?(jvarga)
Comment on attachment 8731494 [details] [diff] [review]
Diagnostics

Review of attachment 8731494 [details] [diff] [review]:
-----------------------------------------------------------------

Good old debugging methods.
Attachment #8731494 - Flags: review?(jvarga) → review+
3 crashes have happened so far with that diagnostic.  All of them crashed at 0x5, pointing to ConnectionPool::CloseDatabaseWhenIdleInternal.
Attachment #8732973 - Flags: review?(jvarga) → review+
Assignee: nobody → khuey
Ok, our premature close is getting triggered from Database::CloseInternal, which calls ConnectionPool::CloseDatabaseWhenIdle, which seems to believe we are already idle.  Either we're not, or we get a late transaction constructor.  Either way, we have assertions, which we should enable.
Attached patch Part 1: Revert previous patches (obsolete) — Splinter Review
Attachment #8731494 - Attachment is obsolete: true
Attachment #8732973 - Attachment is obsolete: true
Attachment #8734604 - Flags: review?(jvarga)
Attachment #8734604 - Flags: review?(jvarga) → review+
Attachment #8734605 - Flags: review?(jvarga) → review+
Whiteboard: btpp-active
I don't see any crashes on Nightly after this landed, though maybe it is too early to say for sure. Is this ready to be backported?
(In reply to Andrew McCreight [:mccr8] from comment #20)
> I don't see any crashes on Nightly after this landed, though maybe it is too
> early to say for sure. Is this ready to be backported?

These patches are not real fixes, just helping to identify the issue.
Yeah, all I did was chase this signature elsewhere.
Backed out because the assertion occurs so often that many mochitests fail.
https://hg.mozilla.org/integration/mozilla-inbound/rev/76134d6ad636
Flags: needinfo?(khuey)
Well, that answers my question.
Flags: needinfo?(khuey)
Attached patch PatchSplinter Review
This should do it, I think.
Attachment #8734604 - Attachment is obsolete: true
Attachment #8734605 - Attachment is obsolete: true
Attachment #8737985 - Flags: review?(jvarga)
Comment on attachment 8737985 [details] [diff] [review]
Patch

Review of attachment 8737985 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me. I tested it locally on my mac, no failures.
Attachment #8737985 - Flags: review?(jvarga) → review+
Blocks: 1172822
ni? myself to look at crash stats next week.
Flags: needinfo?(khuey)
Comment on attachment 8737985 [details] [diff] [review]
Patch

This is looking good so far. I'd like to take it to Aurora.

Approval Request Comment
[Feature/regressing bug #]: Possibly Bug 858680
[User impact if declined]: More crashes
[Describe test coverage new/current, TreeHerder]: IDB is pretty well tested in automation.
[Risks and why]: Low risk. This is a pretty simple patch and is unlikely to mess anything up.
[String/UUID change made/needed]: N/A
Attachment #8737985 - Flags: approval-mozilla-aurora?
Flags: needinfo?(khuey)
Keywords: leave-open
Crash Signature: [@ mozilla::dom::indexedDB::`anonymous namespace''::DatabaseConnection::Close] → [@ mozilla::dom::indexedDB::`anonymous namespace''::DatabaseConnection::Close] [@ PLDHashTable::Search | nsInterfaceHashtable<T>::Get]
https://hg.mozilla.org/mozilla-central/rev/03afa06c1272
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Hi Kyle, I was trying to check whether the crash signatures have appeared since this landed in Nightly before taking this to Aurora 47. If I am not wrong, this fix should have been in 4/13 Nightly build. Right? I was not able to find the first signature but the 2nd crash sig is still occurring on 4/13 Nightly build: https://crash-stats.mozilla.com/report/index/89a9a6c1-7cc7-43c3-85ee-941cf2160413. Am I reading the data wrong? Please let me know.
Flags: needinfo?(khuey)
(In reply to Ritu Kothari (:ritu) from comment #39)
> Hi Kyle, I was trying to check whether the crash signatures have appeared
> since this landed in Nightly before taking this to Aurora 47. If I am not
> wrong, this fix should have been in 4/13 Nightly build. Right? I was not
> able to find the first signature but the 2nd crash sig is still occurring on
> 4/13 Nightly build:
> https://crash-stats.mozilla.com/report/index/89a9a6c1-7cc7-43c3-85ee-
> 941cf2160413. Am I reading the data wrong? Please let me know.

The "[@ PLDHashTable::Search | nsInterfaceHashtable<T>::Get ]" signature is a very generic one that can match lots of crashes. I've filed bug 1263781 to get an extra frame in that signature.

The crash report you cited appears to have a different cause. It doesn't involve DB connection closing, AFAICT.
Flags: needinfo?(khuey)
That crash looks related.  We did fix an actual problem here, and I'd like to get that into Aurora, even if there might be more things that still need to be fixed.
Kyle what do you think about taking this for beta 11? I would like to bring down our crash rate and this looks promising. If you think it's too risky then please wontfix for 46 though.
Flags: needinfo?(khuey)
I think the downside risk here is limited to "doesn't fix all the crashes", so I'd be comfortable taking it for 46 if desired.
Flags: needinfo?(khuey)
For posterity, this is number 35 and 0.32% of browser process crashes on 46.b.
Fixing some of the crashes sounds great to me.  Kyle can you request beta uplift?
Flags: needinfo?(khuey)
Flags: needinfo?(khuey)
Attachment #8737985 - Flags: approval-mozilla-beta?
Comment on attachment 8737985 [details] [diff] [review]
Patch

Crash fixes, looks good to me, and ritu says OK for aurora. Please uplift to aurora and beta, hoping to make it into beta 11.
Attachment #8737985 - Flags: approval-mozilla-beta?
Attachment #8737985 - Flags: approval-mozilla-beta+
Attachment #8737985 - Flags: approval-mozilla-aurora?
Attachment #8737985 - Flags: approval-mozilla-aurora+
And to be precise, the changeset to be uplifted is in comment 30.  None of the other things that landed should be uplifted.
You need to log in before you can comment on or make changes to this bug.