Open Bug 1700402 Opened 4 years ago Updated 9 months ago

Assertion failure: !mPendingRequestCount, at src/dom/indexedDB/IDBTransaction.cpp:134

Tracking

()

Status:

ASSIGNED

Tracking Flags:

Tracking

Status

firefox-esr102

---

affected

firefox88

---

wontfix

firefox89

---

wontfix

firefox103

---

wontfix

firefox104

---

affected

firefox105

---

affected

People

(Reporter: tsmith, Assigned: jstutte)

References

(Blocks 2 open bugs)

Details

(Keywords: assertion)

Attachments

(1 file, 1 obsolete file)

Bug 1700402 - Ensure we sweep through all shutdown phases on nsAppStartup::Quit r?dthayer 3 years ago Jens Stutte [:jstutte] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1700402 - Immediately fake transaction abort on request actor shutdown kill. r?#dom-storage-reviewers 3 years ago Jens Stutte [:jstutte] 48 bytes, text/x-phabricator-request		Details \| Review

Tyson Smith [:tsmith]

Reporter

Description

•

4 years ago

Found while fuzzing m-c 20210320-71b483d7ca07 (--enable-debug --enable-fuzzing)

At this time a reduced test case is not available.

Assertion failure: !mPendingRequestCount, at src/dom/indexedDB/IDBTransaction.cpp:134

#0 0x7fb791a6a1fe in mozilla::dom::IDBTransaction::~IDBTransaction() src/dom/indexedDB/IDBTransaction.cpp:134:3
#1 0x7fb791a6a450 in mozilla::dom::IDBTransaction::~IDBTransaction() src/dom/indexedDB/IDBTransaction.cpp:132:35
#2 0x7fb78d887cc1 in SnowWhiteKiller::MaybeKillObject(SnowWhiteKiller::SnowWhiteObject&) src/xpcom/base/nsCycleCollector.cpp:2434:29
#3 0x7fb78d87cc34 in SnowWhiteKiller::~SnowWhiteKiller() src/xpcom/base/nsCycleCollector.cpp:2421:7
#4 0x7fb78d87c335 in nsCycleCollector::FreeSnowWhite(bool) src/xpcom/base/nsCycleCollector.cpp:2611:3
#5 0x7fb78d8806d3 in nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*) src/xpcom/base/nsCycleCollector.cpp:3592:3
#6 0x7fb78d88028c in nsCycleCollector::Collect(ccType, js::SliceBudget&, nsICycleCollectorListener*, bool) src/xpcom/base/nsCycleCollector.cpp:3415:9
#7 0x7fb78d880037 in nsCycleCollector::ShutdownCollect() src/xpcom/base/nsCycleCollector.cpp:3358:20
#8 0x7fb78d881286 in nsCycleCollector::Shutdown(bool) src/xpcom/base/nsCycleCollector.cpp:3653:5
#9 0x7fb78d882cf2 in nsCycleCollector_shutdown(bool) src/xpcom/base/nsCycleCollector.cpp:3968:18
#10 0x7fb78d9a4824 in mozilla::ShutdownXPCOM(nsIServiceManager*) src/xpcom/build/XPCOMInit.cpp:706:3
#11 0x7fb793877c7c in XRE_TermEmbedding() src/toolkit/xre/nsEmbedFunctions.cpp:212:3
#12 0x7fb78e29d02e in mozilla::ipc::ScopedXREEmbed::Stop() src/ipc/glue/ScopedXREEmbed.cpp:90:5
#13 0x7fb793878372 in XRE_InitChildProcess(int, char**, XREChildData const*) src/toolkit/xre/nsEmbedFunctions.cpp:737:16
#14 0x55edade44fb6 in content_process_main src/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
#15 0x55edade44fb6 in main src/browser/app/nsBrowserApp.cpp:309:18
#16 0x7fb7a321f0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
#17 0x55edade22d5c in _start (/home/worker/builds/m-c-20210321093903-fuzzing-debug/firefox-bin+0x14d5c)

Tyson Smith [:tsmith]

Reporter

Comment 1

•

4 years ago

A Pernosco session is available here: https://pernos.co/debug/STsVEF1XXd-aCd43eDPsXA/index.html

Jens Stutte [:jstutte]

Assignee

Comment 2

•

4 years ago

•

Edited

It seems we have 2 pending requests on the transaction when we get into ActorDestroy. But we call mTransaction->OnRequestFinished just once. We should call it until mPendingRequestCount is 0 (or provide a different OnRequestAbort function that does this) to force the commit?

Jens Stutte [:jstutte]

Assignee

Updated

•

4 years ago

Severity: -- → S3

Priority: -- → P2

Jens Stutte [:jstutte]

Assignee

Comment 3

•

4 years ago

(In reply to Jens Stutte [:jstutte] from comment #2)

It seems we have 2 pending requests on the transaction when we get into ActorDestroy. But we call mTransaction->OnRequestFinished just once. We should call it until mPendingRequestCount is 0 (or provide a different OnRequestAbort function that does this) to force the commit?

Taking another look at the code it seems to me, that we already have an explicit shutdown abort handling based on aRequestCompletedSuccessfully (which is false in the pernosco run due to aWhy == NormalShutdown, triggering thus the mSentCommitOrAbort.Flip() and marking the transaction as aborted). I assume, that we should do something with mPendingRequestCount in that case, too?

I also wonder, if we should check the abort condition before we decrement mPendingRequestCount? After all we are not calling OnRequestFinishedfrom a foreseen code path here but to handle the abort on shutdown?

Jens Stutte [:jstutte]

Assignee

Updated

•

4 years ago

Flags: needinfo?(bugmail)

Pascal Chevrel:pascalc

Updated

•

4 years ago

status-firefox88: affected → wontfix

status-firefox89: affected → wontfix

Tyson Smith [:tsmith]

Reporter

Updated

•

3 years ago

status-firefox103: --- → wontfix

status-firefox104: --- → affected

status-firefox105: --- → affected

status-firefox-esr102: --- → affected

Jens Stutte [:jstutte]

Assignee

Comment 4

•

3 years ago

Hi Tyson, assuming that this still happens - did you have any luck in further reducing the test case?

Flags: needinfo?(bugmail) → needinfo?(twsmith)

Jens Stutte [:jstutte]

Assignee

Comment 5

•

3 years ago

OK, I was looking again at the old pernosco trace, and it seems we never pass through the normal shutdown sequence here. QuotaManager::Shutdown is never called, as we exit the process through a "xul-window-destroyed" notification which ends up calling nsAppStartup::Quit which does not sweep through all our shutdown phases, apparently.

This could explain why we end up in XPCOMShutdown with IDB transactions that have not been canceled.

Jens Stutte [:jstutte]

Assignee

Comment 6

•

3 years ago

Attached file Bug 1700402 - Ensure we sweep through all shutdown phases on nsAppStartup::Quit r?dthayer (obsolete) — Details

Phabricator Automation

Updated

•

3 years ago

Assignee: nobody → jstutte

Status: NEW → ASSIGNED

Jens Stutte [:jstutte]

Assignee

Comment 7

•

3 years ago

•

Edited

(In reply to Jens Stutte [:jstutte] from comment #5)

OK, I was looking again at the old pernosco trace, and it seems we never pass through the normal shutdown sequence here. QuotaManager::Shutdown is never called, as we exit the process through a "xul-window-destroyed" notification which ends up calling nsAppStartup::Quit which does not sweep through all our shutdown phases, apparently.

Scratch that, it is just the order of things. The rest of the shutdown phases would have been kicked off on ~ScopedXPCOMStartup but we fail earlier inside a content process (thanks to Nika for pointing me there).

This could explain why we end up in XPCOMShutdown with IDB transactions that have not been canceled.

The general scenario seems to be that we have two open requests inside an IDB Transaction in a content process when the main app is starting shutdown and asks thus all content processes to shut down (which happens before QuotaManager::Shutdown runs in the parent). So I'll probably return to the hypothesis of comment 2 and comment 3 that we have some flaw in aborting with the aWhy == NormalShutdown case when there is more than one request active in the transaction.

Phabricator Automation

Updated

•

3 years ago

Attachment #9297448 - Attachment is obsolete: true

Tyson Smith [:tsmith]

Reporter

Comment 8

•

3 years ago

An updated Pernosco session is available here: https://pernos.co/debug/g3U9cd7CwyvFbEQjmIrnJQ/index.html

Flags: needinfo?(twsmith)

Jens Stutte [:jstutte]

Assignee

Comment 9

•

3 years ago

Thanks! So we rely on request's ActorDestroy and/or Recv__delete__ in order to adjust our mPendingRequestCount and make it go back down to zero. That looks very fragile, in particular BackgroundCursorChild<CursorType>::SendContinueInternal increases mPendingRequestCount potentially more than once for each actor instance. I was not able to follow through entirely in the pernosco session to find the single ++ without --. But we definitely never check if messages are dispatched successfully, which during shutdown might become a problem. It seems to me that we should fake-abort our transaction at the first shutdown-killed request's ActorDestroy we encounter, regardless of being there remaining requests or not.

However, I cannot 100% exclude to be there some error in our book-keeping even in the non-shutdown case. It all looks safe under the assumption that IPC channels do not go away unexpectedly (if not in shutdown), and probably we need to rely on more fuzzing here.

Jens Stutte [:jstutte]

Assignee

Comment 10

•

3 years ago

Attached file Bug 1700402 - Immediately fake transaction abort on request actor shutdown kill. r?#dom-storage-reviewers — Details

Phabricator Automation

Updated

•

3 years ago

Attachment #9297542 - Attachment description: WIP: Bug 1700402 - Immediately fake transaction abort on request actor shutdown kill. → Bug 1700402 - Immediately fake transaction abort on request actor shutdown kill. r?#dom-storage-reviewers

Jens Stutte [:jstutte]

Assignee

Comment 11

•

3 years ago

•

Edited

Hi Tyson, would it be possible to run the fuzzer testcase with the build from https://treeherder.mozilla.org/jobs?revision=baf2f6fc477895a9c7d8a3dbd0abaa032a161410&repo=try to see if it helps ? Thank you!

Flags: needinfo?(twsmith)

Tyson Smith [:tsmith]

Reporter

Comment 12

•

3 years ago

I was unable to reproduce the issue with the requested build.

Flags: needinfo?(twsmith)

Jens Stutte [:jstutte]

Assignee

Comment 13

•

2 years ago

Hi Tyson, sorry for bothering you again, but after bug 1777914 landed - did you see this anymore ?

Flags: needinfo?(twsmith)

Tyson Smith [:tsmith]

Reporter

Comment 14

•

2 years ago

(In reply to Jens Stutte [:jstutte] from comment #13)

Hi Tyson, sorry for bothering you again

I am always happy to help :)

after bug 1777914 landed - did you see this anymore ?

This is still being reported by the fuzzers (m-c 20221120-d2c628738330). I am able to repro (m-c 20221207-8e09abeeb445), so I will try to reduce the test case and get a new Pernosco session.

Jens Stutte [:jstutte]

Assignee

Comment 15

•

2 years ago

Not sure if we need pernosco, I just wanted to know if the patch here is still of value, and it seems so.

Tyson Smith [:tsmith]

Reporter

Comment 16

•

2 years ago

Ok, just let me know if anything changes.

Flags: needinfo?(twsmith)

Jens Stutte [:jstutte]

Assignee

Comment 17

•

2 years ago

I wonder if bug 1785080 is related and if this comment gives a hint how to fix the overly complicated state machinery here?

Flags: needinfo?(jvarga)

Comment 18

•

1 year ago

(In reply to Jens Stutte [:jstutte] from comment #17)

I wonder if bug 1785080 is related and if this comment gives a hint how to fix the overly complicated state machinery here?

I get the error/assertion locally from time to time when I'm running tests for dom/indexedDB. I could try to narrow it down and even try to create a test for it if you think it's a priority.

Flags: needinfo?(jvarga)

Jens Stutte [:jstutte]

Assignee

Updated

•

1 year ago

Flags: needinfo?(jstutte)

Jens Stutte [:jstutte]

Assignee

Comment 19

•

1 year ago

(added a reminder)

Flags: needinfo?(jstutte)

Jason Kratzer [:jkratzer]

Updated

•

9 months ago

Blocks: fuzzing-indexeddb

You need to log in before you can comment on or make changes to this bug.