Closed Bug 1152026 Opened 10 years ago Closed 9 years ago

IndexedDB cycle-collection crash, probably

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla43

Project Flags:

blocking-b2g

2.5?

Tracking Flags:

Tracking

Status

firefox40

---

wontfix

firefox41

fixed

firefox42

fixed

firefox43

---

fixed

firefox-esr31

---

unaffected

firefox-esr38

41+

fixed

b2g-v2.0

---

unaffected

b2g-v2.0M

---

unaffected

b2g-v2.1

---

unaffected

b2g-v2.1S

---

unaffected

b2g-v2.2

---

fixed

b2g-v2.2r

---

fixed

b2g-master

---

fixed

Tracking

Status

firefox40

firefox41

firefox42

firefox43

firefox-esr31

firefox-esr38

b2g-v2.0

b2g-v2.0M

b2g-v2.1

b2g-v2.1S

b2g-v2.2

b2g-v2.2r

relnote-firefox

b2g-master

thunderbird_esr115

thunderbird_esr128

firefox-esr115

firefox-esr128

firefox131

firefox132

firefox133

People

(Reporter: asuth, Assigned: khuey)

References

Details

(Keywords: crash, sec-moderate, Whiteboard: [fixed in 43 by bug 1179909][post-critsmash-triage][adv-main41+][adv-esr38.3+])

Crash Data

Attachments

(5 files)

backtrace and limited DumpJSStack for MOZ_ASSERT_IF(!mPreemptingRunnableInfos.IsEmpty() from debug build 10 years ago Andrew Sutherland [:asuth] (he/him) 18.42 KB, text/plain		Details
backtrace of assertion still firing even without IndexedDB handler on stack. (But a promise is firing.) 10 years ago Andrew Sutherland [:asuth] (he/him) 6.63 KB, text/plain		Details
details on the pre-empted runnable 10 years ago Andrew Sutherland [:asuth] (he/him) 6.69 KB, text/plain		Details
Repro script 9 years ago Gareth Aye [:gaye] (back from PTO) 531 bytes, application/x-shellscript		Details
Branch patch 9 years ago Kyle Huey (Exited; not receiving bugmail, old account, do not use) 670 bytes, patch	baku : review+ ritu : approval-mozilla-aurora+ ritu : approval-mozilla-beta+ ritu : approval-mozilla-esr38+ ritu : approval-mozilla-b2g37+	Details \| Diff \| Splinter Review

Andrew Sutherland [:asuth] (he/him)

Reporter

Description

•

10 years ago

I'm using IndexedDB on workers, which rocks. But my (stock, mozilla.org-built linux x86_64) nightly builds are crashing a lot in the cycle collector it seems. Here's one: bp-d90ca7bb-15bc-4ab7-8ef3-a26d12150407 And all of these but the windows one seem to be me: https://crash-stats.mozilla.com/report/list?product=Firefox&signature=ChildFinder%3A%3ANoteJSObject%28JSObject*%29#tab-reports Here's the summary of sorts from the stack provided by the crash reporter infrastructure: ChildFinder::NoteJSObject(JSObject*) TraceCallbackFunc::Trace(JS::Heap<JSObject*>*, char const*, void*) const mozilla::DOMEventTargetHelper::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&) mozilla::dom::indexedDB::IDBWrapperCache::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&) mozilla::dom::indexedDB::IDBTransaction::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&) void nsPurpleBuffer::Block::VisitEntries<RemoveSkippableVisitor>(nsPurpleBuffer&, RemoveSkippableVisitor&) nsPurpleBuffer::RemoveSkippable(nsCycleCollector*, bool, bool, void (*)()) nsCycleCollector_forgetSkippable(bool, bool) FireForgetSkippable CCTimerFired Note that I did initially suspect the devtools indexeddb inspector was at play, and commented to that effect, but I've stopped using that. I'm going to spin a local debug build and switch to that and see if there are any helpful assertion explosions that show up there.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 1

•

10 years ago

Hm, that crash is on the main thread...

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 2

•

10 years ago

So, things seem to be a bit more stable now for me (still using the same nightly production build that likes to crash) now that my code isn't throwing as many random errors. === Mental state dump for my benefit, probably don't read this: I believe that while I was seeing these crashes, I was triggering an exception in a request "onsuccess" handler of an on-worker function that would be caught and do a console.error('blah', ex, ex.stack), possibly following an app schema upgrade. It was also possible for code to throw an exception and not be caught. The arrow-function closure where these calls happen definitely closed over the IDBRequest whose onsuccess they were handling and may have incidentally closed-over the associated IDBTransaction by virtue of being declared in the same scope. The transaction seems like something scary could be happening with it because even now I'm seeing: "JavaScript error: http://glodastrophe/deps/gelam/js/maildb.js, line 336: InvalidStateError: An attempt was made to use an object that is not, or is no longer, usable" The code in question is: this._db.transaction([TBL_CONFIG, TBL_FOLDER_INFO], 'readonly') And this._db is only populated in the "onsuccess" handler of an IndexedDB.open() call which also has an appropriate "onupgradeneeded" handler. Which seems surprising, but it wouldn't be surprising if I'm being insufficiently diligent about IndexedDB database lifecycle stuff.

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 3

•

10 years ago

Attached file backtrace and limited DumpJSStack for MOZ_ASSERT_IF(!mPreemptingRunnableInfos.IsEmpty() from debug build — Details

Not sure how related this is to my crashes (I'll try and upgrade variant too), but this appears to be a case of my code creating a new transaction inside an onsuccess event and this causing RunBeforeNextEvent to assert. My investigation got cut short because when I did "call DumpJSStack()" my X session locked up until I killed gdb. (I was expecting something bad to happen, just not that.) Context: this is from "./mach run --debug -P aprofile", which I think means that gdb should generally be happy, configuration wise. And the path is in my auto-load-safe-path. Gonna try again now forcing an upgrade and maybe try and use devtools if I manage to avoid the asserts.

Andrew Sutherland [:asuth] (he/him)

Reporter

Updated

•

10 years ago

Summary: IndexedDB-on-workers cycle-collection crash, probably → IndexedDB cycle-collection crash, probably

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 4

•

10 years ago

Attached file backtrace of assertion still firing even without IndexedDB handler on stack. (But a promise is firing.) — Details

I deferred the transaction's creation to a future turn of the event loop/micro-task queue with Promise.resolve().then(...) but the assertion still happens. Just with a shorter stack. And call DumpJSStack() was fine this time.

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 5

•

10 years ago

Attached file details on the pre-empted runnable — Details

My optimization was turned up too high so ptarray didn't work and I did it manually. The pre-empted runnable is an outstanding, non-overlapping read-only transaction. (Pre-empted one is readonly over ['config', folderInfo], exploding one is readonly over ['tasks'].)

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 6

•

10 years ago

I think that pre-empted transaction may be doing something weird and suspicious, because if I defer the creation of my new transaction with with setTimeout(blah, 0) or event 5000ms, things still explode. I'm going to stop poking at things now, but let me know if there's something useful I can do if this isn't a known problem.

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 7

•

10 years ago

Ugh, promises are broken. Bug 1132436.

Wayne Mery (:wsmwk)

Updated

•

10 years ago

Severity: normal → critical

Crash Signature: ChildFinder::NoteJSObject(JSObject*) → [@ ChildFinder::NoteJSObject(JSObject*)]

Keywords: crash

Gareth Aye [:gaye] (back from PTO)

Comment 8

•

9 years ago

[Blocking Requested - why for this release]: We can't move calendar's database interactions into the worker if this doesn't get fixed in v3.

blocking-b2g: --- → 3.0?

Miller Medeiros [:millermedeiros]

Updated

•

9 years ago

Blocks: 1162586

Gareth Aye [:gaye] (back from PTO)

Comment 9

•

9 years ago

Attached file Repro script — Details

I wrote a little script to run a test-agent (gaia unit) test that reproduces the issue here. You do the following after downloading the script to $GAIA/reproduce_bug_1152026.sh: ```bash cd $GAIA git remote add gaye https://github.com/gaye/bug-1169790 git fetch gaye git checkout gaye/bug-1169790 FIREFOX=/path/to/nightly/firefox ./reproduce_bug_1152026.sh ``` The script will (1) start the server that receives test results from the browser (2) build gaia (3) start firefox with the gaia profile and navigate to the test app (4) run the test that reproduces the crash Let me know if this helps! It'd be a bit of a chore for me to produce a standalone webpage that reproduces, but I could do that if you all needed.

Attachment #8622776 - Flags: feedback?(bent.mozilla)

Gareth Aye [:gaye] (back from PTO)

Comment 10

•

9 years ago

Just FYI you need to wait for a tiny bit (~10-20s on my machine) for gecko to crash after the test runs.

Noemí Freire (:noemi)

Updated

•

9 years ago

Blocks: ServiceWorkers-B2G

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 12

•

9 years ago

(In reply to Gareth Aye [:gaye] from comment #9) > Created attachment 8622776 [details] > Repro script > > I wrote a little script to run a test-agent (gaia unit) test that reproduces > the issue here. You do the following after downloading the script to > $GAIA/reproduce_bug_1152026.sh: > > ```bash > cd $GAIA > git remote add gaye https://github.com/gaye/bug-1169790 > git fetch gaye > git checkout gaye/bug-1169790 > FIREFOX=/path/to/nightly/firefox ./reproduce_bug_1152026.sh > ``` > > The script will > > (1) start the server that receives test results from the browser > (2) build gaia > (3) start firefox with the gaia profile and navigate to the test app > (4) run the test that reproduces the crash > > Let me know if this helps! It'd be a bit of a chore for me to produce a > standalone webpage that reproduces, but I could do that if you all needed. That repo doesn't exist :/

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 13

•

9 years ago

The site in bug 1176529 reproduces though, so I can debug that.

Gareth Aye [:gaye] (back from PTO)

Comment 14

•

9 years ago

Ah crap. I was trying to point you to a branch of my fork. You should really `git remote add gaye https://github.com/gaye/gaia` and then the rest should work out... see https://github.com/gaye/gaia/tree/bug-1169790 ?

Andrew McCreight [:mccr8]

Updated

•

9 years ago

Group: core-security

Andrew McCreight [:mccr8]

Comment 16

•

9 years ago

Hiding because one of the dupes indicates a thread safety issue. Is this a regression from bug 1137245 as indicated in bug 1176529? This bug was filed two months before that.

Andrew McCreight [:mccr8]

Updated

•

9 years ago

Flags: needinfo?(bent.mozilla)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 17

•

9 years ago

No, I don't think bug 1137245 has anything to do with this.

Flags: needinfo?(bent.mozilla)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Updated

•

9 years ago

Depends on: 1179909

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 18

•

9 years ago

The patch in bug 1179909 fixes the site in bug 1176529. I suspect it will fix the Gaia problems too.

Daniel Veditz [:dveditz]

Updated

•

9 years ago

Keywords: sec-high

Daniel Veditz [:dveditz]

Updated

•

9 years ago

Keywords: sec-high → sec-moderate

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 19

•

9 years ago

Comment on attachment 8622776 [details] Repro script khuey is on this (thanks!)

Attachment #8622776 - Flags: feedback?(bent.mozilla)

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 20

•

9 years ago

I've been running with a local build using the fix on bug 1179909 (try push at https://treeherder.mozilla.org/#/jobs?repo=try&revision=6f94afe72869 for builds too), and it indeed seems to have fixed the problem for me. Woo!

(Away)

Comment 22

•

9 years ago

This crash (or one very similar to it) shot up dramatically in the 20150728030209 build, possibly due to 1184574. What's the status of the investigation here?

Flags: needinfo?(nsm.nikhil)

Flags: needinfo?(khuey)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 23

•

9 years ago

The status is that we need to finish the patch from bug 1179909.

Flags: needinfo?(nsm.nikhil)

Flags: needinfo?(khuey)

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Comment 24

•

9 years ago

It is quite possible that Nightly users are trying out push notifications since Bug 1184574 and operating on IDB leading to crashes.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 25

•

9 years ago

It shouldn't lead to crashes unless there are promises involved.

(Away)

Comment 26

•

9 years ago

Apparently nightly and aurora both spiked on 0728. I guess bug 1184574 is off the hook, but this is still worrying. [Tracking Requested - why for this release]: Existing low-volume crash spiking in 41 and 42.

status-firefox41: --- → affected

status-firefox42: --- → affected

tracking-firefox41: --- → ?

tracking-firefox42: --- → ?

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Comment 27

•

9 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #25) > It shouldn't lead to crashes unless there are promises involved. There are Promises involved in all the APIs that ServiceWorkers expose. In addition, if users are using some library like localforage, that would involve Promises too.

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 28

•

9 years ago

Tracked.

tracking-firefox41: ? → +

tracking-firefox42: ? → +

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 29

•

9 years ago

With the landing of bug 1179909 which fixes this, I think we can call this fixed. Or we can call it something else (worksforme?, dupe to bug 1179909?). Many thanks to :khuey and :bzbarsky and :smaug and all involved.

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 30

•

9 years ago

fixed wfm

Assignee: nobody → khuey

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 31

•

9 years ago

(In reply to Andrew Sutherland [:asuth] from comment #29) > With the landing of bug 1179909 which fixes this, I think we can call this > fixed. Or we can call it something else (worksforme?, dupe to bug > 1179909?). Many thanks to :khuey and :bzbarsky and :smaug and all involved. Given that the fix for bug 1179909 works, can we request an uplift to Aurora as it affects 42? The patch associated with bug 1179909 is pretty huge, can somebody comment on whether it is safe to uplift to Beta?

Flags: needinfo?(bugmail)

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 32

•

9 years ago

(In reply to Ritu Kothari (:ritu) from comment #31) > Given that the fix for bug 1179909 works, can we request an uplift to Aurora > as it affects 42? The patch associated with bug 1179909 is pretty huge, can > somebody comment on whether it is safe to uplift to Beta? The authoritative answer would seem to be "no", from bug 1179909: (Quoting Nikhil Marathe [:nsm] (please needinfo?) from bug 1179909 comment 18) > Should this be uplifted at least to aurora? There are some push+IDB+Promises > combinations that crash currently. (Quoting Benjamin Smedberg [:bsmedberg] from bug 1179909 comment 20) > I oppose trying to uplift anything non-critical related to event loops. > Event loops are notoriously regression-prone and we need the full release > cycle to find bugs. I personally do think we should disable IndexedDB-on-workers on all branches where bug 1179909 did not land because the crashing is not fun. (And anyone cool enough to use IndexedDB on workers is probably also cool enough to use Promises on workers.)

Flags: needinfo?(bugmail)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 33

•

9 years ago

IndexedDB on workers has been shipping since 37, so disabling it is not an option IMO.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 35

•

9 years ago

Given that people are discovering it, we should probably do something though.

Andrew McCreight [:mccr8]

Updated

•

9 years ago

status-firefox43: --- → fixed

status-firefox-esr31: --- → unaffected

status-firefox-esr38: --- → affected

Whiteboard: [fixed in 43 by bug 1179909]

Chris Peterson [:cpeterson]

Updated

•

9 years ago

Depends on: 1197547

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 36

•

9 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #35) > Given that people are discovering it, we should probably do something though. I don't suppose there's any chance the pre-empted runnable logic was the entire cause of the cycle collector problem and can be harmlessly disabled?

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 37

•

9 years ago

Attached patch Branch patch — Details — Splinter Review

For branches, we can at least explicitly clear the array on the right thread. This should prevent any security issues, although I don't think IndexedDB will actually work with Promises with this patch.

Attachment #8651883 - Flags: review?(amarchesini)

Andrew Sutherland [:asuth] (he/him)

Reporter

Comment 38

•

9 years ago

Unless there's some side-effect to when the mPreemptingRunnableInfos gets cleared out, the only correctness problem I saw with IndexedDB-with-promises prior to the bug 1179909 fix was bug 1161690 stalling no-op transactions forever.

Andrea Marchesini [:baku]

Updated

•

9 years ago

Attachment #8651883 - Flags: review?(amarchesini) → review+

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 39

•

9 years ago

(In reply to Andrew Sutherland [:asuth] from comment #38) > Unless there's some side-effect to when the mPreemptingRunnableInfos gets > cleared out, the only correctness problem I saw with IndexedDB-with-promises > prior to the bug 1179909 fix was bug 1161690 stalling no-op transactions > forever. Right. I guess having requests running "primes the pump" so to speak. I didn't account for that behavior, so correctness fallout should be minimal.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Updated

•

9 years ago

status-firefox40: affected → wontfix

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Assignee

Comment 40

•

9 years ago

Comment on attachment 8651883 [details] [diff] [review] Branch patch [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: sec-moderate that affects a new feature introduced in ESR 38 User impact if declined: Possible security issues, crashes when IndexedDB is used in certain ways. Fix Landed on Version: 43 Risk to taking this patch (and alternatives if risky): Low risk, alternative is disabling a feature which has been in release for at least 3 cycles. String or UUID changes made by this patch: N/A

Attachment #8651883 - Flags: approval-mozilla-esr38?

Attachment #8651883 - Flags: approval-mozilla-beta?

Attachment #8651883 - Flags: approval-mozilla-aurora?

Ryan VanderMeulen [:RyanVM]

Updated

•

9 years ago

status-b2g-v2.0: --- → unaffected

status-b2g-v2.0M: --- → unaffected

status-b2g-v2.1: --- → unaffected

status-b2g-v2.1S: --- → unaffected

status-b2g-v2.2: --- → affected

status-b2g-v2.2r: --- → affected

status-b2g-master: --- → fixed

Target Milestone: --- → mozilla43

Ryan VanderMeulen [:RyanVM]

Updated

•

9 years ago

Attachment #8651883 - Flags: approval-mozilla-b2g37?

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 41

•

9 years ago

Comment on attachment 8651883 [details] [diff] [review] Branch patch Seems safe to uplift to all nom'd branches.

Attachment #8651883 - Flags: approval-mozilla-esr38?

Attachment #8651883 - Flags: approval-mozilla-esr38+

Attachment #8651883 - Flags: approval-mozilla-beta?

Attachment #8651883 - Flags: approval-mozilla-beta+

Attachment #8651883 - Flags: approval-mozilla-b2g37?

Attachment #8651883 - Flags: approval-mozilla-b2g37+

Attachment #8651883 - Flags: approval-mozilla-aurora?

Attachment #8651883 - Flags: approval-mozilla-aurora+

Ryan VanderMeulen [:RyanVM]

Comment 42

•

9 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/5573079c9118

status-firefox42: affected → fixed

Flags: in-testsuite?

Ryan VanderMeulen [:RyanVM]

Comment 43

•

9 years ago

https://hg.mozilla.org/releases/mozilla-beta/rev/800bc1acaffb

status-firefox41: affected → fixed

Ryan VanderMeulen [:RyanVM]

Comment 44

•

9 years ago

https://hg.mozilla.org/releases/mozilla-esr38/rev/6ee56d787755

status-firefox-esr38: affected → fixed

Ryan VanderMeulen [:RyanVM]

Comment 45

•

9 years ago

https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/34db06429228 https://hg.mozilla.org/releases/mozilla-b2g37_v2_2r/rev/34db06429228

status-b2g-v2.2: affected → fixed

status-b2g-v2.2r: affected → fixed

Matt Wobensmith [:mwobensmith][:matt:]

Updated

•

9 years ago

Whiteboard: [fixed in 43 by bug 1179909] → [fixed in 43 by bug 1179909][post-critsmash-triage]

BMO Automation

Updated

•

9 years ago

Group: core-security → core-security-release

Al Billings [:abillings - ex-MoCo]

Updated

•

9 years ago

tracking-firefox-esr38: --- → 41+

Whiteboard: [fixed in 43 by bug 1179909][post-critsmash-triage] → [fixed in 43 by bug 1179909][post-critsmash-triage][adv-main41+][adv-esr38.3+]

Daniel Veditz [:dveditz]

Updated

•

8 years ago

Group: core-security-release

You need to log in before you can comment on or make changes to this bug.