Closed
Bug 507199
Opened 15 years ago
Closed 15 years ago
Fix race condition in finalizing asynchronous storage statements
Categories
(Core :: SQLite and Embedded Database Bindings, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.3a1
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta1-fixed |
People
(Reporter: sdwilsh, Assigned: sdwilsh)
References
Details
(Keywords: crash, intermittent-failure, verified1.9.2)
Attachments
(2 files)
14.38 KB,
text/plain
|
Details | |
1.88 KB,
patch
|
asuth
:
review+
beltzner
:
approval1.9.2+
|
Details | Diff | Splinter Review |
This has been happening on the places branch for a little while. mak has not been able to reproduce it locally, and it appears to be windows only.
Most recent failure:
WINNT 5.2 places unit test on 2009/07/29 09:23:07
http://tinderbox.mozilla.org/showlog.cgi?log=Places/1248884587.1248892210.567.gz&fulltext=1
Assignee | ||
Updated•15 years ago
|
Whiteboard: [orange]
Assignee | ||
Comment 1•15 years ago
|
||
WINNT 5.2 places unit test on 2009/07/29 09:55:51
http://tinderbox.mozilla.org/showlog.cgi?log=Places/1248886551.1248894720.30438.gz&fulltext=1
Assignee | ||
Comment 2•15 years ago
|
||
WINNT 5.2 mozilla-central test everythingelse on 2009/07/29 14:09:00
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1248901740.1248903720.8789.gz&fulltext=1
Assignee | ||
Comment 3•15 years ago
|
||
According to robarnold, the return code means we are doing a null pointer dereference.
Assignee | ||
Comment 4•15 years ago
|
||
Correction - it's an access violation. May or may not be null.
Assignee | ||
Comment 5•15 years ago
|
||
Running these tests under valgrind doesn't come up with anything.
Comment 6•15 years ago
|
||
i'm now running those tests in an infinite loop... in more than 200 runs i've not seen any crash.
Assignee | ||
Comment 7•15 years ago
|
||
Maybe we should disable these tests until bug 506042 gets resolved? This is happening a lot more than I'd like suddenly, and it's lack of reproducibility makes it next to impossible to debug.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1248924517.1248932490.32720.gz#err6
This is happening pretty much every run now
Assignee | ||
Comment 10•15 years ago
|
||
I'm just going to disable these tests. Once bug 506042 gets resolved, we can re-enable them and debug.
Assignee | ||
Comment 11•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1249143659.1249148848.16320.gz
WINNT 5.2 mozilla-central unit test on 2009/08/01 09:20:59
looks like the same thing except in test_browser_places\unit\test_browserGlue_shutdown.js ?
Comment 13•15 years ago
|
||
at this point this appear to be JS or cycle collector crash, disabling all tests is not an option and will simply move on the crash to another test.
Comment 14•15 years ago
|
||
on Windows xpcshell tests i still crash on bug 507424, could be worth reenabling tests after that has been fixed, to check if it's somehow related.
Comment 15•15 years ago
|
||
in comment 14 i meant i crash in other xpcshell tests, not during the above ones.
Comment 17•15 years ago
|
||
i'd like to try reenabling these tests, to see if the crashes were depending on some of the JS crashes recently fixed. I'll try to do that at a quiet time when i'l have enough time to follow the tree.
Comment 18•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1250863787.1250865533.11322.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/08/21 07:09:47
Updated•15 years ago
|
Summary: test_browserGlue_corrupt_nobackup_default.js and test_browserGlue_corrupt_nobackup.js randomly crash → test_browserGlue_corrupt_nobackup_default.js, test_browserGlue_corrupt_nobackup.js, test_browserGlue_shutdown.js randomly crash
Comment 19•15 years ago
|
||
i'd say there is nothing we can do at this time, till we get a valid stack trace, somehow.
Comment 20•15 years ago
|
||
Tracemonkey took the patch from bug 506042 ahead of mozilla-central, this is the stack from the xpcshell run where it crashed running test_browser_places\unit\test_browserGlue_shutdown.js.
Comment 21•15 years ago
|
||
sayrer was puzzling over this, but I guess this is the same crash as m-c, just something landed on tracemonkey to make it more frequent.
Comment 22•15 years ago
|
||
yeah, this is happening every run on TM windows. I think there's a race on shutdown in the storage code.
Comment 23•15 years ago
|
||
the same thing on Places branch, see bug 511965.
Assignee | ||
Comment 24•15 years ago
|
||
This is, indeed, a race condition on shutdown. Here's what is happening:
1) We call Close on the Connection object. This causes us to spin the background thread's event loop and process any pending events.
2) The events are processed, and then in AsyncExecuteStatements::notifyComplete we call finalize on the StatementData object. This resets the statement, and then releases the reference to it.
3) Our reference count on our Statement goes to zero, so the destructor is called, which calls Statement::Finalize.
4) In Statement::Finalize, we try to dispatch an event to the background thread to finalize the async statement (this is needed to serialize access to the async statement). However, at this time, we can no longer get the background thread.
The fix here is simple - we just synchronously finalize if we cannot get the background thread (because it's shutting down or has been shutdown).
Component: Places → Storage
Flags: blocking1.9.2?
QA Contact: places → storage
Target Milestone: --- → mozilla1.9.2b1
Assignee | ||
Comment 25•15 years ago
|
||
Assignee | ||
Comment 26•15 years ago
|
||
This was happening in the wild too: http://➡.ws/蚯
Whiteboard: [orange] → [orange][needs review asuth]
Comment 27•15 years ago
|
||
So, I think the tracemonkey merge turned this into constant Windows orange.
Updated•15 years ago
|
Severity: normal → blocker
Updated•15 years ago
|
Attachment #396480 -
Flags: review?(bugmail) → review+
Assignee | ||
Updated•15 years ago
|
Whiteboard: [orange][needs review asuth] → [orange][can land]
Comment 29•15 years ago
|
||
please when landing this also re-enable disabled tests (see comment 11)
Comment 30•15 years ago
|
||
pushed by Jesse
http://hg.mozilla.org/mozilla-central/rev/2ee676e74eac
I'm going to re-enable the tests and mark this fixed.
Comment 31•15 years ago
|
||
re-enaled by Jesse :)
http://hg.mozilla.org/mozilla-central/rev/cf74f6f746c3
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 32•15 years ago
|
||
This impacts 1.9.2 as well.
Summary: test_browserGlue_corrupt_nobackup_default.js, test_browserGlue_corrupt_nobackup.js, test_browserGlue_shutdown.js randomly crash → Fix race condition in finalizing asynchronous storage statements
Whiteboard: [orange][can land] → [orange]
Target Milestone: mozilla1.9.2b1 → mozilla1.9.3a1
Assignee | ||
Updated•15 years ago
|
Attachment #396480 -
Flags: approval1.9.2?
Updated•15 years ago
|
Attachment #396480 -
Flags: approval1.9.2? → approval1.9.2+
Comment 33•15 years ago
|
||
Comment on attachment 396480 [details] [diff] [review]
v1.0
a192=beltzner
Assignee | ||
Comment 34•15 years ago
|
||
status1.9.2:
--- → beta1-fixed
Updated•15 years ago
|
Flags: blocking1.9.2? → blocking1.9.2+
Comment 35•15 years ago
|
||
assuming this is fixed with tests re-enabled and passing.
Keywords: verified1.9.2
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [orange]
Updated•3 months ago
|
Product: Toolkit → Core
You need to log in
before you can comment on or make changes to this bug.
Description
•