Closed
Bug 992512
Opened 10 years ago
Closed 10 years ago
DOMStorageCache crashes (sDatabase access violation read)
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
VERIFIED
FIXED
mozilla31
People
(Reporter: johnp, Assigned: mayhemer)
Details
Attachments
(2 files, 3 obsolete files)
1.73 KB,
patch
|
smaug
:
review+
lsblakk
:
approval-mozilla-aurora+
lsblakk
:
approval-mozilla-beta-
mayhemer
:
checkin+
|
Details | Diff | Splinter Review |
5.00 KB,
patch
|
smaug
:
review+
mayhemer
:
checkin+
|
Details | Diff | Splinter Review |
There are multiple crash signatures crashing on sDatabase calls, mostly by ::Clear and ::SetItem being called: https://crash-stats.mozilla.com/query/?query_type=simple&query=DOMStorageCache One of those crashes even goes back to 24 ESR. Could someone pls have a look into it? I don't know DOMStorage code, but should this even be possible to happen? (e.g. calling ::Clear after a ::StopDatabase (or how else does sDatabase get cleared?))
Comment 1•10 years ago
|
||
Just a missing null check?
Assignee | ||
Comment 2•10 years ago
|
||
Apparently [1][2] there is a reentrancy of the main thread loop. This probably allows content script run from during profile-before-change notifications. How is this possible? Who is the called of ProcessNextEvent here? Seems like some javascript does it? Bad! The option to fix the crash is not a null check, but shutdown of the DOMStorage database from something else then the profile-before-change notification, probably profile-before-change2. Apparently, content scripts may run past that point. I can do it even only from xpcom-shutdown, see [3]. Olli? [1] https://crash-stats.mozilla.com/report/index/f64b34a2-f3f6-47f6-adc1-f7b6b2140406 [2] https://crash-stats.mozilla.com/report/index/80dc739d-4c74-436e-a28b-fca072140406 [3] http://hg.mozilla.org/mozilla-central/annotate/5fa70bd90a8b/dom/src/storage/DOMStorageObserver.cpp#l296
Assignee | ||
Updated•10 years ago
|
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
QA Contact: honzab.moz
Assignee | ||
Comment 3•10 years ago
|
||
- changing profile-before-change to profile-before-change2 in hope all nested loops and content scripts are done by that time - other option: only shutdown at xpcom-shutdown, we must do it no longer after that point (synchronously) since the database connections have to be all closed after xpcom-shutdown
Assignee: nobody → honzab.moz
Attachment #8402360 -
Flags: review?(bugs)
Comment 4•10 years ago
|
||
Note: database connections need to be shutdown by the time xpcom-shutdown-threads runs, not xpcom-shutdown.
Assignee | ||
Comment 5•10 years ago
|
||
Josh, I indeed tried that before I submitted the patch, assertions raised in my face:
> xul.dll!mozilla::storage::Service::Observe(0x00000000, 0x06a9e790, 0x00000000) Line 940 C++
xul.dll!nsObserverList::NotifyObservers(0x00000000, 0x06a9e790, 0x00000000) Line 97 C++
xul.dll!nsObserverService::NotifyObservers(0x00000000, 0x06a9e790, 0x00000000) Line 305 C++
xul.dll!mozilla::ShutdownXPCOM(0x00481b2c) Line 886 C++
xul.dll!NS_ShutdownXPCOM(0x00481b2c) Line 819 C++
xul.dll!ScopedXPCOMStartup::~ScopedXPCOMStartup() Line 1202 C++
0x06a9e790 = NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID
Other option is to post an event from either profile2 or xpcom-shutdown that will shut the database down between xpcom-shutdown and xpcom-shutdown-threads. There is a root loop spin between those too. I think it could be be even the best solution here.
Assignee | ||
Comment 6•10 years ago
|
||
Tried to post an event only from xpcom-shutdown. Apparently, not the true solution either. I can see that the seer thread is being shutdown exactly from [1] and in the seer's thread shutdown nested loop my shutdown event for the local storage database is executed. [1] is the place I want my shutdown event to be fired too. So, if this main thread event loop happens a bit sooner from some reason we are still not safe from execution of a content script here. Adding just a null check will lead to actual (and unnecessary) data lose! Hmm.. thinking further, we could just let consumers of the database keep a reference to it. When it is finally released, do the shutdown thing. No events, just a flag set in profile2 and xpcom-shutdown that the database can go away when consumers are gone. Question here tho is, if all consumers are gone before the NS_XPCOM_SHUTDOWN_THREADS_OBSERVER_ID check for open sqlite connections... If not, then it's a problem beyond scope of this bug (beyond localStorage back end) [1] http://hg.mozilla.org/mozilla-central/annotate/5fa70bd90a8b/xpcom/build/nsXPComInit.cpp#l879
Assignee | ||
Comment 7•10 years ago
|
||
- added a null-check with an assertion (no more crashes) - shutdown delayed as much as possible and is implemented as a post from xpcom-shutdown, see comment 5 for rational - added a flag to prevent reinitialization of the database after it has ones been shut down I don't want to implement what's mentioned in comment 6, since it is first quite complicated and second may introduce shutdown of the connection after the check point mentioned in comment 5 or still lead to data lose. So, eventually not much better then this fix.
Attachment #8402360 -
Attachment is obsolete: true
Attachment #8402360 -
Flags: review?(bugs)
Attachment #8402375 -
Flags: review?(bugs)
Comment 8•10 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #2) > Apparently [1][2] there is a reentrancy of the main thread loop. This > probably allows content script run from during profile-before-change > notifications. How is this possible? Who is the called of ProcessNextEvent > here? Seems like some javascript does it? Bad! Shutting down some thread would also spin the event loop
Comment 9•10 years ago
|
||
Looks like Seer thing shuts down some thread.
Comment 10•10 years ago
|
||
...after some js spins the event loop first.
Assignee | ||
Comment 11•10 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #10) > ...after some js spins the event loop first. Exactly! Seer shutdown is sent as an event to prevent loops so it's innocent here. Something in JS invoked (maybe just indirectly) the main thread looping during profile-before-change notifications looping was on the stack. I think it will be hard to find that piece of code and enforce not to do looping.
Comment 12•10 years ago
|
||
Trying to understand why DOMStorage couldn't just handle the case when something spins the event loop.
Assignee | ||
Comment 13•10 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #12) > Trying to understand why DOMStorage couldn't just handle the case when > something spins the event loop. Explanation: - in profile-before-change we shut the database thread (which also blocks until all pending writes are committed to the database) down - later, a script is executed after that point that still wants to use localStorage, this is something I really would not expect The patch tries to delay shutdown of the database to a later point, ideally between xpcom-shutdown and xpcom-shutdown-threads. However, when someone 1) loops the main thread event queue and 2) executes a script from xpcom-shutdown after we did that, we may get to the point we lose data. If it is OK that web content scripts may execute even during xpcom-shutdown (which I personally would not expect..) then we may need to impl the "Hmm..." note from comment 6.
Comment 14•10 years ago
|
||
There is also http://mxr.mozilla.org/mozilla-central/source/widget/windows/nsWindow.cpp#4524 case.
Comment 15•10 years ago
|
||
Comment on attachment 8402375 [details] [diff] [review] v1 We could do this, but for branches a null check would be nicer, IMO.
Attachment #8402375 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 16•10 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #14) > There is also > http://mxr.mozilla.org/mozilla-central/source/widget/windows/nsWindow. > cpp#4524 case. Hmm.. this ruins it a bit... there is no xpcom-shutdown... And in all cases [1] we invoke profile(2) data == 'shutdown-persist', so no way to recognize between exit(0) and normal shutdown. The patch is then wrong... [1] http://mxr.mozilla.org/mozilla-central/search?string=Notify.*change2®exp=1&find=&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central
Assignee | ||
Comment 17•10 years ago
|
||
Comment on attachment 8402375 [details] [diff] [review] v1 comment 15/16
Attachment #8402375 -
Flags: review+ → review-
Comment 18•10 years ago
|
||
(In reply to Josh Matthews [:jdm] from comment #4) > Note: database connections need to be shutdown by the time > xpcom-shutdown-threads runs, not xpcom-shutdown. fwiw, anything using the profile (and dbs are in the profile) should start its shutdown at profile-before-change (now there is p-b-c2, but the point sticks) and likely be done by xpcom-shutdown. There's no guarantee that the openedDBs check that today happens at xpcom-shutdown-threads may have to be move earlier in future, there's still that plan to exit(0) at xpcom-shutdown around, even if there's no definitive plan for it. So we may have to move the barrier at xpcom-shutdown at a certain point.
Assignee | ||
Comment 19•10 years ago
|
||
(In reply to Marco Bonardo [:mak] from comment #18) > (In reply to Josh Matthews [:jdm] from comment #4) > > Note: database connections need to be shutdown by the time > > xpcom-shutdown-threads runs, not xpcom-shutdown. > > fwiw, anything using the profile (and dbs are in the profile) should start > its shutdown at profile-before-change (now there is p-b-c2, but the point > sticks) That's exactly what localstorage does! And hence the bug, because someone is breaking this contract and tries to access localstorage past this shutdown point. > and likely be done by xpcom-shutdown. We do that too. > There's no guarantee that the > openedDBs check that today happens at xpcom-shutdown-threads may have to be > move earlier in future, there's still that plan to exit(0) at xpcom-shutdown > around, even if there's no definitive plan for it. So we may have to move > the barrier at xpcom-shutdown at a certain point. I don't see a relevance here, the current bits of exit(0) doesn't trigger xpcom-shutdown (-threads) at all as Olli mentions in comment 14: http://hg.mozilla.org/mozilla-central/annotate/e7d3a8320289/widget/windows/nsWindow.cpp#l4528 Hence r- on the patch. I also have another plan how to fix this in the DOM storage code (but I'm quit busy these days with other stuff): after p-b-c tell the database to shutdown after all references to it go away. In xpcom-shutdown block (or rather force) shutdown with some (small as possible) data lose but no crash.
Assignee | ||
Comment 20•10 years ago
|
||
And by the way - this bug is indirectly caused by nsThread::Shutdown event queue spinning...
Assignee | ||
Comment 21•10 years ago
|
||
Olli, this is the same as v1 (you actually has already r+'ed) except the observer notification changes. It's actually a null check + check against reinitialization of the database after shutdown.
Attachment #8413213 -
Flags: review?(bugs)
Assignee | ||
Comment 22•10 years ago
|
||
Attachment #8413215 -
Flags: review?(bugs)
Updated•10 years ago
|
Attachment #8413213 -
Flags: review?(bugs) → review+
Updated•10 years ago
|
Attachment #8413215 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 23•10 years ago
|
||
Thanks for quick r+'s. https://hg.mozilla.org/integration/mozilla-inbound/rev/9e4afd48a727
Assignee | ||
Updated•10 years ago
|
Attachment #8413213 -
Flags: checkin+
Assignee | ||
Updated•10 years ago
|
Attachment #8402375 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
status-firefox28:
--- → affected
status-firefox29:
--- → affected
status-firefox30:
--- → affected
status-firefox31:
--- → affected
tracking-firefox29:
--- → ?
tracking-firefox30:
--- → ?
Assignee | ||
Comment 24•10 years ago
|
||
Comment on attachment 8413215 [details] [diff] [review] v2 - null check only for branches [Approval Request Comment] Bug caused by (feature/regressing bug #): 600307 User impact if declined: potential shutdown crash (null dereference) Testing completed (on m-c, etc.): similar patch just landed on m-i Risk to taking this patch (and alternatives if risky): near to zero, just a null check String or IDL/UUID changes made by this patch: none
Attachment #8413215 -
Flags: approval-mozilla-beta?
Attachment #8413215 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 25•10 years ago
|
||
Didn't catch you on IRC, Olli, maybe for the branch patch would be better to just return NS_OK when the database is not up. This will be in parity with the regular patch and also have less impact, do you agree to update the branch patch this way?
Flags: needinfo?(bugs)
Comment 26•10 years ago
|
||
Does it really matter? It is some unusual case. exception at least tells something went wrong.
Flags: needinfo?(bugs)
Assignee | ||
Comment 27•10 years ago
|
||
The patch is based on a plain m-c code, so to land this one, I will back the v2 out first. - as discussed on IRC, returns NS_ERROR_NOT_INITIALIZED - MOZ_ASSERT changed to NS_ERROR (imo better)
Attachment #8413267 -
Flags: review?(bugs)
Updated•10 years ago
|
Attachment #8413267 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 28•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/85e868943b61 https://hg.mozilla.org/integration/mozilla-inbound/rev/565704a273f5
Assignee | ||
Updated•10 years ago
|
Attachment #8413213 -
Attachment is obsolete: true
Attachment #8413213 -
Flags: checkin+
Assignee | ||
Updated•10 years ago
|
Attachment #8413267 -
Flags: checkin+
Comment 29•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/9e4afd48a727 https://hg.mozilla.org/mozilla-central/rev/85e868943b61 https://hg.mozilla.org/mozilla-central/rev/565704a273f5
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Comment 30•10 years ago
|
||
Tracking for 30 (and will approve uplift) but this is not a viable ride-along in a 29 chemspill so we're not going to track there, it can ride for the next 6 weeks.
Updated•10 years ago
|
Attachment #8413215 -
Flags: approval-mozilla-beta?
Attachment #8413215 -
Flags: approval-mozilla-beta-
Attachment #8413215 -
Flags: approval-mozilla-aurora?
Attachment #8413215 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 31•10 years ago
|
||
Comment on attachment 8413215 [details] [diff] [review] v2 - null check only for branches https://hg.mozilla.org/releases/mozilla-aurora/rev/041d2a2ee631
Attachment #8413215 -
Flags: checkin+
Assignee | ||
Updated•10 years ago
|
Assignee | ||
Updated•10 years ago
|
QA Contact: honzab.moz
Updated•10 years ago
|
Comment 32•10 years ago
|
||
After this fix was pushed, there are no crashes in Socorro on Fx 30 and 31 for the following signatures: [@ mozilla::dom::DOMStorageCache::Clear(mozilla::dom::DOMStorage const*) ] [@ mozilla::dom::DOMStorageCache::SetItem(mozilla::dom::DOMStorage const*, nsAString_internal const&, nsString const&, nsString&)] [@ mozilla::dom::DOMStorageCache::Clear] [@ mozilla::dom::DOMStorageCache::~DOMStorageCache()] Marking as verified based on the above results. Although, with [@ mozilla::dom::DOMStorageCache::~DOMStorageCache()] signature, there are a couple of crashes (~16) on Nightly 32.0a1: https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Adom%3A%3ADOMStorageCache%3A%3A~DOMStorageCache%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&version=Firefox%3A32.0a1&hang_type=any&date=2014-05-20+08%3A00%3A00&range_value=3#tab-reports Could this be a concern?
Reporter | ||
Comment 33•10 years ago
|
||
This crashes seem to come from a single user. Also only three reports show a full stacktrace (probably because this is a shutdown crash). This is the first one (the others are equal): https://crash-stats.mozilla.com/report/index/48a8b7dd-e190-4bce-9600-4b2bb2140518 Some kind of use after free introduced by delaying the db shutdown?
Assignee | ||
Comment 34•10 years ago
|
||
(In reply to Johannes Pfrang [:johnp] from comment #33) > This crashes seem to come from a single user. Also only three reports show a > full stacktrace (probably because this is a shutdown crash). This is the > first one (the others are equal): > > https://crash-stats.mozilla.com/report/index/48a8b7dd-e190-4bce-9600- > 4b2bb2140518 > > Some kind of use after free introduced by delaying the db shutdown? Thanks, filed bug 1013329 on that.
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 35•10 years ago
|
||
(In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #32) > After this fix was pushed, there are no crashes in Socorro on Fx 30 and 31 > for the following signatures: > [@ mozilla::dom::DOMStorageCache::Clear(mozilla::dom::DOMStorage const*) ] > [@ mozilla::dom::DOMStorageCache::SetItem(mozilla::dom::DOMStorage const*, > nsAString_internal const&, nsString const&, nsString&)] > [@ mozilla::dom::DOMStorageCache::Clear] > [@ mozilla::dom::DOMStorageCache::~DOMStorageCache()] > > Marking as verified based on the above results. > > Although, with [@ mozilla::dom::DOMStorageCache::~DOMStorageCache()] > signature, there are a couple of crashes (~16) on Nightly 32.0a1: > https://crash-stats.mozilla.com/report/ > list?signature=mozilla%3A%3Adom%3A%3ADOMStorageCache%3A%3A~DOMStorageCache%28 > %29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&ver > sion=Firefox%3A32.0a1&hang_type=any&date=2014-05- > 20+08%3A00%3A00&range_value=3#tab-reports > Could this be a concern? To answer, this crash (in the dtor) is there since 28 and maybe even earlier. So it's probably not related to this patch.
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•