Closed
Bug 1376874
Opened 7 years ago
Closed 7 years ago
Crash in AsyncShutdownTimeout | profile-before-change | Sqlite.jsm shutdown blocker
Categories
(Toolkit :: Blocklist Implementation, defect, P1)
Toolkit
Blocklist Implementation
Tracking
()
RESOLVED
FIXED
mozilla56
People
(Reporter: philipp, Assigned: leplatrem)
References
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(1 file)
59 bytes,
text/x-review-board-request
|
mak
:
review+
jcristau
:
approval-mozilla-beta-
|
Details |
This bug was filed from the Socorro interface and is report bp-30da927f-631a-4a75-89a1-4fb690170628. ============================================================= these async shutdownhangs are spiking cross-platform on 56.0a1 after build 20170627030209. Correlations for Firefox Nightly (100.0% in signature vs 00.03% overall) abort_message = ###!!! ABORT: file resource://gre/modules/Sqlite.jsm, line 156 (100.0% in signature vs 03.62% overall) moz_crash_reason = MOZ_CRASH() (100.0% in signature vs 05.54% overall) Module "twinapi.appcore.dll" = true (100.0% in signature vs 05.63% overall) Module "twinapi.dll" = true (83.33% in signature vs 07.64% overall) process_type = null [100.0% vs 06.75% if startup_crash = 0] (100.0% in signature vs 08.29% overall) Module "explorerframe.dll" = true (100.0% in signature vs 09.27% overall) Module "winsta.dll" = true (100.0% in signature vs 09.40% overall) Module "FWPUCLNT.DLL" = true (100.0% in signature vs 09.55% overall) Module "rasadhlp.dll" = true (100.0% in signature vs 11.11% overall) shutdown_progress = profile-before-change (83.33% in signature vs 08.50% overall) Module "ondemandconnroutehelper.dll" = true graph from nightly: https://crash-stats.mozilla.com/signature/?release_channel=nightly&signature=AsyncShutdownTimeout%20%7C%20profile-before-change%20%7C%20Sqlite.jsm%20shutdown%20blocker&date=%3E%3D2017-03-01T00%3A00%3A00.000Z&date=%3C2017-06-28T18%3A25%3A45.000Z#graphs
Comment 1•7 years ago
|
||
This signature is ranked #5 in nightly top-crashers. ::mak, could you investigate please ?
Flags: needinfo?(mak77)
Keywords: topcrash
Comment 3•7 years ago
|
||
It looks like it suddenly started having 50 crashes a day (from maybe 0-1 per day) in 56 nightly on June 28.
Comment 4•7 years ago
|
||
That's interesting, but I was expecting to see a bump on 21st if it was due to bug 1371945 :( I wonder if some extension using Sqlite released a new version on that day. Correlations don't seem to be correct for add-ons (for example I see activity-stream in a lot of crashes, but I don't see it in correlations) and have too much noise to make any use of them.
Flags: needinfo?(mak77)
Updated•7 years ago
|
Flags: needinfo?(mak77)
Reporter | ||
Comment 5•7 years ago
|
||
i don't see any obvious non-mozilla addons in those reports: https://crash-stats.mozilla.com/search/?signature=%3DAsyncShutdownTimeout%20%7C%20profile-before-change%20%7C%20Sqlite.jsm%20shutdown%20blocker&version=56.0a1&date=%3E%3D2017-06-23T18%3A15%3A32.000Z&_facets=addons#facet-addons
Comment 6•7 years ago
|
||
it doesn't necessarily need to be non-mozilla add-ons, our add-ons use (and thus could misuse) Sqlite. I'm not sure which between pocket, activity stream and formautofill are using Sqlite.jsm, that could be checked. If any of these add-ons got an update around the date where the bump happened, it could likely be related.
Comment 7•7 years ago
|
||
Looks like on 28 it has been release v1.14 of Activity Stream. I'm pretty sure it uses Sqlite.jsm. If it'd register in a transaction a promise that is never resolved, or run a very long (hanging) recursive query, it may cause this kind of async shutdown crash. It may be a coincidence, but the date and correlation seems to point there. I've also excluded FormAutofill.
Comment 8•7 years ago
|
||
This was the #3 topcrash in Window Nightly 20170630030203 build. But it doesn't show up at all in the Windows Nightly 20170630150342 build, which was the second Nightly build for that same day. I'm not sure what to make of that change...
Comment 9•7 years ago
|
||
Nan, any idea if AS 1.14 did some kind of data migration or uses Sqlite executeTransaction with some promise that may never resolve in some circumstances?
Flags: needinfo?(mak77) → needinfo?(najiang)
Comment 10•7 years ago
|
||
So, I did an aggregation on async shudown timeout and I see that most crashes are caused by {"name":"metadata.sqlite#0: waiting for shutdown","state":{"identifier":"metadata.sqlite#0","isCloseRequested":false,"hasDbConn":true,"hasInProgressTransaction":false,"pendingStatements":0,"statementCounter":11} {"name":"kinto.sqlite#3: waiting for shutdown","state":{"identifier":"kinto.sqlite#3","isCloseRequested":false,"hasDbConn":true,"hasInProgressTransaction":false,"pendingStatements":0,"statementCounter":10} These 2 dbs are covering 99% of the cases... This seems to cope with the fact part of this started with the Activity Stream add-on update. metadata.sqlite is surely from it. I'm not sure about kinto.sqlite, who is using it? blocklists? This is not a bug in Storage or Sqlite.jsm, it's a consumers bug. I'm moving to AS, but I'm not sure who should take care of the kinto.sqlite problem though, since I don't know who uses it.
Component: Storage → Activity Streams: General
Product: Toolkit → Firefox
Comment 11•7 years ago
|
||
Mark, do you know anything about kinto.sqlite usage and who owns that?
Flags: needinfo?(markh)
Updated•7 years ago
|
Flags: needinfo?(mathieu)
Comment 12•7 years ago
|
||
I suspect the problem here is that we never actually invoke close() on the connection, rather than a stuck transaction. Note that consumers may use Sqlite.shutdown.addBlocker to ensure the connection is closed properly, it will be invoked when Sqlite.jsm is shutting down. Something like: let conn = await Sqlite.openConnection(...); try { Sqlite.shutdown.addBlocker("My connection closing", async function() { await conn.close(); }); } catch (ex) { // It's too late to block shutdown, just close the connection. await conn.close(); throw ex; }
Comment 13•7 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #11) > Mark, do you know anything about kinto.sqlite usage and who owns that? No, but I suspect Ethan might.
Flags: needinfo?(markh) → needinfo?(eglassercamp)
Comment 14•7 years ago
|
||
I believe kinto.sqlite itself is only used by blocklists.
Assignee | ||
Comment 15•7 years ago
|
||
kinto.sqlite is the local database for blocklists. We are transitioning blocklists from a huge XML file downloaded daily to a diff-based system. I can take care of fixing the blocklists clients with your snippet. I'm surprised there are so few places where `Sqlite.shutdown.addBlocker()` is used though.
Flags: needinfo?(mathieu)
Comment 16•7 years ago
|
||
(In reply to Mathieu Leplatre (:leplatrem) from comment #15) > I can take care of fixing the blocklists clients with your snippet. I'm > surprised there are so few places where `Sqlite.shutdown.addBlocker()` is > used though. You are right, I suspect it's mostly a documentation problem, we should better document the close() semantics in Sqlite.jsm Afair, close() can be invoked even multiple times, so if you can close the connection earlier, the shutdown blocker will just be a safety fallback.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → mathieu
Assignee | ||
Comment 18•7 years ago
|
||
> You are right, I suspect it's mostly a documentation problem, we should better > document the close() semantics in Sqlite.jsm I added a small mention on MDN docs about Sqlite.jsm https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/Sqlite.jsm#close()
Comment 19•7 years ago
|
||
I'm cloning the bug and creating 2 instances, one in blocklisting and one in Activity Stream, so we can handle the 2 consumers apart.
Updated•7 years ago
|
Updated•7 years ago
|
Updated•7 years ago
|
Component: Activity Streams: General → Blocklisting
Product: Firefox → Toolkit
Updated•7 years ago
|
Priority: -- → P1
Comment 20•7 years ago
|
||
mozreview-review |
Comment on attachment 8883257 [details] Bug 1376874 - Close blocklist Sqlite connection on shutdown https://reviewboard.mozilla.org/r/154196/#review159338 LGTM, thank you for looking into it. ::: services/common/tests/unit/test_storage_adapter_shutdown.js:25 (Diff revision 1) > + do_check_eq("Should not succeed, connection should be closed.", false); > + } catch (e) { > + do_check_eq(e.message, "Connection is not open."); > + } > + > + run_next_test(); if you use add_task instead of add_test, you can avoid run_next_next().
Attachment #8883257 -
Flags: review?(mak77) → review+
Comment 21•7 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #4) > That's interesting, but I was expecting to see a bump on 21st if it was due > to bug 1371945 :( > > I wonder if some extension using Sqlite released a new version on that day. > > Correlations don't seem to be correct for add-ons (for example I see > activity-stream in a lot of crashes, but I don't see it in correlations) and > have too much noise to make any use of them. I think activity-stream here refers to the system addon, which, unlike the one on Test Pilot, doesn't use a separate sqlite database called "metadata store". Anyway, we have added the sqlite shutdown blocker to the Test Pilot version. Let's see what happens.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Keywords: checkin-needed
Comment 23•7 years ago
|
||
Pushed by cbook@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/92b787c8d2a3 Close blocklist Sqlite connection on shutdown r=mak
Keywords: checkin-needed
Comment 24•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/92b787c8d2a3
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Comment 25•7 years ago
|
||
If my blame digging is correct, this issue goes back as far as Fx53? Should we consider backporting this to Beta?
Flags: needinfo?(mathieu)
Comment 26•7 years ago
|
||
If I read crash-stats correctly, most entries are on 56 (I selected the ones with "async shutdown timeout" containing kinto, for the last month): 1 56.0a1 1419 82.36 % 2 54.0 42 2.44 % 3 53.0.3 29 1.68 % 4 55.0b4 27 1.57 % 5 54.0b99 24 1.39 % 6 54.0.1 23 1.33 % 7 55.0b6 19 1.10 %
Assignee | ||
Comment 27•7 years ago
|
||
Technically the kinto.sqlite code was introduced for Fx45 (Oct 2015). I can't really say if the shutdown problem exists from the beginning or not. I just guess it does. So yes, the patch should be relatively easy to backport. But to be honest, I don't know the process nor the amount of efforts it represents on the product integrity side!
Flags: needinfo?(mathieu)
Comment 28•7 years ago
|
||
I think this is probably worth a backport to Beta. I've already confirmed that it grafts cleanly, so we just need a Beta approval request on the attachment. For ESR52, I agree that the crash rate is low enough that we should probably just leave well enough alone.
status-firefox54:
--- → wontfix
status-firefox55:
--- → affected
status-firefox-esr52:
--- → wontfix
Flags: needinfo?(mathieu)
Comment 29•7 years ago
|
||
Mathieu, we're running low on time this cycle. Can you please do the Beta approval request ASAP?
Comment 30•7 years ago
|
||
Comment on attachment 8883257 [details] Bug 1376874 - Close blocklist Sqlite connection on shutdown Approval Request Comment [Feature/Bug causing the regression]: Kinto blocklists [User impact if declined]: Possible crash on shutdown [Is this code covered by automated tests?]: yes [Has the fix been verified in Nightly?]: yes [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: not particularly [Why is the change risky/not risky?]: the code is just adding a shutdown blocker to ensure closing the database connection, it won't affect other than shutdown. [String changes made/needed]: none
Attachment #8883257 -
Flags: approval-mozilla-beta?
Comment 31•7 years ago
|
||
Comment on attachment 8883257 [details] Bug 1376874 - Close blocklist Sqlite connection on shutdown this isn't super high volume, not a new issue, and we're out of betas. beta55-
Attachment #8883257 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
Assignee | ||
Comment 32•7 years ago
|
||
I'm sorry for not making known my time off on Bugzilla :( Thanks for moving it forward!
Flags: needinfo?(mathieu)
Updated•7 years ago
|
Updated•5 years ago
|
Component: Blocklist Policy Requests → Blocklist Implementation
You need to log in
before you can comment on or make changes to this bug.
Description
•