Crash in [@ mozilla::dom::cache::Connection::Close]
Categories
(Core :: Storage: Cache API, defect, P1)
Tracking
()
People
(Reporter: sg, Assigned: sg)
References
(Blocks 1 open bug)
Details
(Keywords: crash)
Crash Data
Attachments
(3 files)
This bug is for crash report bp-91a52b77-0202-47cf-8cdb-f6fbf0200610.
Top 10 frames of crashing thread:
0 xul.dll mozilla::dom::cache::Connection::Close dom/cache/Connection.cpp:42
1 xul.dll mozilla::dom::cache::Connection::~Connection dom/cache/Connection.cpp:28
2 xul.dll mozilla::dom::cache::Connection::Release dom/cache/Connection.cpp:18
3 xul.dll mozilla::dom::cache::Context::Data::Release dom/cache/Context.cpp:82
4 xul.dll mozilla::dom::cache::Context::ActionRunnable::Run dom/cache/Context.cpp:624
5 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1236
6 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:501
7 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:332
8 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
9 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290
This is happening with low frequency since Nightly build 20200409095500. The crash reason is MOZ_DIAGNOSTIC_ASSERT(false) (NS_SUCCEEDED(db::IncrementalVacuum(this)))
.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 1•4 years ago
|
||
Since Nightly build 20200409095500 is the build where Bug 1619965 landed, it might be that this happens in the situations that previously crashed with the signature from Bug 1619305. Not sure what happens in non-diagnostic builds.
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 2•4 years ago
|
||
Asserting that a db operation always succeeds looks too strong to me. It's also not critical if vacuuming fails, but Andrew or Tom should look at this too.
Comment 3•4 years ago
|
||
But it would be interesting to know if there's still a transaction open or it fails for some other reason.
Assignee | ||
Comment 4•4 years ago
|
||
Would adding a crash annotation with the actual nsresult
value help to get insight into this?
Comment 5•4 years ago
|
||
Yes, that might help.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 6•4 years ago
|
||
Updated•4 years ago
|
Assignee | ||
Comment 7•4 years ago
|
||
I provided a patch. We need data review for that before landing.
Assignee | ||
Comment 8•4 years ago
|
||
Comment on attachment 9155879 [details]
Bug 1644780 - Include nsresult value in crash report when db::IncrementalVacuum fails. r=janv
What questions will you answer with this data?
Gain insight in the reasons for failure to perform an IncrementalVacuum
.
Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:
Previously, it was assumed the call never fails using MOZ_ALWAYS_SUCCEEDS
. Until recently, this was probably masked by another defect in indexedDB code.
What alternative methods did you consider to answer these questions? Why were they not sufficient?
None.
Can current instrumentation answer these questions?
No.
List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.
Note that the data steward reviewing your request will characterize your data collection based on the highest (and most sensitive) category.
Measurement Description | Data Collection Category | Tracking Bug #
nsresult return value of db::IncrementalVacuum
. | Category 1 | Bug 1644780
How long will this data be collected? Choose one of the following:
- I want this data to be collected for 6 months initially (potentially renewable).
What populations will you measure?
Nightly and beta release channels (those where a MOZ_DIAGNOSTIC_ASSERT
would be active)
If this data collection is default on, what is the opt-out mechanism for users?
None.
Please provide a general description of how you will analyze this data.
Determine which nsresult
value(s) cause the failure. Depending on the result, further data collection might become necessary.
Where do you intend to share the results of your analysis?
This will probably result in a comment and/or new bug on Bugzilla.
Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection? If so: Are you using that on the Mozilla backend? Or going directly to the third-party?
Not a third party tool, but not telemetry either, but a crash annotation.
Comment 9•4 years ago
|
||
Comment on attachment 9155879 [details]
Bug 1644780 - Include nsresult value in crash report when db::IncrementalVacuum fails. r=janv
Data reviews must be completed in a place accessible by the people subject to the collection. If this bug needs to remain private (given it's a secbug, I'm guessing that's likely), please file the data review request in its own separate bug. (and attach it as a plaintext attachment to the bug so it's easier for us Stewards to find)
Updated•4 years ago
|
Assignee | ||
Comment 10•4 years ago
|
||
Comment on attachment 9155879 [details]
Bug 1644780 - Include nsresult value in crash report when db::IncrementalVacuum fails. r=janv
The bug is now public. Sorry for the troubles.
Comment 11•4 years ago
|
||
Comment on attachment 9155879 [details]
Bug 1644780 - Include nsresult value in crash report when db::IncrementalVacuum fails. r=janv
As I mentioned earlier, data reviews should be attached to bugs. I'll take care of that in a moment.
Comment 12•4 years ago
|
||
Comment 13•4 years ago
|
||
Assignee | ||
Comment 14•4 years ago
|
||
(In reply to Chris H-C :chutten from comment #11)
Comment on attachment 9155879 [details]
Bug 1644780 - Include nsresult value in crash report when db::IncrementalVacuum fails. r=janvAs I mentioned earlier, data reviews should be attached to bugs. I'll take care of that in a moment.
Sorry, I did not quite understand that part of your earlier comment as it seems. Now it's clear what you meant :) (I am still not sure why there is a data-review flag then on the attachment)
Comment 15•4 years ago
|
||
(In reply to Simon Giesecke [:sg] [he/him] from comment #14)
(In reply to Chris H-C :chutten from comment #11)
Comment on attachment 9155879 [details]
Bug 1644780 - Include nsresult value in crash report when db::IncrementalVacuum fails. r=janvAs I mentioned earlier, data reviews should be attached to bugs. I'll take care of that in a moment.
Sorry, I did not quite understand that part of your earlier comment as it seems. Now it's clear what you meant :) (I am still not sure why there is a data-review flag then on the attachment)
Oh, as why there's an option for a data-review
on the phab "attachment"? When :emceeaich added the flag for us in the first place I don't know that there was an option to limit the flag to only certain mimetypes. I wonder if that's a possibility...
Comment 16•4 years ago
|
||
Comment 17•4 years ago
|
||
bugherder |
Assignee | ||
Comment 18•4 years ago
|
||
We have one report (https://crash-stats.mozilla.org/report/index/54855192-162d-4f47-bd71-316b60200705) now with a nsresult
value of 0x8052000b
, which is NS_ERROR_FILE_CORRUPTED
. Jan, does this make this more actionable?
Comment 19•4 years ago
|
||
Interesting, the error is currently only handled here:
https://searchfox.org/mozilla-central/rev/c86c19bd64f8f19590a4190c282781d3a9631422/dom/cache/DBAction.cpp#230
But we already found out in LSNG that it's not sufficient to check it only after opening the database.
We need to solve this in general somehow, but in this concrete case we should just throw away entire cache ?
Comment 20•4 years ago
•
|
||
(In reply to Jan Varga [:janv] from comment #19)
We need to solve this in general somehow, but in this concrete case we should just throw away entire cache ?
I think the general plan is:
- Enhance mozStorage in bug 1125157 to provide corruption notifications.
- On corruption, wipe the storage bucket, which currently means the entire origin. (Currently this is specified by https://storage.spec.whatwg.org/#storage-endpoints to include LocalStorage but we can do a carve-out.) I think we've discussed having the DirectoryLock class expose a method for this which could then notify the nsIClearDataService.
Doing a localized removal of the cache (database) without doing things at the bucket/origin level is probably an okay stop-gap since the ServiceWorkerManager/ScriptLoader will notice the storage has gone away the next time it goes to try and wake up the ServiceWorker and will remove the registration.
Comment 21•4 years ago
|
||
In our team meeting we just discussed immediate removal of the assertion with a follow-up for doing the right thing.
Assignee | ||
Comment 22•4 years ago
|
||
Comment 23•4 years ago
|
||
Comment 24•4 years ago
|
||
bugherder |
Assignee | ||
Comment 25•4 years ago
|
||
Andrew, the bug as such is now fixed, and I am not sure what exactly to file as a follow-up based on your comment https://bugzilla.mozilla.org/show_bug.cgi?id=1644780#c20. For the first bullet point, we already have a bug on file.
Comment 26•4 years ago
|
||
I filed bug 1652461 for point 2 of comment 20.
Updated•4 years ago
|
Description
•