Closed Bug 1597211 Opened 2 months ago Closed 2 months ago

Crash in [@ mozilla::ipc::ProcessLink::SendMessage | IPC_Message_Name=PBackgroundIDBCursor::Msg_Response]

Categories

(Core :: Storage: IndexedDB, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox-esr68 --- unaffected
firefox71 --- unaffected
firefox72 --- fixed
firefox73 --- fixed

People

(Reporter: gsvelto, Assigned: sg, NeedInfo)

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(7 files)

This bug is for crash report bp-39225571-b387-4f9c-8222-5d9190191117.

Top 10 frames of crashing thread:

0 xul.dll void mozilla::ipc::ProcessLink::SendMessage ipc/glue/MessageLink.cpp:151
1 xul.dll mozilla::ipc::MessageChannel::Send ipc/glue/MessageChannel.cpp:1020
2 xul.dll mozilla::ipc::IProtocol::ChannelSend ipc/glue/ProtocolUtils.cpp:477
3 xul.dll mozilla::dom::indexedDB::PBackgroundIDBCursorParent::SendResponse ipc/ipdl/PBackgroundIDBCursorParent.cpp:100
4 xul.dll void mozilla::dom::indexedDB::`anonymous namespace'::Cursor::SendResponseInternal dom/indexedDB/ActorsParent.cpp:15323
5 xul.dll nsresult mozilla::dom::indexedDB::`anonymous namespace'::Cursor::ContinueOp::SendSuccessResult dom/indexedDB/ActorsParent.cpp:26565
6 xul.dll void mozilla::dom::indexedDB::`anonymous namespace'::TransactionDatabaseOperationBase::SendPreprocessInfoOrResults dom/indexedDB/ActorsParent.cpp:22171
7 xul.dll nsresult mozilla::dom::indexedDB::`anonymous namespace'::TransactionDatabaseOperationBase::Run dom/indexedDB/ActorsParent.cpp
8 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1250
9 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486

Crashes across all three platforms on nightly only (apparently). The raw crash reason is:

MOZ_CRASH(IPC message size is too large)

Assignee: nobody → sgiesecke
Component: IPC → Storage: IndexedDB
Priority: -- → P1

How can I query the size of the serialized message before sending? Or what are best practices to get an upper bound for the serialized size to avoid running into this assertion?

Flags: needinfo?(jld)

Yes, there's test_maximal_serialized_object_size.js

Status: NEW → ASSIGNED

(In reply to Jan Varga [:janv] from comment #4)

Yes, there's test_maximal_serialized_object_size.js

Unfortunately, this doesn't really test if the size calculation matches the size calculation done by the IPC code, since it sets an own pref with some value (to 20 MB, which is much smaller than the actual IPC limit of 256 MB), and just tests the size calculation against that.

Yeah, the test isn't perfect.

Pushed by sgiesecke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/283b309498e0
Do not cast IPC message size to signed int. r=jld
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #9110229 - Attachment description: Bug 1597211 - Introduce InitializedOnce class template and use to reduce statefulness of Cursor class. r=#dom-workers-and-storage → Bug 1597211 - Use InitializedOnce to reduce statefulness of Cursor class. r=#dom-workers-and-storage
Pushed by sgiesecke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/12a323ce4403
Introduce InitializedOnce class template. r=dom-workers-and-storage-reviewers,asuth

(In reply to Jan Varga [:janv] from comment #7)

Yeah, the test isn't perfect.

Should we take action to improve it?

The test uses only 20 MB limit (instead of 256MB) to avoid slowing down testing infrastructure. It's not exactly the same situation as with normal 256MB limit, but I think it's pretty good approximation for now.

Status: REOPENED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---

This is #2 at https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=72.0b

Can we do something in the short term here?

Flags: needinfo?(sgiesecke)
Keywords: topcrash

(In reply to Julien Cristau [:jcristau] from comment #21)

This is #2 at https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=72.0b

Can we do something in the short term here?

I already have patches to fix the issue. I pinged my team to review them ASAP.

Flags: needinfo?(sgiesecke)
Pushed by sgiesecke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7cffeb6316b5
Adjust parameters and return types to allow passing response size information. r=dom-workers-and-storage-reviewers,ytausky
https://hg.mozilla.org/integration/autoland/rev/a590b97b81b5
Stop processing entries when maximum message size has been exceeded. r=dom-workers-and-storage-reviewers,ytausky
https://hg.mozilla.org/integration/autoland/rev/131474ea2464
Calculate size of individual cursor response entries. r=dom-workers-and-storage-reviewers,ytausky
https://hg.mozilla.org/integration/autoland/rev/a545df406ed4
Use EmplaceBack instead of AppendResult, and use return value rather than output parameter. r=dom-workers-and-storage-reviewers,ytausky
https://hg.mozilla.org/integration/autoland/rev/c4dde82d7eba
Use InitializedOnce to reduce statefulness of Cursor class. r=dom-workers-and-storage-reviewers,asuth
Keywords: leave-open

Andrew, how plausible is it this is related to consumers and/or contents of the DB (whether that's individual record size or size of a full response to a particular query)? And/or would those consumers see different behaviour as a result of the patches here?

For context, I'm just trying to establish if there's likely to be / have been any correlation with turning on the indexed-db-based remote settings blocklist, or if that's unlikely... At least timing-wise, bug 1594521 landed slightly after the reports here started so it seems unlikely - but also worth checking?

Flags: needinfo?(bugmail)

(In reply to :Gijs (he/him) from comment #24)

Andrew, how plausible is it this is related to consumers and/or contents of the DB (whether that's individual record size or size of a full response to a particular query)? And/or would those consumers see different behaviour as a result of the patches here?

For context, I'm just trying to establish if there's likely to be / have been any correlation with turning on the indexed-db-based remote settings blocklist, or if that's unlikely... At least timing-wise, bug 1594521 landed slightly after the reports here started so it seems unlikely - but also worth checking?

I am not sure if I parse your comment correctly. The issue happens with rather large records. Before preloading landed, during cursor iteration only a single record was transfered via IPC, now the default is 64 records. So if 64 records are larger than the IPC message size limit (which is 256 MB), the crash occurs. This is fixed by the patches that are currently landing.

New internal uses such as the remote settings might have increased the number of crashes, if the record size is large. I don't know if this might be the case intentionally or accidentally for the remote settings blocksize.

(As a side note, the remote-settings usage of the IndexedDB has been making it harder to debug IndexedDB tests. It would be great to be able to turn it off when running tests (wpt/mochitest). Is there a way to achieve that, or could one be added?)

(In reply to Simon Giesecke [:sg] [he/him] from comment #25)

(In reply to :Gijs (he/him) from comment #24)

Andrew, how plausible is it this is related to consumers and/or contents of the DB (whether that's individual record size or size of a full response to a particular query)? And/or would those consumers see different behaviour as a result of the patches here?

For context, I'm just trying to establish if there's likely to be / have been any correlation with turning on the indexed-db-based remote settings blocklist, or if that's unlikely... At least timing-wise, bug 1594521 landed slightly after the reports here started so it seems unlikely - but also worth checking?

I am not sure if I parse your comment correctly. The issue happens with rather large records. Before preloading landed, during cursor iteration only a single record was transfered via IPC, now the default is 64 records. So if 64 records are larger than the IPC message size limit (which is 256 MB), the crash occurs. This is fixed by the patches that are currently landing.

New internal uses such as the remote settings might have increased the number of crashes, if the record size is large. I don't know if this might be the case intentionally or accidentally for the remote settings blocksize.

Seems unlikely - 8mb is a lot, and these are all pretty small json objects. Mathieu, can you doublecheck that I'm making sense?

(As a side note, the remote-settings usage of the IndexedDB has been making it harder to debug IndexedDB tests. It would be great to be able to turn it off when running tests (wpt/mochitest). Is there a way to achieve that, or could one be added?)

Yes, set extensions.blocklist.enabled to false in the relevant test directory's manifest or with --setpref on the commandline when running locally.

Flags: needinfo?(bugmail) → needinfo?(mathieu)

(In reply to :Gijs (he/him) from comment #26)

(In reply to Simon Giesecke [:sg] [he/him] from comment #25)

(In reply to :Gijs (he/him) from comment #24)

Yes, set extensions.blocklist.enabled to false in the relevant test directory's manifest or with --setpref on the commandline when running locally.

Great, thanks!

Keywords: regression

Please request beta uplift when you're comfortable doing so.

Flags: needinfo?(sgiesecke)

Comment on attachment 9110195 [details]
Bug 1597211 - Adjust parameters and return types to allow passing response size information. r=#dom-workers-and-storage

Beta/Release Uplift Approval Request

  • User impact if declined: User experiences crashes when sites use IndexedDB cursor on very large records, such that 64 (by default) records + overhead exceed the maximum IPC message size of 256 MB. The user may also have changed the preloading cursor preference to a number other than 64, the larger the number being, the larger the chance of hitting the issue fixed by these patches.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce: This might be reproduced by creating an IndexedDB object store with very large records, and then opening a cursor on that object store.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch limits the number of records transferred between the parent and child processes and does not change behaviour otherwise. In the extreme case, preloading is effectively disabled.

An alternative would be to disable the preloading pref entirely.

  • String changes made/needed:
Flags: needinfo?(sgiesecke)
Attachment #9110195 - Flags: approval-mozilla-beta?
Attachment #9110196 - Flags: approval-mozilla-beta?
Attachment #9110197 - Flags: approval-mozilla-beta?

Comment on attachment 9110195 [details]
Bug 1597211 - Adjust parameters and return types to allow passing response size information. r=#dom-workers-and-storage

fixing a crash regression from indexeddb; approved for 72.0b5

Attachment #9110195 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9110196 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9110197 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Seems unlikely - 8mb is a lot, and these are all pretty small json objects. Mathieu, can you doublecheck that I'm making sense?

I can confirm that there is no such big record. But if there is a known software limit on the IDB side, that could be a good reason to setup quotas (Bug 1525407)

Flags: needinfo?(mathieu)
You need to log in before you can comment on or make changes to this bug.