Reduce overhead of extension event dispatch and API calls
Categories
(WebExtensions :: General, enhancement, P2)
Tracking
(Not tracked)
People
(Reporter: robwu, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [addons-jira])
Bug 1906755 reports huge memory spikes in the extension process and parent process.
A recording with the Gecko profiler and the analysis thereof is posted at https://bugzilla.mozilla.org/show_bug.cgi?id=1906755#c7 ; for the full analysis see that other bug, but here I'd like to zoom in on the number of copies made as part of an event dispatch:
- "WebExtensions", "Call Tree", click around 27.3 s in the graph (memcpy) with caller at
_doSend
, which is called ultimately viaset
(storage.session.set
)- "Parent Process", "Marker Table", 28.366s:
admin@fastaddons.com_GroupSpeedDial, api_call: storage.session.set
- the graphs before show significant memcpy (first 258ms receiving via IPC, secondly 175ms from crossing IPC to internal message handler)- "Parent Process", "Marker Table", 28.366s:
admin@fastaddons.com_GroupSpeedDial, api_event: storage.session.onChanged
- this is when the parent wants to dispatch the event. The graphs show 940ms in memcpy in passing the event listener data (for onChanged) (Firefox source:get args()
), 1182ms in memcpy for cloning the data in preparation for transfer over IPC (Firefox source:_doSend()
, handled bynsFrameMessageManager::GetParamsForMessage
), 1526ms in memcpy for the actual copying of the data for IPC (still in_doSend
, handled bymozilla::dom::JSWindowActorParent::SendRawMessage
).
The last point shows that there are 3 copies made when the parent wants to dispatch an event to the child (something similar happens when the result is returned for an API call to the child). Here is the last point above, split in the three separate parts:
- The graphs show 940ms in memcpy in passing the event listener data (for onChanged) (Firefox source:
get args()
), - 1182ms in memcpy for cloning the data in preparation for transfer over IPC (Firefox source:
_doSend()
, handled bynsFrameMessageManager::GetParamsForMessage
), - 1526ms in memcpy for the actual copying of the data for IPC (still in
_doSend
, handled bymozilla::dom::JSWindowActorParent::SendRawMessage
).
While step 1 and 3 are difficult to avoid, the overhead of step 2 could feasibly be eliminated by changing the implementation such that when it encounters a StructuredCloneHolder
, that it takes ownership of the data within instead of making a copy. For this to work, the sendAsyncMessage
and sendQuery
APIs used by _doSend
of ConduitsChild.sys.mjs need the ability to take ownership of data without cloning it. Like std::move
in C++, or probably more relevant, semantics of transferable objects (bug 1579536).
Reporter | ||
Updated•1 year ago
|
Updated•1 year ago
|
Comment 1•1 year ago
|
||
In general a transferable object when serialized to be sent over IPC with structured clone ends up being copied to be sent inline in the structured clone payload (https://searchfox.org/mozilla-central/rev/891d104826fb0cfd5cbdd6128e2372ce62810028/js/src/vm/StructuredClone.cpp#2310-2321). This does mean that naively using transferable semantics won't necessarily get you what you're looking for. We've also removed some of the BufferList
grafting stuff since bug 1783242.
One thing which is worth noting is that there are some planned changes to how we serialize IPC messages for JS actors already coming down the pipe in bug 1885221. The plan there right now is to use a different data structure to serialize JS IPC messages than StructuredCloneData
to allow for us to do more validation and type-checking on the messages. As part of that, the nested StructuredCloneBlob
objects will probably be added as an explicit data type, and not serialized by copying everything into another StructuredClone data like they are right now.
Specifically, we could potentially change StructuredCloneBlob
objects to use threadsafe refcounting, and make sure that the underlying storage is immutable after construction (I believe it already is?), allowing the data structures to be directly held within the message structure, and serialized only when required.
In terms of options for before those changes, another option as may be to pre-serialize them into BigBuffer
attachments on a cross-process StructuredCloneHolder
, reading them back out when deserializing, though that may be more invasive.
Overall, I don't think we want to increase our dependence on transferable objects, in part because they are hard to express in the query resolve case (bug 1579536), and in part because they make less sense in the new IPC data model being implemented in bug 1885221.
Description
•