Closed Bug 1380974 Opened 7 years ago Closed 7 years ago

BufferList default size for IPC messages maybe too large

Categories

(Core :: IPC, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56
Performance Impact high
Tracking Status
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- fixed

People

(Reporter: kanru, Assigned: hchang)

References

Details

(Whiteboard: [MemShrink])

Attachments

(1 file, 3 obsolete files)

This was noticed in bug 1349991 comment 54 and 58. The initial size for IPC messages minus the header was only 36 bytes. This is clearly not enough for some smallish messages but the next allocation will always be 4K.

I think we may want to enlarge the default size or use exact size for fixed size messages.
Whiteboard: [qf] → [qf][MemShrink]
As a followup bug for bug 1348591, we might need a mid-term (or long-term) 
IPC message size monitoring to come out the most customized IPC message sizes.
The first step is probably to key IPC_MESSAGE_SIZE2 [1] and log all sizes [1]
so that we can have complete data for analysis.


[1] http://searchfox.org/mozilla-central/rev/01d27fdd3946f7210da91b18fcccca01d7324fe2/ipc/glue/MessageChannel.cpp#875
Assignee: nobody → hchang
Whiteboard: [qf][MemShrink] → [qf:p1][MemShrink]
For this particular bug, I wonder if the telemetry data would help since 
the distribution doesn't seem to show as fine-grained as desired.
I don't know if the raw data has been processed or just not shown on
the dashboard.
(In reply to Henry Chang [:hchang] from comment #2)
> For this particular bug, I wonder if the telemetry data would help since 
> the distribution doesn't seem to show as fine-grained as desired.
> I don't know if the raw data has been processed or just not shown on
> the dashboard.

The original telemetry collect message size > 4K so it's not useful for this case. We need to create a new telemetry to collect more fine grained usage.
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #3)
> (In reply to Henry Chang [:hchang] from comment #2)
> > For this particular bug, I wonder if the telemetry data would help since 
> > the distribution doesn't seem to show as fine-grained as desired.
> > I don't know if the raw data has been processed or just not shown on
> > the dashboard.
> 
> The original telemetry collect message size > 4K so it's not useful for this
> case. We need to create a new telemetry to collect more fine grained usage.

Even if we have new probe, I don't know if we will have helpful distribution.
Take [1] as example, the bucket size is around 2k bytes. Not sure if the bucket
size is tunable. 

[1] https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-07-16&keys=__none__!__none__!__none__&max_channel_version=nightly%252F56&measure=IPC_MESSAGE_SIZE2&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-06-12&table=0&trim=1&use_submission_date=0
I have done a very initial analysis as shown in [1].
There are around 100 messages which size is always less than 64 bytes.

[1] https://docs.google.com/spreadsheets/d/1p7PoXA7K-sGZWb7AkfBwMyebNHBQCrVFS5JesmuB7cQ/edit#gid=2063339041
(In reply to Henry Chang [:hchang] from comment #5)
> I have done a very initial analysis as shown in [1].
> There are around 100 messages which size is always less than 64 bytes.
> 
> [1]
> https://docs.google.com/spreadsheets/d/1p7PoXA7K-
> sGZWb7AkfBwMyebNHBQCrVFS5JesmuB7cQ/edit#gid=2063339041

The document is not viewable. Could you paste it to somewhere like https://pastebin.mozilla.org/ or upload as a attachment?
Sorry I forgot to change the permission. Now it's accessible.

This initial analysis is based on around 5-minute trace so it's not very
representative. I'll have a new one today.
https://docs.google.com/spreadsheets/d/1CpuO7BnSW9p7PjvRSTBkNUegRVt3-2GZhWw1YuptPnQ

Based on the trace including watching video, reading facebook feeds,
email, sites with massive images.

Frequent (> 1%) small-size messages:

1.  PCompositorBridge::Msg_DidComposite                (~45%),  all < 128 bytes
2.  PVsync::Msg_Notify,                                (~7%),   all < 64  bytes
3.  PBrowser::Msg_RealMouseMoveEvent                   (~3%),   all < 192 bytes
6.  PCompositorBridge::Msg_PTextureConstructor         (~2%),   all < 192 bytes
7.  PTexture::Msg___delete__                           (~2%),   all < 64  bytes
10. PLayerTransaction::Msg_InitReadLocks               (~1%),   98% < 256 bytes
12. PLayerTransaction::Msg_ReleaseLayer                (~1%),   all < 64  bytes
15. PBackground::Msg_PHttpBackgroundChannelConstructor (~1%),   all < 64  bytes
16. PHttpBackgroundChannel::Msg___delete__             (~1%),   all < 64  bytes
18. PHttpChannel::Msg_DeleteSelf                       (~1%),   all < 64  bytes
19. PHttpChannel::Msg___delete__                       (~1%),   all < 64  bytes
20. PHttpBackgroundChannel::Msg_OnStartRequestSent     (~1%),   all < 64  bytes
22. PHttpBackgroundChannel::Msg_OnStopRequest          (~1%),   all < 192 bytes
23. PHttpChannel::Msg_DeletingChannel                  (~1%),   all < 64  bytes
(In reply to Henry Chang [:hchang] from comment #8)
> https://docs.google.com/spreadsheets/d/1CpuO7BnSW9p7PjvRSTBkNUegRVt3-
> 2GZhWw1YuptPnQ
> 
> Based on the trace including watching video, reading facebook feeds,
> email, sites with massive images.
> 
> Frequent (> 1%) small-size messages:

The size is grabbed from 

http://searchfox.org/mozilla-central/rev/a83a4b68974aecaaacdf25145420e0fe97b7aa22/ipc/glue/MessageChannel.cpp#876,1359

so it already includes the header and the payload.
Attached patch Bug1380974.patch (obsolete) — Splinter Review
Attached patch Bug1380974.patch (obsolete) — Splinter Review
Enlarging the default header capacity may be a better approach.
Attachment #8888154 - Attachment is obsolete: true
Attachment #8888158 - Flags: review?(kchen)
Comment on attachment 8888158 [details] [diff] [review]
Bug1380974.patch

Review of attachment 8888158 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/chromium/src/base/pickle.cc
@@ +31,5 @@
>  #else
>  // TaskTracer would add extra fields to the header to carry task ID and
>  // other information.
>  // \see class Message::HeaderTaskTracer
>  static const uint32_t kHeaderSegmentCapacity = 128;

If we make the default 256 bytes, we should also adjust the case for TaskTracer. Or we can remove the special case for TaskTracer altogether.

I would like :billm to review this.
Attachment #8888158 - Flags: review?(kchen) → review?(wmccloskey)
Attached patch Bug1380974.patch (obsolete) — Splinter Review
Attachment #8888158 - Attachment is obsolete: true
Attachment #8888158 - Flags: review?(wmccloskey)
Attachment #8888229 - Flags: review?(wmccloskey)
I guess I'm not convinced here. It seems like a better solution would be to increase the size for DidComposite and a few other messages in message-metadata.ini, but leave the others alone. Since there are still a lot of messages that are <= 64 bytes, we would be wasting a lot of space by bumping the initial segment to 256 bytes.

If you want to look at this further, I think it would be cool to have a script that would output the number of bytes wasted and the number of allocations needed for a given set of message allocations that you've logged. Then you could adjust the default size and the size for individual messages and try to minimize allocations and wasted bytes.
(In reply to Bill McCloskey (:billm) from comment #14)
> I guess I'm not convinced here. It seems like a better solution would be to
> increase the size for DidComposite and a few other messages in
> message-metadata.ini, but leave the others alone. Since there are still a
> lot of messages that are <= 64 bytes, we would be wasting a lot of space by
> bumping the initial segment to 256 bytes.
> 

I guess you mean the first patch I attached attachment 8888154 [details] [diff] [review],
where a few messages are with custom segment size.

> If you want to look at this further, I think it would be cool to have a
> script that would output the number of bytes wasted and the number of
> allocations needed for a given set of message allocations that you've
> logged. Then you could adjust the default size and the size for individual
> messages and try to minimize allocations and wasted bytes.

I was also thinking of the similar thing: to minimize wasted memory and allocations.
However, I haven't figured out a good penalty function to "optimize". 
The reason why I want a penalty function is then we can automatically
generate message-metadata.ini.

Besides, we can further

1) analyze fixed-size messages and variable-size messages respectively.
2) use different "first segment capacity" and "standard capacity" 
   (might only helpful for variable-size messages)

I'll try what I can dig from the trace. Before that, would you mind
reviewing attachment 8888154 [details] [diff] [review] first? Thanks :)
Attachment #8888154 - Attachment is obsolete: false
Attachment #8888154 - Flags: review?(wmccloskey)
Attachment #8888229 - Attachment is obsolete: true
Attachment #8888229 - Flags: review?(wmccloskey)
Comment on attachment 8888154 [details] [diff] [review]
Bug1380974.patch

Review of attachment 8888154 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8888154 - Flags: review?(wmccloskey) → review+
(In reply to Henry Chang [:hchang] from comment #15)
> (In reply to Bill McCloskey (:billm) from comment #14)
> > If you want to look at this further, I think it would be cool to have a
> > script that would output the number of bytes wasted and the number of
> > allocations needed for a given set of message allocations that you've
> > logged. Then you could adjust the default size and the size for individual
> > messages and try to minimize allocations and wasted bytes.
> 
> I was also thinking of the similar thing: to minimize wasted memory and
> allocations.
> However, I haven't figured out a good penalty function to "optimize". 
> The reason why I want a penalty function is then we can automatically
> generate message-metadata.ini.
> 
> Besides, we can further
> 
> 1) analyze fixed-size messages and variable-size messages respectively.
> 2) use different "first segment capacity" and "standard capacity" 
>    (might only helpful for variable-size messages)

Yes, we should aim for generating the size automatically, maybe in another bug. I'm worried that message-metadata.ini may be out of sync with reality because we don't know when message size changes.
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #17)
> Yes, we should aim for generating the size automatically, maybe in another
> bug. I'm worried that message-metadata.ini may be out of sync with reality
> because we don't know when message size changes.

Maybe we can get this done with the help of Telemetry alert system:
We probe the "wasted memory" or "allocation count" and telemetry backend
would notify us if there is a spike.
Attachment #8888154 - Attachment is obsolete: true
Attachment #8889254 - Flags: review+
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3c35ec94f3ed
Use custom size for small frequent messages. r=billm
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/3c35ec94f3ed
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Could we uplift this to 55? It seems rather safe to me.
Flags: needinfo?(hchang)
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #23)
> Could we uplift this to 55? It seems rather safe to me.

Sure but we have to uplift bug 1348591 as well. Let me see if there's a
conflict to 55.
Flags: needinfo?(hchang)
I've verified bug 1348591 and this one can be directly applied to 55
and run very well. So, Kanru, if you think we should uplift to 55 I
can make a request then.
(In reply to Henry Chang [:hchang] from comment #25)
> I've verified bug 1348591 and this one can be directly applied to 55
> and run very well. So, Kanru, if you think we should uplift to 55 I
> can make a request then.

I think uplifting bug 1348591 at this point is a bit scary considering it's only a week before merge. Let's just ride the train then.
Performance Impact: --- → P1
Whiteboard: [qf:p1][MemShrink] → [MemShrink]
You need to log in before you can comment on or make changes to this bug.