Closed Bug 1338555 Opened 7 years ago Closed 7 years ago

Add an accumulation limit to Scalars IPC messages

Categories

(Toolkit :: Telemetry, defect, P3)

defect
Points:
1

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox54 --- wontfix
firefox55 --- fixed

People

(Reporter: Dexter, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client])

Attachments

(1 file)

We do have a watermark limit for Histogram IPC accumulation [1].

We should have the same for scalars too.

[1] - http://searchfox.org/mozilla-central/rev/d3307f19d5dac31d7d36fc206b00b686de82eee4/toolkit/components/telemetry/TelemetryIPCAccumulator.cpp#111
Blocks: 1275517
Priority: -- → P1
Whiteboard: [measurement:client]
Points: --- → 1
The watermark was introduced to try and account for odd thought-to-be-capacity-related issues (which turned out to be threading issues).

Is there evidence that we require a size-based watermark in addition to the time-based one? Or is it just a desired practice?
Good question, there was no specific reason besides hardening & having sanity limits.

Gabor, do you know if we have any documented limits on the IPDL/IPC message sizes?
I assume these are not unbounded?
Flags: needinfo?(gkrizsanits)
(In reply to Georg Fritzsche [:gfritzsche] from comment #2)
> Good question, there was no specific reason besides hardening & having
> sanity limits.
> 
> Gabor, do you know if we have any documented limits on the IPDL/IPC message
> sizes?
> I assume these are not unbounded?

I think it's 256MB.

http://searchfox.org/mozilla-central/source/ipc/chromium/src/chrome/common/ipc_channel.h#53
Flags: needinfo?(gkrizsanits)
Alright, i wonder if anyone has ever done performance tests for different message sizes.
Lets just pick some "reasonable" number here now, presumably we don't need to max out to 256MB.
(In reply to Georg Fritzsche [:gfritzsche] from comment #4)
> Alright, i wonder if anyone has ever done performance tests for different
> message sizes.
> Lets just pick some "reasonable" number here now, presumably we don't need
> to max out to 256MB.

Mccr8 is probably the best person to ask, he invested quite some time in keeping message sizes under control and monitor the distribution of the size of various message types. I'm pretty sure there is still a lot of work to be done in the area. But unfortunately we don't have enough resources on memshrink currently... There was an attempt to release assert for >128MB messages, but didn't turn out great: bug 1271102.
Ah, thanks!
Andrew, do you have more context here?
Are there any guidelines or rules of thumb for IPC message sizes?
Flags: needinfo?(continuation)
(In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> Are there any guidelines or rules of thumb for IPC message sizes?

I assume here the issue is the data is being sent, more than how much is being sent? Large messages are not quite as bad now that we don't use contiguous allocation for them at the lowest level, but the serialization/deserialization process can still use extra memory. Nobody has looked in detail at performance beyond that, as far as I know.
Flags: needinfo?(continuation)
After discussing with Georg, we're aiming for a 10Mb upper limit.

Before diving into the computations, here's the size in bytes of the individual actions that we're storing in the IPC messages.

Accumulation: 8
KeyedAccumulation: 24 (excluding the key data)
ScalarAction 40 
KeyedScalarAction 56 (excluding the key data and the data for sting scalars)

If we consider an average key size of 32 characters, the new size for each KeyedAccumulation
and KeyedScalarAction becomes:

KeyedAccumulation: 56 bytes
KeyedScalarAction: 88 bytes

Histograms
==========
When the current watermark is hit [1] (5 * 1024 = 5120), we will be sending: 
 
gHistogramAccumulations storage = 5120 * 8 bytes = 40960bytes = ~40kb
gKeyedHistogramAccumulations storage = 5120 * 56= 286720 bytes = ~280kb

Scalars
=======
With a tentative 10Mb limit, we could use an high cap for scalars (50 * 1024 = 51200).
With this watermark, we will be sending:

gChildScalarsActions storage = 51200* 40 bytes = 2048000 = ~2Mb
gChildKeyedScalarsActions storage = 51200* 88 bytes = 4505600=  ~4.4Mb

We could probably lower the scalar watermark limit to account for events as well. Does this make any sense?

[1] - http://searchfox.org/mozilla-central/rev/39e4b25a076c59d2e7820297d62319f167871449/toolkit/components/telemetry/ipc/TelemetryIPCAccumulator.cpp#42
@Chris, @Georg do the high cap of 51200 elements suggested in comment 8 for Scalars sound reasonable to you?
Flags: needinfo?(gfritzsche)
Flags: needinfo?(chutten)
Seems high to me (the guy who spitballed something far lower for hgrams). If we're going to set it this high, are we expecting this to only be hit in anomalous cases? Should we log/accumulate on those anomalous cases?
Flags: needinfo?(chutten)
Priority: P1 → P2
Is there any specific downside we are expecting from this?
This probably should only protect from notorious case.

Lets add some limit soon, that's better than none at all right now.
Flags: needinfo?(gfritzsche)
Can you set this up for mentoring Alessio?
Flags: needinfo?(alessio.placitelli)
Priority: P2 → P3
Chris, events are currently using 10k events as a watermark limit [1]. Would it be ok for you to use this limit for the scalars as well? We could always tweak it as we go.

[1] - http://searchfox.org/mozilla-central/rev/c48398abd9f0f074c69f2223260939e30e8f99a8/toolkit/components/telemetry/ipc/TelemetryIPCAccumulator.cpp
Flags: needinfo?(chutten)
10k's fine by me, especially with this evidence in bugzilla supporting someone's initiative to change it later if it's problematic. :)
Flags: needinfo?(chutten)
Would you be interested in working on this?

This is about adding a limit to the number of scalar accumulations we do before sending an IPC message, here [1]. The limit should be 10k accumulations. You can see how |kHistogramAccumulationsArrayHighWaterMark| is being used and do something similar for the scalars!

[1] - http://searchfox.org/mozilla-central/rev/c48398abd9f0f074c69f2223260939e30e8f99a8/toolkit/components/telemetry/ipc/TelemetryIPCAccumulator.cpp#160,176
Flags: needinfo?(alessio.placitelli) → needinfo?(federico_padua)
(In reply to Alessio Placitelli [:Dexter] from comment #15)
> Would you be interested in working on this?
> 
> This is about adding a limit to the number of scalar accumulations we do
> before sending an IPC message, here [1]. The limit should be 10k
> accumulations. You can see how |kHistogramAccumulationsArrayHighWaterMark|
> is being used and do something similar for the scalars!
> 
> [1] -
> http://searchfox.org/mozilla-central/rev/
> c48398abd9f0f074c69f2223260939e30e8f99a8/toolkit/components/telemetry/ipc/
> TelemetryIPCAccumulator.cpp#160,176

Yes, I can try it ;)
Probably I will need some help along the way, I'll let you know.
One thing: a test should be written too?
Flags: needinfo?(federico_padua)
(In reply to Federico Padua (fedepad) from comment #16)
> (In reply to Alessio Placitelli [:Dexter] from comment #15)
> > Would you be interested in working on this?
> > 
> > This is about adding a limit to the number of scalar accumulations we do
> > before sending an IPC message, here [1]. The limit should be 10k
> > accumulations. You can see how |kHistogramAccumulationsArrayHighWaterMark|
> > is being used and do something similar for the scalars!
> > 
> > [1] -
> > http://searchfox.org/mozilla-central/rev/
> > c48398abd9f0f074c69f2223260939e30e8f99a8/toolkit/components/telemetry/ipc/
> > TelemetryIPCAccumulator.cpp#160,176
> 
> Yes, I can try it ;)
> Probably I will need some help along the way, I'll let you know.

Feel free to get in touch if you get stuck or want to chat about the implementation, no problem!

> One thing: a test should be written too?

If you can think of an useful and reliable way to test that, why not!

I don't think we have test coverage for that for events and histograms though.
Who should I put as reviewer of the patch?
I can review.
(In reply to Chris H-C :chutten from comment #19)
> I can review.

Ok, perfect. I'm doing a build now and then will push for review. 
Thanks.
Comment on attachment 8851133 [details]
Bug 1338555 - Add an accumulation limit to Scalars IPC messages;

https://reviewboard.mozilla.org/r/123526/#review126304

Well done, this appears to all be in order. I'll send this to autoland.
Attachment #8851133 - Flags: review?(chutten) → review+
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1e71acdb85ad
Add an accumulation limit to Scalars IPC messages; r=chutten
https://hg.mozilla.org/mozilla-central/rev/1e71acdb85ad
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: