Closed Bug 837271 Opened 7 years ago Closed 7 years ago

Histogram::SampleSet::AccumulateWithExponentialStats is slow

Categories

(Toolkit :: Telemetry, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla21
Tracking Status
firefox20 --- fixed
firefox21 --- fixed

People

(Reporter: vladan, Assigned: froydnj)

Details

Attachments

(4 files, 3 obsolete files)

In bug 816166, we added log-sum and log-sum-squares stats to exponential histograms. We anticipated perf issues so we switched to the logf call in bug 826439, but _log_pentium4 still shows up in profiles:

http://people.mozilla.com/~bgirard/cleopatra/#report=4cb37c063a0e94c5dc02cad6a9485f1dbd2c0721&search=log_pentium

We shouldn't be calling logf on hot paths, so let's make it so that histograms have to opt into the detailed log-stats in Histograms.json. We should also change Histograms.json definitions to opt-in existing histograms that are outside hot paths. 

We'll need to uplift this fix or back out the original patch so it doesn't make it to release.
A simple flag on the histogram is all we need; fortunately, we already have
a flags field available to us.
Attachment #713475 - Flags: review?(vdjeric)
This patch adds the necessary bits to Telemetry only; the histogram definition
changes are coming in a later patch.
Attachment #713476 - Flags: review?(vdjeric)
Here's the bikeshed.  I like the color blue.

(Roughly speaking, I avoided expensive statistics for gradients, tab
switching, CC, GC, cache locking, and word cache-related histograms.
Those are the ones that stood out to me in my own browsers and briefly
browsing through the dashboards.)
Attachment #713479 - Flags: review?(vdjeric)
Attachment #713475 - Flags: review?(vdjeric) → review+
Comment on attachment 713476 [details] [diff] [review]
part 2 - add expensive_statistics_ok mechanism to histogram machinery

Would it be hard to add a regression test?
Attachment #713476 - Flags: review?(vdjeric) → review+
Comment on attachment 713479 [details] [diff] [review]
part 3 - flag histograms as OK for expensive statistics computation in Histograms.json

- Please update the wiki page after this lands: https://developer.mozilla.org/en-US/docs/Performance/Adding_a_new_Telemetry_probe
- Consider making the new field's name a bit shorter by naming it "expensive_log_stats" or something
- Nit: Can you leave the "description" field as the last field in the histogram definitions? I only ever look at Histograms.json when I'm adding a new probe or when I'm looking for a description of an existing histogram, and this makes it a bit easier to find
- How much delay does the logf call add according to your mini-benchmark? Looking at my own CACHE_MEMORY_SEARCH_2 histogram, I have 3000 entries from a 3 hour session, and all of them are less than 1ms. If the logf call adds 1ms to the operation, this might be unwanted overhead. Same with URLCLASSIFIER_LOOKUP_TIME
Attachment #713479 - Flags: review?(vdjeric) → review+
Irving suggested "slow_log_stats" on #perf
(In reply to Vladan Djeric (:vladan) from comment #4)
> Would it be hard to add a regression test?

That's a good idea, I'll check that e.g. GRADIENT_DURATION doesn't accumulate in the log_sum* fields.

(In reply to Vladan Djeric (:vladan) from comment #5)
> - Consider making the new field's name a bit shorter by naming it
> "expensive_log_stats" or something

I'm open to suggestions.  I wanted to avoid "log" in the name in an effort to make it more general; log stats are likely (only?) good for exponential histograms.  We might add other histogram kinds in the future and those may require expensive stats.

How about "extended_statistics"?  No connotations of slowness or expensiveness, and expansive enough to encompass future additions.

> - Nit: Can you leave the "description" field as the last field in the
> histogram definitions? I only ever look at Histograms.json when I'm adding a
> new probe or when I'm looking for a description of an existing histogram,
> and this makes it a bit easier to find.

Sure.

> - How much delay does the logf call add according to your mini-benchmark?
> Looking at my own CACHE_MEMORY_SEARCH_2 histogram, I have 3000 entries from
> a 3 hour session, and all of them are less than 1ms. If the logf call adds
> 1ms to the operation, this might be unwanted overhead. Same with
> URLCLASSIFIER_LOOKUP_TIME

A logf call on my development machine is ~20ns.  So I think we're within the noise factor here.
(In reply to Nathan Froyd (:froydnj) from comment #7)
> I'm open to suggestions.  I wanted to avoid "log" in the name in an effort
> to make it more general; log stats are likely (only?) good for exponential
> histograms.  We might add other histogram kinds in the future and those may
> require expensive stats.
> 
> How about "extended_statistics"?  No connotations of slowness or
> expensiveness, and expansive enough to encompass future additions.

I guess that name is ok, I can't think of anything better and i already asked on #perf. I guess we'll just have to make it clear in the wiki + maybe Histograms.json header what it does and what the tradeoffs are

> > - How much delay does the logf call add according to your mini-benchmark?
> > Looking at my own CACHE_MEMORY_SEARCH_2 histogram, I have 3000 entries from
> > a 3 hour session, and all of them are less than 1ms. If the logf call adds
> > 1ms to the operation, this might be unwanted overhead. Same with
> > URLCLASSIFIER_LOOKUP_TIME
> 
> A logf call on my development machine is ~20ns.  So I think we're within the
> noise factor here.

That's suprising, I wonder why it showed up in 2 consecutive 1ms samples in the profile in Comment 0

We should nominate the final patch for aurora approval asap
Quick-and-dirty test.
Attachment #714379 - Flags: review?(taras.mozilla)
(In reply to Vladan Djeric (:vladan) from comment #8)
> > A logf call on my development machine is ~20ns.  So I think we're within the
> > noise factor here.
> 
> That's suprising, I wonder why it showed up in 2 consecutive 1ms samples in
> the profile in Comment 0

Samples are just that: samples.  We only had one profile with this result, right?
Attachment #714379 - Flags: review?(taras.mozilla) → review+
Updated patch.  I made an executive decision to call these "extended" statistics.
If somebody feels strongly about that name or length of identifiers, or whatever,
they can file a bug and I will gladly change it.
Attachment #713475 - Attachment is obsolete: true
Attachment #714583 - Flags: review+
Comment on attachment 714583 [details] [diff] [review]
part 1 - add means to control computation of extended statistics to histogram code

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 816166/bug 826439
User impact if declined: Decreased performance.
Testing completed (on m-c, etc.): On inbound, will be merged later this evening.
Risk to taking this patch (and alternatives if risky): Low risk.  This is reverting us to a previous state of affairs.
String or UUID changes made by this patch: None.
Attachment #714583 - Flags: approval-mozilla-aurora?
Comment on attachment 714584 [details] [diff] [review]
part 2 - add extended_statistics_ok mechanism to histogram machinery

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 816166/bug 826439
User impact if declined: Decreased performance.
Testing completed (on m-c, etc.): On inbound, will be merged later this evening.
Risk to taking this patch (and alternatives if risky): Low risk.  This is reverting us to a previous state of affairs.
String or UUID changes made by this patch: None.
Attachment #714584 - Flags: approval-mozilla-aurora?
Comment on attachment 714585 [details] [diff] [review]
part 3 - flag histograms as OK for extended statistics computation in Histograms.json

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 816166/bug 826439
User impact if declined: Decreased performance.
Testing completed (on m-c, etc.): On inbound, will be merged later this evening.
Risk to taking this patch (and alternatives if risky): Low risk.  This is reverting us to a previous state of affairs.
String or UUID changes made by this patch: None.
Attachment #714585 - Flags: approval-mozilla-aurora?
Comment on attachment 714379 [details] [diff] [review]
part 4 - add test for extended statistics not being recorded

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 816166/bug 826439
User impact if declined: Decreased performance.
Testing completed (on m-c, etc.): On inbound, will be merged later this evening.
Risk to taking this patch (and alternatives if risky): Low risk.  This is reverting us to a previous state of affairs.
String or UUID changes made by this patch: None.
Attachment #714379 - Flags: approval-mozilla-aurora?
Comment on attachment 714379 [details] [diff] [review]
part 4 - add test for extended statistics not being recorded

Approving for aurora uplift even though this is only on inbound so far with the caveat that this pass testing and get merge to m-c before uplift to Aurora.  With the long weekend in the US it would be great if this can land before Tues 6am Pacific when we do the merge to beta.
Attachment #714379 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #714583 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #714584 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #714585 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
FWIW, this doesn't apply cleanly enough to Aurora for me to comfortably uplift. Nathan, I'm going to have to defer to you on it.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #21)
> FWIW, this doesn't apply cleanly enough to Aurora for me to comfortably
> uplift. Nathan, I'm going to have to defer to you on it.

Thanks for poking at it; I ran into the same problem last night. :(  I'm planning on doing this today, earlier rather than later if I can manage it.
You need to log in before you can comment on or make changes to this bug.