Add a way to clone histograms

RESOLVED FIXED in mozilla12

Status

()

Toolkit
Telemetry
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: (dormant account), Assigned: froydnj)

Tracking

unspecified
mozilla12
x86
Windows 7
Points:
---
Bug Flags:
in-testsuite ?

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 17 obsolete attachments)

2.60 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
3.11 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
2.74 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
5.95 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
(Reporter)

Description

7 years ago
bug 701583 landed a hack to enumerate histograms on startup and then send them in. The proper way to do this is to add a clone(or initFrom?) method to nsITelemetry.

This way one could do Telemetry.histogramFrom("STARTUP_MOZ_SQLITE_PLACES_MAINTHREAD_SYNC_MS", "MOZ_SQLITE_PLACES_MAINTHREAD_SYNC_MS") and have a histogram appear in about:telemetry.

This could also use a testcase
(Assignee)

Comment 1

7 years ago
I think I understand how to do this after poking at it this morning.
Assignee: nobody → nfroyd
Status: NEW → ASSIGNED
(Assignee)

Comment 2

7 years ago
Created attachment 574361 [details] [diff] [review]
separate out C++-vs-JS API bits

For creating various bits of histograms, we need a way to get at the C++ bits, rather than the JS bits.

This doesn't look like a straight win, but functions in the next patch will use these.
Attachment #574361 - Flags: review?(tglek)
(Assignee)

Comment 3

7 years ago
Created attachment 574362 [details] [diff] [review]
add nsITelemetry::HistogramFrom

This patch adds the function we actually want.

I daresay we could write the histogramSnapshots bits all in JS now (well, maybe after adding a registeredHistograms list to nsITelemetry), but that's a separate bug.
Attachment #574362 - Flags: review?(tglek)
(Reporter)

Comment 4

7 years ago
This needs a third patch to create the STARTUP_* histograms using the new functionality and to have the testcase updated to check for it. You can do that by sending "sessionstore-windows-restored" from testcase.
(Reporter)

Comment 5

7 years ago
There is a tricky bit with "static" histograms. Histograms created by newHistogram are considered dynamic and we do not report those to the server. This is a silly precaution to not send histograms created by addons.
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryPing.js#193
(Reporter)

Comment 6

7 years ago
Comment on attachment 574361 [details] [diff] [review]
separate out C++-vs-JS API bits

>+nsresult
>+NewUserHistogram(const nsACString &name, PRUint32 min, PRUint32 max,
>+                 PRUint32 bucketCount, PRUint32 histogramType,
>+                 Histogram **ret)
>+{
>+  nsresult rv = HistogramGet(PromiseFlatCString(name).get(), min, max,
>+                             bucketCount, histogramType, ret);
>+  if (NS_FAILED(rv))
>+    return rv;
>+  (*ret)->ClearFlags(Histogram::kUmaTargetedHistogramFlag);
>+  return NS_OK;
>+}

this should be return rv; see below

>+
> NS_IMETHODIMP
> TelemetryImpl::NewHistogram(const nsACString &name, PRUint32 min, PRUint32 max, PRUint32 bucketCount, PRUint32 histogramType, JSContext *cx, jsval *ret)
> {
>   Histogram *h;
>-  nsresult rv = HistogramGet(PromiseFlatCString(name).get(), min, max, bucketCount, histogramType, &h);
>+  nsresult rv = NewUserHistogram(name, min, max, bucketCount, histogramType, &h);
>   if (NS_FAILED(rv))
>     return rv;
>-  h->ClearFlags(Histogram::kUmaTargetedHistogramFlag);

My code is usually perfect and self-explanatory. In this case it needed a comment. ClearFlags should stay in NewHistogram, this prevents "dynamic" histograms from being reported. You'd see a testfailure if you modified test_TelemetryPing.js to check for these being sent in the ping.

Rest is ok.
Attachment #574361 - Flags: review?(tglek) → review-
(Reporter)

Comment 7

7 years ago
Comment on attachment 574362 [details] [diff] [review]
add nsITelemetry::HistogramFrom

># HG changeset patch
># User Nathan Froyd <froydnj@mozilla.com>
># Date 1321298898 18000
>Bug 701863 - Add a way to clone histograms; r=taras
>
>Add nsITelemetry::histogramFrom.
>
>diff --git a/toolkit/components/telemetry/Telemetry.cpp b/toolkit/components/telemetry/Telemetry.cpp
>index 6cea5d8..8de27b4 100644
>--- a/toolkit/components/telemetry/Telemetry.cpp
>+++ b/toolkit/components/telemetry/Telemetry.cpp
>@@ -106,6 +106,25 @@ const TelemetryHistogram gHistograms[] = {
> };
> 
> nsresult
>+TelemetryHistogramType(Histogram *h, PRUint32 *result)
>+{
>+  switch (h->histogram_type()) {
>+  case Histogram::HISTOGRAM:
>+    *result = nsITelemetry::HISTOGRAM_EXPONENTIAL;
>+    break;
>+  case Histogram::LINEAR_HISTOGRAM:
>+    *result = nsITelemetry::HISTOGRAM_LINEAR;
>+    break;
>+  case Histogram::BOOLEAN_HISTOGRAM:
>+    *result = nsITelemetry::HISTOGRAM_BOOLEAN;
>+    break;
>+  default:
>+    return NS_ERROR_INVALID_ARG;
>+  }
>+  return NS_OK;
>+}

Please don't use nsresult for binary success/failure returns, bool works better for this.
Rest looks ok.
Attachment #574362 - Flags: review?(tglek) → review+
(Assignee)

Comment 8

7 years ago
(In reply to Taras Glek (:taras) from comment #6)
> My code is usually perfect and self-explanatory. In this case it needed a
> comment. ClearFlags should stay in NewHistogram, this prevents "dynamic"
> histograms from being reported. You'd see a testfailure if you modified
> test_TelemetryPing.js to check for these being sent in the ping.

OK, I can see the reasoning here, but this is a weird interface.

Also, adding this cloning ability opens up a way to have "dynamic" histograms reported: say some malicious/clueless addon author cloned one of the base histograms that has more-or-less the properties (min/max/type/etc.) desired, but doesn't have a whole lot of data.  You might even just need the type; the data can be scaled to fit into the buckets.  Voila.
(Assignee)

Comment 9

7 years ago
Actually, maybe you don't even care about recording data.  A malicious addon can do something like:

var base_histogram = ...
for (i = 0; i < 1000000; i++)
   Telemetry.clone("USELESS_DOS" + i + base_histogram, base_histogram)

and we suddenly have a million extra histograms flooding our metrics server.
(Reporter)

Comment 10

7 years ago
(In reply to Nathan Froyd (:froydnj) from comment #9)
> Actually, maybe you don't even care about recording data.  A malicious addon
> can do something like:
> 
> var base_histogram = ...
> for (i = 0; i < 1000000; i++)
>    Telemetry.clone("USELESS_DOS" + i + base_histogram, base_histogram)
> 
> and we suddenly have a million extra histograms flooding our metrics server.

yes. If you want, you can solve this properly. in telemetry_ping, send dynamic histograms IFF they are named STARTUP_{foo} and the {foo} is also present in the histogram list.

I didn't add .static to guard against evil addon authors. This is mostly to show that we do not support gathering telemetry for addons yet.
(Assignee)

Comment 11

7 years ago
(In reply to Taras Glek (:taras) from comment #4)
> This needs a third patch to create the STARTUP_* histograms using the new
> functionality and to have the testcase updated to check for it. You can do
> that by sending "sessionstore-windows-restored" from testcase.

You mean implementing something like nsITelemetry::registeredHistograms, which returns a list of the names of the "base" histograms we support, then using that and histogramFrom in TelemetryPing.gatherStartupSqlite?

And which testcase?  tests/unit/test_TelemetryPing.js?
(Assignee)

Comment 12

7 years ago
(In reply to Taras Glek (:taras) from comment #10)
> yes. If you want, you can solve this properly. in telemetry_ping, send
> dynamic histograms IFF they are named STARTUP_{foo} and the {foo} is also
> present in the histogram list.

Apparently I fail at reading your perfect explanations, like your perfect code. :)  You mean sticking this check in the logic for nsITelemetry::histogramSnapshots, yes?

That method makes more sense to me than putting it in TelemetryPing and also closes out addon authors from modifying our checking by swapping in their JavaScript.
(Reporter)

Comment 13

7 years ago
(In reply to Nathan Froyd (:froydnj) from comment #12)
> (In reply to Taras Glek (:taras) from comment #10)
> > yes. If you want, you can solve this properly. in telemetry_ping, send
> > dynamic histograms IFF they are named STARTUP_{foo} and the {foo} is also
> > present in the histogram list.
> 
> Apparently I fail at reading your perfect explanations, like your perfect
> code. :)  You mean sticking this check in the logic for
> nsITelemetry::histogramSnapshots, yes?
> 
> That method makes more sense to me than putting it in TelemetryPing and also
> closes out addon authors from modifying our checking by swapping in their
> JavaScript.

I mean doing it near http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryPing.js#187

It makes sense to me because telemetry ping would be cloning the histograms to create STARTUP_* ones, so it should also ensure to report those. I'm not worried about evil addons replacing telemetryping. They can also include a native component in their code and modify staticness of histograms themselves.
(Assignee)

Comment 14

7 years ago
Created attachment 574937 [details] [diff] [review]
separate out C++-vs-JS API bits, v2

updated to keep flag clearing in NewHistogram
Attachment #574361 - Attachment is obsolete: true
Attachment #574937 - Flags: review?(tglek)
(Assignee)

Comment 15

7 years ago
Created attachment 574939 [details] [diff] [review]
2. add nsITelemetry::HistogramFrom, v2

Don't use nsresult for boolean-valued things.  Carrying over r+.
Attachment #574362 - Attachment is obsolete: true
Attachment #574939 - Flags: review+
(Reporter)

Comment 16

7 years ago
Comment on attachment 574937 [details] [diff] [review]
separate out C++-vs-JS API bits, v2

NewUserHistogram means nothing to me
Attachment #574937 - Flags: review?(tglek) → review-
(Assignee)

Comment 17

7 years ago
Created attachment 574968 [details] [diff] [review]
separate out C++-vs-JS API bits, v3

nuking NewUserHistogram, as requested
Attachment #574937 - Attachment is obsolete: true
Attachment #574968 - Flags: review?(tglek)
(Assignee)

Comment 18

7 years ago
Created attachment 574970 [details] [diff] [review]
use histogramFrom to clone STARTUP_* histograms

OK, so here's what I think you're getting at doing.

I ditched .static because it's not useful anymore; it was used to query "is this histogram that was defined at compile-time?" but it really meant "a histogram we are going to upload".  Since .static now gets set on cloned histograms, .static is overloaded and not especially useful.

I could have defined .static properly (i.e. this is something in TelemetryHistograms.h), but then using that to figure out what to upload just seemed messy.  I think registeredHistograms will be useful in the future, too.
Attachment #574970 - Flags: review?(tglek)
(Assignee)

Comment 19

7 years ago
Created attachment 574987 [details] [diff] [review]
1. separate out C++-vs-JS API bits, v4

Apparently git decided to be dumb about merging.  Merge markers removed.
Attachment #574968 - Attachment is obsolete: true
Attachment #574968 - Flags: review?(tglek)
Attachment #574987 - Flags: review?(tglek)
(Reporter)

Comment 20

7 years ago
Comment on attachment 574970 [details] [diff] [review]
use histogramFrom to clone STARTUP_* histograms

Please make registeredHistograms an object with {name:comment, ...}
Attachment #574970 - Flags: review?(tglek) → review+
(Reporter)

Updated

7 years ago
Attachment #574987 - Flags: review?(tglek) → review+
(Assignee)

Comment 21

7 years ago
Created attachment 575527 [details] [diff] [review]
3. use histogramFrom to clone STARTUP_* histograms

Changed registeredHistograms to return an object, not an array.  Made corresponding changes in TelemetryPing.js + a few fixes noticed by actually testing.

Carrying over r+.
Attachment #574970 - Attachment is obsolete: true
Attachment #575527 - Flags: review+
(Assignee)

Comment 22

7 years ago
Created attachment 575528 [details] [diff] [review]
4. add TelemetryPing tests for cloned histograms

Make sure that we pass along a cloned STARTUP_* histogram, but not some other cloned histogram with a random name.
Attachment #575528 - Flags: review?(tglek)
(Assignee)

Updated

7 years ago
Attachment #574939 - Attachment description: add nsITelemetry::HistogramFrom, v2 → 2. add nsITelemetry::HistogramFrom, v2
(Assignee)

Updated

7 years ago
Attachment #574987 - Attachment description: separate out C++-vs-JS API bits, v4 → 1. separate out C++-vs-JS API bits, v4
(Reporter)

Comment 23

7 years ago
Comment on attachment 575528 [details] [diff] [review]
4. add TelemetryPing tests for cloned histograms

can you also create a MOZ_SQLITE_whatever histogram and make sure that get copied into STARTUP_MOZ_SQLITE_whatever?
Attachment #575528 - Flags: review?(tglek) → review+
(Assignee)

Comment 24

7 years ago
(In reply to Taras Glek (:taras) from comment #23)
> can you also create a MOZ_SQLITE_whatever histogram and make sure that get
> copied into STARTUP_MOZ_SQLITE_whatever?

There are already MOZ_SQLITE histograms present when we run the xpcshell tests; you're asking me to clone one of those instead of PAGE_FAULTS_HARD?
(Reporter)

Comment 25

7 years ago
if they are already present, you can just check for their existance. Up to you if you want to drop page_faults stuff you added.
(Assignee)

Comment 26

7 years ago
You're saying it's sufficient to check for the existence of MOZ_SQLITE_*?  That's very different from what you were asking for before.

I don't see why it matters which histogram we use for the test.  I'm not checking for STARTUP_MOZ_SQLITE_* here because the TelemetryPing bits to create those cloned histograms don't fire in xpcshell tests.  I figured it was sufficient to clone a histogram, stick STARTUP_ on the front of its name, and make sure that gets sent, since that's essentially what we're doing with the SQLITE probes.
(Reporter)

Comment 27

7 years ago
Sorry I skipped some non-obvious details. I mean that if we already have MOZ_SQLITE*, we should check for STARTUP_MOZ_SQLITE... In order for that to happen we need to do something like:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js#29 eg
 TelemetryPing.observe(null, "sessionstore-windows-restored");
(Assignee)

Comment 28

7 years ago
Created attachment 578583 [details] [diff] [review]
4. add TelemetryPing tests for cloned histograms

Added checks for STARTUP_MOZ_SQLITE* to TelemetryPing tests.  I don't see how this makes much difference than what was already there, but...

Carrying over r+.
Attachment #575528 - Attachment is obsolete: true
Attachment #578583 - Flags: review+
(Assignee)

Comment 29

7 years ago
Created attachment 578584 [details] [diff] [review]
3. use histogramFrom to clone STARTUP_* histograms

Minor formatting tweaks.  Carrying over r+.
Attachment #575527 - Attachment is obsolete: true
Attachment #578584 - Flags: review+
(Assignee)

Comment 30

7 years ago
Marking for checkin-needed.  Please apply the patches in numbered order, thanks!
Keywords: checkin-needed
Backed out of inbound for build failures on all platforms
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=c7d671e9cf58
{
../../../../toolkit/components/telemetry/Telemetry.cpp: In member function 'virtual nsresult<unnamed>::TelemetryImpl::HistogramFrom(const nsACString_internal&, const nsACString_internal&, JSContext*, jsval*)':
../../../../toolkit/components/telemetry/Telemetry.cpp:356:46: error: 'NewUserHistogram' was not declared in this scope
make[7]: *** [Telemetry.o] Error 1
}

https://hg.mozilla.org/integration/mozilla-inbound/rev/4f9c27745913
Target Milestone: mozilla11 → ---
(Assignee)

Comment 33

7 years ago
Created attachment 579021 [details] [diff] [review]
2. add nsITelemetry::HistogramFrom, v3

Fixup build-breaking borkage.
Attachment #574939 - Attachment is obsolete: true
Attachment #579021 - Flags: review+
(Assignee)

Comment 34

7 years ago
Marking for checkin-needed, v2.  Please apply the patches in numbered order, thanks!
Keywords: checkin-needed
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/6c59889f16b0 for timing out in test_TelemetryPing.js on Windows (opt, so far, https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=346e090f5a4d will eventually tell you whether it also times out in Windows debug).
Target Milestone: mozilla11 → ---
(Assignee)

Comment 37

7 years ago
Are there any "these are common reasons why tests on Windows timeout" guides, or is this going to involve some painful debugging with VS?
(Assignee)

Comment 38

7 years ago
Created attachment 580437 [details] [diff] [review]
4. add TelemetryPing tests for cloned histograms, v2

PAGE_FAULTS_HARD doesn't exist on Windows, so the test was erroring trying to access the histogram and thereby timing out.  Since we're already checking for STARTUP_* histograms, just ditch cloning PAGE_FAULTS_HARD.  Fixes timeouts on my Windows PC.  Carrying over r+.
Attachment #578583 - Attachment is obsolete: true
Attachment #580437 - Flags: review+
(Assignee)

Comment 39

7 years ago
Third time's the charm!  Please apply the patches in numbered order, thanks!
Keywords: checkin-needed
(Assignee)

Comment 40

7 years ago
Created attachment 580502 [details] [diff] [review]
1. separate out C++-vs-JS API bits, v5

Rebase against master to pick up changes in this area.
Attachment #574987 - Attachment is obsolete: true
Attachment #580502 - Flags: review+
(Assignee)

Comment 41

7 years ago
Created attachment 580507 [details] [diff] [review]
2. add nsITelemetry::HistogramFrom, v4

Rebase against master to pick up changes in this area.
Attachment #579021 - Attachment is obsolete: true
(Assignee)

Comment 42

7 years ago
Created attachment 580509 [details] [diff] [review]
2. add nsITelemetry::HistogramFrom, v5

Rebase against master to pick up changes in this area.
Attachment #580507 - Attachment is obsolete: true
Attachment #580509 - Flags: review+
(Assignee)

Comment 43

7 years ago
Created attachment 580510 [details] [diff] [review]
3. use histogramFrom to clone STARTUP_* histograms, v2

Rebase against master to pick up changes in this area.
Attachment #578584 - Attachment is obsolete: true
Attachment #580510 - Flags: review+
(Assignee)

Comment 44

7 years ago
Created attachment 580515 [details] [diff] [review]
2. add nsITelemetry::HistogramFrom, v6

Grrr...git rebase swapped some hunks around.
Attachment #580509 - Attachment is obsolete: true
Attachment #580515 - Flags: review+
(Assignee)

Comment 45

7 years ago
Created attachment 580516 [details] [diff] [review]
3. use histogramFrom to clone STARTUP_* histograms, v3

Fix rebase bustage.
Attachment #580510 - Attachment is obsolete: true
Attachment #580516 - Flags: review+
(Marking whiteboard just to make it easier to keep track of which bugs in the checkin-needed search cannot be landed due to bug 709193, to save me opening them every time :-) )
Whiteboard: [waiting on bug 709193]
mozilla-central is now open again (subject to sheriff metering until the backlog clears), adjusting whiteboard accordingly.
Whiteboard: [waiting on bug 709193]
Has this passed try?
(Assignee)

Comment 49

7 years ago
I haven't pushed it to try; checkin-needed #2 passed tests on my Linux box; checkin-needed #3 passed tests on my Windows box.  Would it help if I pushed this to try myself?  (I see some people pushing stuff to try with automagic bz postings, but a lot of my checkin-needed patches thus far haven't required anything like that, so I'm not sure what the correct protocol is.)
I'm hoping to get a moment to push a load of checkin-neededs to try soon (I've seen enough people bitten by seemingly innocuous checkin-neededs that I'll always send them to try now), but if you have a chance in the meantime, it may expedite this landing :-)

You don't necessarily need to use the "post to bug" feature of trychooser, just pushing to try with say "try: -b do -n -p all -u all -t none" and posting in the bug the tbpl url you are emailed, is enough for me to check out the results before pushing. Up to you :-)
(Assignee)

Comment 51

7 years ago
OK, thanks.  Here's the tbpl url: https://tbpl.mozilla.org/?tree=Try&rev=85ffcb142ed7

Still figuring out how to use mercurial correctly, hopefully all those extraneous commits don't screw things up...

Comment 52

7 years ago
Try run for 85ffcb142ed7 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=85ffcb142ed7
Results (out of 12 total builds):
    success: 12
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/nfroyd@mozilla.com-85ffcb142ed7
(Reporter)

Comment 53

7 years ago
Attachment #1 [details] [diff] doesn't apply anymore
Keywords: checkin-needed
(Assignee)

Comment 54

7 years ago
Created attachment 586419 [details] [diff] [review]
1. separate out C++-vs-JS API bits, v6

Rebased patch.
Attachment #580502 - Attachment is obsolete: true
Attachment #586419 - Flags: review+
(Assignee)

Comment 55

7 years ago
Created attachment 586420 [details] [diff] [review]
3. use histogramFrom to clone STARTUP_* histograms, v3

Rebased patch and added r= note.
Attachment #580516 - Attachment is obsolete: true
Attachment #586420 - Flags: review+
(Assignee)

Updated

7 years ago
Keywords: checkin-needed
(Sorry seems that I missed marking this bug post-merge)

https://hg.mozilla.org/mozilla-central/rev/b9601344d6d7
https://hg.mozilla.org/mozilla-central/rev/4c6086b7e2fa
https://hg.mozilla.org/mozilla-central/rev/488c9feac9bd
https://hg.mozilla.org/mozilla-central/rev/555ca2958f6f
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
You need to log in before you can comment on or make changes to this bug.