Enforce 100-bucket limit on Telemetry histograms

RESOLVED FIXED in Firefox 45

Status

()

P4
normal
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: vladan, Assigned: chutten)

Tracking

(Blocks: 1 bug)

Trunk
mozilla45
Points:
---

Firefox Tracking Flags

(firefox44 affected, firefox45 fixed)

Details

(Whiteboard: [measurement:client])

Attachments

(2 attachments, 5 obsolete attachments)

(Reporter)

Description

3 years ago
Created attachment 8669851 [details]
problemHistograms.txt

As discussed on fhr-dev, we should standardize on a maximum of 100 buckets in a histogram and enforce it in the histogram generation script.

Currently, there are 176 histograms with n_buckets > 100, so we may have to whitelist them.

Setting the maximum threshold higher would be of little help:

# of histograms with 100-199 buckets: 12
# of histograms with 200-299 buckets: 24
# of histograms with 400 buckets:  1
# of histograms with 500 or 512 buckets:  5
# of histograms with 700 buckets:  1
# of histograms with 1,000 buckets:  131
# of histograms with 10,000 buckets:  2

The 1,000 bucket histograms are the biggest challenge.
Bug 1167292 created the biggest histograms, with 10,000 buckets in 2 histograms
(Reporter)

Updated

3 years ago
Blocks: 1201022
Priority: -- → P4
Whiteboard: [measurement:client]
(Reporter)

Comment 1

3 years ago
Chris: can you add a 100 bucket maximum in toolkit/components/telemetry/histogram_tools.py? Whitelist the existing histograms with more than 100 buckets.. you can put their names in some kind of whitelist file in the same directory

Georg: any objections?
Assignee: nobody → chutten
(Assignee)

Comment 2

3 years ago
Created attachment 8678277 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Just a simple JSON file with the 185+ histograms who have numeric buckets > 100, and the code to throw an exception when one's seen.

Also, since this works in other places that might not have the whitelist present, default to allowing everything in the event the whitelist isn't found. Complain, but allow.
Attachment #8678277 - Flags: review?(nfroyd)
(Reporter)

Comment 3

3 years ago
Comment on attachment 8678277 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Review of attachment 8678277 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/bucket-whitelist.json
@@ +1,4 @@
> +{
> +  "MEMORY_RESIDENT": true,
> +  "MEMORY_JS_MAIN_RUNTIME_TEMPORARY_PEAK": true,
> +  "MEMORY_JS_GC_HEAP": true,

Add a JS comment to the header warning people not to edit the whitelist without Perf team's approval (include my IRC nick or email address)

::: toolkit/components/telemetry/histogram_tools.py
@@ +244,5 @@
>          self._high = try_to_coerce_to_number(high)
>          self._n_buckets = try_to_coerce_to_number(n_buckets)
> +        if self._n_buckets > 100 and type(self._n_buckets) is int:
> +            try:
> +                with open(os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json'), 'r') as f:

this line is pretty ugly, can you figure out the path on a separate line and put it in a temporary variable

@@ +250,5 @@
> +                        whitelist = json.load(f)
> +                    except ValueError, e:
> +                        raise BaseException, 'error parsing bucket whitelist'
> +                if self._name not in whitelist:
> +                    raise KeyError, 'Non-whitelisted histogram %s cannot have > 100 buckets' % self._name

I wouldn't want people to just whitelist their large histogram after getting this message :) How about "New histograms are not allowed to have more than 100 buckets. Histograms with hundreds of buckets increase mobile memory use, slow down histogram aggregation, etc. Contact :vladan or Perf team if you think an exception needs to be made"
(Assignee)

Comment 4

3 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #3)
> Comment on attachment 8678277 [details] [diff] [review]
> 0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
> 
> Review of attachment 8678277 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/telemetry/bucket-whitelist.json
> @@ +1,4 @@
> > +{
> > +  "MEMORY_RESIDENT": true,
> > +  "MEMORY_JS_MAIN_RUNTIME_TEMPORARY_PEAK": true,
> > +  "MEMORY_JS_GC_HEAP": true,
> 
> Add a JS comment to the header warning people not to edit the whitelist
> without Perf team's approval (include my IRC nick or email address)

JSON files aren't permitted to have comments and still be JSON files. I double-checked and python's json.load unfortunately doesn't seem to like comments (ie, it's a strict JSON parser, not a JS object parser).

> 
> ::: toolkit/components/telemetry/histogram_tools.py
> @@ +244,5 @@
> >          self._high = try_to_coerce_to_number(high)
> >          self._n_buckets = try_to_coerce_to_number(n_buckets)
> > +        if self._n_buckets > 100 and type(self._n_buckets) is int:
> > +            try:
> > +                with open(os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json'), 'r') as f:
> 
> this line is pretty ugly, can you figure out the path on a separate line and
> put it in a temporary variable

Sure.

> @@ +250,5 @@
> > +                        whitelist = json.load(f)
> > +                    except ValueError, e:
> > +                        raise BaseException, 'error parsing bucket whitelist'
> > +                if self._name not in whitelist:
> > +                    raise KeyError, 'Non-whitelisted histogram %s cannot have > 100 buckets' % self._name
> 
> I wouldn't want people to just whitelist their large histogram after getting
> this message :) How about "New histograms are not allowed to have more than
> 100 buckets. Histograms with hundreds of buckets increase mobile memory use,
> slow down histogram aggregation, etc. Contact :vladan or Perf team if you
> think an exception needs to be made"

Agreed. I'll change it up.
(Assignee)

Comment 5

3 years ago
Created attachment 8678305 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Updated error message, split path calculation to separate line for clarity.
Attachment #8678277 - Attachment is obsolete: true
Attachment #8678277 - Flags: review?(nfroyd)
Attachment #8678305 - Flags: review?(vladan.bugzilla)
Attachment #8678305 - Flags: review?(nfroyd)
(Reporter)

Comment 6

3 years ago
Comment on attachment 8678305 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Review of attachment 8678305 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/histogram_tools.py
@@ +244,5 @@
>          self._high = try_to_coerce_to_number(high)
>          self._n_buckets = try_to_coerce_to_number(n_buckets)
> +        if self._n_buckets > 100 and type(self._n_buckets) is int:
> +            try:
> +                whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json');

semicolons in python?

@@ +247,5 @@
> +            try:
> +                whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json');
> +                with open(whitelist_path, 'r') as f:
> +                    try:
> +                        whitelist = json.load(f)

are you loading the whitelist file for every histogram?
Attachment #8678305 - Flags: review?(vladan.bugzilla) → review-
(Assignee)

Comment 7

3 years ago
Created attachment 8678333 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Now we're only parsing it the once, and I've removed a spurious semi.
Attachment #8678305 - Attachment is obsolete: true
Attachment #8678305 - Flags: review?(nfroyd)
Attachment #8678333 - Flags: review?(vladan.bugzilla)
Attachment #8678333 - Flags: review?(nfroyd)
(Reporter)

Comment 8

3 years ago
Comment on attachment 8678333 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Review of attachment 8678333 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/bucket-whitelist.json
@@ +1,2 @@
> +{
> +  "MEMORY_RESIDENT": true,

why not make this an array? the "true" values seems redundant

::: toolkit/components/telemetry/histogram_tools.py
@@ +77,5 @@
>  always_allowed_keys = ['kind', 'description', 'cpp_guard', 'expires_in_version',
>                         'alert_emails', 'keyed', 'releaseChannelCollection']
>  
> +try:
> +    whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json')

are realpath & abspath needed?

@@ +82,5 @@
> +    with open(whitelist_path, 'r') as f:
> +        try:
> +            whitelist = json.load(f)
> +        except ValueError, e:
> +            raise BaseException, 'error parsing bucket whitelist'

include the path & filename in the error message to make it easier to debug

@@ +84,5 @@
> +            whitelist = json.load(f)
> +        except ValueError, e:
> +            raise BaseException, 'error parsing bucket whitelist'
> +except IOError:
> +    whitelist = None

declare this at the top of the file, and use a more descriptive name like n_buckets_whitelist

@@ +85,5 @@
> +        except ValueError, e:
> +            raise BaseException, 'error parsing bucket whitelist'
> +except IOError:
> +    whitelist = None
> +    print 'Unable to load whitelist. Assume all histograms are acceptable.'

ditto on including path/filename where script looked for the whitelist

@@ +255,5 @@
>          self._high = try_to_coerce_to_number(high)
>          self._n_buckets = try_to_coerce_to_number(n_buckets)
> +        if self._n_buckets > 100 and type(self._n_buckets) is int:
> +            if whitelist is None:
> +                print 'Histogram %s allowed to have > 100 buckets because the whitelist file could not be read' % self._name

this is going to print ~190 error messages, just put up one big scary message in the IOError handler instead
Attachment #8678333 - Flags: review?(vladan.bugzilla)
(Assignee)

Comment 9

3 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #8)
> Comment on attachment 8678333 [details] [diff] [review]
> 0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
> 
> Review of attachment 8678333 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/telemetry/bucket-whitelist.json
> @@ +1,2 @@
> > +{
> > +  "MEMORY_RESIDENT": true,
> 
> why not make this an array? the "true" values seems redundant

This way I can test for membership instead of having to indexOf each time. I could convert from list to set inside Python to accomplish a similar thing, but this way it is easier to extend in the future if we want this whitelist to mean more than just "presence in this list means we're good for > 100 buckets". Something like "MEMORY_RESIDENT": {n_buckets_limit: 400, can_access_awesome_thing1: 'read-only'}, for instance.
 
> ::: toolkit/components/telemetry/histogram_tools.py
> @@ +77,5 @@
> >  always_allowed_keys = ['kind', 'description', 'cpp_guard', 'expires_in_version',
> >                         'alert_emails', 'keyed', 'releaseChannelCollection']
> >  
> > +try:
> > +    whitelist_path = os.path.join(os.path.abspath(os.path.realpath(os.path.dirname(__file__))), 'bucket-whitelist.json')
> 
> are realpath & abspath needed?

For traversing symlinks it would be. I'm not sure if it's relevant, I copypasta'd from the many SCRIPT_DIR examples in other python files: https://dxr.mozilla.org/mozilla-central/search?q=SCRIPT_+realpath&redirect=true&case=true 

> @@ +82,5 @@
> > +    with open(whitelist_path, 'r') as f:
> > +        try:
> > +            whitelist = json.load(f)
> > +        except ValueError, e:
> > +            raise BaseException, 'error parsing bucket whitelist'
> 
> include the path & filename in the error message to make it easier to debug

Okily dokily.

> @@ +84,5 @@
> > +            whitelist = json.load(f)
> > +        except ValueError, e:
> > +            raise BaseException, 'error parsing bucket whitelist'
> > +except IOError:
> > +    whitelist = None
> 
> declare this at the top of the file, and use a more descriptive name like
> n_buckets_whitelist

Alrighty.

> @@ +85,5 @@
> > +        except ValueError, e:
> > +            raise BaseException, 'error parsing bucket whitelist'
> > +except IOError:
> > +    whitelist = None
> > +    print 'Unable to load whitelist. Assume all histograms are acceptable.'
> 
> ditto on including path/filename where script looked for the whitelist

Makes sense.

> @@ +255,5 @@
> >          self._high = try_to_coerce_to_number(high)
> >          self._n_buckets = try_to_coerce_to_number(n_buckets)
> > +        if self._n_buckets > 100 and type(self._n_buckets) is int:
> > +            if whitelist is None:
> > +                print 'Histogram %s allowed to have > 100 buckets because the whitelist file could not be read' % self._name
> 
> this is going to print ~190 error messages, just put up one big scary
> message in the IOError handler instead

I figured if it was in automation no one would be looking anyway, and if it wasn't in automation we'd want the user to be scared. I'll try to be just as scary with one line. 'Tis the season, after all.
Comment on attachment 8678333 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Review of attachment 8678333 [details] [diff] [review]:
-----------------------------------------------------------------

Vladan's review seems good enough to me, and it sounds like there are some revisions that are going to be made anyway.

::: toolkit/components/telemetry/histogram_tools.py
@@ +85,5 @@
> +        except ValueError, e:
> +            raise BaseException, 'error parsing bucket whitelist'
> +except IOError:
> +    whitelist = None
> +    print 'Unable to load whitelist. Assume all histograms are acceptable.'

Nit: I think the error reads slightly better as "Assuming all histograms..."

@@ +258,5 @@
> +            if whitelist is None:
> +                print 'Histogram %s allowed to have > 100 buckets because the whitelist file could not be read' % self._name
> +            elif self._name not in whitelist:
> +                raise KeyError, ('New histograms are not permitted to have more than 100 buckets. '
> +                                'Histograms with large numbers of buckets use disproportionately-high amounts of resources. '

Nit: no hyphen here.
Attachment #8678333 - Flags: review?(nfroyd)
(Assignee)

Comment 11

3 years ago
Created attachment 8678859 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch
Attachment #8678859 - Flags: review?(vladan.bugzilla)
Attachment #8678859 - Flags: review?(nfroyd)
(Reporter)

Updated

3 years ago
Attachment #8678333 - Attachment is obsolete: true
(Reporter)

Comment 12

3 years ago
Comment on attachment 8678859 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Review of attachment 8678859 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Chris H-C :chutten from comment #9)
> (In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #8)
> > why not make this an array? the "true" values seems redundant
> 
> This way I can test for membership instead of having to indexOf each time. I
> could convert from list to set inside Python to accomplish a similar thing,
> but this way it is easier to extend in the future if we want this whitelist
> to mean more than just "presence in this list means we're good for > 100
> buckets". Something like "MEMORY_RESIDENT": {n_buckets_limit: 400,
> can_access_awesome_thing1: 'read-only'}, for instance.

You have a point about future extensibility, but then the values in the dict should be {} or {n_buckets: true}. I suggest for this patch we just keep a list in the file and convert it to a set as you said.

> > are realpath & abspath needed?
> 
> For traversing symlinks it would be. I'm not sure if it's relevant, I
> copypasta'd from the many SCRIPT_DIR examples in other python files:
> https://dxr.mozilla.org/mozilla-central/
> search?q=SCRIPT_+realpath&redirect=true&case=true 

Ok, let's keep it
Attachment #8678859 - Flags: review?(vladan.bugzilla) → review+
(Assignee)

Comment 13

3 years ago
Created attachment 8678896 [details]
MozReview Request: Bug 1211599 - Only allow whitelisted histograms to have > 100 buckets.

Bug 1211599 - Only allow whitelisted histograms to have > 100 buckets.
(Assignee)

Comment 14

3 years ago
Comment on attachment 8678896 [details]
MozReview Request: Bug 1211599 - Only allow whitelisted histograms to have > 100 buckets.

mozreview test publish. Feel free to ignore.
Attachment #8678896 - Attachment is obsolete: true
Attachment #8678859 - Flags: review?(nfroyd) → review+
(Assignee)

Comment 15

3 years ago
Created attachment 8680018 [details] [diff] [review]
0001-Bug-1211599-Only-allow-whitelisted-histograms-to-hav.patch

Patch updated as asked. Now to figure out try.
Attachment #8678859 - Attachment is obsolete: true
Attachment #8680018 - Flags: review+
(Assignee)

Comment 16

3 years ago
Clean trybuild of win32 and xpcshell tests on Win7: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3d86e92d4cd2
Status: NEW → ASSIGNED
Keywords: checkin-needed

Comment 18

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/c0a60c2c5a83
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
status-firefox45: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
removing the b2g 2.5 flag since this commit has been reverted due to an incorrect merge, sorry for the confusion
status-b2g-v2.5: fixed → ---
You need to log in before you can comment on or make changes to this bug.