Closed Bug 920169 Opened 6 years ago Closed 4 years ago

Remove bucket length expressions from Histograms.json

Categories

(Toolkit :: Telemetry, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: taras.mozilla, Assigned: hu.eric, Mentored)

References

Details

(Whiteboard: [measurement:client] [lang=c++])

Attachments

(1 file, 4 obsolete files)

GC_REASON_2 and STARTUP_MEASUREMENT_ERRORS are wrong(tm). We will be doing validation as part of https://wiki.mozilla.org/Telemetry/Reboot. There are 2 problems with computed bucket lengths:
1. The server has no idea what those constants are without us hardcoding them in parallel with Histograms.json
2. We will require histograms with the same name to have the same parameters...So if this computed value changes, we'll start throwing those histograms away
Note: we can retain the nice functionality of having telemetry histograms match enum sizes with asserts
Blocks: 856263
Histogram name                Constant                                  Actual value
--------------------------    --------------------------------------    ------------
GC_REASON_2                -> JS::gcreason::NUM_TELEMETRY_REASONS    -> 101
STARTUP_MEASUREMENT_ERRORS -> mozilla::StartupTimeline::MAX_EVENT_ID -> 12
POPUP_NOTIFICATION_MAINACTION_TRIGGERED_MS -> "80 + 1" -> 81

The script that generates the .h file should fail on non-numeric values.
Blocks: 1201022, 1245514
OS: Linux → All
Priority: -- → P3
Hardware: x86_64 → All
Summary: remove computed bucket lengths → Remove bucket length expressions from Histograms.json
Whiteboard: [measurement:client]
Version: unspecified → Trunk
(In reply to Nathan Froyd [:froydnj] from bug 1245514, comment #5)
> Adding appropriate static_asserts in Telemetry.cpp:
> 
> static_assert(gHistograms[GC_REASON_2].bucketCount <= JAVASCRIPT_ENUM_COUNT,
> "message");
> 
> seems like a reasonable way to handle it, with a comment about changing the
> histogram name if the assert fires, or something like that.  That condition
> might need to be massaged a bit...
Mentor: gfritzsche
Whiteboard: [measurement:client] → [measurement:client] [lang=c++]
There are four entries in Histograms.json that are enum-based expressions:
https://dxr.mozilla.org/mozilla-central/search?q=path%3AHistograms.json+regexp%3A\%22n_values\%22%3A\W\%22&redirect=false&case=false

We want to make those into static asserts per comment 4 and change the "n_values" field to contain a number that matches the current enum value.
No longer blocks: 1201022
No longer blocks: 856263
Assignee: nobody → pineapple.rice
I've added the static_asserts into my local copy.  When I change the "n_values" field in Histograms.json, a pair of compile errors appear:

```
0:07.26 Traceback (most recent call last):
 0:07.26   File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
 0:07.26     "__main__", fname, loader, pkg_name)
 0:07.26   File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
 0:07.27     exec code in run_globals
 0:07.27   File "/Users/erichu/mozilla-central/python/mozbuild/mozbuild/action/file_generate.py", line 92, in <module>
 0:07.27     sys.exit(main(sys.argv[1:]))
 0:07.27   File "/Users/erichu/mozilla-central/python/mozbuild/mozbuild/action/file_generate.py", line 63, in main
 0:07.27     ret = module.__dict__[method](output, *args.additional_arguments)
 0:07.27   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/gen-histogram-data.py", line 189, in main
 0:07.27     histograms = list(histogram_tools.from_files(filenames))
 0:07.27   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 383, in from_files
 0:07.27     yield Histogram(name, definition)
 0:07.27   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 114, in __init__
 0:07.27     self.compute_bucket_parameters(definition)
 0:07.27   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 201, in compute_bucket_parameters
 0:07.27     lambda p: self.set_bucket_parameters(*p(definition)))
 0:07.27   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 33, in table_dispatch
 0:07.27     return body(table[kind])
 0:07.27   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 201, in <lambda>
 0:07.27     lambda p: self.set_bucket_parameters(*p(definition)))
 0:07.27   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 274, in set_bucket_parameters
 0:07.27     'Contact :vladan or the Perf team if you think an exception ought to be made.' % self._name)
 0:07.27 KeyError: u'New histogram GC_MINOR_REASON_LONG is not permitted to have more than 100 buckets. Histograms with large numbers of buckets use disproportionately high amounts of resources. Contact :vladan or the Perf team if you think an exception ought to be made.'
 0:07.28 Traceback (most recent call last):
 0:07.28   File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
 0:07.28     "__main__", fname, loader, pkg_name)
 0:07.28   File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
 0:07.28     exec code in run_globals
 0:07.28   File "/Users/erichu/mozilla-central/python/mozbuild/mozbuild/action/file_generate.py", line 92, in <module>
 0:07.28     sys.exit(main(sys.argv[1:]))
 0:07.28   File "/Users/erichu/mozilla-central/python/mozbuild/mozbuild/action/file_generate.py", line 63, in main
 0:07.28     ret = module.__dict__[method](output, *args.additional_arguments)
 0:07.28   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/gen-histogram-enum.py", line 46, in main
 0:07.28     for histogram in histograms:
 0:07.28   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 383, in from_files
 0:07.29     yield Histogram(name, definition)
 0:07.29   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 114, in __init__
 0:07.29     self.compute_bucket_parameters(definition)
 0:07.29   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 201, in compute_bucket_parameters
 0:07.29     lambda p: self.set_bucket_parameters(*p(definition)))
 0:07.29   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 33, in table_dispatch
 0:07.29     return body(table[kind])
 0:07.29   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 201, in <lambda>
 0:07.29     lambda p: self.set_bucket_parameters(*p(definition)))
 0:07.29   File "/Users/erichu/mozilla-central/toolkit/components/telemetry/histogram_tools.py", line 274, in set_bucket_parameters
 0:07.29     'Contact :vladan or the Perf team if you think an exception ought to be made.' % self._name)
 0:07.29 KeyError: u'New histogram GC_MINOR_REASON_LONG is not permitted to have more than 100 buckets. Histograms with large numbers of buckets use disproportionately high amounts of resources. Contact :vladan or the Perf team if you think an exception ought to be made.'

```

These errors correspond to GC_MINOR_REASON_LONG, GC_MINOR_REASON, and GC_REASON_2.  I only see one pair of errors per compile, until I change the value of Histograms.json variable to 99.

These 3 values were assigned the dynamic n_value `JS::gcreason::NUM_TELEMETRY_REASONS` before, which is 100 at build startup time.  The error messages above suggests that 100 should be acceptable, but it isn't.

The line in `histogram_tools.py` raising the error has the following condition check:

```
if n_buckets_whitelist is not None and self._n_buckets > 100 and type(self._n_buckets) is int:
```

`self._n_buckets` has value 101 when the n_values is 100 in Histograms.json.

If I switch the Histograms.json value back to the original expression, the python value of self._n_buckets is "JS::gcreason::NUM_TELEMETRY_REASONS+1", and its type is unicode.  The unicode type explains why the condition passes in histogram_tools.py when its value is eventually 101.
It looks like the value comes from these lines in histogram_tools.py:

https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/histogram_tools.py#286-289

At this point, I'm not sure if the error check should be updated to 101 or if there's a flaw with that "value+1" logic.
(In reply to Eric Hu from comment #6)
> These errors correspond to GC_MINOR_REASON_LONG, GC_MINOR_REASON, and
> GC_REASON_2.  I only see one pair of errors per compile, until I change the
> value of Histograms.json variable to 99.
> 
> These 3 values were assigned the dynamic n_value
> `JS::gcreason::NUM_TELEMETRY_REASONS` before, which is 100 at build startup
> time.  The error messages above suggests that 100 should be acceptable, but
> it isn't.
> 
> The line in `histogram_tools.py` raising the error has the following
> condition check:
> 
> ```
> if n_buckets_whitelist is not None and self._n_buckets > 100 and
> type(self._n_buckets) is int:
> ```
> 
> `self._n_buckets` has value 101 when the n_values is 100 in Histograms.json.
> 
> If I switch the Histograms.json value back to the original expression, the
> python value of self._n_buckets is "JS::gcreason::NUM_TELEMETRY_REASONS+1",
> and its type is unicode.  The unicode type explains why the condition passes
> in histogram_tools.py when its value is eventually 101.

Given that this just happened to work before, we should simply add those Histograms to the exceptions listed in the file bucket-whitelist.json.
The eval-logic this goes through is pretty awkward and can be cleaned up with bug 1245910.

It's admittedly surprising that using |"n_values":100| brings that error with enumerated values, but we can take that to a follow-up bug.
The n_values+1 for enumerated histograms is actually desired: we need an additional bucket for the values outside of the expected value range.
See Also: → 1247260
Attachment #8717975 - Flags: review?(gfritzsche)
Comment on attachment 8717975 [details] [diff] [review]
Remove references to C++ constants in Histograms.json

Review of attachment 8717975 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this looks good.
I have only one request below.

::: toolkit/components/telemetry/Telemetry.cpp
@@ +1959,5 @@
>    // Setup of the initial recording-enabled state (for not-recording-by-default
>    // histograms) using InitHistogramRecordingEnabled() will happen after instantiating
>    // sTelemetry since it depends on the static GetKeyedHistogramById(...) - which
>    // uses the singleton instance at sTelemetry.
> +  static_assert((JS::gcreason::NUM_TELEMETRY_REASONS == 100),

Nit: Lets add an empty line and a comment before the asserts and describe what this section is for.
Suggesting:

// Some Telemetry histograms depend on the value of C++ constants and hardcode
// their values in Histograms.json.
// We add static asserts here for those values to match so that future changes
// don't go unnoticed.
Attachment #8717975 - Flags: review?(gfritzsche) → feedback+
Attachment #8718432 - Flags: review?(gfritzsche)
Good suggestion.  Thanks for the feedback!
I see you uploaded the diff to the previous patch.
Can you upload the full diff? We usually don't upload incremental diffs for reviews on bugzilla.
Flags: needinfo?(pineapple.rice)
Flags: needinfo?(pineapple.rice)
Attachment #8718812 - Flags: review?(gfritzsche)
Updated, sorry about that.  I thought the patches stacked like in git.
Comment on attachment 8718812 [details] [diff] [review]
Remove references to C++ constants in Histograms.json

Review of attachment 8718812 [details] [diff] [review]:
-----------------------------------------------------------------

I was tempted to say lets properly compare this to the actual histogram value as suggested in comment 4, i.e. to |gHistograms[<histogram id>].bucketCount|.
However, static_assert() requires a constexpr, and gHistograms is a const array, which is sadly not a constexpr.
We could make define gHistograms as a constexpr, but until we require Visual Studio 2015, we don't have constexpr support across platforms in our code base.

However, lets add a TODO comment to that effect:
// TODO: Compare explicitly with gHistograms[<histogram id>].bucketCount here once we can make gHistograms constexpr (requires VS2015).

::: toolkit/components/telemetry/Telemetry.cpp
@@ +1966,5 @@
> +  // We add static asserts here for those values to match so that future changes
> +  // don't go unnoticed.
> +  static_assert((JS::gcreason::NUM_TELEMETRY_REASONS == 100),
> +      "NUM_TELEMETRY_REASONS is assumed to be a fixed value in Histograms.json."
> +      "If this was an intentional change, update this assert with its value and"

Nit: The messages are missing a space after or before the line breaks.
Attachment #8718812 - Flags: review?(gfritzsche) → feedback+
(In reply to Eric Hu from comment #15)
> Updated, sorry about that.  I thought the patches stacked like in git.

With MozReview / ReviewBoard (the new review approach), they do.
However, patches in Bugzilla here are treated just as plain stand-alone patches.
Attached patch Merge m-i to m-c, (obsolete) — Splinter Review
Attachment #8718432 - Attachment is obsolete: true
Attachment #8718432 - Flags: review?(gfritzsche)
Attachment #8718849 - Flags: review?(gfritzsche)
Attachment #8718846 - Attachment is obsolete: true
Comment on attachment 8718849 [details] [diff] [review]
Remove references to C++ constants in Histograms.json

Review of attachment 8718849 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8718849 - Flags: review?(gfritzsche) → review+
Attachment #8717975 - Attachment is obsolete: true
Attachment #8718812 - Attachment is obsolete: true
All green.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/c89eba9442f4
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in before you can comment on or make changes to this bug.