Closed
Bug 1188888
Opened 10 years ago
Closed 9 years ago
Implement categorical histograms
Categories
(Toolkit :: Telemetry, defect, P1)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla50
| Tracking | Status | |
|---|---|---|
| firefox50 | --- | fixed |
People
(Reporter: rvitillo, Assigned: gfritzsche)
References
Details
(Whiteboard: [measurement:client])
Attachments
(7 files, 11 obsolete files)
|
7.19 KB,
patch
|
chutten
:
review+
|
Details | Diff | Splinter Review |
|
26.78 KB,
patch
|
gfritzsche
:
review+
|
Details | Diff | Splinter Review |
|
3.56 KB,
patch
|
chutten
:
review+
|
Details | Diff | Splinter Review |
|
1.67 KB,
patch
|
chutten
:
review+
|
Details | Diff | Splinter Review |
|
2.96 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
|
23.60 KB,
patch
|
gfritzsche
:
review+
|
Details | Diff | Splinter Review |
|
13.11 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
It would be convenient to have a histogram for categorical data where each category is represented with a string. Ideally we could start with the implementation of enumerated histograms and add labels to it.
| Reporter | ||
Updated•10 years ago
|
Summary: Implement categorial histograms → Implement categorical histograms
Comment 1•9 years ago
|
||
I suggested in email to keep this as type: "enumerated" and add a enumeration field:
{
"name": "MY_TELEMETRY_ENUM",
"type": "enumerated",
"enumeration: [
"VALUE_1",
"VALUE_2",
...
]
}
The enumeration could be exposed as a real c++ enum class, if the names were all tokens:
namespace Telemetry {
enum class MY_TELEMETRY_ENUM_Values {
VALUE_1 = 0,
VALUE_2 = 2
}
And also people would be allowed to accumulate string values and have them automatically converted:
void Accumulate(ID id, const nsCString& value);
void Accumulate(const char* name, const nsCString& value);
Georg, would you be willing to mentor somebody through this? It seems like a pretty simple bug for somebody who knows c++, although the xpconnect bits might be hard.
Flags: needinfo?(gfritzsche)
| Assignee | ||
Comment 2•9 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #1)
> I suggested in email to keep this as type: "enumerated" and add a
> enumeration field:
>
> {
> "name": "MY_TELEMETRY_ENUM",
> "type": "enumerated",
> "enumeration: [
> "VALUE_1",
> "VALUE_2",
> ...
> ]
> }
The `enumeration` part looks sensible, but i think we should use a different type.
We should be able to tell for which histograms we support this (and error out for others).
> The enumeration could be exposed as a real c++ enum class, if the names were
> all tokens:
>
> namespace Telemetry {
> enum class MY_TELEMETRY_ENUM_Values {
> VALUE_1 = 0,
> VALUE_2 = 2
> }
So, we can add a type-safe C++ path for this:
* generate an `enum class X : uint32_t ...` for each enumeration list
* generate a type list `TL` of those enum classes
* provide `template<class E> Accumulate(ID id, E value)`
* have that function `static_assert()` that `E` is an enum and in `TL`
* ... accumulate normally from there
> And also people would be allowed to accumulate string values and have them
> automatically converted:
>
> void Accumulate(ID id, const nsCString& value);
> void Accumulate(const char* name, const nsCString& value);
Do you think we need that for the C++ part?
We need the IDL/JS interface to take string arguments, but we should try to force C++ to use enums where possible.
> Georg, would you be willing to mentor somebody through this? It seems like a
> pretty simple bug for somebody who knows c++, although the xpconnect bits
> might be hard.
Sure, i'll set this up when i'm back.
Flags: needinfo?(gfritzsche)
OS: Unspecified → All
Priority: -- → P3
Hardware: Unspecified → All
Whiteboard: [measurement:client]
Version: unspecified → Trunk
Comment 3•9 years ago
|
||
There have been two cases in the past week where the C++ code had a string and would have had to build a giant switch block to get back to an enumeration, which I was trying to avoid by exposing this directly to C++.
| Assignee | ||
Comment 4•9 years ago
|
||
Roberto, i'm thinking that:
* we add a new histogram type, say "labelled" or "categorical"
* for a start, we treat them like enumerated histograms in the aggregator & cerberus
* in a follow-up bug, we add showing labels on t.m.o
How does that sound?
Flags: needinfo?(rvitillo)
| Reporter | ||
Comment 5•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #4)
> Roberto, i'm thinking that:
> * we add a new histogram type, say "labelled" or "categorical"
> * for a start, we treat them like enumerated histograms in the aggregator &
> cerberus
> * in a follow-up bug, we add showing labels on t.m.o
>
> How does that sound?
Sounds good to me. I would prefer "categorical" over "labelled".
Flags: needinfo?(rvitillo)
| Assignee | ||
Updated•9 years ago
|
Points: --- → 3
Priority: P3 → P1
| Assignee | ||
Updated•9 years ago
|
Assignee: nobody → gfritzsche
| Assignee | ||
Comment 6•9 years ago
|
||
Some long overdue refactoring of test_nsITelemetry.js.
Attachment #8769664 -
Flags: review?(alessio.placitelli)
| Assignee | ||
Comment 7•9 years ago
|
||
More prep work: Make the histogram string tables human-readable. I dont see a great reason for the previous structure.
Attachment #8769665 -
Flags: review?(chutten)
| Assignee | ||
Comment 8•9 years ago
|
||
More prep work: Clean up and refactor the type checking in the histogram script.
Attachment #8769666 -
Flags: review?(chutten)
| Assignee | ||
Comment 9•9 years ago
|
||
Implementing the categorical histograms and JS API. The C++ API comes in a separate patch.
Attachment #8769667 -
Flags: review?(chutten)
Updated•9 years ago
|
Attachment #8769665 -
Flags: review?(chutten) → review+
Updated•9 years ago
|
Attachment #8769666 -
Flags: review?(chutten) → review+
Comment 10•9 years ago
|
||
Comment on attachment 8769667 [details] [diff] [review]
Part 4 - Implement categorical histograms with JS API
Review of attachment 8769667 [details] [diff] [review]:
-----------------------------------------------------------------
Basically there, but I have some questions
::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ +295,5 @@
> +nsresult
> +HistogramInfo::label_id(const char* label, uint32_t* labelId) const
> +{
> + MOZ_ASSERT(label);
> + MOZ_ASSERT(this->histogramType == nsITelemetry::HISTOGRAM_CATEGORICAL);
Do we want to early-return on these failing in opt?
@@ +298,5 @@
> + MOZ_ASSERT(label);
> + MOZ_ASSERT(this->histogramType == nsITelemetry::HISTOGRAM_CATEGORICAL);
> +
> + for (uint32_t i = 0; i < this->label_count; ++i) {
> + // gHistogramLabelTable contains the indices of the label strings in the
I like "label table". Rolls off the tongue. :)
@@ +377,5 @@
> switch (histogramType) {
> case nsITelemetry::HISTOGRAM_EXPONENTIAL:
> *result = Histogram::FactoryGet(name, min, max, bucketCount, Histogram::kUmaTargetedHistogramFlag);
> break;
> case nsITelemetry::HISTOGRAM_LINEAR:
Do we need a "// Fall Through." comment or something here to prove that we're doing this a-purpose? *checks style guide* I guess we don't, but I'd kinda like one :)
::: toolkit/components/telemetry/gen-histogram-data.py
@@ +47,5 @@
> + label_index = 0
> + if len(labels) > 0:
> + label_index = label_count
> + label_table.append((histogram.name(), string_table.stringIndexes(labels)))
> + label_count = label_count + len(labels)
Nit: label_count += len(labels)
::: toolkit/components/telemetry/tests/unit/test_nsITelemetry.js
@@ +216,5 @@
> + Assert.throws(() => h1.add(s));
> + }
> +
> + let snapshot = h1.snapshot();
> + Assert.equal(snapshot.sum, 6);
Uh... sum(snapshot.counts) is seven, no?
Attachment #8769667 -
Flags: review?(chutten) → review-
| Assignee | ||
Comment 11•9 years ago
|
||
(In reply to Chris H-C :chutten from comment #10)
> ::: toolkit/components/telemetry/tests/unit/test_nsITelemetry.js
> @@ +216,5 @@
> > + Assert.throws(() => h1.add(s));
> > + }
> > +
> > + let snapshot = h1.snapshot();
> > + Assert.equal(snapshot.sum, 6);
>
> Uh... sum(snapshot.counts) is seven, no?
`.sum` is the sum of the submitted values and we submit 3 zero-bucket values.
`.sum` is basically `sum(.ranges * .counts)`.
| Assignee | ||
Comment 12•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #11)
> (In reply to Chris H-C :chutten from comment #10)
> > ::: toolkit/components/telemetry/tests/unit/test_nsITelemetry.js
> > @@ +216,5 @@
> > > + Assert.throws(() => h1.add(s));
> > > + }
> > > +
> > > + let snapshot = h1.snapshot();
> > > + Assert.equal(snapshot.sum, 6);
> >
> > Uh... sum(snapshot.counts) is seven, no?
>
> `.sum` is the sum of the submitted values and we submit 3 zero-bucket values.
> `.sum` is basically `sum(.ranges * .counts)`.
... at least for linear/enumerated/categorical histograms where the bucket size is 1.
For differently sized-buckets this doesn't hold though.
Comment 14•9 years ago
|
||
Ack, sorry about that. (Marked as spam)
r+ with or without nits addressed.
| Assignee | ||
Comment 15•9 years ago
|
||
(In reply to Chris H-C :chutten from comment #10)
> @@ +298,5 @@
> > + MOZ_ASSERT(label);
> > + MOZ_ASSERT(this->histogramType == nsITelemetry::HISTOGRAM_CATEGORICAL);
> > +
> > + for (uint32_t i = 0; i < this->label_count; ++i) {
> > + // gHistogramLabelTable contains the indices of the label strings in the
>
> I like "label table". Rolls off the tongue. :)
"i wasn't able to disable the label table" ...
> @@ +377,5 @@
> > switch (histogramType) {
> > case nsITelemetry::HISTOGRAM_EXPONENTIAL:
> > *result = Histogram::FactoryGet(name, min, max, bucketCount, Histogram::kUmaTargetedHistogramFlag);
> > break;
> > case nsITelemetry::HISTOGRAM_LINEAR:
>
> Do we need a "// Fall Through." comment or something here to prove that
> we're doing this a-purpose? *checks style guide* I guess we don't, but I'd
> kinda like one :)
I would agree if we have any code there, but with the switch labels directly following each other it seems like an obvious pattern.
Comment 16•9 years ago
|
||
Comment on attachment 8769664 [details] [diff] [review]
Part 1 - Refactor test_nsITelemetry.js
Review of attachment 8769664 [details] [diff] [review]:
-----------------------------------------------------------------
Good think you're taking care of this file. I think we can get rid of some more stuff (e.g. |expect_fail|). Other than that, there are just a few other minor style comments.
::: toolkit/components/telemetry/tests/unit/test_nsITelemetry.js
@@ +17,3 @@
> }
>
> +function expect_fail(f) {
Since we're here, can't we use Assert.throws instead of defining this?
@@ +26,5 @@
> + }
> + do_check_true(failed);
> +}
> +
> +function expect_success(f) {
Mh, why is this needed? Can't we simply call f() within the test and let it break it when throwing?
@@ +103,5 @@
> }
>
> +// This MUST be the very first test of this file.
> +add_task({
> + skip_if: () => gIsAndroid
Trailing whitespace.
@@ +128,5 @@
> + let [min, max, bucket_count] = [1, INT_MAX - 1, 10]
> + check_histogram(histogram_type, "test::"+histogram_type, min, max, bucket_count);
> +
> + const nh = Telemetry.newHistogram;
> + expect_fail(() => nh("test::min", "never", histogram_type, 0, max, bucket_count));
I think we can use |Assert.throws| in place of |expect_fail| and get rid of the latter.
@@ +136,5 @@
>
> +add_task(function test_noSerialization() {
> + // Instantiate the storage for this histogram and make sure it doesn't
> + // get reflected into JS, as it has no interesting data in it.
> + let h = Telemetry.getHistogramById("NEWTAB_PAGE_PINNED_SITES_COUNT");
nit: You're not using |h|, I think we can get rid of it
@@ +145,1 @@
> {
nit: Let's move "{" to the previous line
@@ +167,1 @@
> {
nit: same here and in other places within this file.
@@ +460,1 @@
> {
nit: since we're here, would you move the opening "{" to the previous line?
@@ +509,1 @@
> {
nit: move "{" to the previous line
@@ +565,1 @@
> {
nit: same here
@@ +737,1 @@
> {
nit: move "{" to the previous line
@@ +781,3 @@
>
> +add_task({
> + skip_if: () => gIsAndroid
nit: trailing whitespace
@@ +867,3 @@
>
> +add_task({
> + skip_if: () => gIsAndroid
nit: trailing whitespace
Attachment #8769664 -
Flags: review?(alessio.placitelli) → feedback+
| Assignee | ||
Comment 17•9 years ago
|
||
(In reply to Alessio Placitelli [:Dexter] from comment #16)
> Good think you're taking care of this file. I think we can get rid of some
> more stuff (e.g. |expect_fail|).
I'm not planning to do that in this bug.
Comment 18•9 years ago
|
||
Comment on attachment 8769664 [details] [diff] [review]
Part 1 - Refactor test_nsITelemetry.js
(In reply to Georg Fritzsche [:gfritzsche] from comment #17)
> (In reply to Alessio Placitelli [:Dexter] from comment #16)
> > Good think you're taking care of this file. I think we can get rid of some
> > more stuff (e.g. |expect_fail|).
>
> I'm not planning to do that in this bug.
It looks good then, with the other nits addressed.
Attachment #8769664 -
Flags: feedback+ → review+
| Assignee | ||
Comment 19•9 years ago
|
||
Nits addressed, also removed unused function check_bug_numbers from histogram_tools.
| Assignee | ||
Updated•9 years ago
|
Attachment #8769667 -
Attachment is obsolete: true
| Assignee | ||
Updated•9 years ago
|
Attachment #8769697 -
Flags: review+
| Assignee | ||
Comment 20•9 years ago
|
||
| Assignee | ||
Updated•9 years ago
|
Attachment #8769664 -
Attachment is obsolete: true
| Assignee | ||
Comment 21•9 years ago
|
||
Also fix not using generators for the add_task() functions.
| Assignee | ||
Updated•9 years ago
|
Attachment #8769699 -
Attachment is obsolete: true
| Assignee | ||
Updated•9 years ago
|
Attachment #8769704 -
Flags: review+
| Assignee | ||
Comment 22•9 years ago
|
||
We should restrict the label values to sane contents from the start.
Attachment #8770256 -
Flags: review?(chutten)
| Assignee | ||
Comment 23•9 years ago
|
||
| Assignee | ||
Comment 24•9 years ago
|
||
This adds the C++ API and enum generation.
Attachment #8770258 -
Flags: review?(chutten)
| Assignee | ||
Comment 25•9 years ago
|
||
This restricts the histogram names to sane contents instead of making the C++ build fail when adding unsupported characters.
Attachment #8770259 -
Flags: review?(chutten)
| Assignee | ||
Comment 26•9 years ago
|
||
Comment on attachment 8770257 [details] [diff] [review]
Part 6 - Add mozilla::tl::Or
This adds tl::Or<> to the MFBT, which is used in part 7 to check for whether a type is in a known set of types:
> template<class T> struct IsCategoricalLabelEnum : tl::Or<
> IsSame<T, LABELS_TELEMETRY_TEST_CATEGORICAL>::value,
> IsSame<T, LABELS_TELEMETRY_TEST_CATEGORICAL_OPTOUT>::value
> >::Type {};
Not the most pretty way i think, but it works.
Attachment #8770257 -
Flags: review?(jwalden+bmo)
Comment 27•9 years ago
|
||
Comment on attachment 8770256 [details] [diff] [review]
Part 5 - Restrict label values
Review of attachment 8770256 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/histogram_tools.py
@@ +257,5 @@
> + labels = definition.get('labels')
> + if not labels:
> + return
> +
> + limit = 20
Shouldn't this be a constant someplace instead of a local?
@@ +262,5 @@
> + invalid = filter(lambda l: len(l) > limit, labels)
> + if len(invalid) > 0:
> + raise ValueError, 'Label values for %s exceed length limit of %d: %s' % (name, limit, ', '.join(invalid))
> +
> + pattern = '^[a-z][a-z0-9_]+[a-z0-9]$'
Maybe a comment that this forces labels to conform to identifiers allowed by C++ version whatever?
Attachment #8770256 -
Flags: review?(chutten)
Updated•9 years ago
|
Attachment #8770257 -
Flags: review?(jwalden+bmo) → review+
| Assignee | ||
Comment 28•9 years ago
|
||
Attachment #8770514 -
Flags: review?(chutten)
| Assignee | ||
Updated•9 years ago
|
Attachment #8770256 -
Attachment is obsolete: true
| Assignee | ||
Comment 29•9 years ago
|
||
Updated doc comments for the API and dropped the EnableIf for a static_assert.
Attachment #8770515 -
Flags: review?(chutten)
| Assignee | ||
Updated•9 years ago
|
Attachment #8770258 -
Attachment is obsolete: true
Attachment #8770258 -
Flags: review?(chutten)
| Assignee | ||
Comment 30•9 years ago
|
||
Attachment #8770516 -
Flags: review?(chutten)
| Assignee | ||
Updated•9 years ago
|
Attachment #8770259 -
Attachment is obsolete: true
Attachment #8770259 -
Flags: review?(chutten)
Updated•9 years ago
|
Attachment #8770514 -
Flags: review?(chutten) → review+
Comment 31•9 years ago
|
||
Comment on attachment 8770515 [details] [diff] [review]
Part 7 - Implement C++ API for categorical histograms
Review of attachment 8770515 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/Telemetry.h
@@ +106,5 @@
> +};
> +
> +/**
> + * Adds sample to a categorical histogram defined in TelemetryHistogramEnums.h
> + * This allows passings strings matching the labels defined in Histograms.json
"passings"
I'm not sure that this sentence is very clear. What does the caller care about the index of the label string?
Maybe something like: "This string will be matched against the labels defined in Histograms.json. If the string does not match a label defined for the histogram, nothing will be recorded"
::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ +1132,1 @@
> JS_ReportError(cx, "Unknown label for categorical histogram");
By changing the semantics here, technically the error could be with the id (line 611 returns failure status from internal_GetHistogramByEnumId) _or_ the label. I'm not too worried.
::: toolkit/components/telemetry/gen-histogram-enum.py
@@ +91,5 @@
> + print("\nenum class %s : uint32_t {" % name, file=output)
> + print(" %s" % ",\n ".join(labels), file=output)
> + print("};", file=output)
> +
> + print("\ntemplate<class T> struct IsCategoricalLabelEnum : tl::Or<", file=output)
I'm not so familiar with enum class finagling. Please ask for a second pair of eyes to look at these parts.
Attachment #8770515 -
Flags: review?(chutten) → feedback+
Updated•9 years ago
|
Attachment #8770516 -
Flags: review?(chutten) → review+
| Assignee | ||
Comment 32•9 years ago
|
||
Nathan, can you take specifically take a look over the label enum handling? This affects IsCategoricalLabelEnum, CategoricalLabelId and their uses. chutten covered the rest.
Attachment #8770582 -
Flags: review?(nfroyd)
| Assignee | ||
Updated•9 years ago
|
Attachment #8770515 -
Attachment is obsolete: true
| Assignee | ||
Updated•9 years ago
|
Attachment #8770582 -
Flags: review?(chutten)
| Assignee | ||
Comment 33•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #32)
> Created attachment 8770582 [details] [diff] [review]
> Part 7 - Implement C++ API for categorical histograms
>
> Nathan, can you take specifically take a look over the label enum handling?
> This affects IsCategoricalLabelEnum, CategoricalLabelId and their uses.
> chutten covered the rest.
For supporting "categorical histograms" with a predefined set we generate C++ enums.
Basically for Histograms.json entries like this:
> "TELEMETRY_TEST_CATEGORICAL": {
> "alert_emails": ["telemetry-client-dev@mozilla.com"],
> "bug_numbers": [1188888],
> "expires_in_version": "never",
> "kind": "categorical",
> "labels": [
> "CommonLabel",
> "Label2",
> "Label3"
> ],
> "description": "a testing histogram; not meant to be touched"
> },
> "TELEMETRY_TEST_CATEGORICAL_OPTOUT": {
> "alert_emails": ["telemetry-client-dev@mozilla.com"],
> "bug_numbers": [1188888],
> "expires_in_version": "never",
> "releaseChannelCollection": "opt-out",
> "kind": "categorical",
> "labels": [
> "CommonLabel",
> "Label4",
> "Label5",
> "Label6"
> ],
> "description": "a testing histogram; not meant to be touched"
> },
... we generate one `enum class` for each categorical histograms labels and some type-checking. Here is a shortened version of the generated TelemetryHistogramEnums.h:
https://pastebin.mozilla.org/8883568
Comment 34•9 years ago
|
||
Comment on attachment 8770582 [details] [diff] [review]
Part 7 - Implement C++ API for categorical histograms
Review of attachment 8770582 [details] [diff] [review]:
-----------------------------------------------------------------
r1/2 for the telemetry semantics
Attachment #8770582 -
Flags: review?(chutten) → review+
Comment 35•9 years ago
|
||
Comment on attachment 8769665 [details] [diff] [review]
Part 2 - Generate readable Telemetry string tables
Review of attachment 8769665 [details] [diff] [review]:
-----------------------------------------------------------------
Drive-by review. The tables are written the way they are for a reason. :)
::: toolkit/components/telemetry/shared_telemetry_utils.py
@@ +26,5 @@
> the table if it's not there.
> :param string: the input string.
> """
> if string in self.table:
> + return self.table.index(string)
You have turned a constant-time operation into a linear-time operation here...and you actually do a linear-time operation twice, because of the membership check and the indexing operation are both linear.
@@ +37,5 @@
> :param f: the output stream.
> :param name: the name of the output array.
> """
> + entries = self.table
> + f.write('const char* %s[] = {\n' % name)
You have drastically increased the space requirements for the table here, because now in addition to the string data itself, you require pointers to individual strings, and those pointers on some platforms also require extra space in the form of relocations. Please don't do that.
Attachment #8769665 -
Flags: review-
Comment 36•9 years ago
|
||
Comment on attachment 8770582 [details] [diff] [review]
Part 7 - Implement C++ API for categorical histograms
Review of attachment 8770582 [details] [diff] [review]:
-----------------------------------------------------------------
The template definitions need a little bit of massaging, and label_id needs to be reworked.
::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ +600,5 @@
> +nsresult
> +internal_HistogramAddCategorical(mozilla::Telemetry::ID id, const nsCString& label)
> +{
> + uint32_t labelId = 0;
> + if (NS_FAILED(gHistograms[id].label_id(label.get(), &labelId))) {
I'll comment on this here rather than in the patch that implements it: I think this style of API in C++ is a horrible idea, because you should almost always have constant string values, and this API means that you're always doing work at runtime that you shouldn't have to do!
That said, bsmedberg indicated in comment 3 that we need something like this, and while I'm curious about the examples, I'll take his word for it.
But the underlying implementation of label_id is terrible: it should be logarithmic time rather than linear time, because we should be able to arrange it such that labels are stored in sorted order. So can we fix that, please?
::: toolkit/components/telemetry/gen-histogram-enum.py
@@ +93,5 @@
> + print("};", file=output)
> +
> + print("\ntemplate<class T> struct IsCategoricalLabelEnum : tl::Or<", file=output)
> + print(",\n".join([" IsSame<T, %s>::value" % name for name,_,_ in enums]), file=output)
> + print(">::Type {};", file=output)
This method is not optimal. What you really want is something like:
template<typename T>
struct IsCategorialLabelEnum : public mozilla::FalseType {};
template<>
struct IsCategoricalLabelEnum<ENUM_TYPE1> : public mozilla::TrueType {};
template<>
struct IsCategoricalLabelEnum<ENUM_TYPE2> : public mozilla::TrueType {};
and so forth.
@@ +97,5 @@
> + print(">::Type {};", file=output)
> +
> + print("\ntemplate<class T> struct CategoricalLabelId {};", file=output)
> + for name,_,id in enums:
> + print("template<> struct CategoricalLabelId<%s> : IntegralConstant<uint32_t, %s>::Type {};" % (name, id), file=output)
This doesn't need ::Type, as IntegralConstant<>::Type is just IntegralConstant<>.
Attachment #8770582 -
Flags: review?(nfroyd) → feedback+
| Assignee | ||
Comment 37•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #35)
> Comment on attachment 8769665 [details] [diff] [review]
> Part 2 - Generate readable Telemetry string tables
>
> Review of attachment 8769665 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Drive-by review. The tables are written the way they are for a reason. :)
Ok, i think that shows that we need a big comment there on the reason :)
Do you think this adds unacceptable space usage across platforms with current compiler generations?
| Assignee | ||
Comment 38•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #36)
> But the underlying implementation of label_id is terrible: it should be
> logarithmic time rather than linear time, because we should be able to
> arrange it such that labels are stored in sorted order. So can we fix that,
> please?
So, from what i see we have few options here:
(1) generate that mapping/sorting at compile-time: this means additional space usage (i.e. would have to be outside of the common histogram string table)
(2) generate that kind of (label -> id) mapping on start-up: this means storage costs for all labels (potentially many), many of which might not actually be used
(3) build a lazy map on label use: we can cache the results locally, speeding up repeated use
(1) and (2) don't sound great. If lookup perf is a concern, lets do (3)?
Flags: needinfo?(nfroyd)
Comment 39•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #37)
> (In reply to Nathan Froyd [:froydnj] from comment #35)
> > Drive-by review. The tables are written the way they are for a reason. :)
>
> Ok, i think that shows that we need a big comment there on the reason :)
> Do you think this adds unacceptable space usage across platforms with
> current compiler generations?
Yes. Granted, the space increase is not *huge* (~40K on x86-64 Linux, roughly half that on 32-bit platforms), but it is certainly a regression that's completely avoidable, and it would be an even more noticeable regression with multiple content processes. It doesn't really matter what the compiler generation is, either.
If you want to make it more readable, adding a (generated) comment for each string that shows the actual value would be reasonable.
(In reply to Georg Fritzsche [:gfritzsche] from comment #38)
> (In reply to Nathan Froyd [:froydnj] from comment #36)
> > But the underlying implementation of label_id is terrible: it should be
> > logarithmic time rather than linear time, because we should be able to
> > arrange it such that labels are stored in sorted order. So can we fix that,
> > please?
>
> So, from what i see we have few options here:
In my reading, I missed that the linear search is actually per-histogram, rather than global. So the number of entries is (presumably) small, which makes a linear search much more reasonable. So I was out of line in grousing about the linear lookup: my apologies. Let's keep the linear lookup for now. (I think you can do this without any extra space and some cleverness, but given that linear search is over a small number of entries, I'm not going to spend a lot of time thinking about it.)
Flags: needinfo?(nfroyd)
| Assignee | ||
Comment 40•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #39)
> (In reply to Georg Fritzsche [:gfritzsche] from comment #37)
> > (In reply to Nathan Froyd [:froydnj] from comment #35)
> > > Drive-by review. The tables are written the way they are for a reason. :)
> >
> > Ok, i think that shows that we need a big comment there on the reason :)
> > Do you think this adds unacceptable space usage across platforms with
> > current compiler generations?
>
> Yes. Granted, the space increase is not *huge* (~40K on x86-64 Linux,
> roughly half that on 32-bit platforms), but it is certainly a regression
> that's completely avoidable, and it would be an even more noticeable
> regression with multiple content processes. It doesn't really matter what
> the compiler generation is, either.
>
> If you want to make it more readable, adding a (generated) comment for each
> string that shows the actual value would be reasonable.
Sounds good, that was my planned fallback option here.
| Assignee | ||
Comment 41•9 years ago
|
||
Attachment #8772437 -
Flags: review?(nfroyd)
| Assignee | ||
Updated•9 years ago
|
Attachment #8769665 -
Attachment is obsolete: true
| Assignee | ||
Comment 42•9 years ago
|
||
Rebase.
| Assignee | ||
Updated•9 years ago
|
Attachment #8769697 -
Attachment is obsolete: true
| Assignee | ||
Updated•9 years ago
|
Attachment #8772444 -
Flags: review+
| Assignee | ||
Updated•9 years ago
|
Attachment #8770257 -
Attachment is obsolete: true
| Assignee | ||
Comment 43•9 years ago
|
||
Addressed the other two points.
Attachment #8772445 -
Flags: review?(nfroyd)
| Assignee | ||
Updated•9 years ago
|
Attachment #8770582 -
Attachment is obsolete: true
Comment 44•9 years ago
|
||
Comment on attachment 8772437 [details] [diff] [review]
Part 2 - Generate readable Telemetry string tables
Review of attachment 8772437 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/shared_telemetry_utils.py
@@ +62,4 @@
> f.write("const char %s[] = {\n" % name)
> + for (string, offset) in entries:
> + if "*/" in string:
> + raise ValueError, "String in string table contains unexpected sequence '*/': %s" % string
Nice catch to save somebody some future frustration.
Attachment #8772437 -
Flags: review?(nfroyd) → review+
Comment 45•9 years ago
|
||
Comment on attachment 8772445 [details] [diff] [review]
Part 6 - Implement C++ API for categorical histograms
Review of attachment 8772445 [details] [diff] [review]:
-----------------------------------------------------------------
r=me. Thanks for catching that the mozilla::Or change was no longer needed.
Attachment #8772445 -
Flags: review?(nfroyd) → review+
| Assignee | ||
Comment 46•9 years ago
|
||
Thanks Nathan for the good points here.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=406d4064acb7
Comment 47•9 years ago
|
||
Pushed by georg.fritzsche@googlemail.com:
https://hg.mozilla.org/integration/fx-team/rev/6a5f0538b3c9
Part 1 - Refactor test_nsITelemetry.js. r=dexter
https://hg.mozilla.org/integration/fx-team/rev/58344005efca
Part 2 - Generate readable Telemetry string tables. r=froydnj
https://hg.mozilla.org/integration/fx-team/rev/45b83bf04a26
Part 3 - Refactor the histogram script type & presence checking. r=chutten
https://hg.mozilla.org/integration/fx-team/rev/a662e130c9de
Part 4 - Implement categorical histograms with JS API. r=chutten
https://hg.mozilla.org/integration/fx-team/rev/9f2ef29dee01
Part 5 - Restrict label values. r=chutten
https://hg.mozilla.org/integration/fx-team/rev/13b22aa53d11
Part 6 - Implement C++ API for categorical histograms. r=chutten,r=froyndj
https://hg.mozilla.org/integration/fx-team/rev/d8dcff77656e
Part 7 - Bonus - Explicitly restrict histogram names to current patterns. r=chutten
Comment 48•9 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/6a5f0538b3c9
https://hg.mozilla.org/mozilla-central/rev/58344005efca
https://hg.mozilla.org/mozilla-central/rev/45b83bf04a26
https://hg.mozilla.org/mozilla-central/rev/a662e130c9de
https://hg.mozilla.org/mozilla-central/rev/9f2ef29dee01
https://hg.mozilla.org/mozilla-central/rev/13b22aa53d11
https://hg.mozilla.org/mozilla-central/rev/d8dcff77656e
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox50:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in
before you can comment on or make changes to this bug.
Description
•