The default bug view has changed. See this FAQ.

generate TelemetryHistograms.h from JSON

RESOLVED FIXED in mozilla17

Status

()

Toolkit
Telemetry
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: froydnj, Assigned: froydnj)

Tracking

unspecified
mozilla17
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(8 attachments, 4 obsolete attachments)

(Assignee)

Description

5 years ago
Created attachment 650557 [details]
strawman Histograms.json file

We have a couple bugs open (bug 748417, bug 753447, bug 778213) that would be somewhat easier to pull off if our histogram data were stored in a real format, rather than something shoved through the preprocessor.

Attached is a strawman proposal for what the data format would look like.  The only thing I'm not sure about is that the cpp_guard seems to easy to leave out, leading to unintentional bustage somewhere along the way.
Attachment #650557 - Flags: feedback?(taras.mozilla)

Updated

5 years ago
Attachment #650557 - Attachment mime type: application/json → text/plain

Comment 1

5 years ago
Comment on attachment 650557 [details]
strawman Histograms.json file

Yeah I agree with the general idea
Attachment #650557 - Flags: feedback?(taras.mozilla) → feedback+
(Assignee)

Updated

5 years ago
Depends on: 782373
(Assignee)

Comment 2

5 years ago
Created attachment 652839 [details] [diff] [review]
convert TelemetryHistograms.h to JSON

The rather large patch that removes the old stuff and adds the new stuff.

Will attach conversion scripts in a moment.
Attachment #650557 - Attachment is obsolete: true
Attachment #652839 - Flags: review?(taras.mozilla)
(Assignee)

Comment 3

5 years ago
Created attachment 652840 [details]
python portion of the conversion script

This is the Python conversion script.  It requires a shell wrapper for dealing with various bits of cpp and whatnot that I will attach shortly.
(Assignee)

Comment 4

5 years ago
Created attachment 652844 [details]
shell portion of the conversion script

...and the promised shell part.  Put convert-to-json.{py,sh} in the same directory and run:

./convert-to-json.sh /path/to/TelemetryHistograms.h /path/to/output/json/file

The script should work just fine on any TelemetryHistograms.h from mozilla-central as of today; past revisions may require a bit of tweaking.
(Assignee)

Comment 5

5 years ago
Adding Harsha to point him at the scripts.

Comment 6

5 years ago
Thanks. Just to confirm if i checkout mozilla-central I don't need to apply your patch to TelemetryHistogram.h ?
(Assignee)

Comment 7

5 years ago
(In reply to schintalapani from comment #6)
> Thanks. Just to confirm if i checkout mozilla-central I don't need to apply
> your patch to TelemetryHistogram.h ?

That's correct.  Though if you checkout mozilla-central and apply the patch, you shouldn't need to use the conversion scripts, since the patch includes toolkit/components/telemetry/Histograms.json.  You can use that json file directly.

Comment 8

5 years ago
Comment on attachment 652839 [details] [diff] [review]
convert TelemetryHistograms.h to JSON

gen-histogram-data.py needs license boilerplate.
There no useful comments in the file. Having a general description near top would be handy.
Also need description various gotchas, ie distinction between 'BOOLEAN' and 'boolean'


For a followup bug:
I'm not sure a single histogram json is a great idea. I think we should break .jsons into a directory of their own and aggregate them based on various build flags.

For example,
cpp_guard stuff could just live on filename level, ala histograms.XP_WIN.json
We could still autogenerate networking histograms via the build system.
We can also better group related histograms in own files, do channel-specific histograms, etc.
Attachment #652839 - Flags: review?(taras.mozilla) → review-
(Assignee)

Comment 9

5 years ago
(In reply to Taras Glek (:taras) from comment #8)
> gen-histogram-data.py needs license boilerplate.

Ah, yes, license boilerplate.  Forgot about that.

> There no useful comments in the file. Having a general description near top
> would be handy.

I refuse to break convention with the many millions of lines of source code already in mozilla-central by addressing this concern. ;)

> Also need description various gotchas, ie distinction between 'BOOLEAN' and
> 'boolean'

That particular distinction will go away once all the appropriate histograms are converted to use HISTOGRAM_BOOLEAN.

> For a followup bug:
> I'm not sure a single histogram json is a great idea. I think we should
> break .jsons into a directory of their own and aggregate them based on
> various build flags.
> 
> For example,
> cpp_guard stuff could just live on filename level, ala histograms.XP_WIN.json
> We could still autogenerate networking histograms via the build system.
> We can also better group related histograms in own files, do
> channel-specific histograms, etc.

I'd have to think about this.  It seems like it would be useful to have as little build knowledge as necessary so that any processing of the JSON can be done without having to perform a full build.  Splitting things into individual files wouldn't necessarily go against that, but autogeneration via the build system would.  I suppose you could provide pushbutton scripts for such purposes, to be used via the build system and any other interested parties.

Comment 10

5 years ago
Comment on attachment 652839 [details] [diff] [review]
convert TelemetryHistograms.h to JSON

Please separate the mechanical .h->.json translation into own patch in future patches.

I think we should factor out
+    with open(filename, 'r') as f:
+        histograms = json.load(f, object_pairs_hook=json.OrderedDict)
...
+        for (name, definition) in histograms.iteritems():
Into own python module even if it's a trivial amount of code.

This will make it easier to evolve our json functionality faster.

+if __name__ == '__main__':
+    main(sys.argv[1:])

Why not just do main(sys.argv[1])?

Overall this looks ok.

Updated

5 years ago
Attachment #652844 - Attachment mime type: application/x-sh → text/plain

Updated

5 years ago
Attachment #652840 - Attachment mime type: text/x-python → text/plain

Comment 11

5 years ago
Created attachment 654027 [details]
reference.json

Hi,
  Please take a look at the reference.json and confirm if its the right file. I looked at the older reference.json its completely different from this one so confused.
Thanks,
Harsha
(Assignee)

Updated

5 years ago
Attachment #654027 - Attachment mime type: application/json → text/plain
(Assignee)

Comment 12

5 years ago
Created attachment 654255 [details]
Histograms.json

(In reply to schintalapani from comment #11)
>   Please take a look at the reference.json and confirm if its the right
> file. I looked at the older reference.json its completely different from
> this one so confused.

I'm not sure what you mean by "older reference.json"; the file I'm attaching is the version I've submitted in my patches and will, with minor modifications, be committed to the repository once the surrounding infrastructure is approved.

Note that it has many more histograms than the reference.json you submitted; how did you generate your reference.json?

Comment 13

5 years ago
I checked out mozilla-central repo and used TelemetryHistograms.h and convert-to-json.sh to generate the reference.json
(Assignee)

Comment 14

5 years ago
(In reply to schintalapani from comment #13)
> I checked out mozilla-central repo and used TelemetryHistograms.h and
> convert-to-json.sh to generate the reference.json

What platform are you using?

Comment 15

5 years ago
OS X 10.8
(Assignee)

Comment 16

5 years ago
Created attachment 654344 [details] [diff] [review]
TelemetryHistograms.h -> JSON patch part

The TelemetryHistograms.h to JSON bit of the patch, separated out for ease of review.
Attachment #654344 - Flags: review?(taras.mozilla)
(Assignee)

Comment 17

5 years ago
Created attachment 654345 [details] [diff] [review]
Histograms.json processing machinery

The rest of the story.  License boilerplate added: do we need a header on Histograms.json?  That's going to complicate parsing...
Attachment #654345 - Flags: review?(taras.mozilla)
(Assignee)

Comment 18

5 years ago
Created attachment 654346 [details]
shell portion of the conversion script

(In reply to schintalapani from comment #15)
> OS X 10.8

Hum, that's strange.  Maybe the sed on your machine doesn't work the same as sed on my Linux box.  Could you try the attached script, please?
Attachment #652844 - Attachment is obsolete: true

Comment 19

5 years ago
I am getting this on OS X 
sed: 1: "/^\#define HISTOGRAM_/, ...": expected context address
I ran it in archlinux it works and matches your json output.
Thanks,
Harsha
(Assignee)

Comment 20

5 years ago
Created attachment 654363 [details]
shell portion of the conversion script
Attachment #654346 - Attachment is obsolete: true

Comment 21

5 years ago
Created attachment 654388 [details]
older version of reference.json

This is the older version of json I am talking about and there is no "buckets" integer array element in this new version.
(Assignee)

Comment 22

5 years ago
(In reply to schintalapani from comment #21)
> This is the older version of json I am talking about and there is no
> "buckets" integer array element in this new version.

Ah, we are talking past each other, then.

The JSON I am attaching to this bug is merely a description of the essential bits of histograms.  The JSON you want is that generated by the scripts attached to bug 748417.  I don't believe it's particularly difficult to get from this bug's JSON description to the JSON you want, but the code to do that has not yet been written. :)

Comment 23

5 years ago
(In reply to Nathan Froyd (:froydnj) from comment #22)
> attached to bug 748417.  I don't believe it's particularly difficult to get
> from this bug's JSON description to the JSON you want, but the code to do
> that has not yet been written. :)

By written, Nathan meant cut'n'pasted :)

(In reply to Nathan Froyd (:froydnj) from comment #17)
> Created attachment 654345 [details] [diff] [review]
> Histograms.json processing machinery
> 
> The rest of the story.  License boilerplate added: do we need a header on
> Histograms.json?  That's going to complicate parsing...

It's a datafile, lets skip boilerplate because format does not allow it.

Comment 24

5 years ago
Comment on attachment 654344 [details] [diff] [review]
TelemetryHistograms.h -> JSON patch part

Thanks for factoring this out.
Attachment #654344 - Flags: review?(taras.mozilla) → review+

Comment 25

5 years ago
Comment on attachment 654345 [details] [diff] [review]
Histograms.json processing machinery

+histograms_file := $(srcdir)/Histograms.json
+histogram_enum_file := TelemetryHistogramEnums.h
+histogram_data_file := TelemetryHistogramData.inc

lowercase LHS seems to be pretty uncommon in our makefiles

+def allowed_keys(*keylist):
+    always_allowed = ['kind', 'description', 'cpp_guard']
+    return list(keylist) + always_allowed
For sanity, please rename the function to get_allowed_keys() so we aren't mixing a variable a function of the same name within the same func. I'm not even sure what the value of this function is.


+    with open(filename, 'r') as f:
+        histograms = json.load(f, object_pairs_hook=json.OrderedDict)
+        return histograms.iteritems()

This may be better with yield. but matter of taste really.


+if __name__ == '__main__':
+    main(sys.argv[1:])

Why not just do main(sys.argv[1])?(this is a holdover from last review)

Comment 26

5 years ago
(In reply to Taras Glek (:taras) from comment #24)
> Comment on attachment 654344 [details] [diff] [review]
> TelemetryHistograms.h -> JSON patch part
> 
> Thanks for factoring this out.

Note I do not want to land this with "BOOLEAN". I do not want metrics to deal with that bogus distinction
(Assignee)

Comment 27

5 years ago
(In reply to Taras Glek (:taras) from comment #25)
> +histograms_file := $(srcdir)/Histograms.json
> +histogram_enum_file := TelemetryHistogramEnums.h
> +histogram_data_file := TelemetryHistogramData.inc
> 
> lowercase LHS seems to be pretty uncommon in our makefiles

The new hotness is that uppercase names are for build-system special variables and lowercase names are for makefile-local things.

> +def allowed_keys(*keylist):
> +    always_allowed = ['kind', 'description', 'cpp_guard']
> +    return list(keylist) + always_allowed
> For sanity, please rename the function to get_allowed_keys() so we aren't
> mixing a variable a function of the same name within the same func. I'm not
> even sure what the value of this function is.

I can just remove it.

> +    with open(filename, 'r') as f:
> +        histograms = json.load(f, object_pairs_hook=json.OrderedDict)
> +        return histograms.iteritems()
> 
> This may be better with yield. but matter of taste really.

as in:

  for item in histograms.iteritems():
    yield item

?  I don't care either way (it'll be changed to yield once other refactoring goes in).
 
> +if __name__ == '__main__':
> +    main(sys.argv[1:])
> 
> Why not just do main(sys.argv[1])?(this is a holdover from last review)

It is The Python Way.  Particularly useful if you're going to use a given file as a module, but you also want it to act as a script (testing, etc.).  I don't particularly care either way.

Comment 28

5 years ago
Comment on attachment 654345 [details] [diff] [review]
Histograms.json processing machinery

Other than minor nits I mentioned, this looks good. 
I'd like glandium to review the makefile bits.
Attachment #654345 - Flags: review?(taras.mozilla)
Attachment #654345 - Flags: review?(mh+mozilla)
Attachment #654345 - Flags: review+
(Assignee)

Comment 29

5 years ago
Created attachment 655047 [details]
python portion of the conversion script

(In reply to Taras Glek (:taras) from comment #26)
> > TelemetryHistograms.h -> JSON patch part
>
> Note I do not want to land this with "BOOLEAN". I do not want metrics to
> deal with that bogus distinction

Here's a version of the script which handles boolean histograms that get defined with the HISTOGRAM macro as opposed to the HISTOGRAM_BOOLEAN macro.

It also makes attempts to coerce expressions used for the upper bounds of histograms to numbers; doing this means that we can generate bucket ranges for a larger number of histograms than we could before.  I suppose we could just stick the coercion into the reading library; I am ambivalent about this one way or the other.
Attachment #652840 - Attachment is obsolete: true
Comment on attachment 654345 [details] [diff] [review]
Histograms.json processing machinery

Review of attachment 654345 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Makefile.in
@@ +71,5 @@
> +	$(PYTHON) $(srcdir)/gen-histogram-data.py $< > $@
> +
> +Telemetry.$(OBJ_SUFFIX): $(histogram_data_file)
> +
> +GARBAGE += $(histogram_data_file)

You want to remove histogram_enum_file too.
Attachment #654345 - Flags: review?(mh+mozilla) → review+
(Assignee)

Comment 31

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/b4316e1c474d
Assignee: nobody → nfroyd
Status: NEW → ASSIGNED
https://hg.mozilla.org/mozilla-central/rev/b4316e1c474d
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla17
Depends on: 790615
You need to log in before you can comment on or make changes to this bug.