Closed
Bug 1353678
Opened 9 years ago
Closed 9 years ago
Make duplicate Histograms.json keys build errors
Categories
(Toolkit :: Telemetry, enhancement, P3)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla55
| Tracking | Status | |
|---|---|---|
| firefox55 | --- | fixed |
People
(Reporter: gfritzsche, Assigned: djmdeveloper060796, Mentored)
Details
(Whiteboard: [measurement:client] [lang=python] [good next bug])
Attachments
(1 file, 2 obsolete files)
|
1.75 KB,
patch
|
gfritzsche
:
review+
|
Details | Diff | Splinter Review |
Per bug 1353666, it is possible to have duplicate keys in Histograms.json.
We should add checking to histogram_tools.py [1] that makes this an error at build-time.
We parse the file at [2], where json.load() already returns a dict.
Either we can make json.load() be stricter or we need to add other checking methods.
1: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/histogram_tools.py
2: https://dxr.mozilla.org/mozilla-central/rev/81e37ef1360ba4505726ddf542ebdcc952a57578/toolkit/components/telemetry/histogram_tools.py#435
| Assignee | ||
Comment 1•9 years ago
|
||
@Georg Hi I would like to work on this. How should I proceed?
Flags: needinfo?(gfritzsche)
| Reporter | ||
Comment 2•9 years ago
|
||
You can test the behavior by adding duplicate keys like the ones from bug 1353666 to Histograms.json.
A quick web search finds suggestions like this to deal with duplicate keys:
http://stackoverflow.com/questions/14902299/json-loads-allows-duplicate-keys-in-a-dictionary-overwriting-the-first-value
From there raise an exception, ideally with the offending histogram names and property keys.
Assignee: nobody → djmdeveloper060796
Flags: needinfo?(gfritzsche)
| Assignee | ||
Comment 3•9 years ago
|
||
Added check for duplicate keys.
Attachment #8856343 -
Flags: review?(gfritzsche)
| Reporter | ||
Comment 4•9 years ago
|
||
Comment on attachment 8856343 [details] [diff] [review]
bug1353678.patch
Review of attachment 8856343 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks for the patch.
This looks close already, but i think we can actually do this more efficiently.
::: toolkit/components/telemetry/histogram_tools.py
@@ +429,5 @@
>
>
> +# The json being parsed should not contain duplicate keys. This function
> +# checks for duplicate keys.If duplicate key is present it raises a
> +# ValueError mentioning the duplicate key
Nits:
- "JSON"
- We can describe this a bit shorter. Together with the change below e.g.:
"This hook function loads the histograms into an OrderedDict.
It will raise a ValueError if duplicate keys are found."
@@ +434,5 @@
> +def dict_raise_on_duplicates(ordered_pairs):
> + d = {}
> + for key, value in ordered_pairs:
> + if key in d:
> + raise ValueError("duplicate key: %r" % (key,))
No need for "%r", we only expect string keys.
"Found duplicate key in Histograms file: %s"
@@ +448,5 @@
> def from_Histograms_json(filename):
> + # Checking for duplicate keys
> + with open(filename, 'r') as f:
> + try:
> + histograms = json.load(f, object_pairs_hook=dict_raise_on_duplicates)
Thinking about this, this will load the file from disk two times, which can impact build time.
Let's instead:
- change the object_pairs_hook below to use the new helper function
- return an OrderDict from the helper function
- rename the helper function to something more suitable like "load_histograms_into_dict"
Attachment #8856343 -
Flags: review?(gfritzsche) → feedback+
| Assignee | ||
Comment 5•9 years ago
|
||
I did the changes you mentioned above.
Attachment #8856343 -
Attachment is obsolete: true
Attachment #8856541 -
Flags: review?(gfritzsche)
| Reporter | ||
Comment 6•9 years ago
|
||
Comment on attachment 8856541 [details] [diff] [review]
bug1353678.patch
Review of attachment 8856541 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks, this looks good! I only have two small change requests below.
::: toolkit/components/telemetry/histogram_tools.py
@@ +434,5 @@
> +def load_histograms_into_dict(ordered_pairs):
> + d = collections.OrderedDict()
> + for key, value in ordered_pairs:
> + if key in d:
> + raise ValueError("Found duplicate key in Histograms file: %s" % (key,))
Nit: You don't need the trailing comma here. Just use e.g.:
> ("..." % key)
@@ +436,5 @@
> + for key, value in ordered_pairs:
> + if key in d:
> + raise ValueError("Found duplicate key in Histograms file: %s" % (key,))
> + else:
> + d[key] = value
We don't need an `else:` here - if the above condition is true we always throw.
Attachment #8856541 -
Flags: review?(gfritzsche) → feedback+
| Reporter | ||
Comment 7•9 years ago
|
||
I also saw Python linting errors:
> <...>/toolkit/components/telemetry/histogram_tools.py
> 432:1 error too many blank lines (3) E303 (flake8)
> 434:1 error expected 2 blank lines, found 3 E302 (flake8)
Please run this on the code:
> ./mach lint -l flake8 toolkit/components/telemetry
| Assignee | ||
Comment 8•9 years ago
|
||
@George did the changes you mentioned. Lint tests are passing.
Attachment #8856541 -
Attachment is obsolete: true
Attachment #8857086 -
Flags: review?(gfritzsche)
| Reporter | ||
Comment 9•9 years ago
|
||
Comment on attachment 8857086 [details] [diff] [review]
bug1353678.patch
Review of attachment 8857086 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks!
Attachment #8857086 -
Flags: review?(gfritzsche) → review+
| Reporter | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 10•9 years ago
|
||
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0c63f689efc7
Made duplicate Histograms.json keys build errors. r=gfritzsche
Keywords: checkin-needed
Comment 11•9 years ago
|
||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in
before you can comment on or make changes to this bug.
Description
•