Open Bug 1334469 Opened 3 years ago Updated 2 years ago

Make sure Keyed Histogram APIs don't allow empty keys

Categories

(Toolkit :: Telemetry, defect, P3)

defect
Points:
1

Tracking

()

REOPENED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: gfritzsche, Unassigned)

References

(Depends on 1 open bug)

Details

(Whiteboard: [measurement:client])

Attachments

(1 file, 5 obsolete files)

Per this on empty histogram keys:

(Georg Fritzsche [:gfritzsche] from bug 1333624, comment #5)
> There is an internal artifact of the Histogram storage that makes this a bit
> harder to deal with.
> More importantly, i don't think any of the upstream tooling (e.g.
> telemetry.mozilla.org dashboard) expect & handle this.
> In any place where we should show a set of keys/labels for
> selection/consumption/..., empty keys will need special treatment.
Bug 1333624 will take care of this for the AutoTimer.
We should check that the rest of the API does not allow empty keys.
Assignee: nobody → alessio.placitelli
Priority: P2 → P1
Attached patch bug1334469.patch (obsolete) — Splinter Review
This patch asserts if empty keyes are provided for any action on Keyed histograms.

I added some test coverage in test_TelemetryHistogram.js too: it's conditionally enabled on non-debug builds only.

Even if we don't assert on empty keys but simply return nullptr, we have other asserts in place down the code path.

If that patch looks ok, a followup try push will identify any current use of empty keys and fix them.
Attachment #8831617 - Flags: review?(gfritzsche)
Status: NEW → ASSIGNED
Comment on attachment 8831617 [details] [diff] [review]
bug1334469.patch

Review of attachment 8831617 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ +934,5 @@
>                               bool subsession)
>  {
> +  // We do not allow empty keys in keyed scalars.
> +  if (key.IsEmpty()) {
> +    MOZ_ASSERT(false, "Keyed scalars do not allow empty keys.");

For the C++ code path, an assertion seems ok.
For JS and non-debug builds, we'll want to log errors though.

Is this not feasible?
Attachment #8831617 - Flags: review?(gfritzsche)
Attached patch bug1334469.patch (obsolete) — Splinter Review
It was a bit tricky, but this is correct. The programmed behaviour is:

- On debug builds, MOZ_ASSERT on both the C++ and JS path. Also print a console warning on the JS path.
- On relelease, it just prints a console warning on the JS path.
Attachment #8831617 - Attachment is obsolete: true
Attachment #8832092 - Flags: review?(gfritzsche)
Comment on attachment 8832092 [details] [diff] [review]
bug1334469.patch

Review of attachment 8832092 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ +962,5 @@
>  {
> +  // We do not allow empty keys in keyed scalars.
> +  if (key.IsEmpty()) {
> +    MOZ_ASSERT(false, "Keyed scalars do not allow empty keys.");
> +    return NS_ERROR_ILLEGAL_VALUE;

Ok, triggering assertions for wrong values from JavaScript here is not what we want.
Let's take a step back to just always printing warnings to the browser console.
Attachment #8832092 - Flags: review?(gfritzsche)
Attached patch bug1334469.patch (obsolete) — Splinter Review
I've changed the patch to always print an error in the browser console: we don't assert anymore.
Attachment #8832092 - Attachment is obsolete: true
Attachment #8833249 - Flags: review?(gfritzsche)
Comment on attachment 8833249 [details] [diff] [review]
bug1334469.patch

Review of attachment 8833249 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ +961,5 @@
>                               bool subsession)
>  {
> +  // We do not allow empty keys in keyed scalars.
> +  if (key.IsEmpty()) {
> +    return NS_ERROR_ABORT;

It becomes hard to track those kind of assertions through the code.
I think we should do those kind of checks on the outer API, i.e. the TelemetryHistogram::*, JSHistogram_*, JSKeyedHistogram_* functions.
Then we can also clearly log the problems without refactoring yet.

If you are worried about code duplication you can use a helper function.

::: toolkit/components/telemetry/tests/unit/test_TelemetryHistograms.js
@@ +492,5 @@
> +  const KEYED_ID = "TELEMETRY_TEST_KEYED_COUNT";
> +  let h = Telemetry.getKeyedHistogramById(KEYED_ID);
> +
> +  // Try to add to an empty key and make sure nothing happens.
> +  h.add("");

Are we calling this out clearly in the documentation?
Attachment #8833249 - Flags: review?(gfritzsche)
Points: --- → 1
Attached patch bug1334469.patch (obsolete) — Splinter Review
Attachment #8833249 - Attachment is obsolete: true
Attachment #8839416 - Flags: review?(gfritzsche)
Comment on attachment 8839416 [details] [diff] [review]
bug1334469.patch

Review of attachment 8839416 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ +1746,5 @@
>      return true;
>    }
>  
> +  if (key.IsEmpty()) {
> +    LogToBrowserConsole(nsIScriptError::errorFlag, NS_LITERAL_STRING("Empty keys are not allowed."));

Why not use the same message in all 3 changes here?
"Empty histogram keys are not allowed."

@@ +2145,5 @@
>    }
>  
> +  if (aKey.IsEmpty()) {
> +    LogToBrowserConsole(nsIScriptError::errorFlag,
> +      NS_LITERAL_STRING(" - The histogram key must not be empty."));

I've tested this using:
> let error = Cc["@mozilla.org/scripterror;1"].createInstance(Ci.nsIScriptError);
> error.init(" - foo", "", "", 0, 0, Ci.nsIScriptError.errorFlag, "chrome javascript")
> Services.console.logMessage(error)

This ends up printing just
> " - foo"

Please test this message and fix it if needed.

::: toolkit/components/telemetry/docs/collection/histograms.rst
@@ +87,5 @@
>  
>  Keyed Histograms
>  ----------------
>  
> +Keyed histograms are collections of one of the histogram types above, indexed by a non-empty string key. This is for example useful when you want to break down certain counts by a name, like how often searches happen with which search engine.

Nit: The "non-empty" is too much detail for the first paragraph, which should give an overview.
Add an empty line and a note that keys can't be empty to the next paragraph.

@@ +202,5 @@
>  
>    let keyed = Services.telemetry.getKeyedHistogramById("TAG_SEEN_COUNTS");
>    keyed.add("blink");
>  
> +Note that ``nsITelemetry.getHistogramById()`` will throw an ``NS_ERROR_ILLEGAL_VALUE`` JavaScript exception if it is called with an invalid histogram ID. The ``add()`` function will not throw if it fails or is called with an empty key string for keyed histograms, instead it prints an error in the browser console.

I think we shouldn't add more details to this sentence, its hard to read.
Its probably time to convert this into a list (and in bug 1338148 we can move these details to the API doc part).
How about:

> Note that:
> - ``nsITelemetry.getHistogramById()`` will throw an ``NS_ERROR_ILLEGAL_VALUE`` JavaScript exception when called with an invalid histogram ID
> - the ``add()`` function will not throw on failure, but log an error to the browser console
> - for keyed histograms, calling ``add()`` with an empty key will be ignored and log an error
Attachment #8839416 - Flags: review?(gfritzsche) → feedback+
Attached patch bug1334469.patch (obsolete) — Splinter Review
Thanks for the review, we should be there now.
Attachment #8839416 - Attachment is obsolete: true
Attachment #8839447 - Flags: review?(gfritzsche)
Comment on attachment 8839447 [details] [diff] [review]
bug1334469.patch

Review of attachment 8839447 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks.

::: toolkit/components/telemetry/docs/collection/histograms.rst
@@ +89,5 @@
>  ----------------
>  
>  Keyed histograms are collections of one of the histogram types above, indexed by a string key. This is for example useful when you want to break down certain counts by a name, like how often searches happen with which search engine.
> +
> +The keys can't be empty strings. Also note that when you need to record for a small set of known keys, using separate plain histograms is more efficient.

"Note that keys can't be empty strings. Also note that ..."
Or just a list like the change below?
Attachment #8839447 - Flags: review?(gfritzsche) → review+
Attached patch bug1334469.patchSplinter Review
Attachment #8839447 - Attachment is obsolete: true
Attachment #8839457 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/3bb2c974b5dc
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
As part of this bug, did we find code that made this mistake as well?

After this patch, I am seeing "Empty histogram keys are not allowed" by the thousands on the console, all with (unknown) as the location.

I'm hitting the message in internal_JSKeyedHistogram_Add
Depends on: 1341996
(In reply to Mike Kaply [:mkaply] from comment #16)
> As part of this bug, did we find code that made this mistake as well?
> 
> After this patch, I am seeing "Empty histogram keys are not allowed" by the
> thousands on the console, all with (unknown) as the location.
> 
> I'm hitting the message in internal_JSKeyedHistogram_Add

I filed bug 1341996 for this, it's due to the DevTools' DEVTOOLS_JAVASCRIPT_ERROR_DISPLAYED.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
also backed out from m-c
https://hg.mozilla.org/mozilla-central/rev/4134c04c8028
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
I ran an analysis and there are more affected histograms:
DEVTOOLS_JAVASCRIPT_ERROR_DISPLAYED 	
URLCLASSIFIER_UPDATE_REMOTE_STATUS2
CANVAS_WEBGL_ACCL_FAILURE_ID
CANVAS_WEBGL_FAILURE_ID
FX_MIGRATION_ERRORS

See:
https://gist.github.com/georgf/94ca77fe6174ec07077504b24379932a
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Copying :jaws comment from bug 1341996:

(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #6)
> While fixing this, can the 'empty key' error message start including which
> histogram this is related to? It would help for debugging purposes.

Seems like this would be important to have if this is relanded.
Depends on: 1346196
Depends on: 1346201
Depends on: 1346203
Depends on: 1346205
Assignee: alessio.placitelli → nobody
Priority: P1 → P2
See Also: → 1333624
Outcomes from bug 1346205:
- we should track histogram recording into an empty key in a separate metric (keyed histogram?)
- we should reprioritize making the errors here into test failures (bug 1324774)
I ran the gist again [1], with some small changes:

- It's now using the Dataset API (a bit faster);
- It's just retrieving the build ids that have the fixes from the bug blocking this one

Good parts:

- Looks like all but URLCLASSIFIER_UPDATE_REMOTE_STATUS2 are not happening anymore.

Bad parts:

- Looks like URLCLASSIFIER_UPDATE_REMOTE_STATUS2 wasn't fixed by bug 1346196
- URLCLASSIFIER_UPDATE_REMOTE_NETWORK_ERROR and URLCLASSIFIER_UPDATE_SERVER_RESPONSE_TIME were recently added and are accumulating to empty keys as well :(

[1] - https://gist.github.com/Dexterp37/c0dd82374b49cf17539ded0e680af585
Given the bad parts from comment 23, we're changing strategy for this. We'll land this and add a whitelist (sad) for the offending histograms.
Priority: P2 → P3
See Also: → 1467826
You need to log in before you can comment on or make changes to this bug.