Closed Bug 1121006 Opened 9 years ago Closed 9 years ago

FHR data migration: org.mozilla.translation

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set
normal

Tracking

(firefox43 fixed)

RESOLVED FIXED
Firefox 43
Tracking Status
firefox43 --- fixed

People

(Reporter: gfritzsche, Assigned: yarik.sheptykin, Mentored)

References

Details

(Whiteboard: [help][good next bug][lang=js])

Attachments

(1 file, 4 obsolete files)

No longer depends on: 1121001
Blocks: 1137160
No longer blocks: 1120356
This entirely consists of daily counts and flags.
It should be really straight-forward to port to using histograms.
The current implementation is counters and numeric fields around here:
https://hg.mozilla.org/mozilla-central/annotate/3d846527576f/browser/components/translation/Translation.jsm#l406

Most of these just map to plain histograms, the *ByLanguage fields map to keyed histograms.
We could just fill those histograms directly from Translation.jsm
Whiteboard: [help]
Mentor: felipc
Whiteboard: [help] → [help][good next bug][lang=js]
This is not urgently required on 39 and there are currently no running plans involving this.
Moving this over to the post-39 work for now.
Blocks: 1122482
No longer blocks: 1137160
Attached patch bug1121006.patch (obsolete) — Splinter Review
Hi guys!

I replaced the FHR provider with a provider that saves metrics in histograms. I updated the browser_translation_fhr.js test trying to keep all testing logic intact. I also merged the patch with the freshest master but due to frequent changes in Histograms.json there might be conflicts when you try to apply this patch.

I found a xpcshell test for (now inexistent) TranslationProvider. I left it untouched, because I don't know what is best to do there. Should I try to port the test to work with the new provider? Or should we just drop it? The test is not necessary because browser_translation_fhr.js covers much of the new telemetry provider. But it might be more convenient to test the provider in xpcshell.
Assignee: nobody → yarik.sheptykin
Status: NEW → ASSIGNED
Attachment #8610565 - Flags: feedback?(gfritzsche)
Attachment #8610565 - Flags: feedback?(felipc)
Iaroslav, sorry that i haven't gotten here yet!
I will be away until tuesday, i will try to take a proper look over it then, for now only a quick first look.
Comment on attachment 8610565 [details] [diff] [review]
bug1121006.patch

Review of attachment 8610565 [details] [diff] [review]:
-----------------------------------------------------------------

Open question: Do we actually want to strip out the FHR provider now too?
Or do we want to keep both that and the Telemetry implementation.

Felipe, if this isn't actively used right now we should be fine with just switching over to Telemetry?

::: browser/components/translation/Translation.jsm
@@ +46,5 @@
>      return this._defaultTargetLanguage;
>    },
>  
>    documentStateReceived: function(aBrowser, aData) {
> +    TranslationHealthReport.collectDailyData();

Why do we need that?
Can't we just record the measurements directly whenever the events occur?

@@ +400,5 @@
> +    SHOW_UI               : "SHOULD_TRANSLATION_UI_APPEAR",
> +    DETECT_LANG           : "SHOULD_AUTO_DETECT_LANGUAGE",
> +  };
> +  for(let key of Object.keys(this.HISTOGRAMS)) {
> +    this.HISTOGRAMS[key] = Services.telemetry.getHistogramById(this.HISTOGRAMS[key]);

I don't think we want to keep the histogram objects around all the time (for memory usage).
It's fine to just retrieve them when needed.

@@ +425,1 @@
>      return this._enqueueTelemetryStorageTask(function* recordTask() {

We don't need to wrap anything in the storage task wrapper anymore.
You can just record directly in Telemetry.
Dito the usages below.

@@ +478,5 @@
>      }.bind(this));
>    },
>  
>    _enqueueTelemetryStorageTask: function (task) {
>      if (!Services.prefs.getBoolPref("toolkit.telemetry.enabled")) {

Felipe, i see that we only submitted any of the measurements here when telemetry.enabled?
If so, we don't even need to set the probes to opt-out for release, right?
Attachment #8610565 - Flags: feedback?(gfritzsche)
(In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> Comment on attachment 8610565 [details] [diff] [review]
> bug1121006.patch
> 
> Review of attachment 8610565 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Open question: Do we actually want to strip out the FHR provider now too?
> Or do we want to keep both that and the Telemetry implementation.
> 
> Felipe, if this isn't actively used right now we should be fine with just
> switching over to Telemetry?

We aren't using this at the moment, so yeah I think it's fine to just move directly to telemetry and remove FHR.

>  
> >    _enqueueTelemetryStorageTask: function (task) {
> >      if (!Services.prefs.getBoolPref("toolkit.telemetry.enabled")) {
> 
> Felipe, i see that we only submitted any of the measurements here when
> telemetry.enabled?
> If so, we don't even need to set the probes to opt-out for release, right?

I didn't follow the logic. What you mean by opt-out for release?
Comment on attachment 8610565 [details] [diff] [review]
bug1121006.patch

Review of attachment 8610565 [details] [diff] [review]:
-----------------------------------------------------------------

Do we need to remove http://mxr.mozilla.org/mozilla-central/source/browser/components/translation/translation.manifest ?

::: browser/components/translation/Translation.jsm
@@ +272,2 @@
>   */
>  let TranslationHealthReport = {

Maybe rename this?
Attachment #8610565 - Flags: feedback?(felipc) → feedback+
(In reply to :Felipe Gomes from comment #7)
> > >    _enqueueTelemetryStorageTask: function (task) {
> > >      if (!Services.prefs.getBoolPref("toolkit.telemetry.enabled")) {
> > 
> > Felipe, i see that we only submitted any of the measurements here when
> > telemetry.enabled?
> > If so, we don't even need to set the probes to opt-out for release, right?
> 
> I didn't follow the logic. What you mean by opt-out for release?

FHR data-collection is opt-out on release, Telemetry is opt-in.
The check above makes the translation FHR provider only submit data on release if a user opted into Telemetry.
So, if that was intentional we don't need to add |"releaseChannelCollection": "opt-out"| to the histogram definitions here.
Comment on attachment 8610565 [details] [diff] [review]
bug1121006.patch

Review of attachment 8610565 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/translation/Translation.jsm
@@ +475,3 @@
>      return this._enqueueTelemetryStorageTask(function* recordTask() {
> +      this.HISTOGRAMS.SHOW_UI.add(Services.prefs.getBoolPref(TRANSLATION_PREF_SHOWUI));
> +      this.HISTOGRAMS.DETECT_LANG.add(Services.prefs.getBoolPref("browser.translation.detectLanguage"));

Flags are for only collecting a value once per browser-session, once flipped to true/1 they don't go back.
We should either only collect flag values exactly once per browser session (e.g. at startup) or use a boolean histogram.

::: toolkit/components/telemetry/Histograms.json
@@ +7990,5 @@
> +  },
> +  "SHOULD_TRANSLATION_UI_APPEAR": {
> +    "expires_in_version": "default",
> +    "kind": "flag",
> +    "description": "Tracks sutiations when the user opts for displaying translation UI"

Nit: "situations"

@@ +7995,5 @@
> +  },
> +  "SHOULD_AUTO_DETECT_LANGUAGE": {
> +    "expires_in_version": "default",
> +    "kind": "flag",
> +    "description": "Tracks sutiations when the user opts for auto-detecting the language of a page"

Dito.
Sorry it took a while for the feedback to accumulate here Iaroslav - let us know if anything is still unclear now :)
(In reply to Georg Fritzsche [:gfritzsche] from comment #11)
> Sorry it took a while for the feedback to accumulate here Iaroslav - let us
> know if anything is still unclear now :)

Thanks for the feedback, guys! I actually do have a short question. In comment #10 you wrote

> We should either only collect flag values exactly once per browser session
> (e.g. at startup) or use a boolean histogram.

As far as I understand the previous (FHR) implementation the idea was to collect this metric daily [1] recording the most recent value of the flags. Meaning that if the flags were updated during the day only the last value would matter. This usage seems closer to a boolean histogram.  
But, I am not familiar with the firefox startup process. Searching MDN, MXR did not tell me much. Could you point me to a place in code which would be appropriate for updating these metrics. Or a sample bug that shows how to handle this properly.

1. http://mxr.mozilla.org/mozilla-central/source/browser/components/translation/Translation.jsm#592
Flags: needinfo?(gfritzsche)
I'll forward to Felipe for the startup question as i'm not sure what the best approach for the Translation module is.

As a side-note, we are moving away from the "daily" collection model of FHR here, instead we prefer recording the value "live" directly into Telemetry.
I think we should be fine with only collecting the pref values once at startup as flags.
Alternatively we could record them as boolean, once at startup and once whenever they change.
Flags: needinfo?(gfritzsche) → needinfo?(felipc)
Hi guys!

Meanwhile, I updated the patch as according to your feedback:
- Corrected spelling in Histogram.json
- Replaced histogram objects with calls to find[Keyed]HistogramById
- Removed Task.spawn from the record telemetry wrapper.
- Moved updating of flag histograms into the Provider's constructor, which I believe will only be executed once per session. But I am waiting for Felipe's answer regarding this.
- removed xpcshell test for fhr provider.
Attachment #8610565 - Attachment is obsolete: true
Attachment #8614621 - Flags: review?(gfritzsche)
Attachment #8614621 - Flags: review?(felipc)
Forgot one thing: 
I also renamed browser_translation_fhr.js test into browser_translation_telemetry.js. The latter reflects better the current purpose of the file as to me.
Hi, thanks for the update!
I think with upcoming travel i'll only get to take a look here on monday.
Hi, Georg! No problem, I also have a busy weekend. Have a safe travel!
(In reply to Georg Fritzsche [:gfritzsche] from comment #9)
> (In reply to :Felipe Gomes from comment #7)
> > > >    _enqueueTelemetryStorageTask: function (task) {
> > > >      if (!Services.prefs.getBoolPref("toolkit.telemetry.enabled")) {
> > > 
> > > Felipe, i see that we only submitted any of the measurements here when
> > > telemetry.enabled?
> > > If so, we don't even need to set the probes to opt-out for release, right?
> > 
> > I didn't follow the logic. What you mean by opt-out for release?
> 
> FHR data-collection is opt-out on release, Telemetry is opt-in.
> The check above makes the translation FHR provider only submit data on
> release if a user opted into Telemetry.
> So, if that was intentional we don't need to add
> |"releaseChannelCollection": "opt-out"| to the histogram definitions here.

So just to make sure I understand: if we don't add "opt-out" to the histogram definition, it defaults to be opt-in? And that opt-in would mean "with telemetry enabled"? If so, yeah that would match directly the previous behavior


About the daily collection, I've been trying to understand the new model of collection on Telemetry, and if I get it, it's all session-based, right? There's no more the concept of daily or user like there was with FHR?  I think either way we're fine with just registering these measurements once on the constructor as these values aren't meant to be changing, although when we ran experiments we would change it on the fly from false to true and that would take effect for all new tabs, without a restart required
Flags: needinfo?(felipc)
Comment on attachment 8614621 [details] [diff] [review]
Migrating translation component from FHR to telemetry histograms

Review of attachment 8614621 [details] [diff] [review]:
-----------------------------------------------------------------

I've looked into the changes on the Translation code, and everything looks fine. But I'll defer the proper review to Georg.
Attachment #8614621 - Flags: review?(felipc) → feedback+
Comment on attachment 8614621 [details] [diff] [review]
Migrating translation component from FHR to telemetry histograms

Review of attachment 8614621 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the late review here, i'm glad we met in Vancouver and cleared this up :)

This looks good and i mostly have some nits below.
One thing i'd love to see changed to is the division between TranslationTelemetryReport and TranslationTelemetryProvider - i don't think we need that division and could just have TranslationTelemetry instead.

::: browser/components/translation/Translation.jsm
@@ +5,5 @@
>  "use strict";
>  
>  this.EXPORTED_SYMBOLS = [
>    "Translation",
> +  "TranslationTelemetryProvider",

Could maybe just be `TranslationTelemetry`?

@@ +269,5 @@
> +  /**
> +   * Record the state of translation preferences.
> +   */
> +  recordPreferences: function () {
> +    this._withProvider(provider => provider.recordPreferences());

What do we need the split between TranslationTelemetryReport & TranslationTelemetryProvider for?
Can we just have TranslationTelemetry instead?

FHR was historically a bit over-abstracted, we'd like to avoid that in the future.

@@ +366,5 @@
>     *        The function that will be passed the translation provider.
>     */
>    _withProvider: function (callback) {
>      try {
> +      if(!this._provider)

Nit: space before `(` for `for`/`if`/... in this file.

@@ +398,5 @@
> +  };
> +  for(let name of Object.keys(this.HISTOGRAMS)) {
> +    let id = this.HISTOGRAMS[name];
> +    this.HISTOGRAMS[name] = function() {
> +      return Services.telemetry.getHistogramById(id);

If we do that, why not initialize HISTOGRAMS with the functions directly?
E.g. for still keeping it less verbose:

const plain = (id) => Services.telemetry.getHistogramById(id);
const keyed = (id) => Services.telemetry.getKeyedHistogramById(id);
this.HISTOGRAMS = {
  OPPORTUNITIES : () => plain("TRANSLATION_OPPORTUNITIES"),
  OPPORTUNITIES_BY_LANG: () => keyed("TRANSLATION_OPPORTUNITIES_BY_LANGUAGE"),
  ...

@@ +435,5 @@
>      }.bind(this));
>    },
>  
> +  recordTranslation: function (langFrom, langTo, numCharacters) {
> +    return this._doIfAllowed(function recordTask() {

Couldn't we just check a condition instead of this abstraction?
if (this._canRecord) {
  ...

or:
if (!this._canRecord) {
  return;
}

::: browser/components/translation/test/browser_translation_telemetry.js
@@ +8,5 @@
> +let {Translation, TranslationTelemetryProvider} = tmp;
> +const Telemetry = Services.telemetry;
> +
> +let MetricsChecker = {
> +  HISTOGRAMS: {

Side-note, doesn't really need changing now:
Map() seems more natural for these patterns now and allows us to do |for ([key,value] of this.HISTOGRAMS) ...| etc.

@@ +48,5 @@
> +      opportunitiesCountByLang: {},
> +      pageCountByLang: {}
> +    };
> +
> +    for(let source of Translation.supportedSourceLanguages) {

Nit: space before `(`, ditto in the rest of the file for `for(` and `if(`.

@@ +49,5 @@
> +      pageCountByLang: {}
> +    };
> +
> +    for(let source of Translation.supportedSourceLanguages) {
> +      let s = this.HISTOGRAMS.OPPORTUNITIES_BY_LANG.snapshot();

We're not updating that histogram in the loop, so we can move taking the snapshot out of it.
Ditto with PAGES_BY_LANG below.

@@ +50,5 @@
> +    };
> +
> +    for(let source of Translation.supportedSourceLanguages) {
> +      let s = this.HISTOGRAMS.OPPORTUNITIES_BY_LANG.snapshot();
> +      this._metrics.opportunitiesCountByLang[source] = s[source]? s[source].sum : 0;

Nit: space before `?`, ditto in the rest of the file.

@@ +75,5 @@
> +      Assert.equal(prevMetrics[metric] + addition, metrics[metric]);
> +    }
> +  },
> +
> +  checkAdditions: Task.async(function* (additions) {

Why is this a task function?

@@ +87,5 @@
> +  Services.prefs.setBoolPref("toolkit.telemetry.enabled", true);
> +  Services.prefs.setBoolPref("browser.translation.detectLanguage", true);
> +  Services.prefs.setBoolPref("browser.translation.ui.show", true);
> +
> +  // Making sure that telemetry service is running with mochitest.

Nit: "Making sure that extended Telemetry is enabled for this test."

@@ +93,5 @@
> +  let oldCanRecord = Telemetry.canRecordExtended;
> +  Telemetry.canRecordExtended = true;
> +
> +  registerCleanupFunction(() => {
> +    MetricsChecker.reset();

I think this not needed here.
Any possible later tests could not rely on the values anyway (other tests may have changed it) and have to reset it themselves.

@@ +97,5 @@
> +    MetricsChecker.reset();
> +
> +    Services.prefs.clearUserPref("toolkit.telemetry.enabled");
> +    Services.prefs.clearUserPref("browser.translation.detectLanguage");
> +    Services.prefs.clearUserPref("browser.translation.ui.show");

This is problematic - depending on the test harness, prefs for the whole harness may be set as user-prefs.
When using clearUserPref() on them we don't reset them to the previous value.

I recommend resetting them similarly to canRecordExtended.

@@ +108,5 @@
> +  // Reset histogram metrics.
> +  MetricsChecker.reset();
> +});
> +
> +add_task(function* test_fhr() {

Nit: Rename the function.

@@ +234,5 @@
> +  }
> +
> +});
> +
> +function getInfobarElement(browser, anonid) {

Nit: I'd expect helper to be on top before the main test functions.

@@ +241,5 @@
> +  return notif._getAnonElt(anonid);
> +}
> +
> +function offerTranslatationFor(text, from) {
> +  return Task.spawn(function* task_offer_translation() {

Nit: You can get rid of indentation here via:
let offerTranslationFor = Task.async(function*(text, from) { ...

... ditto other functions below.

@@ +248,5 @@
> +      gBrowser.addTab("data:text/html;charset=utf-8," + text);
> +
> +    // Wait until that's loaded.
> +    let browser = tab.linkedBrowser;
> +    yield promiseBrowserLoaded(browser);

This can be shrunk to:
const dataUrl = ...
yield BrowserTestUtils.openNewForegroundTab(gBrowser, dataUrl);

@@ +288,5 @@
> +    });
> +  });
> +}
> +
> +function promiseBrowserLoaded(browser) {

We already have BrowserTestUtils.browserLoaded()

::: toolkit/components/telemetry/Histograms.json
@@ +8122,5 @@
> +  },
> +  "CHANGES_OF_DETECTED_LANGUAGE": {
> +    "expires_in_version": "default",
> +    "kind": "boolean",
> +    "description": "A number of times when the auto-detected language was changed by the user before of after translation"

Explain here what the boolean value means? Two "count" histograms would be more intuitive, but i don't think it matters much.
Nit: "before of after"
Attachment #8614621 - Flags: review?(gfritzsche) → feedback+
Attachment #8634628 - Flags: review?(gfritzsche)
Hi Georg!

Thanks for the detailed review of my previous patch! I updated the code accordingly. If the patch look ok, I can push it to the tryserver. For now, all translation tests pass locally.
Comment on attachment 8634628 [details] [diff] [review]
Migrating translation component from FHR to telemetry histograms

Review of attachment 8634628 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, r=me with the minor issues below addressed.

Thanks for staying on it even with the delays, this was bad timing here :) (other urgent work, whistler, sick time...)

::: browser/components/translation/Translation.jsm
@@ +261,5 @@
>  };
>  
>  /**
> + * This class uses telemetry histograms for collecting statistics on
> + * the usage of the translation component.

Nit: It's not a class.

@@ +284,5 @@
> +      TARGET_CHANGES        : () => plain("CHANGES_OF_TARGET_LANGUAGE"),
> +      DETECTION_CHANGES     : () => plain("CHANGES_OF_DETECTED_LANGUAGE"),
> +      SHOW_UI               : () => plain("SHOULD_TRANSLATION_UI_APPEAR"),
> +      DETECT_LANG           : () => plain("SHOULD_AUTO_DETECT_LANGUAGE"),
> +    }

Nit: Missing semicolon here.

@@ +401,4 @@
>    },
>  
> +  /**
> +   * A shortcur for reading the telemetry preference.

Nit: "shortcut"

::: browser/components/translation/test/browser_translation_telemetry.js
@@ +136,5 @@
> +  menulist.doCommand();
> +}
> +
> +add_task(function* setup() {
> +

Nit: Redundant empty line.
Attachment #8634628 - Flags: review?(gfritzsche) → review+
Attachment #8614621 - Attachment is obsolete: true
Hi Georg!

(In reply to Georg Fritzsche [:gfritzsche] from comment #24)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=eabc7c37506a

Thanks for pushing the patch to the tryserver. The results seem to be looking good, right? I don't see any failures. Should I still fix the nits from your last review, or is that already done?
Flags: needinfo?(gfritzsche)
Hm, i can't see any actual test results there, only build results.
Maybe try got reset in the mean-time and we have to repush?

The nits still need to be fixed, i only pushed the patch you put up for review.
Flags: needinfo?(gfritzsche)
Hi Georg!

I updated the patch. Sorry for taking this long. The tests pass on my laptop but FAIL on tryserver https://treeherder.mozilla.org/#/jobs?repo=try&revision=0b39ccf35971. And I dont know why. Do you have an idea?
Attachment #8634628 - Attachment is obsolete: true
Flags: needinfo?(gfritzsche)
Attachment #8645382 - Flags: review+
Note that the test-failures are debug-only (i.e. you probably need |--enable-debug| in your .mozconfig to reproduce).

Checking in a log [0], i see this:
 10:41:04 INFO - [2903] ###!!! ABORT: file /builds/slave/try-l64-d-00000000000000000000/build/src/ipc/chromium/src/base/histogram.cc, line 1020
10:41:04 INFO - #01: base::FlagHistogram::Accumulate(int, int, unsigned long) [ipc/chromium/src/base/histogram.cc:1020] 

Checking that histogram.cc code, this is a debug only check for accumulation on a flag histogram [1].
So, apparently you are calling .add(x) with x!=1 somewhere in your code, which should not happen.

0: https://treeherder.mozilla.org/logviewer.html#?job_id=10254931&repo=try
1: https://dxr.mozilla.org/mozilla-central/rev/d6ea652c579992daa9041cc9718bb7c6abefbc91/ipc/chromium/src/base/histogram.cc#1020
Flags: needinfo?(gfritzsche)
(In reply to Georg Fritzsche [:gfritzsche] from comment #28)
> Note that the test-failures are debug-only (i.e. you probably need
> |--enable-debug| in your .mozconfig to reproduce).

ac_add_options --enable-debug
Hi Guys! Sorry for taking this long for updating the patch. Thanks for your patience Georg! I fixed the problem with failing tests. Seems like telemetry does not like when somebody adds 0 to the flag histogram. The tests pass locally now. I pushed the changeset to the try-server: 
https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa3219588941
Hope it works this time!
Attachment #8645382 - Attachment is obsolete: true
Attachment #8651377 - Flags: review+
Yes, that looks good - lets get this landed :)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/1cbc5e18434f
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 43
Depends on: 1205968
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: