Open Bug 1861698 Opened 11 months ago Updated 8 months ago

Add Telemetry for Language Identification in Translations

Categories

(Firefox :: Translations, enhancement)

enhancement

Tracking

()

People

(Reporter: nordzilla, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

In Bug 1859081 we are going to change the way we do language detection in an effort to reduce false positives.

Following this change, we should instrument the Full-Page Translations functionality with telemetry to measure a few aspects about our translations offering process.

We want to measure the following qualities:

  1. Whether visited pages have specified language tag vs. no language tag.
  2. Among pages that have a specified language tag which is a supported language, how often did our language detection process agree with the language tag vs disagree?
  3. How often is our language detection confident about is predictions vs. not confident?

We should ensure that we have this data instrumented for our current CLD2 implementation before we switch to fastText, CLD3, or potentially a different language identification mechanism, so that we can make informed inferences about any improvements or regressions.

We might not even need Telemetry, we could use CommonCrawl. There could already some analysis of this done by somebody else.

You need to log in before you can comment on or make changes to this bug.