Closed Bug 973292 Opened 6 years ago Closed 5 years ago

Firefox counts the number of characters its translates


(Firefox Health Report Graveyard :: Client: Desktop, defect)

Not set


(firefox32 verified, firefox33 verified)

Firefox 33
Tracking Status
firefox32 --- verified
firefox33 --- verified


(Reporter: MarcoM, Assigned: ttaubert)



(Whiteboard: [translation] p=5 s=33.1 [qa!])


(1 file, 2 obsolete files)

1.  Goal
* As a product owner I want to know how many characters Firefox translates so that I can assess the cost of the service

2.  Acceptance Criteria
* I can view a report that tells me how many characters have been translated by our users
* This report will be updated weekly
Depends on: 973437
Depends on: 974477
Does translation really cost in characters? Would it be sufficient to ask the service provider to provide us with these metrics? Or are we trying to proof their numbers against our own internal metrics?
Flags: needinfo?(cweiner)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #1)
> Does translation really cost in characters? 
Yup, for many of the leading providers :(

Would it be sufficient to ask
> the service provider to provide us with these metrics? 
Or are we trying to
> proof their numbers against our own internal metrics?

It's always safer to know ourselves, especially with the potential size of the cost involved here.
Flags: needinfo?(cweiner)
Component: Firefox Operations → Client: Desktop
Product: Tracking → Firefox Health Report
Version: --- → Trunk
No longer blocks: fxdesktopbacklog
Flags: firefox-backlog+
This isn't prioritized yet but bug 978158 introduces a "recordTranslation()" method which can be used to record this data. Fixing this patch will require actually getting a character count of what's being translated and then passing it to this method along with the languages. Should be quite simple to fix.
Hardware: x86_64 → All
Marco, can you please add this to the current iteration? p=5 because it needs a test.
Assignee: nobody → ttaubert
Flags: needinfo?(mmucci)
Whiteboard: [story] [translation] → [story] [translation] p=5
Blocks: 1019731
Added to Iteration 32.3
Flags: needinfo?(mmucci)
Whiteboard: [story] [translation] p=5 → [story] [translation] p=5 s=it-32c-31a-30b.3 [qa?]
Whiteboard: [story] [translation] p=5 s=it-32c-31a-30b.3 [qa?] → [translation] p=5 s=it-32c-31a-30b.3 [qa?]
Comment on attachment 8434190 [details] [diff] [review]

Review of attachment 8434190 [details] [diff] [review]:

::: browser/components/translation/TranslationContentHandler.jsm
@@ +79,5 @@
> +        // Get the total text length of all roots.
> +        let totalChars = translationDocument.roots.reduce((acc, root) => {
> +          return acc + translationDocument.generateTextForItem(root).length;
> +        }, 0);

I wasn't sure about this part. On the one hand we could let BingTranslation count characters and return them but I don't really like shifting that responsibility to the translation provider. OTOH generateTextForItem() might have a non-negligible cost? It didn't look too costly to me.
Yeah, I was pondering about that too. I think in the end it's better to let the provider count the chars by itself, because that would be more accurate for two reasons:

- the provider splits the translation in various requests, and if one of that request fails, we should not count those chars as translated
- the provider has its own max data limit which is independent from the TranslationDocument. So it might translate less data than what's all stored in the TranslationDocument

For those reasons, I believe the provider knows the info for the most accurate counting.

_generateNextTranslationRequest already has the value calculated, so it can store it in the object it returns, to be stored in the BingRequest.  Then on _chunkCompleted we can accumulate the count if .requestSucceeded, and pass the value back through the _onFinishDeferred promise.
Attachment #8434190 - Flags: review?(felipc)
Chad, we're not sure about the use case here. So should we:

1) Count the characters that the user *wants* to translate? And also count them if some or all of the sub-requests for a single page translation fail? After all this would allow us to estimate the costs for a future translation provider even if we should start with a bad one.

2) Count only the number of characters that were successfully translated? That would allow us keep record of our translation volume and check with what the provider wants to charge us.

What exactly is the use case here?
Flags: needinfo?(cweiner)
Hey Tim,  the use case is #2.  

Flags: needinfo?(cweiner)
Whiteboard: [translation] p=5 s=it-32c-31a-30b.3 [qa?] → [translation] p=5 s=33.1 [qa?]
Whiteboard: [translation] p=5 s=33.1 [qa?] → [translation] p=5 s=33.1 [qa+]
I ended up implementing it a little differently because it felt unnecessary to pass counts from _generateNextTranslationRequest() to _chunkCompleted() if we can just count in the request itself and use tack it onto that as a property.

The test works locally with the bing key file you gave me.
Attachment #8434190 - Attachment is obsolete: true
Attachment #8438550 - Flags: review?(felipc)
Attachment #8438550 - Attachment is obsolete: true
Attachment #8438550 - Flags: review?(felipc)
Attachment #8438554 - Flags: review?(felipc)
Comment on attachment 8438554 [details] [diff] [review]
0001-Bug-973292-Record-the-number-of-characters-that-are-.patch, v3

Review of attachment 8438554 [details] [diff] [review]:

Great, thanks! To land the test we need to wait for bug 1022725, because once we have a key built-in, we don't want our infra to be hitting Bing's server and using the key.
So I'll add a note on bug 1022725 to land the test part of this patch there once the mock server lands.

::: browser/components/translation/BingTranslator.jsm
@@ +122,2 @@
>        } else {
> +        this._onFinishedDeferred.reject();

promise.reject() requires at least one param to exist

::: browser/components/translation/TranslationContentHandler.jsm
@@ +141,3 @@
>              translationDocument.showTranslation();
>            },
> +          () => {

the named param "error" is unused but I like the way it makes it clear that this is the function handling the promise rejection

maybe the characterCount param for the success case should be called result, and be an object where you get result.characterCount
Attachment #8438554 - Flags: review?(felipc) → review+
Addressed all feedback and landed the test with skip-if=true.
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 33
Comment on attachment 8438554 [details] [diff] [review]
0001-Bug-973292-Record-the-number-of-characters-that-are-.patch, v3

[Approval Request Comment]
Bug caused by (feature/regressing bug #): This bug is part of the automatic translation feature, which we want to A/B test with a subset of Aurora 32 users.
User impact if declined: The feature won't count the number of chars to translate sent to the provider
Testing completed (on m-c, etc.): landed on m-c
Risk to taking this patch (and alternatives if risky): minimal
String or IDL/UUID changes made by this patch: none
Attachment #8438554 - Flags: approval-mozilla-aurora?
QA Contact: bogdan.maris
I tested on Windows 7 64bit, Mac OS X 10.9.3 and Ubuntu 13.04 64bit using latest Nightly and verified that "charactersTranslatedCount" works as expected. I did some exploratory around "pageTranslatedCount", "translationOpportunityCount", "pageTranslatedCountsByLanguage" and "translationOpportunityCountsByLanguage", all seem to work just fine.
Whiteboard: [translation] p=5 s=33.1 [qa+] → [translation] p=5 s=33.1 [qa!]
Attachment #8438554 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Marking as Verified on Firefox 33, based on comment 17.
Also verified on Windows 7 64bit, Mac OS X 10.9.3 and Ubuntu 13.04 64bit using latest Aurora.
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.