Closed Bug 1410387 Opened 7 years ago Closed 3 years ago

[API] Expose aggregate statistics about Suggestions

Categories

(Webtools Graveyard :: Pontoon, enhancement, P3)

enhancement

Tracking

(Not tracked)

RESOLVED MOVED

People

(Reporter: stas, Unassigned)

References

Details

(A follow-up to bug 1403861.)

Once bug 1377969 is fixed, we'll be able to add the following fields to the API:

    totalStringsWithPendingSuggestions
    pendingSuggestions

Names are still TBD.
Priority: -- → P2
In bug 1377969 we added pendingSuggestions (called unreviewedStrings).

Is there a need for totalStringsWithPendingSuggestions?
Flags: needinfo?(francesco.lodolo)
(In reply to Matjaz Horvat [:mathjazz] from comment #1)
> In bug 1377969 we added pendingSuggestions (called unreviewedStrings).
> 
> Is there a need for totalStringsWithPendingSuggestions?

Good point. The question is what we want to measure.

a) Total number of unreviewed suggestions, to both approved and missing translations. That's covered by unreviewedStrings (I assume), and it signals that a locale doesn't have enough reviewer.

b) "Number of missing strings that have unreviewed suggestions" and "Number of translated strings with unreviewed suggestions" (their sum is what I imagine "unreviewedStrings" being right now). Both signal lack of reviewer, but the former means we're shipping something with less translations that we could, and it's an interesting data point of its own.
Flags: needinfo?(francesco.lodolo)
Your assumption in a) is correct.

Your proposal in b) might be useful, too. I just want to make it clear that the sum of two field you propose is not always unreviewedStrings, it's actually totalStringsWithPendingSuggestions. In former we count suggestions, in latter we count source strings.

To clarify, what stas proposes (please correct me if I'm wrong) with totalStringsWithPendingSuggestions can be demonstrated with an example project consisting of 3 strings:
- String A: 0 unreviewed suggestions
- String B: 1 unreviewed suggestions
- String C: 2 unreviewed suggestions

=> unreviewedStrings: 3
=> totalStringsWithPendingSuggestions: 2

My question was: do we need totalStringsWithPendingSuggestions? And now the question could be extended to:

Do we need totalStringsWithPendingSuggestions? If yes, do we need to split it into totalMissingStringsWithPendingSuggestions and totalTranslatedStringsWithPendingSuggestions.
(In reply to Matjaz Horvat [:mathjazz] from comment #3)
> Your assumption in a) is correct.
> 
> Your proposal in b) might be useful, too. I just want to make it clear that
> the sum of two field you propose is not always unreviewedStrings, it's
> actually totalStringsWithPendingSuggestions. In former we count suggestions,
> in latter we count source strings.

You're correct, Friday afternoon brain :-\

> To clarify, what stas proposes (please correct me if I'm wrong) with
> totalStringsWithPendingSuggestions can be demonstrated with an example
> project consisting of 3 strings:
> - String A: 0 unreviewed suggestions
> - String B: 1 unreviewed suggestions
> - String C: 2 unreviewed suggestions
> 
> => unreviewedStrings: 3
> => totalStringsWithPendingSuggestions: 2
> 
> My question was: do we need totalStringsWithPendingSuggestions? And now the
> question could be extended to:
> 
> Do we need totalStringsWithPendingSuggestions? If yes, do we need to split
> it into totalMissingStringsWithPendingSuggestions and
> totalTranslatedStringsWithPendingSuggestions.

Or do we just calculate totalMissingStringsWithPendingSuggestions and totalTranslatedStringsWithPendingSuggestions, given that totalStringsWithPendingSuggestions can be derived from these two if needed.

totalTranslatedStringsWithPendingSuggestions could also be used as a measure of lack of QA, assuming those new suggestions come from people using the product, and suggesting improvements.
Note that you don't need "totalMissing…" nor "totalTranslated…". "Total" is used for the sum of (approved + fuzzy + missing). You probably want approvedStringsWithUnreviewedSuggestions, fuzzyStringsWithUnreviewedSuggestions and missingStringsWithUnreviewedSuggestions, which might be too specific and too granular. I wonder if a better approach would be to parametrize totalStrings, fuzzyStrings and missingStrings fields in the API:

    totalStrings(needsReview: true)

Or something more generic:

    totalStrings(where: {hasUnreviewedSuggestions: true})

That said, it might be expensive to aggregate this data dynamically when an API request is made. If that's the case, separate pre-computed fields might be the best option.
If you plan to add some parametrization, I would suggest to be consistent and serve all strings groups via such queries. Than you can also drop totalString keyword and replace it simply by strings/countStrings/stringsCount.

That means:
- totalStrings -> stringsCount
- approvedStrings -> stringsCount(where: {isApproved: true})
- fuzzyStrings -> stringsCount(where: {isFuzzy: true})
- missingStrings -> stringsCount(where: {isApproved:false, isFuzzy: false})
- unreviewedStrings -> stringsCount(where: {hasUnreviewedSuggestion: true}) or stringsCount(where: {isUnreviewed: true}) or stringsCount(where: {isReviewed: false})

As for the on fly computation, I don't think it's necessary to serve the most used queries as specific fields, if you can identify that specific query and serve the pre-computed value.
Priority: P2 → P3
*This bug has been moved to GitHub.*

*Please check it out on https://github.com/mozilla/pontoon/issues.*
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → MOVED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.