[research] figure out list of top locales where feedback is in english



3 years ago
2 years ago


Using the Input API and a language guesser, we should go through 1000 or so feedback and generate a list of locales represented in that feedback and the percent of feedback responses for that locale where the feedback is actually in English.

I've used guess-language before:


We probably want to have a minimum number of words required in the feedback response before we include it in the stats so as to reduce the number of "bad guesses" and spam.

This bug covers doing that work and producing the list.

The reason we want to do this is that we have a number of locales where the strings aren't translated and I want to see how that affects the incoming feedback.
Tentatively tossing this in 2015q2 since I think generating numbers is relatively straight-forward and it'd probably guide us in possibly eliminating some locales from the site altogether which would be good.
I never did this research work. This work would satisfy some curiosities and possibly unearth some issues, but I don't think any of that has much impact on the grand scheme of things.

Given that, I'm just going to WONTFIX it.
