Closed Bug 1219431 Opened 9 years ago Closed 5 years ago

Add word count to Stats

Categories

(Webtools Graveyard :: Pontoon, defect, P3)

defect

Tracking

(firefox44 affected)

RESOLVED FIXED
Tracking Status
firefox44 --- affected

People

(Reporter: mathjazz, Assigned: karskaja)

References

Details

In addition to string count, we should also add word count to stats in the dashboard and in the translate view. Word count is usually more precise and is also used in calculations by agencies.
Matjaz, So I believe the changes have to be made in the /profile and the /contributors views ? Are there any other pages in the dashboard where the count has to be made from string count to a word count ?
We could also add word count to /profile and /contributors, but that's not neccessary. We need word count in the translate view (in progress chart): http://localhost:8000/ And on dashboards (in tooltip): http://localhost:8000/projects/ http://localhost:8000/teams/ http://localhost:8000/sl/ http://localhost:8000/projects/pontoon-intro/ http://localhost:8000/sl/pontoon-intro/ So you'll first have to store word_count in the Resource model, just like we store entity_count. And then figure out the good place in the UI to display total, translated, suggested, fuzzy, missing in both strings and words.
Priority: P2 → P3
Is this bug still valid to work upon?
Flags: needinfo?(m)
The bug is still valid. We're planning to make some changes to stats in the near future (bug 1377969), which don't directly block this bug, but it's probably not the best time to take it right now. If you're looking for good first bugs, I suggest to start here: https://wiki.mozilla.org/Webdev/GetInvolved/pontoon.mozilla.org
Flags: needinfo?(m)

Would the maintainers be open to a contribution implementing this feature -- or, at least, laying the groundwork for it by adding total_words, translated_words, etc. to the db models?

(In reply to Anand from comment #7)

Would the maintainers be open to a contribution implementing this feature -- or, at least, laying the groundwork for it by adding total_words, translated_words, etc. to the db models?

Yes and yes. :)

I started poking around at the word count thing today, and I've got a strategy question: it looks to me like it would be worthwhile to store a word_count field on the Entity model (even though it's derivable from the string data) because this would allow for updating an analogous, aggregated field on Resource without fetching all the strings from the db.

One way to do this would be to overload the Entity class's constructor so that word_count is always set. But Django docs seem to suggest that overloading model constructors isn't idiomatic

Another way would be to add a static factory method for creating Entity objects, but I worry then that it would be easy for future contributors to introduce a bug by constructing Entity directly.

Matjaz: what do you think?

I like the idea of adding the Entity.word_count field for the reasons you mentioned.

You can override the Entity.save() method, which will be called when we create newly added entities during sync:
https://github.com/mozilla/pontoon/blob/master/pontoon/sync/changeset.py#L218

However the method will not be called when we bulk_create or bulk_update Entities:
https://github.com/mozilla/pontoon/blob/master/pontoon/administration/views.py#L387
https://github.com/mozilla/pontoon/blob/master/pontoon/sync/changeset.py#L382

I'm afraid we'll have to manually maintain word_count the in such cases.

I also don't think we have other cases in the codebase currently that change the Entity table beyond the ones I'm linking to.

Assignee: nobody → github
Status: NEW → ASSIGNED

Hi everyone!
:Anand
I like the idea of a separate field for the number of words in the Entity model. Unfortunately

Sorry everyone, I posted a draft comment by a mistake. I wrote about this on IRC: The idea of an additional field is good, but you can't use the constructor of the Entity model because it will not work in places where bulk_* functions change the state of entities.
I'll follow this bug and I'll ping you if something better comes to my mind.

:jotes
What if we also update bulk_create and bulk_update in EntityQuerySet?

:Anand
I like this concept. But there's a small implementation detail :-) Pontoon doesn't use bulk_create to make new entities during the sync process.
Look at: https://github.com/mozilla/pontoon/blob/f9c1aa4b0d319c0247dbf721ba99c485a2df3591/pontoon/sync/changeset.py#L218
The sync process calls the get_or_create method. It's because the id of an entity is required to add translations of the entity into the database. I think you can override it.

bulk_update is not available in the version of Django that's on the master branch of Pontoon. As a temporary solution, you can add a method with the same name to the Entity query set class and just call the bulk_update function from django-bulk-update library. It will be removed when Django 2.0 lands in Pontoon.

However, I would suggest to split the overall change into two smaller pull requests. The first pull request would introduce new methods and do a refactoring and the second pull request would contain all changes related to the number of words in an Entity.
This approach may help with the review process (the smaller pr, the better) :)

:mathjazz what do you think about these ideas?

Flags: needinfo?(m)

Thanks for chiming in, :jotes!

I like Anand's idea and your proposal a lot. Let's go for it!

Flags: needinfo?(m)

Anand, this bug has been stuck for a while, so I'm reassigning it to Karskaya.

Assignee: github → karskaja
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
See Also: → 1621290
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.