Closed Bug 1405256 Opened 7 years ago Closed 6 years ago

[FTL] Some identical translations got turned into unreviewed suggestions

Categories

(Webtools Graveyard :: Pontoon, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: theo, Assigned: mathjazz)

Details

Attachments

(2 files)

Not sure how this happened, but the suggestions for the strings here are completely identical (click the diff button), and are pending suggestions while they probably shouldn’t have:

https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?extra=has-suggestions&string=168151
Did those translations turn into unreviewed suggestions since yesterday, when we deployed the new python-fluent?
https://github.com/mozilla/pontoon/commit/b65629ee52244241587ab4e86dd240109eecf9cd

Looking at the timestamps makes me think that the issue occured earlier and you just noticed it recently?
(In reply to Matjaz Horvat [:mathjazz] from comment #1)
> Did those translations turn into unreviewed suggestions since yesterday,
> when we deployed the new python-fluent?
> https://github.com/mozilla/pontoon/commit/
> b65629ee52244241587ab4e86dd240109eecf9cd
> 
> Looking at the timestamps makes me think that the issue occured earlier and
> you just noticed it recently?

I just noticed it today, but it’s likely it’s been there for a while. I didn’t check this filter in the last few weeks
Thanks. That means it's not related to the latest python-fluent changes.

I'll look into this.
Assignee: nobody → m
Priority: -- → P2
Summary: Some identical translations got turned into unreviewed suggestions → [FTL] Some identical translations got turned into unreviewed suggestions
Since the original example from Comment 0 is no longer available, I'm adding a few more:

https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167712
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167754
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167753
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167755
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=168155
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=159268
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=166144
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=166145
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=160418
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167822
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167821
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167820
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167818
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167827
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167825
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167826
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167823
https://pontoon.mozilla.org/fr/test-pilot-website/all-resources/?string=167819

Those are all old translations, duplicated due to an old bug that was fixed without a migration to clean up the mess.

We should make such migration.
Here's a naive algorithm for the migration that deletes duplicates. Checking condition 3 is pretty slow, we should at least try to optimize it.

===

from pontoon.base.models import *
from django.db.models import Count

non_active_pks = []

# Find all translations, that:

# 1. Are in FTL file format
ftl_translations = Translation.objects.filter(entity__resource__format="ftl")

# 2. Have a duplicate translation (same entity, locale, string)
duplicates = (
    ftl_translations
        .values("entity", "locale", "string")
        .annotate(Count('id'))
        .filter(id__count__gt=1)
)

# 3. Are non-active
for d in duplicates:
    non_active_pks += (
        Translation.objects
            .filter(entity=d['entity'], locale=d['locale'], string=d['string'])
            .order_by('-approved', 'rejected', '-date')
    )[1:].values_list('pk', flat=True)

# Then, delete them
Translation.objects.filter(pk__in=non_active_pks).delete()
It's hard to write a performant algorithm for this using Django ORM. Will reuse jotes's proposal:
https://gist.github.com/jotes/d2c277a80913dc76c4f0b3005acf16ab
Commit pushed to master at https://github.com/mozilla/pontoon

https://github.com/mozilla/pontoon/commit/904fdc8903cddf21ee25755808c4e293ba615805
Fix bug 1405256: Delete duplicate FTL translations (#894)

And corresponding Translation Memory entries.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Commit pushed to master at https://github.com/mozilla/pontoon

https://github.com/mozilla/pontoon/commit/4269ce68cae6c0eb01770132f37e126b6463884a
Fix bug 1405256: Fix migration (#895)

Delete TranslationMemoryEntry instances first, because the tms QuerySet
gets empty after the Translation instances are deleted.
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: