Closed
Bug 464104
Opened 16 years ago
Closed 16 years ago
[silme] L10nObjectDiff produces wrong diffs
Categories
(Mozilla Localizations :: Infrastructure, defect)
Mozilla Localizations
Infrastructure
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: adriank, Unassigned)
References
Details
Attachments
(1 file, 4 obsolete files)
46.31 KB,
application/zip
|
Details |
When I'm trying to use L10nObjects and not EntityLists in compare-locales I'm experiencing strange behavior:
missingEntities: 337
missingEntitiesInMissingFiles: 9
missingFiles: 1
obsoleteEntities: 312
but it should be:
missingEntities: 102
missingEntitiesInMissingFiles: 9
missingFiles: 1
obsoleteEntities: 67
Somehow sometimes it adds the same entities as added AND removed at the same time, even if they were not changed.
After some testing I've found out, that the problem seems to be caused by the SequenceMatcher used in lines 9: http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/file/de21c3a3ded4/lib/silme/diff/l10nobject.py
I did a test case for this problem...
Reporter | ||
Comment 1•16 years ago
|
||
Reporter | ||
Comment 2•16 years ago
|
||
Reporter | ||
Comment 3•16 years ago
|
||
after looking at google it looks like we are not the first to experience strange behavior with the SequenceMatcher...
Comment 4•16 years ago
|
||
Ok, it was a tricky game, because it was made of several bugs which coincidentally were exposed by this testcase.
I started fixing them.
1) Fix http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/rev/837d264c5d7a
We did not collect 'replaced' by default. (also, fixed applyDiff for modified)
2) http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/rev/a07f6d11b65d
We collected all evidences od entity from l10nObject into entity list which made it possible that getEntities() will return the same entity twice (removed from one place, added to another).
In the future we should recognize it and mark as 'moved' flag (we already had it in silme stage1).
3) I'm working on, l10nObjectDiff does catch unmodified entities while entityListDiff does not. This should be consistent.
After the third patch this bug should be fixed. I'll try to have it by today. Two previous ones are already checked in and should give you no regressions.
Comment 5•16 years ago
|
||
Ok. This bug should be fixed. It's a complex fix, but will be cleaned up with bug 458445.
Patches:
1) http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/rev/837d264c5d7a
2) http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/rev/a07f6d11b65d
3) http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/rev/3487b2cf2110
The last one required a lot of debugging bug gave me a nice insight in how complex the whole idea of morphing L10nObject into EntityList is. That's good - it proves that it was a good idea to split those two objects, they have very different logics and its hard to switch from one to another from inside. I want to make sure its easy from outside of API.
Adrian, please confirm that it didn't break anything for you and I'll start working on PEP08/documentation/cleanups for the release :)
Comment 6•16 years ago
|
||
Also. It would be great to get an idea on performance hit you register from this.
As I said, it's to be expected (its because in the bottle neck in each parser I register one other element of the source dict), I just hope its not huge.
Reporter | ||
Comment 7•16 years ago
|
||
The bug seems to be partially fixed. For the remaining problems I have written a new, longer test along with a few examples.
While testing the last file (messenger.properties), Silme crashes -> maybe a new bug?
Attachment #347366 -
Attachment is obsolete: true
Attachment #347367 -
Attachment is obsolete: true
Attachment #347368 -
Attachment is obsolete: true
Comment 8•16 years ago
|
||
http://hg.mozilla.org/users/zbraniecki_mozilla.com/silme/rev/af429423f16b
This solves the latest testcase.
1) I double check for a complex cases with multiply entities with the same ID in one L10nObject. It is legal, but during dump to Entities it caused problems
2) I correctly mark entities as 'removed' and 'added' instead of 'replaced' while dumping to EntityList.
Comment 9•16 years ago
|
||
marking as fixed.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 10•16 years ago
|
||
Reporter | ||
Updated•16 years ago
|
Attachment #350121 -
Attachment description: test case v2 → test case (v3) for the remaining problems
Reporter | ||
Updated•16 years ago
|
Attachment #349106 -
Attachment is obsolete: true
Reporter | ||
Comment 11•16 years ago
|
||
unfortunately, problem is not fully solved. See the v3 test case...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 12•16 years ago
|
||
sorry, but the new testcase is showing 8 identical examples of duplicated entities.
In the future, when you're preparing a testcase please, ensure that you are indeed showing unique, distinguishable errors with it.
I don't find any bug with this, closing back. Please, make double check that you indeed faces a bug before reopening.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•