Closed Bug 1026131 Opened 10 years ago Closed 10 years ago

build gengo translation bookkeeping infrastructure

Categories

(Input Graveyard :: Submission, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

(Whiteboard: u=dev c=translations p=2 s=input.2014q2)

The Gengo human translation system will be creating translation jobs once per hour. When feedback comes in that needs to be translated, we'll queue it up until the next translation job creation run. We'll queue things up in a database table. This bug covers going through all the requirements, designing the table and implementing the table and required migrations.
Assignee: nobody → willkg
Priority: -- → P1
The rough requirements can be extrapolated from this list: 1. A GengoJob is about translating a single field from a single model instance and putting the translated text into another field of that model instance. Right now we are only translating the Response.description field, so this is a bit more general than we need, but it'll help us a ton when Input expands and we have other models to translate and potentially multiple fields in a model to translate. 2. We send translation jobs to Gengo in batches. Each batch is a single GengoOrder and has a unique order id. 3. The GengoJob has a status field. When the Response is saved to the database, we call a method on that object to give us a list of things that need to be translated. Those get sent to a celery task to deal with them. The celery task executes and calls the a method in the GengoHumanTranslationSystem class (which hasn't been written yet, but you can see other translation system classes) which will create a GengoJob instance for each thing that needs to be translated. These will have a "created" status. 4. A cron job will kick off once an hour and (hand-waving general explanation here) will look for all GengoJob items in the "created" status. Then it'll bucket them by src_language. Then it'll use the Gengo API to create orders, create GengoOrder instances in our db and update the status of all GengoJob instances for a given order to "in-progress". 5. The cron job will also use the Gengo API to pull back any completed translations and update the GengoJob and GengoOrder accordingly. 6. Over the course of translating something, we have a bunch of Gengo API calls. I want to record the responses we get back and be able to tie the responses into a conversation about a specific GengoJob or GengoOrder. This will make it a lot easier to find bugs in the system or edge cases we're not handling correctly. Plus it'll let us do metrics later so we can see how everything is performing.
First pass is in a PR: https://github.com/mozilla/fjord/pull/310 It's likely that further development on this project will require changes to those tables, but that's ok. This is a good first pass.
Landed in: * https://github.com/mozilla/fjord/commit/38d8584 * https://github.com/mozilla/fjord/commit/98d30fb Waiting to push this to production until I have other bits done.
Summary: build gengo translation queue → build gengo translation bookkeeping infrastructure
Pushed to prod already.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Fixed this in 2014q2. Fixing the whiteboard data.
Whiteboard: u=dev c=translations p=2 s=input.2014q3 → u=dev c=translations p=2 s=input.2014q2
Product: Input → Input Graveyard
You need to log in before you can comment on or make changes to this bug.