Closed Bug 1035333 Opened 10 years ago Closed 10 years ago

Build Telemetry Experiment for Vi/Pl/Tr translation trial on beta

Categories

(Firefox :: Translations, defect)

defect
Not set
normal
Points:
5

Tracking

()

RESOLVED FIXED
Iteration:
33.3

People

(Reporter: Felipe, Assigned: gps)

References

Details

Attachments

(13 files, 1 obsolete file)

5.30 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
495 bytes, patch
benjamin
: review+
Details | Diff | Splinter Review
3.93 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
4.98 KB, patch
Details | Diff | Splinter Review
4.02 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
2.19 KB, patch
benjamin
: review-
Details | Diff | Splinter Review
2.09 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
1.73 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
3.99 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
2.36 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
2.41 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
2.48 KB, patch
benjamin
: review+
Details | Diff | Splinter Review
1.32 KB, patch
Felipe
: review+
Details | Diff | Splinter Review
Similar to bug 1022411, this is about building the experiment for our beta trial. Besides what was done in bug 1022411, we need to think about:

- Redefining the experiment parameters
- The "Translations by" string is not localized on beta, so the experiment will need to set up the correct string depending on the locale (see bug 1032139). Or maybe Florian is planning to do this directly in the Translations.jsm code instead of through the experiment.. I don't know
- The telemetry-experiments-server has an experiment running limit of 42 days. I think this is arbitrary and we can just raise it, but we need to check with Benjamin about it. I believe the running time for this experiment will be somewhere between 12 - 18 weeks, so more than 42 days.
Flags: firefox-backlog+
Points: --- → 5
(In reply to :Felipe Gomes from comment #0)
> - The "Translations by" string is not localized on beta, so the experiment
> will need to set up the correct string depending on the locale (see bug
> 1032139). Or maybe Florian is planning to do this directly in the
> Translations.jsm code instead of through the experiment.. I don't know

The answer for this part is: the experiment doesn't need to do anything special. Florian handled all this in-tree in bug 1032139
I'm taking this [from the desktop iteration cycle].

I have little clue what's going on. I'll track down someone in the know (Felipe?) once I'm finished clearing my review queue of other Firefox bugs.
Assignee: nobody → gps
Status: NEW → ASSIGNED
Iteration: --- → 33.3
QA Whiteboard: [qa?]
From bug 1022411 comment 24, this is the title and description that goes into the localized section of install.rdf, in English:

Name: Automatic Translation
Description: Read more of the web by automatically translating sites written in different languages.
Translation experts: could we please get the 2 strings in comment #3 translated to vi, po, and tu?
Flags: needinfo?(splewako)
Flags: needinfo?(selim)
Flags: needinfo?(hainp2604)
Previously, experiments were static .xpi files checked into source
control. We enable support for running a Python script to produce the
final .xpi.

This functionality will be leveraged to produce multiple
experiments/XPIs from an input template.
Attachment #8454170 - Flags: review?(benjamin)
Attachment #8454171 - Flags: review?(benjamin)
The upcoming localization experiments will have their source committed
to the tree and the XPI will be produced during building. Instead of
invoking `zip` manually, we roll our own minimal zip file creation code.

This code is pretty basic. It was compared against existing Python code
in the add-on SDK for sanity. A subsequent patch will demonstrate this.
Attachment #8454172 - Flags: review?(benjamin)
Blocks: 1037160
To support the Vi, Po, and Tu variations of the translation experiment,
we're going to dynamically generate an XPI based on templates of their
files.

This patch adds templates for the translation experiment. The files were
obtained from the existing translation-de-aurora experiment. Every de
specific item has been stripped and replaced with %%ITEM%% template
markers. A subsequent patch will introduce code for building XPIs from
these templates.
Attachment #8454180 - Flags: review?(benjamin)
Now with moar manifest.json template.
Attachment #8454184 - Flags: review?(benjamin)
Attachment #8454180 - Attachment is obsolete: true
Attachment #8454180 - Flags: review?(benjamin)
Comment on attachment 8454184 [details] [diff] [review]
Add templates for Aurora translation experiment

I just realized we have a Python templating library already hooked up. I was planning to roll my own using str.replace(). But I guess I'll look into using genshi.template. Pull review for now. I may hold off publishing more patches until the entire patch set is done.
Attachment #8454184 - Flags: review?(benjamin)
(In reply to Gregory Szorc [:gps] from comment #4)
> Translation experts: could we please get the 2 strings in comment #3
> translated to vi, po, and tu?

Turkish (tr):
Name: Otomatik Çeviri
Description: Farklı dillerde yazılmış siteleri otomatik olarak çevirerek web'de daha fazla içeriği okuyun.
Flags: needinfo?(selim)
(In reply to Gregory Szorc [:gps] from comment #4)
> Translation experts: could we please get the 2 strings in comment #3
> translated to vi, po, and tu?

Polish (pl):
Name: Automatyczne tłumaczenia
Description: Korzystaj w większym stopniu z zasobów sieci, dzięki automatycznym tłumaczeniom stron w innych językach.
Flags: needinfo?(splewako)
(In reply to Gregory Szorc [:gps] from comment #4)
> Translation experts: could we please get the 2 strings in comment #3
> translated to vi, po, and tu?

Vietnamese (vi):
Name: Dịch tự động
Description: Đọc nội dung dễ dàng hơn bằng việc tự động dịch các trang sang nhiều ngôn ngữ khác nhau.
Flags: needinfo?(hainp2604)
Comment on attachment 8454170 [details] [diff] [review]
Support per-experiment build rules

I very slightly feel like this might be better as a separate method either in build.py or within the Experiment class.
Attachment #8454170 - Flags: review?(benjamin) → review+
Comment on attachment 8454171 [details] [diff] [review]
Add virtualenv and package paths to .hgignore

Have I mentioned my dislike for virtualenv when it's not needed...
Attachment #8454171 - Flags: review?(benjamin) → review+
Comment on attachment 8454172 [details] [diff] [review]
Python API for creating XPI files

I don't know whether the webops build machines are running python 2.7. Are the unicode_literals critical to this?
Attachment #8454172 - Flags: review?(benjamin) → review+
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #16)
> Comment on attachment 8454172 [details] [diff] [review]
> Python API for creating XPI files
> 
> I don't know whether the webops build machines are running python 2.7. Are
> the unicode_literals critical to this?

Pretty sure our infra is running 2.6 (or at least large parts of it are). unicode_literals work in 2.6.

The unicode_literals aren't critical. I just like to make new Python code I write forward compatible so I'm not creating technical debt.
If unicode_literals works in 2.6, then we're all good. Certainly don't care about anything prior to that.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> Have I mentioned my dislike for virtualenv when it's not needed...

I humbly challenge that this statement stems from a misunderstanding of the importance of virtualenvs to maintain sanity.

A virtualenv is a mechanism to avoid modifying the system/global Python install. "Make your changes in this one-off thing here [a virtualenv] instead of globally." As you work on more and more Python projects, these cumulative hacks [to the system Python] pile up and start to conflict. This leads to bugs, weird behavior, difficult-to-reproduce run-time environments, etc. Virtualenvs operate as Python environment overlays that isolate all these changes to the system Python install. They protect you from shooting yourself with a bullet fired 2 months ago. This is why I always use virtualenvs and recommend others do too.

As a casual Python developer, Benjamin, you may not see a benefit of virtualenvs. But to someone like me who hacks on a few dozen Python projects, virtualenvs are necessary. I use my knowledge of footgun existence to protect those who don't, which is why I recommend virtualenvs in build instructions.

I hope you found this enlightening. I'd be happy to blog in more detail.
Question: is the sha1 hash of the experiment.xpi file identical across different builds, or can some timestamp from buildtime affect the contents of the file (say, metadata about the inner files) produce a different hash each time?

While you're here, if you're willing to do so, it would be nice if the per-experiment build file would contain an optional parameter that would remove the <em:type> from the install.rdf file, or set its value to 1. In the absence of this optional parameter, type=128 should be preserved. This would be useful to use the build script during development to test the add-on as a normal add-on before producing a type=128 version of it. (128 = experiment)

If you don't want to do it or don't care enough, I'll probably add something like this myself the next time I create an experiment.
(In reply to :Felipe Gomes from comment #20)
> Question: is the sha1 hash of the experiment.xpi file identical across
> different builds, or can some timestamp from buildtime affect the contents
> of the file (say, metadata about the inner files) produce a different hash
> each time?

I made the custom zip packaging logic explicitly omit mtimes to reduce variance during builds. I haven't verified that zips are actually idempotent.

> While you're here, if you're willing to do so, it would be nice if the
> per-experiment build file would contain an optional parameter that would
> remove the <em:type> from the install.rdf file, or set its value to 1. In
> the absence of this optional parameter, type=128 should be preserved. This
> would be useful to use the build script during development to test the
> add-on as a normal add-on before producing a type=128 version of it. (128 =
> experiment)

Ooh, that's a good idea! I'll probably brush into this when I test this extension. I got side-tracked today by prior Firefox bugs :/
(In reply to Gregory Szorc [:gps] from comment #21)
> (In reply to :Felipe Gomes from comment #20)
> > Question: is the sha1 hash of the experiment.xpi file identical across
> > different builds, or can some timestamp from buildtime affect the contents
> > of the file (say, metadata about the inner files) produce a different hash
> > each time?
> 
> I made the custom zip packaging logic explicitly omit mtimes to reduce
> variance during builds. I haven't verified that zips are actually idempotent.

This is important, because a hash change is what triggers an update of the experiment, and we don't want to unnecessarily send users through the update path of all experiments whenever a new one is added and the server files are rebuilt.

If it's not possible to do that with certainty, maybe you can change your restriction of "either experiment.xpi or build.py" to "if there's an experiment.xpi file, use it; otherwise build it with the build script". This way we can use the build script to manually build the final xpi, but all the src and build instructions are still checked in.
There's a new requirement on the behavior of the experiment that differs a bit from the previous one. On installation, every user will get the "detectLanguage" pref activated. But the test/control groups should only be split after 10 days of the experiment running. This is to better measure the baseline data for each group instead of just believing they are similar.
(In other words, the pref showTranslationUI should only be toggled to true for the test group after 10 wall days of running the experiment).
The 10-day requirement should be filed and fixed separately. Let's avoid scopecreep!
Depends on: 1038493
I filed bug 1038493 for this. GPS: note that a new template for bootstrap.js might be coming from that bug.

Also, in your manifest.json/install.rdf there are mentions of "aurora", but this experiment is actually going to beta 32.  (e.g. the add-on id should be fx-translation-%%LOCALE%%-beta32@mozilla.org)
This appears to warrant QA. However, it's not clear what would be required in testing/verification.  Please provide insight, Thank you.
QA Whiteboard: [qa?] → [qa+]
I was testing dynamic xpi generation yesterday and confirmed the process can be idempotent. i.e. file mtimes aren't leaking into the zip.
To support the Vi, Po, and Tu variations of the translation experiment,
we're going to dynamically generate an XPI based on templates of their
files.

This patch adds templates for the translation experiment. The files were
obtained from the existing translation-de-aurora experiment. Every de
specific item has been stripped and replaced with template variables.
A subsequent patch will introduce code for building XPIs from
these templates.

Unfortunately, we didn't go 100% and make manifest.json a template. This
change would require significant refactorings to how build.py discovers
experiments. (It would need to support dynamically generated
manifest.json files.)
Attachment #8456355 - Flags: review?(benjamin)
The API key needs work. I'm talking with jakem about establishing a
file with secrets only available to the server/build process. I figure
"foo" and "bar" can be stand-ins for the moment.
Attachment #8456357 - Flags: review?(benjamin)
Summary: Build Telemetry Experiment for Vi/Po/Tu translation trial on beta → Build Telemetry Experiment for Vi/Pl/Tr translation trial on beta
Attachment #8456444 - Flags: review?(benjamin)
Attachment #8456355 - Flags: review?(benjamin) → review+
Comment on attachment 8456356 [details] [diff] [review]
Shared code for producing a translation experiment from templates

I think it's a little strange to have experiment-specific in the telexserver/ directory. Can we either move this into a translation-specific directory, or make this generic code not specific to the translation experiment but valid for any experiment?

If it's going to be generic, perhaps build_translation_xpi should be more generic, such as

def build_translation_xpi(template_dir, dest_xpi, templateparams)

or **templateparams
Attachment #8456356 - Flags: review?(benjamin) → review-
Comment on attachment 8456357 [details] [diff] [review]
Vietnamese translation experiment

rs=me for this once the templating patch is done
Attachment #8456357 - Flags: review?(benjamin) → review+
Attachment #8456444 - Flags: review?(benjamin) → review+
(In reply to Benjamin Smedberg  [:bsmedberg] Away 19-July through 3-Aug from comment #34)
> Comment on attachment 8456356 [details] [diff] [review]
> Shared code for producing a translation experiment from templates
> 
> I think it's a little strange to have experiment-specific in the
> telexserver/ directory. Can we either move this into a translation-specific
> directory, or make this generic code not specific to the translation
> experiment but valid for any experiment?
> 
> If it's going to be generic, perhaps build_translation_xpi should be more
> generic, such as
> 
> def build_translation_xpi(template_dir, dest_xpi, templateparams)
> 
> or **templateparams

I'm tempted to call YAGNI until proved otherwise. We can certainly refactor later.

The reason the code exists in telexserver is because it's a reusable module. If this were other languages, I'd consider a #include equivalent. But Python *really* frowns on this pattern. I would be inviting later maintainers to curse my name. I'd prefer to save face.

Stay tuned for a new patch that hopefully addresses things.
Some experiments contain secret values such as API keys. We don't want
these checked into source control. However, we don't want to make
developers' lives harder either.

This patch introduces a "secrets" file. It is simply a Python file whose
executed content gets converted into a dict. It is similar to how Django
config files and moz.build files work.

There exists an in-tree secrets file with dummy values or values we
don't care about leaking to the outside world. In production, the actual
secrets file is installed via some other mechanism and the official
build process sets an environment variable with the location of this
file.

While we'll eventually want the server to provide a secrets file, we can
avoid this extra work for the moment by checking in an .xpi file
produced with the secrets. When the server is updated, we can remove the
.xpi and use the new build mechanism.
Attachment #8456492 - Flags: review?(benjamin)
Attachment #8456493 - Flags: review?(benjamin)
Attachment #8456494 - Flags: review?(benjamin)
Attachment #8456495 - Flags: review?(benjamin)
I just remembered I never updated the placeholder time values in the experiment manifests. What is the run-time of this experiment? The initial comment gives a range. I need hard dates.
Flags: needinfo?(benjamin)
note: I pushed the updated template for bootstrap.js in bug 1038493.
Comment on attachment 8456492 [details] [diff] [review]
Framework for obtaining secret values

This feels like overkill to me, but ok.
Attachment #8456492 - Flags: review?(benjamin) → review+
Flags: needinfo?(benjamin)
The experiment should start immediately (whenever the 32beta upgrade happens next week), although I think that maxversion is probably incorrect: I believe we want the experiment to continue into 34 beta at least. Either Chad or Felipe needs to answer the question about the end date and total runtime.

Note: do not use maxStartTime because it's still broken on the client.
Flags: needinfo?(felipc)
Flags: needinfo?(cweiner)
Comment on attachment 8456493 [details] [diff] [review]
Vietnamese translation experiment

r=me except for figuring out the final dates/runtime. I'm pretty sure you'll need to update endTime/maxActiveSeconds/maxVersion for all of these.
Attachment #8456493 - Flags: review?(benjamin) → review+
Attachment #8456494 - Flags: review?(benjamin) → review+
Attachment #8456495 - Flags: review?(benjamin) → review+
Start time: Jul 16, 2014   =>  1405479600
End time:   Nov 25, 2014   =>  1416880800
maxActiveSeconds: 126 days =>  10886400
minVersion: 32.0 (no need for "a" suffix for beta)

I think maxVersion is unnecessary, but if we want to include one: 34.0

Start time will actually be July 22 when the beta channel becomes 32, but by setting it to a time in the past it makes this a bit easier to QA with no downsides..
Flags: needinfo?(felipc)
Blocks: 1041598
Translation experiments need to run for 126 days. We increase the arbitrary
42 day max to another arbitrary value, 180 days.
Attachment #8459663 - Flags: review?(felipc)
Attachment #8459663 - Flags: review?(felipc) → review+
Everything is checked in and this should be ready for QA and final deployment.
Flags: needinfo?(cweiner)
QA and final deploy will happen on separate bugs, so marking this one as fixed.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
(In reply to :Felipe Gomes (away Jul 23 - Aug 13) from comment #52)
> QA and final deploy will happen on separate bugs, so marking this one as
> fixed.

Hi Felipe, based on your comment should this bug be marked as [qa-]?
Yeah, let's call this QA- and leave that to bug 1041598.
QA Whiteboard: [qa+] → [qa-]
Blocks: 1042184
Blocks: 1042728
Blocks: 1072741
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: