Closed Bug 460791 Opened 16 years ago Closed 15 years ago

Create l10n nightly builds for mozilla-central

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: coop)

References

Details

Attachments

(12 files, 15 obsolete files)

7.36 KB, patch
coop
: review+
Details | Diff | Splinter Review
11.08 KB, patch
coop
: review+
Details | Diff | Splinter Review
1.10 KB, patch
coop
: review+
Details | Diff | Splinter Review
2.39 KB, patch
coop
: review+
Details | Diff | Splinter Review
2.15 KB, patch
Details | Diff | Splinter Review
5.41 KB, patch
coop
: review+
Details | Diff | Splinter Review
10.78 KB, patch
coop
: review+
Details | Diff | Splinter Review
7.70 KB, patch
Details | Diff | Splinter Review
7.40 KB, patch
coop
: review+
Details | Diff | Splinter Review
5.97 KB, patch
armenzg
: review+
Details | Diff | Splinter Review
2.63 KB, patch
coop
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
691 bytes, patch
armenzg
: review+
Details | Diff | Splinter Review
We need to generate the installers for all locales every night
Priority: -- → P2
There are few TODO lines and this is not the right since it makes more sense to trigger the l10n repackages to use a Dependent scheduler after this:

    85     c['schedulers'].append(Nightly(
    86         name='%s nightly' % name,
    87         branch=name,
    88         hour=[2],
    89         builderNames=nightlyBuilders
    90     ))

The steps that are in l10n_master.py worked all the way to upload (since my machine cannot really upload)

We might want use one of the Mercurial class for what I do to get the source code for en-US and/or the l10n locales
Attachment #344180 - Flags: review?(ccooper)
Comment on attachment 344180 [details] [diff] [review]
first attempt to have l10n nightly builds on fx3.1 in parallel

I strongly suggest to not use the buildbot mercurial classes, they're brittle shit. Pulls for l10n builds are brittle, too. I hope that http://hg.mozilla.org/users/axel_mozilla.com/tooling/rev/28048340c588 will ease that pain, I suggest you update your code for that, too.

Removing dist sounds too harsh, as you'll kill the en-US downloads in each build. In particular now that the wget target only updates on new builds. I'd just kill upload instead, if you want to.
(In reply to comment #2)
> (From update of attachment 344180 [details] [diff] [review])
> Removing dist sounds too harsh, as you'll kill the en-US downloads in each
> build. In particular now that the wget target only updates on new builds. I'd
> just kill upload instead, if you want to.
True - makes sense to me

> I hope that
> http://hg.mozilla.org/users/axel_mozilla.com/tooling/rev/28048340c588 will ease
> that pain, I suggest you update your code for that, too.
>
Is this "hg -R %(locale)s pull -r tip" to get rid of any previous pull from a specific revision (now that you can set the en-US revision of a build rather than the tip)?
(In reply to comment #3)
> Is this "hg -R %(locale)s pull -r tip" to get rid of any previous pull from a
> specific revision (now that you can set the en-US revision of a build rather
> than the tip)?

No, it's a work around that particular race condition for pull. The race doesn't affect clone, according to djc, that's why it's only there for pull.

I never pull to the nightly revision, btw, I merely update to it. The repo that I pull stays unchanged.
(In reply to comment #2)
> (From update of attachment 344180 [details] [diff] [review])
> I strongly suggest to not use the buildbot mercurial classes, they're brittle
> shit. Pulls for l10n builds are brittle, too.

Can you expand on this? We use them fine for en-US builds...are you talking about bugs in them (one of which we have a patch for), the fact that they don't work well if you use more than one per Builder...?
Right, now I remember. The mercurial sources clobber if they're not happy with the revision they get, so they get happily slow when you have more than one source step in the builder.
(In reply to comment #6)
> Right, now I remember. The mercurial sources clobber if they're not happy with
> the revision they get, so they get happily slow when you have more than one
> source step in the builder.

The bug I know of is "Mercurial sources clobber if whenever they have a revision handed to them" (http://buildbot.net/trac/ticket/277). We have a patch for this landed in our repository. I'm unsure how they will work being used multiple times in a builder but the aforementioned problem is not a problem for us.
This is for "mozilla-central" branch only. Let's leave l10n repackages for "tracemonkey" for another moment

I have been able to use this code to repackages the locales that are uncommented in the declaration of DependentL10n scheduler.

I have to test this to be able to check for the latest list from all-locales but the patch is 90% ready
Attachment #344180 - Attachment is obsolete: true
Attachment #344305 - Flags: review?(ccooper)
Attachment #344180 - Flags: review?(ccooper)
The branch and reason parameters are passed to NoMergeStamp which does not make sense for a Dependent scheduler (It only makes sense for a Scheduler that is reactive to commits)

self argument was missing as well for __init__
Attachment #344306 - Flags: review?(ccooper)
Comment on attachment 344305 [details] [diff] [review]
moz2-staging: generate L10n repackages in parallel for mozilla-central only after an en-US nightly happens (2AM)


>+L10nNightlyFactory.addStep(ShellCommand,
>+    command=["make", WithProperties("installers-%(locale)s")],
>+    #It seems that I need to set it because of /packager.mk#144
>+    env={'PKG_DMG_SOURCE':'firefox'},
>+    haltOnFailure = True,
>+    workdir="mozilla-central/browser/locales"
>+)

I didn't need that. If we do, it's a bug that should be fixed.
Comment on attachment 344306 [details] [diff] [review]
class DependentL10n had a missing argument and added branch and reason

Do we want to actually set a specific l10n-related reason for clarity?
Attachment #344306 - Flags: review?(ccooper) → review+
(In reply to comment #11)
> (From update of attachment 344306 [details] [diff] [review])
> Do we want to actually set a specific l10n-related reason for clarity?
For the dependent scheduler? We could but I do not what to set it too. Does that value get displayed somewhere in the waterfall?
(In reply to comment #8)
> This is for "mozilla-central" branch only. Let's leave l10n repackages for
> "tracemonkey" for another moment

There's no need to worry about l10n builds for project branches like tracemonkey.
Comment on attachment 344305 [details] [diff] [review]
moz2-staging: generate L10n repackages in parallel for mozilla-central only after an en-US nightly happens (2AM)

> +# custom classes used
> +from buildbotcustom.l10n.l10n import BuildL10n

Ugh, do we really have an l10n class nested in an l10n directory? That seems poor, naming-wise. Not critical to fix, I guess.

> +# This will remove the folder from which we upload the packages that are generated

For clarity's sake, can we move the comments out from inside the step declaration and put them before?

> +    env={'PKG_DMG_SOURCE':'firefox'},

Can we make this configurable so that the other Mozilla projects can re-use this if they want? Defaulting to 'firefox' is fine though.

>+# This will upload everything in dist/upload, I assume that the first 
> +# step run in this build is "rm -rf mozilla-central/dist/upload" 

Again, move the general comment to outside the step.

> +    # for now let's only add notifiers for the main branch

(3 occurrences) As Nick mentioned, we will only ever care about mozilla-central, although making this configurable for other projects might be nice.
Attachment #344305 - Flags: review?(ccooper) → review-
In 1.9.0 we use a Nightly scheduler - that is why I left a TODO which can be tracked in another bug. Even if the 1.9.0 master's buildbotcustom was to be updated it would work but at some point I would like to remove the repoType = None to remove the TODO line

I was not sure what to do with the errBack for d.addCallback(_getLocalesList). Axel what is the proper way?
Attachment #344306 - Attachment is obsolete: true
Attachment #344519 - Flags: review?(ccooper)
I'd leave the errback unhandled, that should be more or less equivalent to what you do with the errors that subprocess might give you.

You should give a timeout to getPage(). As it's hard to recover, I'd use a long one, like 5 minutes or so. But still.
Fixed the mentioned comments and added the ability to add BRANCH and PROJECT

I need to see this running on staging to get confident that it will work at the same time with everything else.
In my local setup I only added builder for linux with:
 for platform in branch['platforms'].keys():
+   if platform.find('linux') >= 0 and len(platform) == 5:
       builders.append('%s build' % branch['platforms'][platform]['base_name'])

Let me know when do you feel I can run this in staging
Attachment #344305 - Attachment is obsolete: true
Attachment #344526 - Flags: review?(ccooper)
Comment on attachment 344526 [details] [diff] [review]
 moz2-staging: generate L10n repackages in parallel for mozilla-central only after an en-US nightly happens (2AM)

+allLocalesURL  = 'http://hg.mozilla.org/mozilla-central/index.cgi/raw-file/tip/browser/locales/all-locales'

Can we remove index.cgi from the path? I don't think it's necessary.

The rest looks good. r+ with that small change.
Attachment #344526 - Flags: review?(ccooper) → review+
Attachment #344519 - Flags: review?(ccooper) → review+
Comment on attachment 344519 [details] [diff] [review]
DependentL10n changes plus adding the ability to deal with hg

>? tools/buildbotcustom/__init__.pyc
>? tools/buildbotcustom/env.pyc
>? tools/buildbotcustom/misc.pyc
>? tools/buildbotcustom/changes/__init__.pyc
>? tools/buildbotcustom/changes/hgpoller.pyc
>? tools/buildbotcustom/l10n/.scheduler.py.swp
>? tools/buildbotcustom/l10n/__init__.pyc
>? tools/buildbotcustom/l10n/l10n.pyc
>? tools/buildbotcustom/l10n/scheduler.pyc
>? tools/buildbotcustom/process/__init__.pyc
>? tools/buildbotcustom/process/factory.pyc
>? tools/buildbotcustom/steps/__init__.pyc
>? tools/buildbotcustom/steps/misc.pyc
>? tools/buildbotcustom/steps/test.pyc
>? tools/buildbotcustom/steps/transfer.pyc
>? tools/buildbotcustom/steps/updates.pyc
>? tools/buildbotcustom/unittest/__init__.pyc
>? tools/buildbotcustom/unittest/steps.pyc

Can I get you to add a top-level .cvsignore file for buildbotcustom that will ignore the .pyc files please?

>Index: tools/buildbotcustom/l10n/scheduler.py
>===================================================================
>@@ -19,32 +19,41 @@
>+      self.scheduler         = scheduler
>+      self.repoType          = repoType

Why the extra space here before the "=" ?

>@@ -87,30 +96,38 @@ class L10nMixin(object):
>+        else: #the repoType is 'hg'
>+           #getPage returns a defered that will return a string
>+           d = getPage(self.localesURL)
>+           def _getLocalesList(data):
>+               return filter(None, data.split()) 
>+           d.addCallback(_getLocalesList)
>+           return d

Did you see Axel's comment about setting a timeout for this step? It's important that we not hang indefinitely here.

>@@ -157,19 +180,26 @@ class DependentL10n(Dependent):
>+      # I wonder if we should make each scheduler to set its
>+      # own submit function to be passed to the L10nMixin

What logic/info would you include in the subclassed submit function that would make it necessary?

Some little nits, but overall it's good. The timeout is the big issue. It could be we're willing to let that slide until it bites us, just so long as we're aware. With or without the timeout, let's get this running on stage to test it out and then go from there.
These what I have done and the problems I have found in staging-master in chronological order when setting up the l10n repackages in parallel

1) put the diffs of previous work in:
/home/cltbld/armenzg/buildbot-configs.25Oct.diff
/home/cltbld/armenzg/cvs-buildbot-config.25Oct.diff
2) hg update -C & cvs update -C
3) cvs -d :ext:stgbld@cvs.mozilla.org:/cvsroot co -d buildbotcustom mozilla/tools/buildbotcustom (since l10n module was missing)
4) apply the patches from this bug
5) symlink l10n_config.py and l10n_master.py

PROBLEMS:
#########

1)
File "/tools/buildbotcustom/buildbotcustom/process/factory.py", line 755, in __init__
	    self.addStep(SetProperty,
	exceptions.NameError: global name 'SetProperty' is not defined

SOLUTION:
 - commented out import of release_master.py

2) 
|--- <exception caught here> ---
|	  File "/tools/twisted-8.0.1/lib/python2.5/site-packages/twisted/internet/defer.py", line 323, in _runCallbacks
|	    self.result = callback(self.result, *args, **kw)
|	  File "/tools/buildbotcustom/buildbotcustom/l10n/scheduler.py", line 99, in _cbLoadedLocales
|	    self.scheduler.submit(
|	exceptions.AttributeError: DependentL10n instance has no attribute |'submit'
-------------------------
|	--- <exception caught here> ---
|	  File "/tools/buildbot/lib/python2.5/site-packages/buildbot/process/base.py", line 336, in startBuild
|	    self.setupBuild(expectations) # create .steps
|	  File "/tools/buildbotcustom/buildbotcustom/l10n/l10n.py", line 44, in setupBuild
|	    self.setProperty('locale', bd.locale)
|	exceptions.TypeError: setProperty() takes exactly 4 arguments (3 given)
-------------------------

SOLUTION:
 - in buildbot 0.7.9 the method has been change to submitBuildSet and setProperty needs an extra argument
--> used self.scheduler.submitBuildSet() instead of self.scheduler.submit()
--> changed line to:
self.setProperty('locale', bd.locale, "Build")

3)- folder to upload repackages to is missing
SOLUTION:
 - I had to create the folder /home/ftp/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n in staging-stage.build.mozilla.org

4) Permision denied to upload the packages
drwxr-xr-x  2 ffxbld ffxbld  4096 Oct 25 17:54 latest-mozilla-central-l10n

SOLUTION:
chmod 775 latest-mozilla-central-l10n/

5) a folder named "latest" is created in latest-mozilla-central-l10n
SOLUTION:
Add to uploadPath from l10n_config.py a missing a slash at the end of it

6) I still get permission denied even though I can run the command scp locally in moz2-linux-slave04
SOLUTION:
 - added -i key and ffxbld@
command=['sh','-c','scp -i ~/.ssh/ffxbld_dsa -r * ffxbld@'+ftpserver+":"+uploadPath],

7) the l10n builders do not start because the mac en-US nightly has not finished
SOLUTION:
 - Connected to moz2-darwin9-slave01 to kick the mac slave
the buildbot.tac had incorrect umask, password and port.
Fixed and backed up as /builds/slave/buildbot.tac.armen

8) the en-US builders start but not the l10n ones
The problem is that there is no linux-64 and therefore the Nigthly scheduler that triggers the Dependent one does not start

SOLUTION:
-> I have added the line:

before this:
|if platform in ("linux", "macosx", "win32"):
|   nightlyBuilders.append('%s nightly' % \
|                                    branch['platforms'][platform]['base_name'])
|                l10nNightlyBuilders.append('%s l10n nightly' % \
|                                            branch['platforms'][platform]['base_name'])

|if platform in ("linux", "macosx", "win32"):
|   mozilla2_l10n_nightly_builder = {
|                   'name': '%s l10n nightly' % pf['base_name'],
|   etc...

################

1) I will write the patches later
2) The l10n custom classes have those 2 slight changes depending if running on 0.7.7 and 0.7.9. How do I approach this?
3) How do we deal with linux-64? Is it supposed to run an en-US nightly? Can we dedicate a Nigthly scheduler just for it? Shall we separate the platforms that have to generate l10n repackages after an en-US build by adding parameters to config.py?
I need help with this:
http://staging-master.build.mozilla.org:8010/builders/OS%20X%2010.5.2%20mozilla-central%20nightly/builds/12/steps/hg/logs/stdio

ERROR:
abort: error: Temporary failure in name resolution

I tried to run the same command in the local machine and succeeded but I noticed that the SSH_CLIENT and SSH_CONNECTION were slighly different.
Until I don't get this fixed we won't be able to see l10n builders being triggered.

I have no clue what is going on

-- output --
/tools/python/bin/hg clone http://hg.mozilla.org/mozilla-central /builds/slave/mozilla-central-macosx-nightly/build
 in dir /builds/slave/mozilla-central-macosx-nightly (timeout 1800 secs)
 watching logfiles {}
 argv: ['/tools/python/bin/hg', 'clone', 'http://hg.mozilla.org/mozilla-central', '/builds/slave/mozilla-central-macosx-nightly/build']
 environment:
  CVS_RSH=ssh
  HOME=/Users/cltbld
  LOGNAME=cltbld
  MAIL=/var/mail/cltbld
  MANPATH=/usr/share/man:/usr/local/share/man:/usr/X11/man
  OLDPWD=/Users/cltbld
  PATH=/tools/buildbot/bin:/tools/python/bin:/opt/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
  PWD=/builds/slave
  PYTHONHOME=/tools/python
  PYTHONPATH=/tools/buildbot/lib/python2.5/site-packages:/tools/twisted/lib/python2.5/site-packages/:/tools/twisted-core/lib/python2.5/site-packages:/tools/zope-interface/lib/python2.5/site-packages:
  SHELL=/bin/bash
  SHLVL=1
  SSH_CLIENT=10.2.72.11 42895 22
  SSH_CONNECTION=10.2.72.11 42895 10.2.71.165 22
  SSH_TTY=/dev/ttys001
  TERM=xterm
  USER=cltbld
  _=/tools/buildbot/bin/buildbot
 closing stdin
 using PTY: True
abort: error: Temporary failure in name resolution
SetProperty is a step in 0.7.9.

If you're using 0.7.9, is it a good idea to still use the old Build-class based code?
(In reply to comment #22)
> SetProperty is a step in 0.7.9.
> 
> If you're using 0.7.9, is it a good idea to still use the old Build-class based
> code?
>
the comment regarding setProperty does not affect the l10n stuff since I do not use SetProperty.
The old Build-class works fine with 0.7.9 aside of the new submitBuildSet and the setProperty that it now receives 4 parameters instead of 3
moz2-(In reply to comment #21)
> I need help with this:
> http://staging-master.build.mozilla.org:8010/builders/OS%20X%2010.5.2%20mozilla-central%20nightly/builds/12/steps/hg/logs/stdio
> 
> ERROR:
> abort: error: Temporary failure in name resolution
> 
It seems that it was that the buildbot version was 0.7.7

I have upgraded and it seems to work now
The green after upgrading was an ilusion.
I hit the same problem

-----------------------------
starting mercurial operation
rm -rf /builds/slave/mozilla-central-macosx-nightly/build
 in dir /builds/slave/mozilla-central-macosx-nightly (timeout 1800 secs)
 watching logfiles {}
 argv: ['rm', '-rf', '/builds/slave/mozilla-central-macosx-nightly/build']
 environment:
  CVS_RSH=ssh
  HOME=/Users/cltbld
  LOGNAME=cltbld
  MAIL=/var/mail/cltbld
  MANPATH=/usr/share/man:/usr/local/share/man:/usr/X11/man
  OLDPWD=/tools
  PATH=/tools/buildbot/bin:/tools/python/bin:/opt/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
  PWD=/tools/buildbot-0.7.9
  PYTHONHOME=/tools/python
  PYTHONPATH=/tools/buildbot/lib/python2.5/site-packages:/tools/twisted/lib/python2.5/site-packages/:/tools/twisted-core/lib/python2.5/site-packages:/tools/zope-interface/lib/python2.5/site-packages:
  SHELL=/bin/bash
  SHLVL=1
  SSH_CLIENT=10.2.72.11 44411 22
  SSH_CONNECTION=10.2.72.11 44411 10.2.71.165 22
  SSH_TTY=/dev/ttys001
  TERM=xterm
  USER=cltbld
  _=/tools/buildbot/bin/buildbot
 closing stdin
 using PTY: True
elapsedTime=0.013018
/tools/python/bin/hg clone http://hg.mozilla.org/mozilla-central /builds/slave/mozilla-central-macosx-nightly/build
 in dir /builds/slave/mozilla-central-macosx-nightly (timeout 1800 secs)
 watching logfiles {}
 argv: ['/tools/python/bin/hg', 'clone', 'http://hg.mozilla.org/mozilla-central', '/builds/slave/mozilla-central-macosx-nightly/build']
 environment:
  CVS_RSH=ssh
  HOME=/Users/cltbld
  LOGNAME=cltbld
  MAIL=/var/mail/cltbld
  MANPATH=/usr/share/man:/usr/local/share/man:/usr/X11/man
  OLDPWD=/tools
  PATH=/tools/buildbot/bin:/tools/python/bin:/opt/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
  PWD=/tools/buildbot-0.7.9
  PYTHONHOME=/tools/python
  PYTHONPATH=/tools/buildbot/lib/python2.5/site-packages:/tools/twisted/lib/python2.5/site-packages/:/tools/twisted-core/lib/python2.5/site-packages:/tools/zope-interface/lib/python2.5/site-packages:
  SHELL=/bin/bash
  SHLVL=1
  SSH_CLIENT=10.2.72.11 44411 22
  SSH_CONNECTION=10.2.72.11 44411 10.2.71.165 22
  SSH_TTY=/dev/ttys001
  TERM=xterm
  USER=cltbld
  _=/tools/buildbot/bin/buildbot
 closing stdin
 using PTY: True
abort: error: Temporary failure in name resolution
elapsedTime=0.464705
program finished with exit code 255
(In reply to comment #25)
> /tools/python/bin/hg clone http://hg.mozilla.org/mozilla-central
> /builds/slave/mozilla-central-macosx-nightly/build
>  in dir /builds/slave/mozilla-central-macosx-nightly (timeout 1800 secs)
>  watching logfiles {}
>  argv: ['/tools/python/bin/hg', 'clone',
> 'http://hg.mozilla.org/mozilla-central',
> '/builds/slave/mozilla-central-macosx-nightly/build']


Have you tried logging into the slave and running a clone by hand? Are you using the right ssh key?
(In reply to comment #26)
> (In reply to comment #25)
> > /tools/python/bin/hg clone http://hg.mozilla.org/mozilla-central
> > /builds/slave/mozilla-central-macosx-nightly/build
> >  in dir /builds/slave/mozilla-central-macosx-nightly (timeout 1800 secs)
> >  watching logfiles {}
> >  argv: ['/tools/python/bin/hg', 'clone',
> > 'http://hg.mozilla.org/mozilla-central',
> > '/builds/slave/mozilla-central-macosx-nightly/build']
> 
> 
> Have you tried logging into the slave and running a clone by hand? Are you
> using the right ssh key?

I have tried and it worked to clone but as I have find out the slave was for unittest and not for doing builds

We only have for our configuration a windows slave and linux slave
As the description says, I have added:
- timeout
- 0.7.9 changes
Attachment #344519 - Attachment is obsolete: true
Attachment #344894 - Flags: review?(ccooper)
I have made slight changes to l10n_config.py and l10n_master.py
- mkdir step
- autoconf step
They were both bugging me in windows
Attachment #344526 - Attachment is obsolete: true
Attachment #344897 - Flags: review?(ccooper)
Quick status update:
- the l10n repackages in parallel logic is happening correctly
- the windows machines is not being able to repackage (more investigation required http://tinderbox.mozilla.org/showlog.cgi?log=MozillaStaging/1225118136.1225118177.10904.gz&fulltext=1)
- there is no mac slave so I won't be able to test (I had moz-darwin9-slave1 but it was for unittest)

URLs:
- the repackages are being uploaded here: http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n/
- the logs are going in here: http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaStaging
Windows repackages are going to face problems due to bug 461513, but that doesn't fix the configure problem I see in the logs.
Depends on: 461513
Added dependency to mentioned bug.
I have tested the patch on the mentioned window slave and it works after it

Axel, which configure problem?
The linux builds are unstable like yuc, no idea.

The windows builds report that they can't execute configure, see http://tinderbox.mozilla.org/showlog.cgi?log=MozillaStaging/1225114002.1225114017.27915.gz&fulltext=1
No more patches from me

I have added two lines "if platform in ('linux','win32', 'macosx'):"
As per convo with joduinn, we won't be generating linux-64 builds for now and we will soon have a macosx to do the nigthly builds
Attachment #344897 - Attachment is obsolete: true
Attachment #344955 - Flags: review?(ccooper)
Attachment #344897 - Flags: review?(ccooper)
As per convo with joduinn there will be more slaves being added to staging-master (win32, linux, macosx) to look more alike the production environment

The wget step is working again for the win32 machine in staging

We should get some green and win32 repackages during the next two hours in:
* http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n/
* http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaStaging
NOTE:
I have put the master to trigger repackages every 3 hours to get a lot of output and information on how everything is running in staging-master
Attachment #344894 - Flags: review?(ccooper) → review+
Comment on attachment 344894 [details] [diff] [review]
[checked in] DependentL10n changes - getPage of all-locales with timeout - 0.7.9 changes

Looks good.
(In reply to comment #34)
> I have added two lines "if platform in ('linux','win32', 'macosx'):"
> As per convo with joduinn, we won't be generating linux-64 builds for now and
> we will soon have a macosx to do the nigthly builds

There's really no need to have linux 64-bit l10n builds. It's not a platform we ship on.
I do not know too much about "create symbols" and "upload symbols"
What does it happen? Where are the symbols being uploaded (ftp?)?

I have the moz2-linux-slave04 slave not having space left. If it can't finish the nightly build, the scheduler does not finish and we cannot have l10n builds triggered.
###############
/builds/moz2_debug_slave/mozilla-central-linux-nightly/build/toolkit/crashreporter/tools/upload_symbols.sh /builds/moz2_debug_slave/mozilla-central-linux-nightly/build/../20081028060633/crashreporter-symbols-firefox-3.1b2pre-Linux-20081028060633.zip
Transferring symbols... /builds/moz2_debug_slave/mozilla-central-linux-nightly/build/../20081028060633/crashreporter-symbols-firefox-3.1b2pre-Linux-20081028060633.zip

crashreporter-symbols-firefox-3.1b2pre-Linux-   0%    0     0.0KB/s   --:-- ETA
crashreporter-symbols-firefox-3.1b2pre-Linux- 100% 6903KB   6.7MB/s   00:01    
scp: /data/symbols///crashreporter-symbols-firefox-3.1b2pre-Linux-20081028060633.zip: No space left on device
make: *** [uploadsymbols] Error 1
program finished with exit code 2
Right, any failure will cause the l10n builds to fail. Don't worry about the specifics of the symbol steps. This failure highlights the fact that we probably need more disk space on our slaves before we can deploy this -- things are going to get tight after we start building a maintenance branch in addition to mozilla-central.
(In reply to comment #40)
> I have the moz2-linux-slave04 slave not having space left. If it can't finish
> the nightly build, the scheduler does not finish and we cannot have l10n builds
> triggered.

I don't think we want/need separate symbols for l10n builds. They'd be the same as the base en-US build anyway. 

Ted: maybe we do need some way to link a crashing l10n build to the original en-US build (for which would have symbols) in crash-stats. Does the build ID get us that for free?
I totally agree with you Coop. I think Armen was talking about this failure in the en-US nightly: http://staging-master.build.mozilla.org:8010/builders/Linux%20mozilla-central%20nightly/builds/1413. AFAICT l10n builds are running buildsymbols
A few general comments:

We should change the builder names to conform with what we discussed in .builds, http://groups.google.com/group/mozilla.dev.l10n/browse_frm/thread/756b9d4987bc2e8/f4d24b980a4e7445?lnk=gst#f4d24b980a4e7445

I prefer variable arguments to make to be on the commandline rather than in the environment.

Is there a bug on PKG_DMG_SOURCE?

The nightly builders should probably not be platform-protected like the l10n builds.

# foo vs. #foo, I'd suggest to use a space after the # consistently.

Any idea why the actual commands to execute don't show up in the tinderbox logs?
(In reply to comment #44)
> A few general comments:
> 
> We should change the builder names to conform with what we discussed in
> .builds,
> http://groups.google.com/group/mozilla.dev.l10n/browse_frm/thread/756b9d4987bc2e8/f4d24b980a4e7445?lnk=gst#f4d24b980a4e7445
> 
The names that you see in tboxStaging are the ones of the machines for 1.9.0.
If you look at:
http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaStaging&maxdate=1225162483&legend=0&norules=1
You will see the first two columns to be named:
- Linux mozilla-central l10n nightly %
- WINNT 5.2 mozilla-central l10n nightly
We can file another bug for the namings

(In reply to comment #43)
> I totally agree with you Coop. I think Armen was talking about this failure in
> the en-US nightly:
>
I was talking about en-US nightly
Comment on attachment 344955 [details] [diff] [review]
[checked in] moz2-staging: l10n_config.py, l10n_master.py and changes to master.cfg to enable l10n repackages after the 2AM en-US nightly - only do nightly builds for "linux", "win32" and "macosx"

Let's make sure the builder naming bug and the bug for PKG_DMG_SOURCE get filed so they don't get lost.

I'll fix the comment spacing inconsistency myself prior to check-in.

Armen: can you mark the patch descriptions with [checkin needed] when you're ready to have them landed please?
Attachment #344955 - Flags: review?(ccooper) → review+
(In reply to comment #42)
> I don't think we want/need separate symbols for l10n builds. They'd be the same
> as the base en-US build anyway. 
> 
> Ted: maybe we do need some way to link a crashing l10n build to the original
> en-US build (for which would have symbols) in crash-stats. Does the build ID
> get us that for free?

The l10n builds are using the same binaries as the en-US builds, right? If so, you do not want to attempt to upload symbols, as a) you won't have any (you're using stripped en-US builds), and b) the en-US symbols will work just fine for you anyway. It's not based on BuildID, but rather on UUIDs embedded in the modules (for Windows), or MD5sums of the modules.
Keywords: checkin-needed
Regarding symbols - just ignore it because the whole problem was that an en-US nightly build was not able to upload them to staging-stage because /builds is 100% full.
(In reply to comment #46)
> (From update of attachment 344955 [details] [diff] [review])
> Let's make sure the builder naming bug and the bug for PKG_DMG_SOURCE get filed
> so they don't get lost.
> 
Filed bug 462177 and 462179

The l10n builds are happening in staging-master correctly BUT there are so many builds being triggered that win32 never reaches to do the l10n repackages for being so slow and having so much work.
Linux l10n builds had a 2 days gap in which linux never reached to do l10n repackages

I have filed bug 462180 to track the slaves being added to staging-master
Depends on: 462180
That problem is a clear indication that a regular scheduler is a weak tool, and push-on-build is the right tool. Apart from tinderbox dropping builders on a complete lack of regular builders.
Comment on attachment 344955 [details] [diff] [review]
[checked in] moz2-staging: l10n_config.py, l10n_master.py and changes to master.cfg to enable l10n repackages after the 2AM en-US nightly - only do nightly builds for "linux", "win32" and "macosx"

Armen asked me to check this in so I had a quick look over it....

I don't understand why there's l10n logic in both l10n.py and master.cfg. Is there a reason you didn't follow the same format that unittest, release, and mobile stuff did?

I'm going to go ahead and check this in but I'd really like it cleaned up and made consistent with the other tasks...
Comment on attachment 344894 [details] [diff] [review]
[checked in] DependentL10n changes - getPage of all-locales with timeout - 0.7.9 changes

Checking in l10n/l10n.py;
/cvsroot/mozilla/tools/buildbotcustom/l10n/l10n.py,v  <--  l10n.py
new revision: 1.2; previous revision: 1.1
done
Checking in l10n/scheduler.py;
/cvsroot/mozilla/tools/buildbotcustom/l10n/scheduler.py,v  <--  scheduler.py
new revision: 1.2; previous revision: 1.1
done
Attachment #344894 - Attachment description: DependentL10n changes - getPage of all-locales with timeout - 0.7.9 changes → [checked in] DependentL10n changes - getPage of all-locales with timeout - 0.7.9 changes
Comment on attachment 344955 [details] [diff] [review]
[checked in] moz2-staging: l10n_config.py, l10n_master.py and changes to master.cfg to enable l10n repackages after the 2AM en-US nightly - only do nightly builds for "linux", "win32" and "macosx"

changeset:   484:cc4560c100da
Attachment #344955 - Attachment description: moz2-staging: l10n_config.py, l10n_master.py and changes to master.cfg to enable l10n repackages after the 2AM en-US nightly - only do nightly builds for "linux", "win32" and "macosx" → [checked in] moz2-staging: l10n_config.py, l10n_master.py and changes to master.cfg to enable l10n repackages after the 2AM en-US nightly - only do nightly builds for "linux", "win32" and "macosx"
Reaching the point where we can see l10n repackages in parallel in staging-master is getting easier now since we have one more linux and 3 macs but we are still having only one windows machine. (see bug 462180)

The problem that we are hitting is that I cannot see any l10n repackages happening for these reasons:
1) until last night we only had 2 platforms, therefore, the Nightly scheduler that triggers nightly builds in linux, win32 and macosx would never trigger the DependentL10n since there was no mac to do any builds
2) another bug that bhearsum found is that if you do a buildbot reconfig while an upstream scheduler is running the Dependent will never trigger (see bug 459213)
3) last night there was for the first 3 platforms but win32 and macosx failed to do their builds and therefore no l10n repackages. Maybe tonight we have some luck?

Said that, I would like to propose (or think out loud) if it would make sense to have:
1) a Nightly scheduler per platform (e.g. linux nightly)
2) a DependentL10n scheduler per platform (e.g. linux l10n nightly)
This means:
1) That if a specific builder finishes its nightly build, it will automatically generate the l10n repackages for that platform
2) The platform that builds its nightly, it builds its l10n nightly builds
3) Each platform is independent from each other
4) linux and macosx won't have to wait for windows (who takes 2hr and 40mins) to finish before the l10n repackages can be triggered
This attachemnt shows the differences between mozilla2-staging/master.cfg and mozilla2/master.cfg for sanity check
Comment on attachment 346595 [details] [diff] [review]
moz2-production: l10n_config.py, l10n_master.py and changes to master.cfg to enable l10n repackages after the 2AM en-US nightly

I forgot to mention that the *only* change that I have done from the patches for staging is to change the +ftpserver      = 'stage.mozilla.org'
variable in l10n_config.py
This patch will allow us to push the "build on change" repackages to "tinderbox-builds" instead of "nightly"

Shall we change the TinderboxMailNotifier to another tree? or shall we show the "build on change" repackages with the columns that they currently have?

In my opinion I would report where is currently reporting (Mozilla-l10n) but changing the headers to make note that they are "build on change" builds. Not sure what the proper naming should be since I do not see a formal convention of headers in Firefox or Firefox3.0 tinderbox pages
Attachment #346604 - Flags: review?(ccooper)
(In reply to comment #54)
> Said that, I would like to propose (or think out loud) if it would make sense
> to have:
> 1) a Nightly scheduler per platform (e.g. linux nightly)
> 2) a DependentL10n scheduler per platform (e.g. linux l10n nightly)

Yeah, we totally have to do this.
If we'd use Triggerable (or a variation thereof), we might get off easier than with dependent schedulers.
I guess we could pass in the name of a downstream scheduler to MercuriaBuildFactory - but tbh that feels kindof icky. I'll leave this up to Armen though.
The win32 machine had an exception for its nightly: http://staging-master.build.mozilla.org:8010/builders/WINNT%205.2%20mozilla-central%20nightly/builds/148
|OBJDIR=obj-firefox python obj-firefox/_profile/pgo/profileserver.py
|SSL tunnel pid: 5112
|Application pid: 6688
|Shutting down...
|failed to bind socket
|
|command timed out: 3600 seconds without output
|SIGKILL failed to kill process
|using fake rc=-1
|program finished with exit code -1
|
|remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most |recent call last):
|Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to kill process
|]

The macosx has not yet finished its nightly build

Therefore, no l10n repackages in staging-master for today
Attachment #346604 - Flags: review?(ccooper) → review+
Comment on attachment 346595 [details] [diff] [review]
moz2-production: l10n_config.py, l10n_master.py and changes to master.cfg to enable l10n repackages after the 2AM en-US nightly

The diff you posted reveals some minor discrepancies between the two files:

* blank lines/whitespace present in only one file
* commenting/capitalization differences in the release automation section

r+ with those nits fixed. Let's keep the files as close to identical as possible. If the patch is ready for check-in now, I can make those changes myself.
Attachment #346595 - Flags: review?(ccooper) → review+
(In reply to comment #63)
> (From update of attachment 346595 [details] [diff] [review])
> r+ with those nits fixed. Let's keep the files as close to identical as
> possible. If the patch is ready for check-in now, I can make those changes
> myself.
>
It is ready
Attachment #346793 - Flags: review?(ccooper)
Attachment #346793 - Attachment is obsolete: true
Attachment #346794 - Flags: review?(ccooper)
Attachment #346793 - Flags: review?(ccooper)
Attachment #346595 - Attachment is obsolete: true
Attachment #346797 - Flags: review?(ccooper)
coop, these are the differences between staging and production with the last changes.

All of these patches are ready to be reviewed and commited
Attachment #346597 - Attachment is obsolete: true
I was trying to deal with the exception on moz2-win32-slave04 but I am not sure what to do.

I think it has to do with bug 420216 but not sure.
The good thing is that we will soon have another win32 slave in staging according to bug 460884 and we can hope this slave doesn't have the same problem.

What do I do?

=== beginning of output ===
make[1]: Leaving directory `/e/builds/moz2_debug_slave/mozilla-central-win32-nightly/build'
OBJDIR=obj-firefox python obj-firefox/_profile/pgo/profileserver.py
SSL tunnel pid: 5144
Application pid: 3640
Shutting down...
failed to bind socket

command timed out: 3600 seconds without output
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1

remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last):
Failure: buildbot.slave.commands.TimeoutError: SIGKILL failed to kill process
]
=== end of output ===
The reasons that slaves moz2-darwin9-slave2 and 3 did not finish their nightly builds because "ar: libthebes.a: No space left on device" (see output)

According to df there were still 9.3GB so I don't know what is going on

df -hi (BEFORE removing anything)
Filesystem      Size   Used  Avail Capacity  iused   ifree %iused  Mounte
/dev/disk0s2    74Gi   65Gi  9.3Gi    88% 17008358 2445348   87%   /
devfs          105Ki  105Ki    0Bi   100%      600       0  100%   /dev
fdesc          1.0Ki  1.0Ki    0Bi   100%        4     253    2%   /dev
map -hosts       0Bi    0Bi    0Bi   100%        0       0  100%   /net
map auto_home    0Bi    0Bi    0Bi   100%        0       0  100%   /home

df -hi (AFTER rm -rf /builds/moz2_slave/mozilla-central-macosx*)
/dev/disk0s2    74Gi   49Gi   25Gi    66% 12806869 6646837   66%   /

df -hi (AFTER rm -rf /builds/moz2_slave/tracemonkey-macosx*)
/dev/disk0s2    74Gi   32Gi   42Gi    44% 8569276 10884430   44%   /

df -hi (AFTER rm -rf /builds/moz2_slave/macosx_*)
/dev/disk0s2    74Gi   19Gi   55Gi    27% 5130221 14323485   26%   /

and I believe that is all I can remove (I only did this in moz2-darwin9-slave2)

=== beginning of output ===
rm -f libthebes.a
ar cr libthebes.a gfxASurface.o gfxAlphaRecovery.o gfxBlur.o gfxContext.o gfxImageSurface.o gfxFont.o gfxFontMissingGlyphs.o gfxFontTest.o gfxFontUtils.o gfxMatrix.o gfxPath.o gfxPattern.o gfxPlatform.o gfxRect.o gfxSkipChars.o gfxTextRunCache.o gfxTextRunWordCache.o gfxUserFontSet.o gfxQuartzSurface.o gfxQuartzImageSurface.o gfxQuartzPDFSurface.o gfxPlatformMac.o gfxAtsuiFonts.o nsUnicodeRange.o gfxQuartzNativeDrawing.o gfxQuartzFontCache.o  
ar: libthebes.a: No space left on device
make[7]: *** [libthebes.a] Error 1
make[7]: *** Deleting file `libthebes.a'
make[6]: *** [libs] Error 2
make[5]: *** [libs] Error 2
make[4]: *** [libs_tier_gecko] Error 2
make[3]: *** [tier_gecko] Error 2
make[2]: *** [default] Error 2
make[1]: *** [build] Error 2
make: *** [build] Error 2
program finished with exit code 2
elapsedTime=6135.925789
=== end of output ===
I triggered another couple of nightly builds but windows failed - sigh
slave4 still gets the exception mentioned previously but
slave15 was unable to finish the aus step:

=== beginning of output ===
C:\WINDOWS\system32\cmd.exe /c ssh -l cltbld staging-stage.build.mozilla.org mkdir -p /opt/aus2/build/0/Firefox/mozilla-central/WINNT_x86-msvc/20081108092418/en-US
 in dir e:\builds\moz2_slave\mozilla-central-win32-nightly\build (timeout 1200 secs)
 watching logfiles {}
 argv: ['C:\\WINDOWS\\system32\\cmd.exe', '/c', 'ssh', '-l', 'cltbld', 'staging-stage.build.mozilla.org', 'mkdir -p /opt/aus2/build/0/Firefox/mozilla-central/WINNT_x86-msvc/20081108092418/en-US']
=== end of output ===
I have been able to do some l10n-repackages after doing a lot of cleaning and twisting the arms of the machines that were doing the nightly builds to find out that there are more problems *sigh*

- I am hitting again the problem to upload packages to staging-stage (Permission denied)
| sh -c "scp -i ~/.ssh/ffxbld_dsa -r * staging-stage.build.mozilla.org:/home/ftp/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n/"
- The darwin slaves do not have autoconf-2.13 or the symlink to it

TODO/FIX:
- Modify the mkdir step to be like this: ['sh','-c','mkdir -p l10n-central']
(In reply to comment #73)
> - I am hitting again the problem to upload packages to staging-stage
> (Permission denied)
> | sh -c "scp -i ~/.ssh/ffxbld_dsa -r *
> staging-stage.build.mozilla.org:/home/ftp/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n/"
>
FIXED - Looking at my own comments fix more things than I thought - I was missing ffxbld@
Attachment #346792 - Flags: review?(ccooper) → review+
Comment on attachment 346792 [details] [diff] [review]
[checked in] buildbotcustom/l10n/scheduler.py - it uses properties to submit a buildset as soon as the scheduler is triggered - it breaks support for 0.7.7

> +        log.msg('Submited '+locale+' locale')

Small typo (submitted), but I'll fix that on check-in.
Attachment #346794 - Flags: review?(ccooper) → review+
Comment on attachment 346797 [details] [diff] [review]
moz2-production: removes the usage of BuildL10n

> +L10nNightlyFactory.addStep(ShellCommand,
> +    command = ['sh','-c','autoconf-2.13'],
> +    haltOnFailure=True,
> +    descriptionDone="autoconf",
> +    workdir = BRANCH

We should be using autoconf-2.13 automatically onthe slave, via symlink or ENV setting, etc. Let's just call "autoconf" here.
Attachment #346797 - Flags: review?(ccooper) → review-
Hrm? `autoconf` is typically the latest version and you have to explicitly select an older version. client.mk honors AUTOCONF set in the environment and does heuristics if that's not set, but I don't see how that helps an explicit shell command.
In the two last patches fixed:
- autoconf
- mkdir
- scp

/me regrets the day he did not decide to write a factory *sigh*
Attachment #346797 - Attachment is obsolete: true
Attachment #347287 - Flags: review?(ccooper)
Updates regarding the runs in staging-master:
- last night was the first time that I can get the l10n repackages happen *naturally*
- the mac repackages failed ("abort: repository default not found!
"), I will have to research more

- check the repackages in:
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n/
Attachment #347286 - Flags: review?(ccooper) → review+
Attachment #347287 - Flags: review?(ccooper) → review+
Comment on attachment 346792 [details] [diff] [review]
[checked in] buildbotcustom/l10n/scheduler.py - it uses properties to submit a buildset as soon as the scheduler is triggered - it breaks support for 0.7.7

Checking in l10n/scheduler.py;
/cvsroot/mozilla/tools/buildbotcustom/l10n/scheduler.py,v  <--  scheduler.py
new revision: 1.3; previous revision: 1.2
done
Attachment #346792 - Attachment description: buildbotcustom/l10n/scheduler.py - it uses properties to submit a buildset as soon as the scheduler is triggered - it breaks support for 0.7.7 → buildbotcustom/l10n/scheduler.py - it uses properties to submit a [checked in] buildset as soon as the scheduler is triggered - it breaks support for 0.7.7
Attachment #346792 - Attachment description: buildbotcustom/l10n/scheduler.py - it uses properties to submit a [checked in] buildset as soon as the scheduler is triggered - it breaks support for 0.7.7 → [checked in] buildbotcustom/l10n/scheduler.py - it uses properties to submit a buildset as soon as the scheduler is triggered - it breaks support for 0.7.7
Attachment #347286 - Flags: review+ → review-
Comment on attachment 347287 [details] [diff] [review]
 moz2-production: removes the usage of BuildL10n   

The autoconf step isn't actually working, so we're waiting on a working command here.
Attachment #347287 - Flags: review+ → review-
Depends on: 464093
Keywords: checkin-needed
Depends on: 464101
In this and the previous patch I have only modified the autoconf line in l10n_master.py

I have tested the command on the darwin machines successfully

I will now have to run this in staging-master and hope that it also works for all other platforms
Attachment #347287 - Attachment is obsolete: true
Attachment #347376 - Flags: review?(ccooper)
No longer depends on: 464101
fixed typo
Attachment #347373 - Attachment is obsolete: true
Attachment #347383 - Flags: review?(ccooper)
Attachment #347373 - Flags: review?(ccooper)
fixed typo

This code is already running on staging.
We will see results tomorrow
Attachment #347376 - Attachment is obsolete: true
Attachment #347385 - Flags: review?(ccooper)
Attachment #347376 - Flags: review?(ccooper)
Blocks: 464165
Attachment #347383 - Flags: review?(ccooper) → review+
Depends on: 464245
Attachment #347385 - Flags: review?(ccooper) → review+
Attachment #347383 - Attachment description: moz2-staging → [checked in] moz2-staging
Attachment #347385 - Attachment description: moz2-production → [checked in] moz2-production
Configs landed.

changeset:   499:4f6acb1dc805
Depends on: 464249
I have applied the changes on staging-master and reconfigured the master

We will have to see more results in few hours
No longer depends on: 464249
Depends on: 464316
The nightly build for win32 failed. Filed bug 464316
I also had to disable "trace-monkey" since the reconfig was not working properly. Filed bug 464269
Depends on: 464380
I declare victory over l10n repackages but defeat over the staging machines :(

Anyways, good news:
1) the l10n repackages are in:
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n/
2) some of the mac repackages are missing because the 2 slaves got filled up and were not able to checkout more code - half way through I made room and from "ru" locale on the repackages are there. This mac issue should be solved in bug 464093 
3) the windows "af" and "bg" repackages happened correctly because they were done by slaves 15 and 16. The rest of them were not done because slave3 was being funny - bug 464380 to follow
No longer depends on: 464380
Depends on: 464380
No longer depends on: 464316
We had a smooth run for l10n nightly builds today for linux and mac.

You can have a look at:
http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaStaging

The only problem found was one of the windows slave misbehaved but I was able to stop it before it dropped more win32 l10n repackages. Filed 464454

I am confident about the code being used and I would like to know when would be the right time to move this to production
Depends on: 464316
Depends on: 464269
A complete run for l10n repackages happened today in staging-master without any slave issues
The l10n repackages did not run today because the nightly builds on staging-master were not able to complete since the staging-stage server had been filled up.

This has been cleaned and we should have l10n repackages in staging-master for tomorrow morning
*sigh*
There has not been L10n repackages the 16th and the 17th in staging because the macs never actually did a nightly because of the workload
# (Nov 15 06:34) rev=[85fbe5bd09c8] success #54: build successful
# (Nov 15 01:44) rev=[85fbe5bd09c8] success #53: build successful

Nevertheless, I am happy that we bhearsum has been able to make fix bug 465352 since it makes all platforms to be independent from each other to trigger the dependent L10n nightly regardless if the other platforms have finished their nightly. You can see l10n repackages of linux and win32 since their Nigthly schedulers had finished (http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaStaging&maxdate=1226955004&legend=0&norules=1)

I have removed bugs 464316 and bug 464093 since we have not hit these slave issues anymore

NOTE: we have had no more slaves issues since the previous comment (Friday 14th) and that was fixed with a more precise cronjob change
Depends on: 465352
No longer depends on: 464093
No longer depends on: 464316
I OK'ed this with coop. We need to comment these out so we can do maintenance on the production master without this l10n stuff getting in the way.
No slave issues since last comment.
It is proven that the code is reliable now since in today's waterfall you can see up to 11 reconfigs and you can still see that the win32-l10n nightly builder has 47 pending builds without being dropped

When can we move this to production?
Summary: Create l10n nightly builds for firefox 3.1 → Create l10n nightly builds for "mozilla-central"
No longer depends on: 465352
Attachment #349577 - Flags: review?(ccooper)
Attachment #349577 - Flags: review?(ccooper) → review+
Attachment #349577 - Flags: checked‑in+
Attachment #349748 - Flags: review?(armenzg) → review+
Comment on attachment 349748 [details] [diff] [review]
Use RepackFactory to create nightly l10n repackages

changeset:   535:18eb30ebe6da
Attachment #349748 - Flags: checked‑in+
Armen's busy with school, so I'll be driving this now.
Assignee: armenzg → ccooper
Status: NEW → ASSIGNED
Summary: Create l10n nightly builds for "mozilla-central" → Create l10n nightly builds for mozilla-central
Blocks: 467833
I'm trying to work on a couple of release related l10n patches and noticed that l10n_config.py and l10n_master.py don't seem to be used anymore - can we remove them?
(In reply to comment #102)
> I'm trying to work on a couple of release related l10n patches and noticed that
> l10n_config.py and l10n_master.py don't seem to be used anymore - can we remove
> them?
>
Yes, it was going to be part of my clean up work that I was going to do later. Go ahead
Attachment #351380 - Flags: review?(ccooper) → review+
Attachment #351380 - Flags: checked‑in+
I've scheduled some downtime for tomorrow morning PST to try re-enabling l10n for m-c now that I've cleaned up after bug 464151.
Some builds failed to upload due to permissions conflicts on stage.m.o, but the process is working after the downtime. Waiting for a proper round of nightlies tomorrow before I'll claim success.
The l10n repackages have been uploaded to:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central-l10n/?C=M;O=D

I can see these columns in the tinderbox page for Mozilla-l10n:
* OS X 10.5.2 mozilla-central l10n nightly
* WINNT 5.2 mozilla-central l10n nightly
but I can't see the Linux one, the last build for that builder was during the down time
(Dec 09 07:30) rev=[??] failure #67: failed upload locale
(In reply to comment #107)
> The l10n repackages have been uploaded to:
> I can't see the Linux one, the last build for that builder was during the
> down time
> (Dec 09 07:30) rev=[??] failure #67: failed upload locale

Axel reported as much on IRC.

I'm going through the logs right now to try to figure out why the dependent builder didn't fire.
If a reconfig happened during the Linux en-US nightly this would stop it from firing.
The linux en-US nightly failed last night due to a slave disconnect, and someone (nthomas, I'm guessing) had to trigger a new one. Sadly, this doesn't trigger the dependent l10n scheduler, but that's fodder for bug 466498.
That sounds to me like we shouldn't use a dependent scheduler for l10n builds, but a triggered one.

I don't see the mozilla-central l10n builds going off at all today.
I doubt this is what's tanking the dependent scheduler in production because staging was missing it too and staging continues to work. It was causing l10n staging builds to fail to hg update.
Attachment #354416 - Flags: review?(armenzg)
Attachment #354416 - Flags: review?(armenzg) → review+
Depends on: 472333
(carrying over blocking nomination from duplicate bug 434289, and minusing all at the same time!)
Flags: blocking1.9.1-
We have nightly builds now. Will file a new bug for enabling repack-on-change (bug 464163) on this branch.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: