Closed Bug 784848 Opened 12 years ago Closed 11 years ago

Do Windows l10n repacks on win64

Categories

(Release Engineering :: General, defect, P2)

x86_64
Windows Server 2008
defect

Tracking

(firefox21 affected)

RESOLVED FIXED
Tracking Status
firefox21 --- affected

People

(Reporter: coop, Assigned: coop)

References

Details

(Whiteboard: [l10n][capacity])

Attachments

(4 files, 3 obsolete files)

We're trying to get rid of all the Win32 buildslaves, but the main thing preventing us from doing so are l10n repacks still run there. We need to port the current process to the win64 slaves, optionally making it more sane in so doing.

Here's a full example log with all the steps:
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-l10n-en-GB/1345581085.1345581978.853.gz&fulltext=1

Here's an example log that with the various buildbot step broken out (requires build VPN access):
http://buildbot-master30.srv.releng.scl3.mozilla.com:8001/builders/Firefox%20mozilla-aurora%20win32%20l10n%20nightly/builds/2720

I'm not sure whether bug 774947 is a blocker for this or not, but I'm marking it as such for now. It may make sense to finally use an objdir for l10n at this point too.

It would also be nice to have a make target that would build just the mar/bsdiff tools and pre-reqs. We spread them out across the current steps, but building it all in one go makes sense to me. I don't think these tools (mar/bsdiff) change very often (if at all), so I would actually prefer to build them once (say, off a release branch), and then deploy them via other means. That would improve our turnaround time on repacks a bit too.

Dare I suggest we include bug 776440 with this work too?
Taking, per meeting with Joey and John.
Assignee: joey → aki
This needs to get addressed soon. We hit serious problems on release day -- as soon as reserved slaves was set for the releases, there were no slaves available to do esr10 dep builds, which was blocking to the go-to-build for both ESRs. This will get MUCH worse in 6 weeks, because we'll have 4 ESR releases and 2 17.0 releases. I don't think we can do all of those releases without them.
Severity: normal → critical
Assignee: aki → coop
Moving these jobs off the slaves helps us create more win64 slaves, so blocks bug 784891.
Blocks: 784891
Depends on: 785392
I got this working in staging today, including using pre-built mar/mbsdiff for all platforms.

There's a fair bit of unbitrotting to do (yay pep8), but I'll start posting patches to the dependent bugs shortly.
Status: NEW → ASSIGNED
Priority: P3 → P2
misc.py: 
* set objdir for repacks
* use enable_pymake to mirror bhearsum's changes to the other factories

process/factory.py:
* stop running makePartialtools(). Nightlies will already have the tools built, repacks now download them
* set absMozillaObjDir for BaseRepackFactory. Actually use this var (and absMozillaSrcDir) in workdir args and commands.
* strip out the steps to build the mar tools for repacks
* add downloadMarTools(). Nightly repacks can use the same checkout if the same slave is being used again, so check to see whether we already have the tools first.

This patch can't land until the patches from bug 774947 and bug 785392 are uplifted everywhere. That shouldn't be a problem since we're not actually using l10n-mozconfigs anywhere yet, and I can dodge around the packager.mk uplift by copying working version of the mar tools into the branch-specific locations on stage if necessary.
Attachment #713518 - Flags: review?(aki)
Attached patch Use win64 slaves for repacks (obsolete) — Splinter Review
Attachment #713519 - Flags: review?(aki)
Attachment #713519 - Attachment is patch: true
Comment on attachment 713518 [details] [diff] [review]
Use pre-built mar and mbsdiff for l10n repacks

>+    def downloadMarTools(self):
>+        '''The mar and bsdiff tools are created by default when
>+           --enable-update-packaging is specified. The latest version should
>+           always be available latest-${branch}/mar-tools/${platform}
>+           on the ftp server.
>+        '''
>+        pass

Is this needed, since it's defined/called below?
Attachment #713518 - Flags: review?(aki) → review+
Comment on attachment 713519 [details] [diff] [review]
Use win64 slaves for repacks

\o/
Attachment #713519 - Flags: review?(aki) → review+
Landing this will probably fix bug 841740.
What's the ETA on this? We need to unblock the SDK bustage of the windows nightlies.
This will land later this week, after all the merge day gymnastics.
But before we try to do aurora builds, I assume? As they'll be broken after merge day, too, I guess.
(In reply to Chris Cooper [:coop] from comment #12)
> This will land later this week, after all the merge day gymnastics.

Can we resolve this before Wednesday night's (Thursday morning's) build? That'd really help QA be able to sign-off on 21.0a2. CC bajaj who's the rel-mgmt POC for 21.
(In reply to Alex Keybl [:akeybl] from comment #14)
> (In reply to Chris Cooper [:coop] from comment #12)
> > This will land later this week, after all the merge day gymnastics.
> 
> Can we resolve this before Wednesday night's (Thursday morning's) build?
> That'd really help QA be able to sign-off on 21.0a2. CC bajaj who's the
> rel-mgmt POC for 21.

Yes, I'm planning to land the changes tonight *after* the releng merge day patches have all landed.
(In reply to Chris Cooper [:coop] from comment #15)
> (In reply to Alex Keybl [:akeybl] from comment #14)
> > (In reply to Chris Cooper [:coop] from comment #12)
> > > This will land later this week, after all the merge day gymnastics.
> > 
> > Can we resolve this before Wednesday night's (Thursday morning's) build?
> > That'd really help QA be able to sign-off on 21.0a2. CC bajaj who's the
> > rel-mgmt POC for 21.
> 
> Yes, I'm planning to land the changes tonight *after* the releng merge day
> patches have all landed.

Thanks for the clarification. I wasn't sure where we were in terms of timelines after reading comment 12.
(In reply to Chris Cooper [:coop] from comment #15) 
> Yes, I'm planning to land the changes tonight *after* the releng merge day
> patches have all landed.

Slight change of plans. All the releng merge day automation changes didn't land tonight, so I'm still blocked behind a few of them. This will land tomorrow instead.
Comment on attachment 713518 [details] [diff] [review]
Use pre-built mar and mbsdiff for l10n repacks

https://hg.mozilla.org/build/buildbotcustom/rev/c04f537af374
Attachment #713518 - Flags: checked-in+
In production.
Comment on attachment 713518 [details] [diff] [review]
Use pre-built mar and mbsdiff for l10n repacks

Backed out for 20.0b1 bustage.
http://hg.mozilla.org/build/buildbotcustom/rev/5ea0122626d2
Attachment #713518 - Flags: checked-in+ → checked-in-
Comment on attachment 713519 [details] [diff] [review]
Use win64 slaves for repacks

Backed out due to 20.0b1 bustage.
http://hg.mozilla.org/build/buildbot-configs/rev/d7a7d80dcf9d
Attachment #713519 - Flags: checked-in+ → checked-in-
checking compiler version... Microsoft (R) C/C++ Optimizing Compiler Version 15.00.30729.01 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

cl : Command line warning D9002 : ignoring unknown option '--version'
cl : Command line error D8003 : missing source filename

checking for make... /usr/local/bin/make
checking for X... no
checking that static assertion macros used in autoconf tests work... yes
checking for 64-bit OS... yes
checking for custom <stdint.h> implementation... none specified
*** Fix above errors and then restart with               "make -f client.mk build"
configure: error: You are targeting i386 but using the 64-bit compiler.
(In reply to Ben Hearsum [:bhearsum] from comment #23)
> checking compiler version... Microsoft (R) C/C++ Optimizing Compiler Version
> 15.00.30729.01 for x64
> Copyright (C) Microsoft Corporation.  All rights reserved.
> 
> cl : Command line warning D9002 : ignoring unknown option '--version'
> cl : Command line error D8003 : missing source filename
> 
> checking for make... /usr/local/bin/make
> checking for X... no
> checking that static assertion macros used in autoconf tests work... yes
> checking for 64-bit OS... yes
> checking for custom <stdint.h> implementation... none specified
> *** Fix above errors and then restart with               "make -f client.mk
> build"
> configure: error: You are targeting i386 but using the 64-bit compiler.

This is expected, since we're trying to build on the wrong slaves. Overriding 'l10n_slaves' for mozilla-beta and mozilla-release will let us continue to use the win32 slaves until we update that process as well.

Currently working on repacking Thunderbird using the new process, but it's being petulant, especially on Mac.
With Callek's help, I was able to get Thunderbird repacks working on Win64 yesterday. There were Makefile changes required in the comm-central tree to mirror what was done to allow mozconfig+objdir repacks for Firefox. Linux also works using the new process. 

Mac is still busted though. There's a problem with how the objdir is created (we end up with objdir-tb/i386/i386 unless I munge the MOZ_OBJDIR for the |make configure| step), and possibly more Makefile changes required to allow repack_installers to work.

There is also a problem whereby we can't re-use the same srcdir/objdir for subsequent repacks, even with the same locale. A CLOBBER file is created during |make configure| and is propagated to the objdir, causing us to fail on the next pass. This makes it hard/impossible to re-use the same slave for (e.g.) multiple nightly repacks in a row as we currently do.

I am working around this right now by removing the CLOBBER files immediately after |make configure| runs. I could be more aggressive and remove the entire objdir before each build, but we're trying to reduce as much (re)compilation time as possible with these repacks. If there's a true fix, or a better way, I'm all ears.
These changes have allowed me to repack Thunderbird on all platforms in staging.

Thanks to Callek for the help with the configure/Makefile changes. 

AFAICT, the l10n-mozconfigs that I've modified weren't actually used before, just like the Firefox ones.

Mark: I'd also like to request approval-comm-aurora? for this patch. I'm not sure if that will require uplifting more of the recent build system changes as well.
Attachment #720046 - Flags: review?(mbanner)
Same as before (see comment #26), but includes the config.mk changes and also the new file (config/makefiles/autotargets.mk) that Callek added that I missed in my first diff.

Mark: still requesting approval-comm-aurora for this too.
Attachment #720046 - Attachment is obsolete: true
Attachment #720046 - Flags: review?(mbanner)
Attachment #720094 - Flags: review?(mbanner)
Delta from previous patch:
* includes Thunderbird
* still use win32 slaves for beta, release, and esr17 branches
Attachment #713519 - Attachment is obsolete: true
Attachment #720143 - Flags: review?(aki)
Delta from previous patch:
* create absSrcDir and absObjDir, as distinct from absMozillaSrcDir and absMozillaObjDir. Both are required under different circumstances now that we use objdir+mozconfig for Thunderbird repacks, while they amount to the same thing for Firefox.
* remove any partials mar files that already exist before creating a new partial mar. This avoids later problems.
* create a link to the existing tools dir. This is required for Thunderbird because we meld two repos and the effective dir depth is different.
* remove the CLOBBER files for Thunderbird so we can effectively re-use an existing srcdir/objdir checkout for repeated repacks. Again, this is because we're melding two repos.
* test the downloaded mar tools to make sure they're valid. Probably won't matter in production, but bit me in staging.
Attachment #713518 - Attachment is obsolete: true
Attachment #720145 - Flags: review?(aki)
Comment on attachment 720145 [details] [diff] [review]
Use pre-built mar and mbsdiff for l10n repacks, v2

>diff --git a/misc.py b/misc.py
>--- a/misc.py
>+++ b/misc.py
>@@ -1294,16 +1294,17 @@ def generateBranchObjects(config, name, 
>         if do_nightly:
>             nightly_builder = '%s nightly' % pf['base_name']
> 
>             platform_env = pf['env'].copy()
>             if 'update_channel' in config and config.get('create_snippet'):
>                 platform_env['MOZ_UPDATE_CHANNEL'] = config['update_channel']
> 
>             triggeredSchedulers = None
>+            l10n_objdir = pf['platform_objdir']

We're defining l10n_objdir inside an |if do_nightly| block, then exiting that block, and re-using l10n_objdir further down in the file without re-defining.
Essentially, if we have dep l10n builds on a branch without nightlies (mozilla-beta? mozilla-release?) then l10n_objdir will be undefined in those branches.

I'm thinking we should put this line above the |if do_nightly|; does that work?  Or are you depending on this behavior to avoid using objdirs on beta+release, which may bite us in 2 merge days' time?

>+        if self.productName == "thunderbird":
>+            self.addStep(MockCommand(**self.processCommand(
>+                name='link_tools',

I definitely don't like the |if self.productName| checks here, but I'm not going to block on it since this is a stopgap before porting desktop l10n anyway.
Could you base this off of |if self.mozillaDir| instead, since that's what makes this link required?
If that makes more sense, and if that's not difficult, let's go with that; otherwise I think I'll just leave this be.

We could make that change here as well as the CLOBBER workaround and here:

>+        if self.productName == "thunderbird":
>+            workdir = '%s/%s/locales' % (self.absObjDir,
>+                                         self.appName)
>+        else:
>+            workdir = '%s/%s/locales' % (self.absMozillaObjDir,
>+                                         self.appName)

This looks like it took a lot of work to debug and fix; thank you!
Attachment #720145 - Flags: review?(aki) → review+
Comment on attachment 720143 [details] [diff] [review]
Use win64 slaves for repacks, v2

>-BRANCHES['comm-central']['repo_path'] = 'comm-central'
>+BRANCHES['comm-central']['repo_path'] = 'users/coop_mozilla.com/comm-central'

Staging cruft?
Attachment #720143 - Flags: review?(aki) → review+
Attachment #720094 - Flags: review?(mbanner) → review?(bugspam.Callek)
(In reply to Aki Sasaki [:aki] from comment #30)
> We're defining l10n_objdir inside an |if do_nightly| block, then exiting
> that block, and re-using l10n_objdir further down in the file without
> re-defining.
> Essentially, if we have dep l10n builds on a branch without nightlies
> (mozilla-beta? mozilla-release?) then l10n_objdir will be undefined in those
> branches.
> 
> I'm thinking we should put this line above the |if do_nightly|; does that
> work?  Or are you depending on this behavior to avoid using objdirs on
> beta+release, which may bite us in 2 merge days' time?

Good catch. This was working by accident.

> Could you base this off of |if self.mozillaDir| instead, since that's what
> makes this link required?
> If that makes more sense, and if that's not difficult, let's go with that;
> otherwise I think I'll just leave this be.

That make sense, although I need to check that |self.mozillaDir != ''| because it gets set to en empty string when not specified to make path-building more straightforward (HA!).

Switched review of the comm-central changes to Callek. Will still need Standard8's approval for uplift to comm-aurora.
Comment on attachment 720094 [details] [diff] [review]
comm-central changes required for repacking l10n on win64, using mozconfigs, objdirs, and pre-built mar tools, v2

Review of attachment 720094 [details] [diff] [review]:
-----------------------------------------------------------------

I wrote most of this code, but coop tested it and it played out fine. I don't forsee it breaking SeaMonkey either, but if it does I can commit to fixing that.

r+ to get releng unblocked
Attachment #720094 - Flags: review?(bugspam.Callek) → review+
(In reply to Chris Cooper [:coop] from comment #32)
> (In reply to Aki Sasaki [:aki] from comment #30)
> > We're defining l10n_objdir inside an |if do_nightly| block, then exiting
> > that block, and re-using l10n_objdir further down in the file without
> > re-defining.
> > Essentially, if we have dep l10n builds on a branch without nightlies
> > (mozilla-beta? mozilla-release?) then l10n_objdir will be undefined in those
> > branches.
> > 
> > I'm thinking we should put this line above the |if do_nightly|; does that
> > work?  Or are you depending on this behavior to avoid using objdirs on
> > beta+release, which may bite us in 2 merge days' time?
> 
> Good catch. This was working by accident.

This might require a buildbot-configs merge-day flag or something, then, depending.

> > Could you base this off of |if self.mozillaDir| instead, since that's what
> > makes this link required?
> > If that makes more sense, and if that's not difficult, let's go with that;
> > otherwise I think I'll just leave this be.
> 
> That make sense, although I need to check that |self.mozillaDir != ''|
> because it gets set to en empty string when not specified to make
> path-building more straightforward (HA!).

>>> mozillaDir = ''
>>> if mozillaDir:
...     print "yes"
... else:
...     print "no"
... 
no
(In reply to Aki Sasaki [:aki] from comment #34) 
> > > I'm thinking we should put this line above the |if do_nightly|; does that
> > > work?  Or are you depending on this behavior to avoid using objdirs on
> > > beta+release, which may bite us in 2 merge days' time?
> > 
> > Good catch. This was working by accident.
> 
> This might require a buildbot-configs merge-day flag or something, then,
> depending.

No merge day flag required AFAICT. beta and release use a completely different script-based process, but do pull in things like the slave list from the configs, as bit us on the initial landing.

> >>> mozillaDir = ''
> >>> if mozillaDir:
> ...     print "yes"
> ... else:
> ...     print "no"
> ... 
> no

Well that's easier.
Comment on attachment 720094 [details] [diff] [review]
comm-central changes required for repacking l10n on win64, using mozconfigs, objdirs, and pre-built mar tools, v2

Flagging Mark for approval to uplift this patch to comm-aurora.
Attachment #720094 - Flags: review?(mbanner)
Comment on attachment 720094 [details] [diff] [review]
comm-central changes required for repacking l10n on win64, using mozconfigs, objdirs, and pre-built mar tools, v2

https://hg.mozilla.org/comm-central/rev/b9245b4d66d1
Attachment #720094 - Flags: checked-in+
Comment on attachment 720094 [details] [diff] [review]
comm-central changes required for repacking l10n on win64, using mozconfigs, objdirs, and pre-built mar tools, v2

>     case "$host" in
>     *-mingw*)
>-	MOZ_BUILD_ROOT=`cd $MOZ_BUILD_ROOT && pwd -W`
>-	;;
>+        MOZ_BUILD_ROOT=`cd $MOZ_BUILD_ROOT && pwd -W`
>+        L10NBASEDIR=`cd $L10NBASEDIR && pwd -W`
>+        ;;
>     esac
This fails horribly when L10NBASEDIR is unset, as then it becomes c:/Documents and Settings/$USER and as we all know spaces are bad...
(In reply to neil@parkwaycc.co.uk from comment #38)
> This fails horribly when L10NBASEDIR is unset, as then it becomes
> c:/Documents and Settings/$USER and as we all know spaces are bad...

I'll talk to Callek today about a proper fix.
(In reply to Chris Cooper [:coop] from comment #39)
> (In reply to neil@parkwaycc.co.uk from comment #38)
> > This fails horribly when L10NBASEDIR is unset, as then it becomes
> > c:/Documents and Settings/$USER and as we all know spaces are bad...
> 
> I'll talk to Callek today about a proper fix.

Does "fails horribly" mean we cause a build failure, or just that we end up with a potentially-dangerous PATH? I've been monitoring Thunderbird-Trunk status on tbpl and haven't seen any bustage.

This stanza appears to be copied directly from m-c, so the bustage would be in both places. However, if L10NBASEDIR isn't set, that means we're not doing repacks, so having a bad PATH is that variable won't hurt us because we won't use it anyway AFAICT. We could do some quoting there to be safe.
Blocks: 849019
Blocks: 849022
(In reply to Chris Cooper [:coop] from comment #40)
> Does "fails horribly" mean we cause a build failure, or just that we end up
> with a potentially-dangerous PATH? I've been monitoring Thunderbird-Trunk
> status on tbpl and haven't seen any bustage.

It looks like the SeaMonkey Windows builders don't like the spaces, the build fails. Thunderbird builds work fine because they use Windows 7 and Win7 uses another path (from a TB comm-central Windows 7 build log: "--with-l10n-base=c:/Users/cltbld"). See http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1362607698.1362607764.26725.gz&fulltext=1 for an example SeaMonkey comm-central build log:
---
running /bin/sh e:/builds/slave/c-cen-t-w32-dbg/build/mozilla/configure  --enable-application=suite [...] --with-l10n-base=c:/Documents and Settings/seabld --cache-file=.././config.cache [...]
[...]
*** Fix above errors and then restart with               "D:/mozilla-build/python27/python2.7.exe e:/builds/slave/c-cen-t-w32-dbg/build/mozilla/build/pymake/pymake/../make.py -f client.mk build"
[...]
configure: warning: Settings/seabld: invalid host type
configure: error: can only configure for one host and one target at a time
configure: error: e:/builds/slave/c-cen-t-w32-dbg/build/mozilla/configure failed for mozilla
---
Blocks: 849609
(In reply to Frank Wein [:mcsmurf] from comment #41)
> running /bin/sh e:/builds/slave/c-cen-t-w32-dbg/build/mozilla/configure 
> --enable-application=suite [...] --with-l10n-base=c:/Documents and
> Settings/seabld --cache-file=.././config.cache [...]

Callek: do we need to add some quotes here -

https://hg.mozilla.org/comm-central/file/98928de881ed/configure.in#l6436 ?
Depends on: 851117
Comment on attachment 720094 [details] [diff] [review]
comm-central changes required for repacking l10n on win64, using mozconfigs, objdirs, and pre-built mar tools, v2

Uplift request moving to bug 851117.
Attachment #720094 - Flags: review?(mbanner)
Depends on: 851720
Depends on: 852106
This patch fixes the Windows build failure by only setting L10NBASEDIR on mingw platforms when it's already set via mozconfig (or directly passed to configure). Maybe figuring out what exactly needs to be quotes would be a better way. Simply quoting L10NBASEDIR like this:
ac_configure_args="$ac_configure_args --with-l10n-base=\"$L10NBASEDIR\""
is not enough.
Comment on attachment 726429 [details] [diff] [review]
Patch to fix Windows build failure

r? for an m-c port of this exact patch?
Attachment #726429 - Flags: review?(gps)
Comment on attachment 726429 [details] [diff] [review]
Patch to fix Windows build failure

Pushed this patch to comm-central with rs=Callek:
https://hg.mozilla.org/comm-central/rev/920a87339795
Attachment #726429 - Flags: review+
Comment on attachment 726429 [details] [diff] [review]
Patch to fix Windows build failure

Justin: I need comm-aurora approval for this patch, because the original patch got landed in Bug 851720. I can't request comm-aurora in this product here, so doing it this way.
Attachment #726429 - Flags: feedback?(bugspam.Callek)
Actually I meant Bug 851117 in Comment 47.
Hello everyone,

Ended up here from bug 852106, and wonder if fallout from this bug is why this Nightly is stuck at Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:22.0) Gecko/20130318 Firefox/22.0 ID:20130318030947 CSet: b03bb3ce8cee regardless of how many times I make it check for updates.

Is this bug affecting it or shall a new bug be filed?

Thanks!
Comment on attachment 726429 [details] [diff] [review]
Patch to fix Windows build failure

Review of attachment 726429 [details] [diff] [review]:
-----------------------------------------------------------------

a-c-aurora=me
Attachment #726429 - Flags: feedback?(bugspam.Callek) → feedback+
Comment on attachment 726429 [details] [diff] [review]
Patch to fix Windows build failure

Review of attachment 726429 [details] [diff] [review]:
-----------------------------------------------------------------

L10NBASEDIR is so hacky.
Attachment #726429 - Flags: review?(gps) → review+
Comment on attachment 726429 [details] [diff] [review]
Patch to fix Windows build failure

this is for m-c (or m-i) next and likely will have fuzz since it was written against c-c. However it should apply just fine.
Attachment #726429 - Flags: checked-in?
Keywords: checkin-needed
Attachment #726429 - Flags: checked-in? → checked-in+
Comment on attachment 726429 [details] [diff] [review]
Patch to fix Windows build failure

Pushed to comm-aurora: https://hg.mozilla.org/releases/comm-aurora/rev/56a4332c907e
https://hg.mozilla.org/mozilla-central/rev/197b63bd6307
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.