Closed Bug 517428 Opened 11 years ago Closed 9 years ago

fix race condition when uploading repackages to tinderbox-builds

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: coop)

References

()

Details

(Whiteboard: [l10n][automation][oldbugs])

Attachments

(2 files, 3 obsolete files)

Not sure what happened here, but the build is red for things that don't seem to have something with the locale here.

http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-l10n-zh-TW/1253238259.1253238409.2898.gz&fulltext=1 claims that the lang pack isn't there.
The mentioned error:
> OSError: [Errno 2] No such file or directory: '/home/ftp/pub/firefox/tinderbox-builds/mozilla-1.9.2-l10n/firefox-3.6a2pre.zh-TW.langpack.xpi'
> make: *** [l10n-upload-zh-TW] Error 2

This is where it supposedly fails (http://mxr.mozilla.org/build/source/tools/stage/post_upload.py#36):
> if os.path.exists(new_file):
>    os.unlink(new_file)

If I look at FTP I can see the file which says it has not been uploaded:
> firefox-3.6a2pre.zh-TW.mac.dmg	17-Sep-2009 18:46 	22M

Not sure how to check what went wrong.
Could there be some corruption on the filesystem?
It seems that this was a race condition since Linux and Mac for that locale ended the last step at the same time.

Shall we not upload langpacks and just keep the installers for tinderbox-builds?
The race condition seems to be there independent of tinderbox build or not.

The other way would be to only upload the langpack for one platform. We should do that cleverly and by coordination with the automation to not run into the same issue that we have with desktop fennecs, where the platform doesn't really make sense. Probably should be some define to upload that says UPLOAD_LANGPACK, and we set that in the buildbot configs to just do that for one platform on the firefox builds.
(In reply to comment #3)
> The race condition seems to be there independent of tinderbox build or not.
> 
> The other way would be to only upload the langpack for one platform. We should
> do that cleverly and by coordination with the automation to not run into the
> same issue that we have with desktop fennecs, where the platform doesn't really
> make sense. Probably should be some define to upload that says UPLOAD_LANGPACK,
> and we set that in the buildbot configs to just do that for one platform on the
> firefox builds.

Or we could toss the XPIs in platform dirs. Eg, win32/zh-TW.xpi, linux/zh-TW.xpi.
Tweaked the summary and futuring it for now
Component: Release Engineering → Release Engineering: Future
Priority: -- → P4
Summary: mac zh-TW build failure in post-upload.py → fix race condition when uploading repackages to tinderbox-builds
Mass move of bugs from Release Engineering:Future -> Release Engineering. See
http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Priority: P4 → P3
Whiteboard: [l10n][automation]
Assignee: nobody → armenzg
Status: NEW → ASSIGNED
Priority: P3 → P5
We can probably follow the same directory format as the other builds:
mozilla-central-linux-l10n
mozilla-central-linux64-l10n
mozilla-central-win32-l10n
mozilla-central-macosx64-l10n
Whiteboard: [l10n][automation] → [l10n][automation][oldbugs]
My vote is for platform-specific dirs. That seems like the easiest path forward.
@triagefollowup

I won't be able to get this.

We are trying to upload xpi files of each platform on different locations so they won't race condition when uploading through post_upload.py.

IIUC the most wanted option is to upload the langpack to:
* {$branch}-l10n/{$platform}
instead of:
* {$branch}-l10n/{$platform}
Whiteboard: [l10n][automation][oldbugs] → [l10n][automation][oldbugs][triagefollowup]
Priority: P5 → P4
I am going to keep 2 of my oldbugs.
Whiteboard: [l10n][automation][oldbugs][triagefollowup] → [l10n][automation][oldbugs]
Axel how terrible or wonderful is this patch in your opinion?
Attachment #517856 - Flags: feedback?(l10n)
This patch would upload the xpi file under "platform/xpi" like we do for releases.

What do you say?
Attachment #517874 - Flags: feedback?(l10n)
Attachment #517856 - Attachment is patch: true
Attachment #517856 - Attachment mime type: application/octet-stream → text/plain
Is this an question about alternatives? Then I prefer the paths to filenames, the filenames seem to have a stronger implication of being bound to a platform than the path. And langpacks should really be platform indepenent. (If they're dependent, I'd consider that to be a bug.)
Attachment #517856 - Flags: feedback?(l10n)
Comment on attachment 517874 [details] [diff] [review]
PKG_LANGPACK_PATH to upload to platform/xpi

I should have been more clear.
It was to choose one above the other.

To be honest I don't know if they are dependent or not.
Attachment #517874 - Flags: feedback?(l10n) → review?(l10n)
Priority: P4 → P5
Axel gentle poke for review.
Comment on attachment 517874 [details] [diff] [review]
PKG_LANGPACK_PATH to upload to platform/xpi

This would break http://mxr.mozilla.org/mozilla-central/source/browser/locales/Makefile.in#301, in whichever sense of break. Aka, I have no clue if we use that target for anything, but we should either make things consistent or rip it out.
Attachment #517874 - Flags: review?(l10n) → review-
(In reply to comment #2)
> It seems that this was a race condition since Linux and Mac for that locale
> ended the last step at the same time.
> 
> Shall we not upload langpacks and just keep the installers for
> tinderbox-builds?

Could we just teach post_upload.py to not overwrite the existing xpi if it's <1 hour old?

In the rare event that we need to make sure a new xpi gets posted, we could remove it by hand.
Priority: P5 → P4
The problem is that os.unlink() does not get happy when we try a to remove a file that has been deleted already.

This patch allows us to recover from it by ignoring the file we were just about to delete.

Even though I am writing this patch I believe we should be taking attachment 517874 [details] [diff] [review].

Axel could you point out what is the line you were refering to?
Are you referring to http://hg.mozilla.org/mozilla-central/annotate/a74b05ad7801/browser/locales/Makefile.in#l301 ??
This was removed by Callek in http://hg.mozilla.org/mozilla-central/rev/87c47b419223 (Feb. 23rd, I think)
If this is true I don't see any more obstacles to land attachment 517874 [details] [diff] [review].

##########################

This is a recap of what the context is:

* This is where we have the race condition:
> http://hg.mozilla.org/build/tools/file/40f1463d4ad6/stage/post_upload.py#l59
* This is where the list of UPLOAD_FILES is generated:
> http://hg.mozilla.org/mozilla-central/annotate/da816b446ffa/toolkit/mozapps/installer/packager.mk#l784
* This is the output of a normal repack:
> /tools/python/bin/python2.5 /builds/slave/cen-lnx/build/build/upload.py --base-path ../../dist \
>   "../../dist/firefox-6.0a1.en-US.linux-i686.tar.bz2"   "../../dist/install/firefox-6.0a1.en-US.langpack.xpi"  "../../dist/firefox-6.0a1.en-US.linux-i686.tests.zip" "../../dist/firefox-6.0a1.en-US.linux-i686.crashreporter-symbols.zip"  "../../dist//firefox-6.0a1.en-US.linux-i686.txt"  \
>   "../../dist//firefox-6.0a1.en-US.linux-i686.checksums"
* This is the output of a pretty repack:
> /tools/python/bin/python2.5 /builds/slave/rel-2.0-lnx-rpk-1/mozilla-2.0/build/upload.py --base-path ../../dist \
>  "../../dist/linux-i686/af/firefox-4.0.1.tar.bz2"  "../../dist/update/linux-i686/af/firefox-4.0.1.complete.mar" "../../dist/linux-i686/xpi/af.xpi"       \
>  "../../dist/linux-i686/af//firefox-4.0.1.checksums"
Attachment #528689 - Flags: review?(coop)
Priority: P4 → P2
Seems so, so my comment seems to have been obsoleted by reality.
Attachment #517856 - Attachment is obsolete: true
Comment on attachment 517874 [details] [diff] [review]
PKG_LANGPACK_PATH to upload to platform/xpi

re-requesting then :)
Attachment #517874 - Flags: review- → review?(community)
Attachment #517874 - Flags: review?(community) → review?(l10n)
Comment on attachment 528689 [details] [diff] [review]
handle better an OSError for os.unlink()

>+        except OSError, e:

If you're not actually going to use e (to check the errno or whatever), you don't have to include it here.
Attachment #528689 - Flags: review?(coop) → review+
Pike gentle poke for review.
Priority: P2 → P3
Pike what was that other case you were thinking of that you mentioned on IRC a couple of weeks ago?
I hit enter too soon.

Do you think it should stop us from landing this patch or it is orthogonal to it?
I've not yet found the cycles to dive in to the question of what the

$(NSINSTALL) -D $(DIST)/install

at the beginning of libs-% has to do with this. Is that still needed, do we need an equivalent?
Comment on attachment 517874 [details] [diff] [review]
PKG_LANGPACK_PATH to upload to platform/xpi

Sorry for the lag, these patches are a pain to review, as all those variables show up in various combinations to point to the same thing. Plus every now and then you gotta figure out if your nsinstall -D is there. But that looks all good.
Attachment #517874 - Flags: review?(l10n) → review+
That wasn't enough to do the trick. post_upload.py should be fixed as well.

> Running post-upload command: post_upload.py --tinderbox-builds-dir mozilla-central-l10n -b mozilla-central-l10n -p firefox --release-to-tinderbox-builds
Priority: P3 → P2
Trading Armen for the tp5 bug.
Assignee: armenzg → coop
OS: Mac OS X → All
Hardware: x86 → All
We can see that our first change did the job [1] of putting the xpi inside of a platform dir but what it was needed to put it in the right place on ftp.

We have to modify post_upload.py so --release-to-tinderbox-builds [2] does a similar job as --release-to-candidates-dir [3].
> http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-08-03-mozilla-central-l10n/firefox-7.0a1.nl.langpack.xpi (note that there is no $platform/$xpi directory)
VS
> http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/4.0.1-candidates/build1/linux-x86_64/xpi/af.xpi


[1] > /tools/python/bin/python2.5 ../../build/upload.py ......... "../../dist/linux-x86_64/xpi/firefox-7.0a1.nl.langpack.xpi" 
[2] > Running post-upload command: post_upload.py --tinderbox-builds-dir mozilla-central-l10n -b mozilla-central-l10n -p firefox --release-to-tinderbox-builds
[3] > Running post-upload command: post_upload.py -p firefox -n 1 -v 4.0.1 --release-to-candidates-dir
(In reply to comment #31)
> We can see that our first change did the job [1] of putting the xpi inside
> of a platform dir but what it was needed to put it in the right place on ftp.
> 
> We have to modify post_upload.py so --release-to-tinderbox-builds [2] does a
> similar job as --release-to-candidates-dir [3].
> > http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-08-03-mozilla-central-l10n/firefox-7.0a1.nl.langpack.xpi (note that there is no $platform/$xpi directory)
> VS
> > http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/4.0.1-candidates/build1/linux-x86_64/xpi/af.xpi

So the problem here is that post_upload.py has no concept of platform right now.

Is it better to add an explicit flag to post_upload.py for platform (and set it everywhere), or to try to parse it out of the filenames that come in?
I think the difference is that preserveDir is not used:
   157  CopyFileToDir(f, upload_dir, tinderboxBuildsPath)
VS
   193  CopyFileToDir(f, upload_dir, realCandidatesPath, preserve_dirs=True)
(In reply to comment #33)
> I think the difference is that preserveDir is not used:
>    157  CopyFileToDir(f, upload_dir, tinderboxBuildsPath)
> VS
>    193  CopyFileToDir(f, upload_dir, realCandidatesPath, preserve_dirs=True)

Ok, I have applied this change to post_upload.py on staging-stage. We'll soon see whether it's sufficient.
(In reply to comment #34) 
> Ok, I have applied this change to post_upload.py on staging-stage. We'll
> soon see whether it's sufficient.

That doesn't appear to be enough. I haven't verified this yet, but I think the upload.py call in the non-candidate case flattens the directory structure before it even gets to post_upload.py.
(In reply to comment #35) 
> That doesn't appear to be enough. I haven't verified this yet, but I think
> the upload.py call in the non-candidate case flattens the directory
> structure before it even gets to post_upload.py.

I verified that the xpi dir *is* being created properly on upload, but is getting lost somewhere in post_upload.py. Debugging now.
(In reply to comment #36) 
> I verified that the xpi dir *is* being created properly on upload, but is
> getting lost somewhere in post_upload.py. Debugging now.

Figured out my disconnect. One set of logs I was looking at was from a nightly where we don't actually touch tinderbox-builds. This did show me that we need preserve_dirs for the dated dirs as well.
Also rolls in the try block for the OSError from the earlier patch.
Attachment #528689 - Attachment is obsolete: true
Attachment #540668 - Flags: review?(armenzg)
Comment on attachment 540668 [details] [diff] [review]
Set preserve_dirs=True when working with dated or (tinderbox) build dirs

I was afraid this would affect every on-change builder (opt & debug) and binaries would be uploaded to odd subdirectories like "install" or "dist" but I have looked at the output of a make upload log and it doesn't seem we will have changes besides langpacks.

Do you have the output from any staging-stage build? (or where it got uploaded)
Perhaps you want to deploy this during the downtime to be safe.

92a72f7c220e4f28f2047bb34d730b998d8f259b083d4318764cf2ac6301a6a4be37bb606a86454c6bbbf7b891a99889b88a59d58a2fe40dca186edf7804015d sha512 15234978 firefox-7.0a1.en-US.linux-i686.tar.bz2
3ef1da65acb09122661a790bb4752f7432dedb54466ddfa26be65b911182e62f7ce97a6e73bf4a6c02dc4d75fd9d94e7c8e625e35a094b24c4207ea6a7a2d746 sha512 216565 linux-i686/xpi/firefox-7.0a1.en-US.langpack.xpi
e1c18666196d4c223371e0f402c27a93373782d5b6e3e7fb2691c5a607a3e4cdf4e40f1bc12e67eefc776da45539961ac375ef1ea581a87b97966da42dff43ad sha512 49756280 firefox-7.0a1.en-US.linux-i686.tests.zip
b3400d781832ef9e9aaa50da4823d5e1807f49399ce9eaa37aef0d62f7efb4c402acdd060b5d28bb44c3b7598a0ab1ce6da311f8e09f68dad39d05270665ab94 sha512 21123115 firefox-7.0a1.en-US.linux-i686.crashreporter-symbols.zip
5ce3405168e484c5b91e5e26f0904d9245c9da174ea59d92639911a9f25a3dc1c9e69339cd559f74e1035577ae82ea0082c1b3e056029051b36bd98681fabfc9 sha512 70 firefox-7.0a1.en-US.linux-i686.txt
Attachment #540668 - Flags: review?(armenzg) → review+
Largely a finessing of the previous patch:

* only use preserve_dirs for l10n xpis, i.e. en-US xpis still end up in their current spot
* add preserve_dirs for l10n builds going into latest. This cleans up the latest-*-l10n root dir a bit.
* fixes stderr output when preserve_dirs is on for l10n xpis

Here's some output from staging. Notice that en-US is unaffected:

(en-US dep)
Process stderr:
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308953321/firefox-7.0a1.en-US.linux-i686.tar.bz2
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308953321/firefox-7.0a1.en-US.langpack.xpi
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308953321/firefox-7.0a1.en-US.linux-i686.tests.zip
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308953321/firefox-7.0a1.en-US.linux-i686.crashreporter-symbols.zip
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308953321/firefox-7.0a1.en-US.linux-i686.txt
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308953321/firefox-7.0a1.en-US.linux-i686.checksums

(en-US nightly)
Process stderr:
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central/firefox-7.0a1.en-US.linux-i686.tar.bz2
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central/firefox-7.0a1.en-US.linux-i686.complete.mar
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central/firefox-7.0a1.en-US.langpack.xpi
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central/firefox-7.0a1.en-US.linux-i686.partial.20110622075450-20110624063503.mar
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central/firefox-7.0a1.en-US.linux-i686.tests.zip
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central/firefox-7.0a1.en-US.linux-i686.txt
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central/firefox-7.0a1.en-US.linux-i686.checksums
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308922503/firefox-7.0a1.en-US.linux-i686.tar.bz2
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308922503/firefox-7.0a1.en-US.langpack.xpi
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308922503/firefox-7.0a1.en-US.linux-i686.tests.zip
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308922503/firefox-7.0a1.en-US.linux-i686.crashreporter-symbols.zip
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308922503/firefox-7.0a1.en-US.linux-i686.txt
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux/1308922503/firefox-7.0a1.en-US.linux-i686.checksums

(l10n dep build)
Process stderr:
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-l10n/firefox-7.0a1.fr.linux-i686.tar.bz2
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-l10n/linux-i686/xpi/firefox-7.0a1.fr.langpack.xpi
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-l10n/firefox-7.0a1.fr.linux-i686.checksums

(l10n nightly build)
Process stderr:
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central-l10n/firefox-7.0a1.fr.linux-i686.tar.bz2
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central-l10n/firefox-7.0a1.fr.linux-i686.complete.mar
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central-l10n/linux-i686/xpi/firefox-7.0a1.fr.langpack.xpi
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central-l10n/firefox-7.0a1.fr.linux-i686.partial.20110624063503-20110624063503.mar
http://staging-stage.build.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-24-06-35-03-mozilla-central-l10n/firefox-7.0a1.fr.linux-i686.checksums
Attachment #540668 - Attachment is obsolete: true
Attachment #541946 - Flags: review?(armenzg)
Comment on attachment 541946 [details] [diff] [review]
Set preserve_dirs=True when working with dated or (tinderbox) build dirs, v2

Very nice :) one more old bug being done!
Attachment #541946 - Flags: review?(armenzg) → review+
Comment on attachment 541946 [details] [diff] [review]
Set preserve_dirs=True when working with dated or (tinderbox) build dirs, v2

http://hg.mozilla.org/build/tools/rev/6f4b1a4665e2

I've updated the checkout on stage, so any new l10n repacks should have their xpis sorted this way.
Attachment #541946 - Flags: checked-in+
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.