Closed Bug 1097752 Opened 11 years ago Closed 8 years ago

The 'latest' link should be set/moved only once the builds are finished

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: julienw, Unassigned)

References

Details

Attachments

(3 files)

Currently it looks like the "latest" link is set quite early, and points to a directory that is half empty until the builds are finished. For example, http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-central-linux64_gecko/latest/ was pointing to a directory without builds during some minutes and it makes the gaia try hook fail.
Ben *or* Nick, (whoever gets here first!) How do those latest links get generated? The only place I could see possible ln commands running was in: * tools/stage/post_upload.py * buildbotcustom/process/factory.py Is it one of the above? I couldn't see ln running in the logs (e.g. in http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-central-linux64_gecko/latest/mozilla-central_ubuntu64_vm-b2gdt_test-gaia-build-unit-bm118-tests1-linux64-build8.txt.gz) so I wonder if this is run on a cron somewhere outside of the build, or in a post upload step etc, rather than a buildbot step per se of the build job. Julien, which files were missing at the time the gaia try hook ran? I can imagine if we have a parallel task updating these latest links, then maybe it links as soon it notices a new directory appear, even if it is in progress. However, if we changed that to only be completed builds, maybe that interferes with other people's workflows that like to "watch" the latest build being published as it runs?
Flags: needinfo?(nthomas)
Flags: needinfo?(bhearsum)
The builds to look at are the ones at https://tbpl.mozilla.org/?jobname=gecko. After all the compiling and packaging is done there is a 'make upload' step, which has a call to build/upload.py. That uploads the files to a temp directory, then runs post_upload.py (on the ftp server). That call has a --release-to-latest-tinderbox-builds argument, so we end up at http://hg.mozilla.org/build/tools/file/default/stage/post_upload.py#l241 There's no locking around that, but since it's the last step I can't see how that would be an issue. Similarly, we shouldn't get that far if the build didn't have all the artifacts to upload. Could really do with an more details, like a directory listing when it's not right.
Flags: needinfo?(nthomas)
> Julien, which files were missing at the time the gaia try hook ran? I don't want to say anything wrong so I'll update this bug next time I see the issue. I initially thought it was a failed build, but then I saw the same directory (the one "latest" was pointing to) correctly had the builds. Anyway I'll update with more information as soon as I can.
Maybe related but maybe not, so tell me if I should file a separate bug: currently, the "latest" link does not point to the latest tbpl build, but the latest build that finished. To be more factual: http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-central-linux64_gecko/latest/ points to [1], however [2] is built from a more recent mozilla-central. The reason is that [1] finished later than [2] (for a reason that I don't know). https://tbpl.mozilla.org/?rev=ab137ddd3746 resulted in build directory [2] https://tbpl.mozilla.org/?rev=64f1fb1e2f38 resulted in build directory [1] [1] http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-central-linux64_gecko/1415840627/ [2] http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-central-linux64_gecko/1415841589/
Here is a screenshot of the "latest" directory that does not contain any build.
Honestly, my recommendation is not to use "latest" directories for any sort of automation. Using them gives you non-repeatable builds/tests, and as you've seen - the directories are not well maintained. I don't know about these ones specifically, but "latest" directories we've had in the past have been purely for human convenience. These builds have a "packageUrl" property in json metadata we publish, which points at stable locations like http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-central-macosx64_gecko/1415881846/b2g-36.0a1.multi.mac64.dmg. I don't know if your system already consumes the metadata, but if it does, that might be an easy fix.
Flags: needinfo?(bhearsum)
Is it possible that using the symlink (or something else about how gaia-try downloads) results in going outside releng's private VIP view of ftp.m.o? That screenshot shows a completed test log, so at that time, it had already been possible for at least three minutes to download the build even though it isn't seen in that public view of one particular webhead at one particular time.
John, can you maybe precise how gaia try is fetching a build, and where it's being executed?
Flags: needinfo?(jhford)
(In reply to Julien Wajsberg [:julienw] from comment #5) > Here is a screenshot of the "latest" directory that does not contain any > build. Very helpful! So the problem is that logs from tests should not be uploaded using --release-to-latest-tinderbox-builds, only the builds.
Assignee: nobody → nthomas
Priority: -- → P2
I agree that using latest/ is fragile, given it could change at any time, but this at least makes it work better.
Attachment #8523588 - Flags: review?(bhearsum)
Attachment #8523588 - Flags: review?(bhearsum) → review+
Comment on attachment 8523588 [details] [diff] [review] [buildbotcustom] Treat logs differently https://hg.mozilla.org/build/buildbotcustom/rev/5bf45e9cc8c2 This will get deployed in the usual way with a reconfig.
Attachment #8523588 - Flags: checked-in+
Attached image another screenshot
Got another similar issue.
Hey Nick, any idea about this?
Flags: needinfo?(nthomas)
I'm not sure tbh. The earlier fix here was deployed correctly. How long did that situation persist for ? The gecko builds do the en-US build and upload to the en-US/ dir, then do the multi-locale and upload it. There's a 5 minute delay between those, and the latest symlink is updated both times.
Flags: needinfo?(nthomas)
OK, so if I understand correctly, it means that en-US will always be populated before the multilocale build. That means we'll always have a 5 minutes delay where we'll have the en-US build but not the multilocale build. John, what do you think of this?
We aren't using the /latest/ link in the gaia-try hook at all, so if that's the concern we don't need to worry about it. As a general feature/improvement, I think this is a good idea. Not having multilocale until after en-US is not a huge concern for things that I've worked on.
Flags: needinfo?(jhford)
I'm not actively working on this, and bug 1132123 will change the upload logic to rsync instead of scp+post_upload.py anyway. I don't know if that will upload everything in one go or not.
Assignee: nthomas → nobody
Priority: P2 → --
See Also: → 1132123
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: