Closed Bug 573828 Opened 14 years ago Closed 13 years ago

"major update" failed to download file for 3.5.10 -> 3.6.4 MU ("Failed to remove patcher2.pid: No such file or directory")

Categories

(Release Engineering :: General, defect, P5)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

[104/447] firefox-3.5.10.gu-IN.mac.complete.mar... IN SHUTDOWN...
Failed to remove patcher2.pid: No such file or directory
IN SHUTDOWN...
Failed to remove patcher2.pid: No such file or directory
IN SHUTDOWN...
Failed to remove patcher2.pid: No such file or directory
DownloadFile(): FAILED: 0, output: --13:59:30--  http://stage-old.mozilla.org/pub/mozilla.org/firefox/nightly/3.5.10-candidates/build1/update/mac/gu-IN/firefox-3.5.10.complete.mar
Resolving stage-old.mozilla.org... 10.2.74.116
Connecting to stage-old.mozilla.org|10.2.74.116|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 17866723 (17M) [text/plain]
Saving to: `firefox-3.5.10.gu-IN.mac.complete.mar'


I have clobbered and re-triggered to continue.
Summary: "major update" failed with "Failed to remove patcher2.pid" for 3.5.10's MU → "major update" failed to download file for 3.5.10 -> 3.6.4 MU
It seems that the actual problem was that stage-old.m.o was being used and that is not 100% reliable.

At the third attempt it worked so we never had to go and modofy the patcher-configs to use ftp.m.o instead of stage-old.m.o.

What should be the right solution so we don't hit this again?
It surprises me that this is a problem specific to stage-old (aka surf). I've always assumed these sort of issues are a problem with the link overall. Armen, if you get a chance could you try running MajorUpdateFactory in staging, on a VM, and see if you can reproduce? If we can't reproduce on a VM, that points to the link being the problem.

If it is indeed stage-old that is the problem we could probably pass stageServer=ftpServer to MajorUpdateFactory, since it doesn't need to do any uploading. We couldn't do that for ReleaseUpdatesFactory, though. We might be able to modify the patcher bump script to use ftpServer for the completemarurl rather than stagingServer.

If we ever move to a two-tier stage/ftp setup (bug 394069) we'd have to go back to using the staging server.
Ben, my (hand waving) theory is that there is a fairly low rate of failures in requests to surf going on all the time. Sometimes that shows up as bogus data in the Castro proxy, but release jobs that do lots of downloads in one log make it more obvious to us. I don't have any hard data to back that up so I haven't filed it.
Assignee: nobody → armenzg
Status: NEW → ASSIGNED
Priority: -- → P3
Priority: P3 → P4
Priority: P4 → P5
This may not be an issue any more, as stage-old has the files mounted directly via NFS now. It could have been an issue with the way we were (transparently) proxying those requests back to ftp.m.o.
That is good to know.

If we hit this issue again we could come back and opening it again.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
Summary: "major update" failed to download file for 3.5.10 -> 3.6.4 MU → "major update" failed to download file for 3.5.10 -> 3.6.4 MU ("Failed to remove patcher2.pid: No such file or directory")
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.