Closed Bug 631573 Opened 9 years ago Closed 7 years ago

Omit directory times when pushing to mirrors

Categories

(Release Engineering :: General, defect, P3, critical)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jhford, Unassigned)

References

()

Details

(Whiteboard: [releases][automation][simple])

Attachments

(1 file)

During the firefox 4.0b11 build3 automation run, it was discovered that the en-US installer files weren't showing up for ftp.mozilla.org.  These files are showing as present on stage.mozilla.org.

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/4.0b11-candidates/build3/win32/en-US/

vs.

http://stage.mozilla.org/pub/mozilla.org/firefox/nightly/4.0b11-candidates/build3/win32/en-US/

The files in question are "Firefox Setup 4.0 Beta 11.exe" and "Firefox Setup 4.0 Beta 11.exe.asc"

I get the same results for FTP and HTTP on ftp.mozilla.org.  I have not checked for other instances of this issue yet.
I've only gone through a few more directories, but I've found at least one more missing in Win32: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/4.0b11-candidates/build3/win32/zh-CN/
Verified that both servers have the same NFS share mounted on that mount point.

10.253.0.11:/vol/stage

I changed the mount options to enforce a 30 second cache timeout, and the files still aren't showing up.  Is this something server-side on the netapp?
Assignee: server-ops → aravind
Aravind's out today.  Who else knows netapp stuff?
Assignee: aravind → server-ops
Sounds like cshields managed to at least knock it loose...

I added a actimeo=30 to the mount options during my initial fooling around, that may have solved it, but only for newly-added files.  If Corey touched a new file in a directory with missing files, everything in the directory showed up.  Adding new files seem to show up immediately now.
Assignee: server-ops → cshields
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
We've had this sort scenario before, but I can't recall exactly which mounting setup it was for. Basically, someone loads a new directory before builds are published into it, empty state gets cached and doesn't get expired properly. Hopefully we're good now, do let us know if you see it juanb.
I added lookupcache=pos and did a remount, that got the directory listing working.  I am not sure that, the change actually improved the situation.  I will leave this bug open for now.
reopen if this becomes a problem again
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Duplicate of this bug: 636160
OK, so... there's a big long thread on the linux-nfs mailing list about this almost identical situation at http://www.spinics.net/lists/linux-nfs/msg08897.html

The executive summary of that thread is at http://www.spinics.net/lists/linux-nfs/msg09052.html in case you don't feel like reading the whole thing.

The short version is it's a bug in the Linux kernel, apparently fixed in 2.6.30 (although nobody in that thread seems to have tested it to ensure that, they were only basing that on the commit message that accompanied it).  RHEL5 of course uses 2.6.18.  RHEL6 is on 2.6.32.
How are these files being copied onto surf/stage?  I am guessing it's some sort of rsync or unzip style thing.  If so, can you please point us to the exact command line used and the steps that create/update these directory trees?
(In reply to comment #11)/
Great to have an idea for a fix. Could we mount the partiton on a RHEL6 box to test it ? And could we do a remount (or similar) on ftp.m.o for now ?

(In reply to comment #12)
> How are these files being copied onto surf/stage?  I am guessing it's some sort
> of rsync or unzip style thing.  If so, can you please point us to the exact
> command line used and the steps that create/update these directory trees?

We use an rsync once all the windows installers are signed, 
 http://hg.mozilla.org/build/tools/file/tip/release/signing/Makefile#l351
STAGE_HOST expands to stage.m.o. 

As for the files in source directory, win32/en-US/*.exe is created first, then update/win32/en-US/*.mar, then repeat for all the locales roughly alphabetically, the create .asc file (gpg signatures) in alphabetical order.
(In reply to comment #13)
> And could we do a remount (or similar) on ftp.m.o for now ?

n/m, this must have happened already.
Can we add a -O or --omit-dir-times flag to rsync there?  That skips directory time syncing, which I think will solve this problem.
(In reply to comment #11)
AIUI, in our case the ls on the r/o mount is http/ftp requests on the win32/en-US directory.

(In reply to comment #15)
> Can we add a -O or --omit-dir-times flag to rsync there?  That skips directory
> time syncing, which I think will solve this problem.

I think so. bhearsum raises a good point on IRC though, which is that there are other processes which upload lots of data and they might tickle this too. Which I'm taking as 'can we fix the underlying issue ?'. What's the strategy for RHEL upgrades on IT boxes ? Is there a general program or is it being done as needed ?
(In reply to comment #16) 
> I think so. bhearsum raises a good point on IRC though, which is that there are
> other processes which upload lots of data and they might tickle this too. Which
> I'm taking as 'can we fix the underlying issue ?'. What's the strategy for RHEL
> upgrades on IT boxes ? Is there a general program or is it being done as needed
> ?

The entire ftp/stage system needs a bit of re-engineering but the reason it hasn't happened yet is really a lack of time - which would be the biggest problem in upgrading right now as well.

There is no upgrade path for major release of RHEL, you have to start fresh with RHEL6. This would be a big project, but it is on the radar.
FWIW ftp.mozilla.org is already behind a Zeus load balancer on an alternate IP, but we haven't switched DNS to point at it yet.  Someday when we got time and available hardware we'd been planning to spin up an additional node anyway.  Spinning up the additional nodes as RHEL6 when we do so is a no-brainer, then we can just switch to that one entirely and reload the original one.
(In reply to comment #15)
> Can we add a -O or --omit-dir-times flag to rsync there?  That skips directory
> time syncing, which I think will solve this problem.

Has this been done?
This effectively sets all the timestamps on the directories to the upload time, rather than the time we created the detached signatures. As we discussed this shouldn't cause us any problems.
Attachment #515762 - Flags: review?(bhearsum)
Attachment #515762 - Flags: review?(bhearsum) → review+
Comment on attachment 515762 [details] [diff] [review]
[checked in] Omit directory times when uploading signed builds

http://hg.mozilla.org/build/tools/rev/2b5ad8671c22
Attachment #515762 - Attachment description: Omit directory times when uploading signed builds → [checked in] Omit directory times when uploading signed builds
Closing out.  Please reopen if this workaround does not work.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Happened again when we pushed to mirrors for rc1.

ftp://ftp.mozilla.org/pub/firefox/releases/4.0rc1/win32/pl/ is empty, but should have files (like http://releases.mozilla.org/pub/mozilla.org/firefox/releases/4.0rc1/win32/pl/)
Assignee: cshields → server-ops
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Actually, pulling this into RelEng so we can adjust the push to mirrors rsync, too.
Assignee: server-ops → nobody
Component: Server Operations → Release Engineering
QA Contact: mrz → release
Long term fix: bug 446401.  Get those machines into production and let them take over. :)  (but not till after ff4)
Priority: -- → P3
Summary: dm-ftp01 not in sync with stage.m.o → Omit directory times when pushing to mirrors
Whiteboard: [releases][automation][simple]
What's left for releng to do here?
I'm pretty sure we don't care about this anymore.
Status: REOPENED → RESOLVED
Closed: 9 years ago7 years ago
Resolution: --- → WONTFIX
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.