Closed Bug 493755 Opened 16 years ago Closed 16 years ago

installdmg.ex doesn't reliably unpack a .dmg file

Categories

(Release Engineering :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Unassigned)

References

Details

Attachments

(2 files, 3 obsolete files)

The mac slaves sometimes hit the error below when trying to unpack a build. The script then exits without unmounting 'mnt', which causes future builds in the same directory to fail as well. A few ideas we could use to improve this: - Before starting, detect if 'mnt' already exists, and is a mount point. If it is, then try to unmount it first. - Make sure that we umount 'mnt' regardless of it rsync passes or not. - Figure out why rsync is failing spawn hdiutil attach -verbose -noautoopen -mountpoint ./mnt firefox-3.6a1pre.en-US.mac.dmg Initializing… DIBackingStoreInstantiatorProbe: interface 0, score [32m 100[0m, CBSDBackingStore DIBackingStoreInstantiatorProbe: interface 1, score [31m -1000[0m, CBundleBackingStore DIBackingStoreInstantiatorProbe: interface 2, score [31m -1000[0m, CRAMBackingStore DIBackingStoreInstantiatorProbe: interface 3, score [32m 100[0m, CCarbonBackingStore DIBackingStoreInstantiatorProbe: interface 4, score [31m -1000[0m, CDevBackingStore DIBackingStoreInstantiatorProbe: interface 5, score [31m -1000[0m, CCURLBackingStore DIBackingStoreInstantiatorProbe: interface 6, score [31m -1000[0m, CVectoredBackingStore DIBackingStoreInstantiatorProbe: selecting CBSDBackingStore DIBackingStoreInstantiatorProbe: interface 0, score [32m 100[0m, CBSDBackingStore DIBackingStoreInstantiatorProbe: interface 1, score [31m -1000[0m, CBundleBackingStore DIBackingStoreInstantiatorProbe: interface 2, score [31m -1000[0m, CRAMBackingStore DIBackingStoreInstantiatorProbe: interface 3, score [32m 100[0m, CCarbonBackingStore DIBackingStoreInstantiatorProbe: interface 4, score [31m -1000[0m, CDevBackingStore DIBackingStoreInstantiatorProbe: interface 5, score [31m -1000[0m, CCURLBackingStore DIBackingStoreInstantiatorProbe: interface 6, score [31m -1000[0m, CVectoredBackingStore DIBackingStoreInstantiatorProbe: selecting CBSDBackingStore DIFileEncodingInstantiatorProbe: interface 0, score [31m -1000[0m, CMacBinaryEncoding DIFileEncodingInstantiatorProbe: interface 1, score [31m -1000[0m, CAppleSingleEncoding DIFileEncodingInstantiatorProbe: interface 2, score [31m -1000[0m, CEncryptedEncoding DIFileEncodingInstantiatorProbe: nothing to select. DIFileEncodingInstantiatorProbe: interface 0, score [32m 900[0m, CUDIFEncoding DIFileEncodingInstantiatorProbe: selecting CUDIFEncoding DIFileEncodingNewWithBackingStore: CUDIFEncoding DIFileEncodingNewWithBackingStore: instantiator returned 0 DIFileEncodingInstantiatorProbe: interface 0, score [31m -1000[0m, CSegmentedNDIFEncoding DIFileEncodingInstantiatorProbe: interface 1, score [31m -1000[0m, CSegmentedUDIFEncoding DIFileEncodingInstantiatorProbe: interface 2, score [31m -1000[0m, CSegmentedUDIFRawEncoding DIFileEncodingInstantiatorProbe: nothing to select. DIDiskImageInstantiatorProbe: interface 0, score 0[0m, CDARTDiskImage DIDiskImageInstantiatorProbe: interface 1, score 0[0m, CDiskCopy42DiskImage DIDiskImageInstantiatorProbe: interface 2, score [31m -1000[0m, CNDIFDiskImage DIDiskImageInstantiatorProbe: interface 3, score [32m 1000[0m, CUDIFDiskImage CRawDiskImage: data fork length 0x0000000000A759DA (10967514) not a multiple of 512. DIDiskImageInstantiatorProbe: interface 5, score [31m -100[0m, CRawDiskImage DIDiskImageInstantiatorProbe: interface 6, score [31m -100[0m, CShadowedDiskImage DIDiskImageInstantiatorProbe: interface 7, score 0[0m, CSparseDiskImage DIDiskImageInstantiatorProbe: interface 8, score 0[0m, CSparseBundleDiskImage DIDiskImageInstantiatorProbe: interface 9, score [31m -1000[0m, CCFPlugInDiskImage DIDiskImageInstantiatorProbe: interface 10, score [31m -100[0m, CWrappedDiskImage DIDiskImageInstantiatorProbe: selecting CUDIFDiskImage DIDiskImageNewWithBackingStore: CUDIFDiskImage DIDiskImageNewWithBackingStore: instantiator returned 0 Verifying… Checksumming Driver Descriptor Map (DDM : 0)… Driver Descriptor Map (DDM : 0): verified CRC32 $6BD21C49 Checksumming Apple (Apple_partition_map : 1)… Apple (Apple_partition_map : 1): verified CRC32 $1B187942 Checksumming DiscRecording 4.0.1d4 (Apple_HFS : 2)… DiscRecording 4.0.1d4 (Apple_HFS : 2: verified CRC32 $1A0482B7 Verification completed… Error 0 (Unknown error: 0). verified CRC32 $2566E01C Attaching… DI_kextWaitQuiet: about to call IOServiceWaitQuiet... DI_kextWaitQuiet: IOServiceWaitQuiet took 0.040619 seconds 2009-05-16 14:52:17.788 diskimages-helper[21955:2827] -serveImage: attaching drive { autodiskmount = 1; "hdiagent-drive-identifier" = "2D907160-0D9C-49B4-A64F-904D5D805D5B"; "unmount-timeout" = 0; } 2009-05-16 14:52:17.847 diskimages-helper[21955:2827] -serveImage: connecting to myDrive 0x00004707 2009-05-16 14:52:17.848 diskimages-helper[21955:2827] -serveImage: register _readBuffer 0x0x497000 with myDrive 0x0x0 2009-05-16 14:52:17.849 diskimages-helper[21955:2827] -serveImage: activating drive port 0x0x4807 2009-05-16 14:52:17.851 diskimages-helper[21955:2827] _serveImage: set cache enabled=TRUE returned SUCCESS. 2009-05-16 14:52:17.853 diskimages-helper[21955:2827] _serveImage: set on IO thread=TRUE returned SUCCESS. 2009-05-16 14:52:17.857 diskimages-helper[21955:2827] -serveImage: starting server loop - myPort is 0x0x4807 Checking volumes… Volume check completed… Mounting… rsync: link_stat "/builds/moz2_slave/mozilla-central-macosx-unittest-mochitests/build/./mnt/*" failed: No such file or directory (2) rsync error: some files could not be transferred (code 23) at /SourceCache/rsync/rsync-30/rsync/main.c(717) child process exited abnormally while executing "system rsync -a ./mnt/* ." (file "../tools/buildfarm/utils/installdmg.ex" line 47)
Blocks: 383136
So far it looks like it's only moz2-darwin9-slave07 that's been having problems.
Also moz2-darwin9-slave05
It looks like hdiutil has some asynchronous behaviour. I haven't been able to reproduce the exact error above, but running the following eventually fails with a resource busy error: (set -e; while true; do hdiutil attach -verbose -noautoopen -mountpoint mnt test.dmg; umount mnt; done;) This makes me believe there's some race condition in hdiutil and commands that follow.
This patch adds some more error handling to installdmg.ex; it tries harder to clean up after itself when it's done, and also tries to do some cleanup before starting. It also waits for files to show up in mnt/ after calling hdiutil attach, and before calling rsync.
It turns out that the EULA on the .dmg files was disabled late last year, so we no longer need to interact with hdiutil via stdin. I saw one instance of the updated installdmg.ex hanging while waiting for hdiutil to finish, but have yet to see this behaviour with installdmg.sh In the interests of simplicity, I'd rather switch over to use installdmg.sh where possible (talos and unittests on packaged builds are the two places that come to mind)
Assignee: nobody → catlee
Attachment #379589 - Attachment is obsolete: true
Attachment #379958 - Flags: review?(nthomas)
Blocks: 448047
Comment on attachment 379958 [details] [diff] [review] Update installdmg.sh to be more resilient as well Looks alright to me but asking coop as well, he's the one with the experience here IIRC. If there's a possibility to kill one of these scripts then lets.
Attachment #379958 - Flags: review?(nthomas)
Attachment #379958 - Flags: review?(ccooper)
Attachment #379958 - Flags: review+
Comment on attachment 379958 [details] [diff] [review] Update installdmg.sh to be more resilient as well (In reply to comment #8) > If there's a possibility to kill one of these scripts then lets. Agreed, no reason to keep using the expect script any more. One small suggestion: make the 90 sec timeout configurable at the top of the script.
Attachment #379958 - Flags: review?(ccooper) → review+
Same as before, except: - add a TIMEOUT variable to the top of installdmg.sh for how long to wait for files to show up - add a sleep at the end of installdmg.sh to catch final messages from diskimages-helper - make the rsync command in both installdmg.sh and installdmg.ex verbose
Attachment #379958 - Attachment is obsolete: true
Attachment #380816 - Flags: review?
Attachment #380816 - Flags: review? → review?(ccooper)
Attachment #380816 - Flags: review?(bhearsum)
Comment on attachment 380816 [details] [diff] [review] Update installdmg.sh and installdmg.ex to handle error conditions better Looks good, but could still use a 'set timeout 90' in the expect script (if we're not just killing completely). Do we need a configurable timer for the final sleep too, or is 5s a worst-case scenario?
Attachment #380816 - Flags: review?(ccooper) → review+
Attachment #380816 - Flags: review?(bhearsum) → review+
Comment on attachment 380816 [details] [diff] [review] Update installdmg.sh and installdmg.ex to handle error conditions better I don't have anything to add after coop's comment
Attachment #382484 - Flags: review?(ccooper)
Attachment #382484 - Flags: review?(ccooper) → review+
Comment on attachment 380816 [details] [diff] [review] Update installdmg.sh and installdmg.ex to handle error conditions better changeset: 291:cabd30707a43
Attachment #380816 - Flags: checked‑in+
Comment on attachment 382484 [details] [diff] [review] Use installdmg.sh instead of installdmg.ex changeset: 327:20c624f2110f
Attachment #382484 - Flags: checked‑in+
Isn't that FIXED now? BTW, I'm using installdmg.sh as a base for a buildsystem patch on bug 498500
Blocks: 498500
(In reply to comment #16) > Isn't that FIXED now? > BTW, I'm using installdmg.sh as a base for a buildsystem patch on bug 498500 Certainly looks that way. I haven't seen any unpack errors since this was deployed, and all our OSX slaves have at most one disk image mounted right now.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
This still happens on try talos.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The scripts/ directory on try talos (and the existing dedicated talos setup?) need to be updated with the new installdmg.ex.
Assignee: catlee → nobody
Futured until post-pooling because this may not be an issue once the pool is in effect.
Component: Release Engineering → Release Engineering: Future
For reference, this is a problem independent on how you schedule slaves, talos will need to take this.
Try talos is now using a setup similar to the pooled talos slaves, so is grabbing the new installdmg.sh. I think the only things left using the old scripts are the fast talos machines, and those are going away soon, so I'm not going to bother with them.
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Moving closed Future bugs into Release Engineering in preparation for removing the Future component.
Component: Release Engineering: Future → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: