Closed
Bug 799532
Opened 12 years ago
Closed 12 years ago
Try to find the source of IOError when removing profile directory
Categories
(Testing :: XPCShell Harness, defect)
Testing
XPCShell Harness
Tracking
(Not tracked)
RESOLVED
FIXED
mozilla19
People
(Reporter: gps, Assigned: gps)
References
Details
(Whiteboard: [sheriff-want])
Attachments
(1 file)
3.20 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #793855 +++
I'm getting sick of these IOError's and want to try to find the problem. This patch dumps an |ls -a| like output for the profile directory if we get an IOError when deleting it. It will probably need some refining once we uncover what's actually happening, but it's a start.
Yes, I known the redundant lookups to getpwuid and friends are wasteful. This isn't critical path code: I'm not worried about perf.
Sample output (with mach prepending times):
14.13 TEST-INFO | (xpcshell/head.js) | No (+ 0) checks actually run
14.13 <<<<<<<
14.13 >>>>>>>
14.13 Traceback (most recent call last):
14.13 File "/Users/gps/src/mozilla-central-hg/testing/xpcshell/runxpcshelltests.py", line 896, in runTests
14.14 self.removeDir(self.profileDir)
14.14 File "/Users/gps/src/mozilla-central-hg/testing/xpcshell/runxpcshelltests.py", line 339, in removeDir
14.14 shutil.rmtree(dirname)
14.14 File "/usr/local/Cellar/python/2.7.3/lib/python2.7/shutil.py", line 245, in rmtree
14.14 rmtree(fullname, ignore_errors, onerror)
14.14 File "/usr/local/Cellar/python/2.7.3/lib/python2.7/shutil.py", line 250, in rmtree
14.14 onerror(os.remove, fullname, sys.exc_info())
14.14 File "/usr/local/Cellar/python/2.7.3/lib/python2.7/shutil.py", line 248, in rmtree
14.14 os.remove(fullname)
14.14 OSError: [Errno 13] Permission denied: '/var/folders/sz/9xx9hkm53_q6sgt3k50nwffc0000gn/T/tmpBwWvtS/startupCache/startupCache.8.little'
14.14 <<<<<<<
14.14 >>>>>>>
14.14 Files in profile directory:
14.14 <<<<<<<
14.14 40700 gps staff /var/folders/sz/9xx9hkm53_q6sgt3k50nwffc0000gn/T/tmpBwWvtS/
14.14 40755 root wheel /var/folders/sz/9xx9hkm53_q6sgt3k50nwffc0000gn/T/tmpBwWvtS/startupCache/
14.14 100644 gps staff /var/folders/sz/9xx9hkm53_q6sgt3k50nwffc0000gn/T/tmpBwWvtS/startupCache/startupCache.8.little
Attachment #669577 -
Flags: review?(ted.mielczarek)
Assignee | ||
Comment 1•12 years ago
|
||
Assignee | ||
Comment 2•12 years ago
|
||
Finally got a Try build to repro!
From https://tbpl.mozilla.org/php/getParsedLog.php?id=16184085&tree=Try&full=1:
Traceback (most recent call last):
File "xpcshell/runxpcshelltests.py", line 895, in runTests
self.removeDir(self.profileDir)
File "xpcshell/runxpcshelltests.py", line 339, in removeDir
shutil.rmtree(dirname)
File "/home/cltbld/lib/python2.5/shutil.py", line 175, in rmtree
rmtree(fullname, ignore_errors, onerror)
File "/home/cltbld/lib/python2.5/shutil.py", line 167, in rmtree
onerror(os.listdir, path, sys.exc_info())
File "/home/cltbld/lib/python2.5/shutil.py", line 165, in rmtree
names = os.listdir(path)
OSError: [Errno 13] Permission denied: '/tmp/tmpktTCyH/Cache.Trash949609424'
<<<<<<<
Files in profile directory:
40700 cltbld cltbld /tmp/tmpktTCyH/
100664 cltbld cltbld /tmp/tmpktTCyH/runxpcshelltests_leaks.log
100600 cltbld cltbld /tmp/tmpktTCyH/secmod.db
100600 cltbld cltbld /tmp/tmpktTCyH/key3.db
100600 cltbld cltbld /tmp/tmpktTCyH/cert8.db
100600 cltbld cltbld /tmp/tmpktTCyH/_CACHE_CLEAN_
OS Error while performing os.walk!
Traceback (most recent call last):
File "/home/cltbld/lib/python2.5/os.py", line 286, in walk
names = listdir(top)
OSError: [Errno 13] Permission denied: '/tmp/tmpktTCyH/Cache.Trash949609424'
40775 cltbld cltbld /tmp/tmpktTCyH/Cache/
100600 cltbld cltbld /tmp/tmpktTCyH/Cache/_CACHE_001_
100600 cltbld cltbld /tmp/tmpktTCyH/Cache/_CACHE_MAP_
100600 cltbld cltbld /tmp/tmpktTCyH/Cache/_CACHE_002_
100600 cltbld cltbld /tmp/tmpktTCyH/Cache/_CACHE_003_
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/D/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/3/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/1/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/7/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/0/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/5/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/B/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/2/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/A/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/6/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/E/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/C/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/F/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/4/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/9/
40700 cltbld cltbld /tmp/tmpktTCyH/Cache/8/
40755 cltbld cltbld /tmp/tmpktTCyH/webapps/
100644 cltbld cltbld /tmp/tmpktTCyH/webapps/webapps.json
So, what is "Cache.Trash949609424"?
Assignee | ||
Comment 3•12 years ago
|
||
Perhaps some Necko folk can explain where "Cache.Trash949609424" comes from.
Comment 4•12 years ago
|
||
Comment on attachment 669577 [details] [diff] [review]
Dump directory listing on profile removal failure, v1
Review of attachment 669577 [details] [diff] [review]:
-----------------------------------------------------------------
r=me since this is just diagnostic code and we'll remove it eventually.
Attachment #669577 -
Flags: review?(ted.mielczarek) → review+
Assignee | ||
Comment 5•12 years ago
|
||
Digging through source code, it appears the Necko cache has a "trash compartment." It appears that old cache directories get renamed to Cache.Trash<random> and are then deleted on a background thread after some delay. Makes sense.
I suspect that permissions are getting "corrupted" as part of the move. From nsDeleteDir.cpp:
// Important: must rename directory w/o changing parent directory: else on
// NTFS we'll wait (with cache lock) while nsIFile's ACL reset walks file
// tree: was hanging GUI for *minutes* on large cache dirs.
rv = dir->MoveToNative(nullptr, leaf);
dir is just an nsIFile. I suspect we are running into some kind of race condition where the OS or filesystem is attempting to "reparent" directory entries in the directory as part of the move and we try to delete the directory before this has finished. But, that's just speculation.
Assignee | ||
Comment 6•12 years ago
|
||
Target Milestone: --- → mozilla19
Comment 7•12 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Comment 8•12 years ago
|
||
> I suspect we are running into some kind of race condition where the OS or
> filesystem is attempting to "reparent" directory entries in the directory as
> part of the move and we try to delete the directory before this has finished
I'm not much of an expert on NTFS, but AFAIK that shouldn't be happening: when nsDeleteDir renames the file we pass null for the parent directory, which winds up resolving to "skipNtfsAclReset" here:
http://mxr.mozilla.org/mozilla-central/source/xpcom/io/nsLocalFileWin.cpp#1862
But perhaps there's some race condition even without doing the ACL resets. The diagnostic output will hopefully give us a clue.
Assignee | ||
Comment 9•12 years ago
|
||
So, what's the easiest way for me to enable diagnostic output on try builds?
You need to log in
before you can comment on or make changes to this bug.
Description
•