Closed
Bug 681407
Opened 12 years ago
Closed 12 years ago
crash nsLocalFile::CopyMove
Categories
(Core :: Networking: File, defect)
Tracking
()
VERIFIED
FIXED
mozilla10
People
(Reporter: MatsPalmgren_bugz, Assigned: michal)
References
Details
(5 keywords, Whiteboard: [inbound][qa!])
Crash Data
Attachments
(2 files)
1.71 KB,
patch
|
benjamin
:
review+
briansmith
:
review+
jst
:
approval-mozilla-aurora+
jst
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
2.51 KB,
text/html
|
Details |
This bug was filed from the Socorro interface and is report bp-2ba15c7b-9b36-4a0a-a1f4-840c72110822 . ============================================================= Currently at #9 on the "Top Crashers for Firefox 8.0a2" list with 52 crashes in the past 7 days (all on Windows).
Jason - any ideas here? I think you modified this code somewhat recently.
Comment 2•12 years ago
|
||
Yes, I changed the code with this patch in bug 670911, and the crashes seem to only happen after then http://hg.mozilla.org/mozilla-central/rev/f5272524806b But I've got no clue as to how this code could be causing a segfault. The patch is quite simple, and doesn't cause any new data to be read, just tests against '!aParentDir', which is allowed to be null but is never read from if so. My patch also changes the call stack here so that aParentDir is now null (used to be non-null), but that should be fine too (and was on the Windows box I tested with). Been staring at it for a while now and am at a loss. Any chance this is some malware issue? In the past that's generally been the issue for necko crashes that don't make any sense at all. I'm also puzzled by the one user comment we have "watching a streaming video": this code happens when the cache is first opened, so it seems like it would hard to have it happen while watching a video.
Comment 3•12 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #2) > I'm also puzzled by the one user comment we have "watching a streaming > video": this code happens when the cache is first opened, so it seems like > it would hard to have it happen while watching a video. If you look at that stack, thread 0 (UI thread) was explicitly clearing the cache from script. The other crash reports show the cache being cleared on a background thread.
Comment 4•12 years ago
|
||
None of the linked stacks were able to identify the source line for the top frame, nor do they hand out the actual EIP for any of the stack frames :-(
Comment 5•12 years ago
|
||
I'm going to nominate this one for tracking. It looks like a regression to me. Can we can some further action on this?
tracking-firefox8:
--- → ?
Keywords: regression
Comment 6•12 years ago
|
||
From what I can see in crash stats, this crash signature has been around in low volume since the 3.0 days. In the last week there were 252 crashes across all versions, Windows only.
Comment 7•12 years ago
|
||
Breaking this down a bit further: 127 crashes in Aurora in the last week on Windows. First instance of Aurora crashes showed up in crash stats using the 2011082600 build. 16 crashes in on the trunk in the last week on Windows. First instance of trunk crashes showed up in crash stats using the 2011090800 build. The bug noted in Comment 2 never landed on Aurora and is only on mozilla Central, so there may possibly be different causes for this crash?
Keywords: qawanted,
regressionwindow-wanted
Comment 8•12 years ago
|
||
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=09935ede3c77&tochange=b7d269a291b6 is the regression range on the trunk according to crash stats - forgot to add it.
Comment 9•12 years ago
|
||
It's currently #17 top crasher in 8.0b1 and happens mainly close to startup.
Comment 10•12 years ago
|
||
(In reply to Marcia Knous [:marcia] from comment #7) > First instance of Aurora crashes showed up in crash stats using the 2011082600 > build. I see crashes before: https://crash-stats.mozilla.com/report/list?range_value=30&range_unit=days&signature=nsLocalFile%3A%3ACopyMove%28nsIFile*%2C%20nsAString_internal%20const%26%2C%20int%2C%20int%29&version=Firefox:8.0a2 > First instance of trunk crashes showed up in crash stats using the 2011090800 > build. I see crashes before: https://crash-stats.mozilla.com/report/list?range_value=30&range_unit=days&signature=nsLocalFile%3A%3ACopyMove%28nsIFile*%2C%20nsAString_internal%20const%26%2C%20int%2C%20int%29&version=Firefox:9.0a1 The regression range is probably in 8.0a1.
Version: unspecified → 8 Branch
Comment 11•12 years ago
|
||
(In reply to Marcia Knous [:marcia] from comment #7) > The bug noted in Comment 2 never landed on Aurora and is only on mozilla > Central, so there may possibly be different causes for this crash? The patch landed in 8.0a1, so it could be the culprit because this crash was already existing in 8.0a2 (see comment 10). One comment says it doesn't happen in Private Browsing.
Updated•12 years ago
|
Component: Networking: File → Networking: Cache
QA Contact: networking.file → networking.cache
Comment 12•12 years ago
|
||
It first appeared in 8.0a1/20110729: https://crash-stats.mozilla.com/report/list?range_value=30&range_unit=days&date=2011-08-17&signature=nsLocalFile%3A%3ACopyMove%28nsIFile*%2C%20nsAString_internal%20const%26%2C%20int%2C%20int%29&version=Firefox:8.0a1 The regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a627b24e684e&tochange=cef1817c3b13 The likely culprit is bug 673808.
Blocks: 673808
Keywords: regressionwindow-wanted
Updated•12 years ago
|
Assignee: nobody → michal.novotny
Comment 13•12 years ago
|
||
It rised to #13 top crasher in 8.0b2.
Assignee | ||
Comment 14•12 years ago
|
||
This crash occurs when the directory is deleted after the call to Exists() at http://hg.mozilla.org/mozilla-central/annotate/ca73f057dab7/xpcom/io/nsLocalFileWin.cpp#l1633 and before call to GetDirectoryEntries() at http://hg.mozilla.org/mozilla-central/annotate/ca73f057dab7/xpcom/io/nsLocalFileWin.cpp#l1651. The patch fixes just the crash and doesn't fix the race condition in the cache deleting logic. IMO it needs to be completely rewritten and I'll file a follow-up bug for this issue.
Attachment #567105 -
Flags: review?(benjamin)
Assignee | ||
Updated•12 years ago
|
Assignee: michal.novotny → nobody
Component: Networking: Cache → Networking: File
QA Contact: networking.cache → networking.file
![]() |
||
Updated•12 years ago
|
Assignee: nobody → michal.novotny
![]() |
||
Comment 15•12 years ago
|
||
This is now the #8 topcrash in 8.*, I hope we can get that reviewed and landed really soon!
Comment 16•12 years ago
|
||
Comment on attachment 567105 [details] [diff] [review] fix Review of attachment 567105 [details] [diff] [review]: ----------------------------------------------------------------- Michal, were you able to reproduce the crash on Windows? And, were you able to verify experimentally that the fix avoids the crash? ::: xpcom/io/nsLocalFileWin.cpp @@ +1649,5 @@ > > nsCOMPtr<nsISimpleEnumerator> targetIterator; > rv = target->GetDirectoryEntries(getter_AddRefs(targetIterator)); > + if (NS_FAILED(rv) > + return rv; There is a missing closing parenthesis here.
Assignee | ||
Comment 17•12 years ago
|
||
(In reply to Brian Smith (:bsmith) from comment #16) > Michal, were you able to reproduce the crash on Windows? And, were you able > to verify experimentally that the fix avoids the crash? I was able to reproduce the crash on Windows exactly at this place, so this patch should fix it. I'll try to run my test with this patch, but as I said it doesn't fix the race condition while deleting the trash. I expect that with this patch the disk cache will stop working when MoveToNative() fails, because we won't initialize the disk device in nsDiskCacheDevice::ClearDiskCache(). BTW the follow-up bug is #695003.
Assignee | ||
Comment 18•12 years ago
|
||
I've verified that and it works as I described. The patch fixes the crash but the disk device stays in an uninitialized state.
Comment 19•12 years ago
|
||
Comment on attachment 567105 [details] [diff] [review] fix Review of attachment 567105 [details] [diff] [review]: ----------------------------------------------------------------- r=bsmith for preventing the crash. Bug 695003 is the follow-up bug for actually fixing the race condition. nsLocalFileUnix.cpp already contains this error check.
Attachment #567105 -
Flags: review+
Comment 20•12 years ago
|
||
Michal, make sure you fix the syntax error I mentioned in comment 16 before you check this in.
Assignee | ||
Comment 21•12 years ago
|
||
http://hg.mozilla.org/integration/mozilla-inbound/rev/1eb208e16555
Whiteboard: [inbound]
Target Milestone: --- → mozilla10
![]() |
||
Comment 22•12 years ago
|
||
Comment on attachment 567105 [details] [diff] [review] fix This is one of the top 10 crashers in Firefox 8, so requesting approval for Aurora and Beta. Even if that only fixes the crash and not the underlying problem, I think we're better off with some problems deleting the cache than with such an action crashing the whole application.
Attachment #567105 -
Flags: approval-mozilla-beta?
Attachment #567105 -
Flags: approval-mozilla-aurora?
Comment 23•12 years ago
|
||
> I expect that with this patch the disk cache will stop working when
> MoveToNative() fails
Note to drivers: the above sounds scary, but is actually much better than a crash (the browser won't use the disk cache for their current session only: the RAM cache still works, so it's not a huge perf hit, and then things return to normal at the next restart). IMO worth landing to aurora/beta to fix the crashes.
Comment 24•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/1eb208e16555
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
![]() |
||
Updated•12 years ago
|
Crash Signature: [@ nsLocalFile::CopyMove(nsIFile*, nsAString_internal const&, int, int)] → [@ nsLocalFile::CopyMove(nsIFile*, nsAString_internal const&, int, int)]
[@ nsLocalFile::CopyMove(nsIFile*, nsAString_internal const&, bool, bool) ]
Assignee | ||
Comment 25•12 years ago
|
||
Comment 26•12 years ago
|
||
Comment on attachment 567105 [details] [diff] [review] fix Approved per todays triage meeting.
Attachment #567105 -
Flags: approval-mozilla-beta?
Attachment #567105 -
Flags: approval-mozilla-beta+
Attachment #567105 -
Flags: approval-mozilla-aurora?
Attachment #567105 -
Flags: approval-mozilla-aurora+
Comment 27•12 years ago
|
||
---------------------------------[ Triage Comment ]--------------------------------- Tracking as this is a top 10 crasher for Firefox 8 while in beta.
tracking-firefox9:
--- → +
Updated•12 years ago
|
Attachment #567105 -
Flags: review?(benjamin) → review+
Comment 28•12 years ago
|
||
http://hg.mozilla.org/releases/mozilla-aurora/rev/4da434beb537 http://hg.mozilla.org/releases/mozilla-beta/rev/a3491d0f5e7b
status-firefox8:
--- → fixed
status-firefox9:
--- → fixed
Comment 29•12 years ago
|
||
Other than checking Socorro, is there a testcase QA can use to verify this fix?
Whiteboard: [inbound] → [inbound][qa?]
Comment 30•12 years ago
|
||
There is a testcase in the attachments. I cannot crash using Mozilla/5.0 (Windows NT 5.1; rv:11.0a1) Gecko/20111122 Firefox/11.0a1 - my Win 7 VM is not working ATM. (In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #29) > Other than checking Socorro, is there a testcase QA can use to verify this > fix?
Comment 31•12 years ago
|
||
I could not reproduce the crash on Firefox 9.0b2, the latest Nighlty and Aurora using the test case from Comment 25, but checking in Socoro I found some crash reports on builds released after the fixes. https://crash-stats.mozilla.com/report/index/0062b77f-8dfd-4732-80a9-6d73e2111122 https://crash-stats.mozilla.com/report/index/ab255e67-254c-44c8-8c1a-4009f2111122 https://crash-stats.mozilla.com/report/index/db84a35e-373b-4d1b-b72b-fa8da2111121 https://crash-stats.mozilla.com/report/index/fc2ff216-170a-467f-9064-a29002111121 https://crash-stats.mozilla.com/report/index/86384a39-1b67-47b1-a387-13f362111121 https://crash-stats.mozilla.com/report/index/243b4c87-3f96-490d-b540-f5eee2111121 https://crash-stats.mozilla.com/report/index/1cc4c319-78fa-4237-89f2-19af52111121 https://crash-stats.mozilla.com/report/index/696d050f-13a2-492d-b757-f2c882111120 https://crash-stats.mozilla.com/report/index/696d050f-13a2-492d-b757-f2c882111120 https://crash-stats.mozilla.com/report/index/86961e9b-a06e-4cae-a6ed-adc832111120 https://crash-stats.mozilla.com/report/index/789b4d8a-408d-4a71-9ba1-643de2111120 https://crash-stats.mozilla.com/report/index/5c7798ea-bb49-40f3-a70e-6fa762111119 https://crash-stats.mozilla.com/report/index/530ba01f-66fc-46f9-8063-83bfd2111119 Can this bug be reopened? Thank you!
Comment 32•12 years ago
|
||
(In reply to Simona B [QA] from comment #31) > Can this bug be reopened? File a new one.
![]() |
||
Comment 33•12 years ago
|
||
(In reply to Scoobidiver from comment #32) > (In reply to Simona B [QA] from comment #31) > > Can this bug be reopened? > File a new one. Would be better to explain why here to not sound rude. :) If it's the same issue and not (completely) fixed, reopening is in order. If the original case was fixed but there's a different one with similar symptoms (which may be the case here), a new bug is the best thing to do, as the solution might need to be completely different.
Assignee | ||
Comment 34•12 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #33) > Would be better to explain why here to not sound rude. :) These crashes have a completely different backtrace. It could be that there is still some problem in nsLocalFile::CopyMove() but it is definitely on a different line.
![]() |
||
Comment 35•12 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #34) > (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #33) > > Would be better to explain why here to not sound rude. :) > > These crashes have a completely different backtrace. It could be that there > is still some problem in nsLocalFile::CopyMove() but it is definitely on a > different line. Yes, sure. I was just explaining why a new bug is useful here, as I found comment #32 to sound rude without such an explanation.
Comment 36•12 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #35) > Yes, sure. I was just explaining why a new bug is useful here, as I found > comment #32 to sound rude without such an explanation. Sorry for that. I filed bug 704817 instead.
Comment 37•12 years ago
|
||
Verified issue using the test case attached in Comment 25. I cannot crash using: Mozilla/5.0 (Windows NT 6.1; rv:9.0) Gecko/20100101 Firefox/9.0 Mozilla/5.0 (Windows NT 6.1; rv:10.0a2) Gecko/20111128 Firefox/10.0a2 Mozilla/5.0 (Windows NT 6.1; rv:11.0a1) Gecko/20111127 Firefox/11.0a1 Based on Comment 34 setting resolution to VERIFIED FIXED.
Status: RESOLVED → VERIFIED
Keywords: verified-aurora,
verified-beta
Whiteboard: [inbound][qa+] → [inbound][qa!]
You need to log in
before you can comment on or make changes to this bug.
Description
•