4.99 KB, text/plain
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007100705 Minefield/3.0a9pre Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007100705 Minefield/3.0a9pre FF profile is stored on a network drive. If FF is running with that network drive offline and the network drive reconnects a warning dialog pops up. The warning dialog title is "Alert" and the text is "There was an error writing data to the disk. This error is sometimes caused by a full disk. Please restart this application." After clicking okay on the dialog and then doing something with firefox, such as clicking a link, firefox will hang with 100% CPU usage. The hang has lasted for minutes and only ends when firefox is terminated from the task manager. On re-start firefox runs normally, until the next network drive reconnect. Reproducible: Always Steps to Reproduce: 1. Wait for network drive holding profile to be offline. 2. If not already running, start firefox. 3. Reconnect the network drive when offered the opportunity. 4. Click okay on error-writing-to-disk dialog. 5. Click on a link in FF. Actual Results: FF unresponsive with 100% CPU usage. Expected Results: FF should follow the link.
Perhaps this is related: FF never restarts after an application update. I get the update progress bar which shows what looks like normal progress, then the updater exits but FF does not restart. When launched manually FF starts okay (and properly updated). I've had these update problems for about two years, the error-writing-to-disk hangs started roughly a month ago.
During the hang ff is sometimes running sqlite3.dll. Addresses encountered are: 0x6104adc2, 0x60143297. Othertimes running ntdll.dll 0x7c90101a. I'm new to debugging with MSDE, but I'll try to get stack traces with symbols if someone is listening.
I got a call stack during the hang here it is along with the build ID: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007100804 Minefield/3.0a9pre ID:2007100804 > sqlite3.dll!_sqlite3_expired() - 0x3f4b bytes C js3250.dll!JS_DHashTableOperate(JSDHashTable * table=0x00000000, const void * key=0x00000000, JSDHashOperator op=1610717551) Line 591 + 0x9 bytes C nspr4.dll!_MD_CURRENT_THREAD() Line 300 C sqlite3.dll!_sqlite3_expired() - 0x1c72 bytes C sqlite3.dll!_sqlite3_snprintf() + 0x132a bytes C sqlite3.dll!_sqlite3_clear_bindings() + 0xfdf bytes C sqlite3.dll!_sqlite3_declare_vtab() + 0x217a bytes C sqlite3.dll!_sqlite3_declare_vtab() + 0x4b51 bytes C sqlite3.dll!_sqlite3_step() + 0x16 bytes C xul.dll!mozStorageStatement::ExecuteStep(int * _retval=0x0012f0ec) Line 483 C++ xul.dll!nsNavBookmarks::GetBookmarkIdsForURITArray(nsIURI * aURI=0x0012f100, nsTArray<__int64> * aResult=0x6082208d) Line 2185 + 0x13 bytes C++ nspr4.dll!_PR_MD_UNLOCK(_MDLock * lock=0x027fa1f0) Line 342 + 0xa bytes C xul.dll!NS_InvokeByIndex_P(nsISupports * that=0x603f6f15, unsigned int methodIndex=41918960, unsigned int paramCount=40, nsXPTCVariant * params=0x00000003) Line 102 C++ xul.dll!AutoJSSuspendRequest::SuspendRequest() Line 3318 + 0x9 bytes C++ js3250.dll!js_Interpret(JSContext * cx=, unsigned char * pc=, long * result=) Line 6348 + 0x12 bytes C
I'm going to try changing the product to toolkit and the component to storage because the hang occurs within sqlite3.
Not really sure if this should actually block - it seems like a bit of an edge case, but I'll request it.
actually, not until I can get someone to confirm it...
Created attachment 284375 [details] Stack at time of error dialog and at time of hang. I've collected a stack trace during the display of the "error writing to disk" message. I'm attaching that, along with a trace at the time of the subsequent hang, here's an excerpt: xul.dll!nsAsyncWriteErrorDisplayer::Run() Line 1619 + 0x1a bytes C++ xul.dll!nsThread::ProcessNextEvent(int mayWait=1, int * result=0x0012fc88) Line 491 C++ xul.dll!NS_ProcessNextEvent_P(nsIThread * thread=0x00000001, int mayWait=1) Line 227 + 0xd bytes C++ It looks like the error dialog is originating from Firefox.
the storage service is throwing that up. So, I think this is coming about because we run our disk writes on another thread for storage...
Perhaps this is relevant: When the profile is offline and I follow links the size of places.sqlite changes but the modification time does not (at least up to the minute). When I reconnect it looks like places.sqlite is being clobbered by the pre-disconnect version: the size of places.sqlite returns to what I guess is the pre-disconnect size and maybe modification time. A bit more detail: Network volume holding profile is off line. Visit some links (maybe at around 17:00). => places.sqlite: Mod Time: 15:21, Size 360448 Size has changed (trust me) but mod time does not. Have Windows re-connect the network volume and re-sync. => places.sqlite: Mod Time: 15:22, Size 355328 Size reverted. Visit a link. FF hangs, kill and restart. => places.sqlite: Mod Time 18:36, Size 355328 Things seem normal again. If I visit sites while the drive is on line the modification time for places.sqlite changes as expected.
still able to reproduce this?
(In reply to comment #10) > still able to reproduce this? I tried once, today, and did not get the crash (it used to crash reliably). I'll post back if I succeed in crashing it. Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b3pre) Gecko/2008013104 Firefox/188.8.131.52 ID:2008013104
Alright - I'm going to resolve this as WORKSFORME since I think removing async IO would have made this crash goes away. If you see it again, please reopen it.
(In reply to comment #12) > Alright - I'm going to resolve this as WORKSFORME since I think removing async > IO would have made this crash goes away. If you see it again, please reopen > it. Shouldn't the resolution be 'fixed'?
Only if we know what fixed it.