Hang soon after offline network drive storing profile is reconnected.

RESOLVED WORKSFORME

Status

()

Toolkit
Storage
--
major
RESOLVED WORKSFORME
10 years ago
10 years ago

People

(Reporter: David Koppelman, Unassigned)

Tracking

({hang, qawanted})

Trunk
x86
Windows XP
hang, qawanted
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

10 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007100705 Minefield/3.0a9pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007100705 Minefield/3.0a9pre

FF profile is stored on a network drive. If FF is running with that network drive offline and the network drive reconnects a warning dialog pops up. The warning dialog title is  "Alert" and the text is "There was an error writing data to the disk. This error is sometimes caused by a full disk. Please restart this application." After clicking okay on the dialog and then doing something with firefox, such as clicking a link, firefox will hang with 100% CPU usage. The hang has lasted for minutes and only ends when firefox is terminated from the task manager. On re-start firefox runs normally, until the next network drive reconnect.

Reproducible: Always

Steps to Reproduce:
1. Wait for network drive holding profile to be offline.
2. If not already running, start firefox.
3. Reconnect the network drive when offered the opportunity.
4. Click okay on error-writing-to-disk dialog.
5. Click on a link in FF.
Actual Results:  
FF unresponsive with 100% CPU usage.

Expected Results:  
FF should follow the link.
(Reporter)

Comment 1

10 years ago
Perhaps this is related: FF never restarts after an application update. I get the update progress bar which shows what looks like normal progress, then the updater exits but FF does not restart.  When launched manually FF starts okay (and properly updated).  I've had these update problems for about two years, the error-writing-to-disk hangs started roughly a month ago.
(Reporter)

Comment 2

10 years ago
During the hang ff is sometimes running sqlite3.dll. Addresses encountered are: 0x6104adc2, 0x60143297.  Othertimes running ntdll.dll 0x7c90101a. I'm new to debugging with MSDE, but I'll try to get stack traces with symbols if someone is listening.

Updated

10 years ago
Keywords: hang
Version: unspecified → Trunk
(Reporter)

Comment 3

10 years ago
I got a call stack during the hang here it is along with the build ID:

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a9pre) Gecko/2007100804 Minefield/3.0a9pre ID:2007100804

>	sqlite3.dll!_sqlite3_expired()  - 0x3f4b bytes	C
 	js3250.dll!JS_DHashTableOperate(JSDHashTable * table=0x00000000, const void * key=0x00000000, JSDHashOperator op=1610717551)  Line 591 + 0x9 bytes	C
 	nspr4.dll!_MD_CURRENT_THREAD()  Line 300	C
 	sqlite3.dll!_sqlite3_expired()  - 0x1c72 bytes	C
 	sqlite3.dll!_sqlite3_snprintf()  + 0x132a bytes	C
 	sqlite3.dll!_sqlite3_clear_bindings()  + 0xfdf bytes	C
 	sqlite3.dll!_sqlite3_declare_vtab()  + 0x217a bytes	C
 	sqlite3.dll!_sqlite3_declare_vtab()  + 0x4b51 bytes	C
 	sqlite3.dll!_sqlite3_step()  + 0x16 bytes	C
 	xul.dll!mozStorageStatement::ExecuteStep(int * _retval=0x0012f0ec)  Line 483	C++
 	xul.dll!nsNavBookmarks::GetBookmarkIdsForURITArray(nsIURI * aURI=0x0012f100, nsTArray<__int64> * aResult=0x6082208d)  Line 2185 + 0x13 bytes	C++
 	nspr4.dll!_PR_MD_UNLOCK(_MDLock * lock=0x027fa1f0)  Line 342 + 0xa bytes	C
 	xul.dll!NS_InvokeByIndex_P(nsISupports * that=0x603f6f15, unsigned int methodIndex=41918960, unsigned int paramCount=40, nsXPTCVariant * params=0x00000003)  Line 102	C++
 	xul.dll!AutoJSSuspendRequest::SuspendRequest()  Line 3318 + 0x9 bytes	C++
 	js3250.dll!js_Interpret(JSContext * cx=, unsigned char * pc=, long * result=)  Line 6348 + 0x12 bytes	C
(Reporter)

Comment 4

10 years ago
I'm going to try changing the product to toolkit and the component to storage because the hang occurs within sqlite3.  
Component: Startup and Profile System → Storage
Product: Firefox → Toolkit
QA Contact: startup → storage
Not really sure if this should actually block - it seems like a bit of an edge case, but I'll request it.
Flags: blocking1.9?
actually, not until I can get someone to confirm it...
Flags: blocking1.9?
Keywords: qawanted
Severity: normal → major
(Reporter)

Comment 7

10 years ago
Created attachment 284375 [details]
Stack at time of error dialog and at time of hang.

I've collected a stack trace during the display of the "error writing
to disk" message. I'm attaching that, along with a trace at the time
of the subsequent hang, here's an excerpt:

        xul.dll!nsAsyncWriteErrorDisplayer::Run()  Line 1619 + 0x1a bytes       C++
        xul.dll!nsThread::ProcessNextEvent(int mayWait=1, int * result=0x0012fc88)  Line 491    C++
        xul.dll!NS_ProcessNextEvent_P(nsIThread * thread=0x00000001, int mayWait=1)  Line 227 + 0xd bytes       C++

It looks like the error dialog is originating from Firefox.
the storage service is throwing that up.  So, I think this is coming about because we run our disk writes on another thread for storage...
(Reporter)

Comment 9

10 years ago
Perhaps this is relevant: When the profile is offline and I follow
links the size of places.sqlite changes but the modification time does
not (at least up to the minute).

When I reconnect it looks like places.sqlite is being clobbered by the
pre-disconnect version: the size of places.sqlite returns to what I
guess is the pre-disconnect size and maybe modification time.

A bit more detail:

  Network volume holding profile is off line.

  Visit some links (maybe at around 17:00).

  =>  places.sqlite: Mod Time: 15:21, Size  360448
      Size has changed (trust me) but mod time does not.

  Have Windows re-connect the network volume and re-sync.

  => places.sqlite: Mod Time: 15:22, Size  355328
     Size reverted.

  Visit a link.
  FF hangs, kill and restart.

  => places.sqlite: Mod Time  18:36, Size  355328
     Things seem normal again.

  If I visit sites while the drive is on line the modification
  time for places.sqlite changes as expected.

Updated

10 years ago
Depends on: 402615
still able to reproduce this?
(Reporter)

Comment 11

10 years ago
(In reply to comment #10)
> still able to reproduce this?

I tried once, today, and did not get the crash (it used to crash reliably).  I'll post back if I succeed in crashing it.

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b3pre) Gecko/2008013104 Firefox/2.0.0.11 ID:2008013104

Alright - I'm going to resolve this as WORKSFORME since I think removing async IO would have made this crash goes away.  If you see it again, please reopen it.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → WORKSFORME
(Reporter)

Comment 13

10 years ago
(In reply to comment #12)
> Alright - I'm going to resolve this as WORKSFORME since I think removing async
> IO would have made this crash goes away.  If you see it again, please reopen
> it.

Shouldn't the resolution be 'fixed'?

Only if we know what fixed it.
You need to log in before you can comment on or make changes to this bug.