Closed
Bug 206642
Opened 22 years ago
Closed 19 years ago
Mozilla hangs on startup - strace shows its spinning on trying to read XUL.mfasl file
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: mohit_aron, Unassigned)
Details
Attachments
(1 file)
13.88 KB,
application/octet-stream
|
Details |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312
Very frequently, Mozilla seems to hang when I attempt to start it - that is, no
browser window comes up. An strace of mozilla in this condition shows its spinning
on the XUL.mfasl file. The problem goes away if I kill the spinning mozilla
process and delete the XUL.mfasl file and then restart mozilla.
The problem specially happens if Mozilla crashes for some reason. But even on a
clean shutdown, I've often seen this problem when I attempt to start Mozilla
the next time.
This problem has also been observed by many others in my company. In fact,
the solution (deleting the XUL.mfasl file) is now documented in a "helplist"
that we maintain over here for our engineers.
Reproducible: Sometimes
Steps to Reproduce:
Don't know how to reproduce this problem deterministically. However, if Mozilla
were made to crash a couple of times and then restarted, perhaps this problem
might show up.
Actual Results:
I cannot reproduce the problem deterministically - so don't know what to say here.
Expected Results:
Mozilla should start up and show the browser window rather than not showing
anything.
Comment 1•22 years ago
|
||
Can you be extremely specific about which version of mozilla with which you are
experiencing this problem.
I see that the build id for the browser that you used to file this bug is
"...; rv:1.3) Gecko/20030312", but have you actually experienced this problem
with that browser (for sure).
This was a very common problem in builds prior to about 01/31/2003, but since
that time we have had no reports at all of people experiencing this problem.
(See bug 169777, and bug 189832).
If you can in fact, reproduce this with a build with a build ID after 20030131,
could you please (g)zip up the XUL.mfasl and email it to me so that I can have
a look at that file. Thanks. (Note, there is no personal information contained
in that file, aside from the path for where you installed mozilla).
Status: UNCONFIRMED → NEW
Ever confirmed: true
Reporter | ||
Comment 2•22 years ago
|
||
UL.mfasl file that causes mozilla to hang upon startup is attached.
Reporter | ||
Comment 3•22 years ago
|
||
> I see that the build id for the browser that you used to file this bug is
> "...; rv:1.3) Gecko/20030312", but have you actually experienced this problem
> with that browser (for sure).
I'm absolutely experiencing problems with this browser. So are others in the
company. I just ran into the problem again 2 minutes back. I've attached the
corresponding XUL.mfasl file - as soon as I removed this file, mozilla was back in
business.
Comment 4•22 years ago
|
||
This fastload file is not from 1.2 or earlier builds ?
(they are known to be broken)
Comment 5•22 years ago
|
||
Thanks for providing the XUL.mfasl file. That helps a lot.
I hacked out some validity checks in my build so that it would use this
fastload file, and I can reproduce this hang (or at least I can see I am doomed
and am wrongly seeking past the EOF.
I dumped out the format, and noticed the following things:
1) MFL_FILE_VERSION is 4, which means the build that generated this file is
from after brendan landed the final touches to fix the previous problem.
2) There is no nsSystemPrincipal singleton serialized ID map into the fastload
file.
3) There are no ".js" documents in the document map in the file.
I spoke with brendan about this and given that the chrome path begins with
"/auto/..." he suspects that this may be related to a problem with NFS.
Can you provide further details about what OS types and versions are running for
client and (NFS) server, which version of NFS is running for client and server
and any other data about the mountd and nfsd binaries you have installed (e.g.,
on redhat, /usr/sbin/rpc.mountd --version).
Assignee: dougt → jrgm
Comment 6•22 years ago
|
||
If on Redhat Linux, maybe something like 'rpm -qa |egrep -i "(nfs|mount)"'
and then 'rpm -qi' on each of the rpms named from the output of the first
command.
Reporter | ||
Comment 7•22 years ago
|
||
% rpm -qa | egrep -i "(nfs|mount)"
mount-2.10r-0.6.x
nfs-utils-0.3.1-0.6.x.1
% rpm -qi mount-2.10r-0.6.x
Name : mount Relocations: (not relocateable)
Version : 2.10r Vendor: Red Hat, Inc.
Release : 0.6.x Build Date: Tue 10 Apr 2001 02:16:11
PM PDT
Install date: Thu 30 Jan 2003 07:45:45 AM PST Build Host:
porky.devel.redhat.com
Group : System Environment/Base Source RPM: mount-2.10r-0.6.x.src.rpm
Size : 115615 License: GPL
Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Summary : Programs for mounting and unmounting filesystems.
Description :
The mount package contains the mount, umount, swapon and swapoff
programs. Accessible files on your system are arranged in one big
tree or hierarchy. These files can be spread out over several
devices. The mount command attaches a filesystem on some device to
your system's file tree. The umount command detaches a filesystem
from the tree. Swapon and swapoff, respectively, specify and disable
devices and files for paging and swapping.
% rpm -qi nfs-utils-0.3.1-0.6.x.1
Name : nfs-utils Relocations: (not relocateable)
Version : 0.3.1 Vendor: Red Hat, Inc.
Release : 0.6.x.1 Build Date: Tue 17 Apr 2001 09:56:16
AM PDT
Install date: Thu 30 Jan 2003 07:50:48 AM PST Build Host:
porky.devel.redhat.com
Group : System Environment/Daemons Source RPM:
nfs-utils-0.3.1-0.6.x.1.src.rpm
Size : 524367 License: GPL
Packager : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Summary : NFS utlilities and supporting daemons for the kernel NFS server.
Description :
The nfs-utils package provides a daemon for the kernel NFS server and
related tools, which provides a much higher level of performance than the
traditional Linux NFS server used by most users.
This package also contains the showmount program. Showmount queries the
mount daemon on a remote host for information about the NFS (Network File
System) server on the remote host. For example, showmount can display the
clients which are mounted on that host.
Reporter | ||
Comment 8•22 years ago
|
||
% uname -rs
Linux 2.2.19-6.2.1
The NFS server my company uses is one from Netapp - their 820 filer.
Comment 9•22 years ago
|
||
Some information about problems with redhat client problems and NFS servers
(including Netapp), although they seem to be more about 2.4.x kernels and 7.x
redhat. Suggests possible improvement if setting nfs version 2, or limiting the
rsize/wsize to 8KB.
http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=65069
(Yet, I don't quite grok why it appears to selectively drop the ".js" documents
from the serialization (although some ".xul" documents are missing too)).
Comment 10•22 years ago
|
||
By the way, what does 'cat /etc/redhat-release' say. And are there any suspicious
errors in /var/log/messages?
(Sorry to ask so many questions, but I just don't see how, under typical
conditions, fastload would serialize only some of the streams).
Reporter | ||
Comment 11•22 years ago
|
||
% cat /etc/redhat-release
Red Hat Linux release 6.2 (Zoot)
No, there's nothing in /var/log/messages that I can connect to this bug.
Can you please explain how this bug is affected by NFS ? Just out of curiosity.
Also, I routinely have to delete my XUL.mfasl file to start up mozilla. I don't
see a speed benefit anywhere. So why is this file even kept around ? Given the
number of problems that surround it, it seems its better to get rid of this file
altogether.
Comment 12•22 years ago
|
||
Mohit Aron, the symptoms you are reporting have not been reported by others.
They are not known bugs in FastLoad. From my years in the mid-late-80s hacking
on NFS for SGI, they seem like client NFS buffered-write-vs.-seek bugs.
You won't see a FastLoad speedup without a local filesystem. Even then, on very
fast hardware, you won't see as big a speedup with FastLoad as you would on
older hardware. But FastLoad does improve Ts, which is a performance number we
measure and keep ever-improving (mostly), via tinderbox. Some of our
tinderboxes are older machines. None uses NFS for the filesystem containing the
profile.
/be
Reporter | ||
Comment 13•22 years ago
|
||
> Mohit Aron, the symptoms you are reporting have not been reported by others.
> They are not known bugs in FastLoad. From my years in the mid-late-80s
> hacking on NFS for SGI, they seem like client NFS buffered-write-vs.-seek
> bugs.
>
> You won't see a FastLoad speedup without a local filesystem. Even then, on
> very fast hardware, you won't see as big a speedup with FastLoad as you would
> on older hardware. But FastLoad does improve Ts, which is a performance
> number we measure and keep ever-improving (mostly), via tinderbox. Some of
> our tinderboxes are older machines. None uses NFS for the filesystem
> containing the profile.
I don't even remember the last time I had my home directory on a local
filesystem rather than an NFS one - and I've been using computers for more
than 10 years now (Bachelors, Phd, and 3 years in industry). If bugs in
fastload are triggered by commercial implementations of NFS, then it'd be
better to get rid of the fastload mechanism. I can't imagine anyone doing
serious work putting his/her directory on a local filesystem.
You say other users haven't reported this bug - I don't see what I can do
about that. I work at Google - and people routinely see this problem here.
And they've been seeing it for a long time - its just that nobody cared to file
a bug. I just took the effort of filing one. Please don't tell me that this is
a non-problem because others haven't reported it.
Comment 14•22 years ago
|
||
Build-Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1)
Gecko/20030212 Debian/1.2.1-9woody1
Reading the comments, the developers seem to be focussing on possible NFS
problems as the cause?
I've also had this problem on this version (but not earlier versions, I don't
think). As you can see I'm on 20030212 on Debian Woody.
Only problem is, I have no NFS support on the system whatsoever. Not even in the
kernel (custom compiled, both settings set to No). I do have Samba support
(server and client), but purely for documents - nothing related to either the
Mozilla system files or the home directory.
Apart from that, the symptoms are exactly as described - in my case on a Celeron
466 w/192M RAM, it effectively locks the system up if I don't kill it fairly
straight away (ie. when it starts using more than about 30M). It's not directly
reproducible, but my Mozilla rarely crashes so more often than not it's after a
clean shutdown.
HTH
Andrew
Comment 15•22 years ago
|
||
Yes, this was a common problem with "rv:1.2.1) Gecko/20030212". But since 1.3
final, aside from the situation noted in this bug, this has not been a known issue.
Comment 16•22 years ago
|
||
It's hard to fix a bug that people don't report. Thanks to Mohit for filing
this one, but we're still not going to get far without a way to reproduce the
problem here where jrgm and I sit.
One thing that would help: if you can use tcpdump or ethereal or something
similar to snoop NFS packets when you first start the browser *without* a
XUL.mfasl file in your profile directory. Once the browser is up, if you then
quit and start again, and find the browser hanging as described here, I would
love to see the packet trace (voluminous thought it would be).
/be
Reporter | ||
Comment 17•22 years ago
|
||
I'm trying to understand why this bug is being blamed on NFS ? If there really
was such a blatant problem with NFS, I should be seeing the problem with a host
of other files on my NFS directory. Why only Mozilla then ?
Also, please notice that the problem happens randomly - so I can't
deterministically be running a tcpdump when the problem happens.
Updated•19 years ago
|
Assignee: jrgmorrison → nobody
QA Contact: scc → xpcom
Comment 18•19 years ago
|
||
Mohit, can you reproduce this in builds dated 20060804 or newer (i.e. with
bug 341595 fixed) ?
Comment 19•19 years ago
|
||
Please REOPEN the bug if it can be reproduced in Firefox 2.0b2 or newer.
-> WORKSFORME
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•