Closed
Bug 500926
Opened 16 years ago
Closed 14 years ago
Mozilla corrupts NFS link to server if server goes through high resource use cycle
Categories
(Firefox :: General, defect)
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: corporate, Unassigned)
Details
(Keywords: regression, Whiteboard: [CLOSEME 2010-11-01])
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.10) Gecko/2009051611 Gentoo Firefox/3.0.10
Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.10) Gecko/2009051611 Gentoo Firefox/3.0.10
a) Home directory NFS mounted from central server
b) server has nightly backups with gzip which takes the server to high resource usage
c) mozilla clients on client machines stop responding and the NFS mount from the
server gets borked.
I can understand why the clients would become unusable during high-stress periods on the server, but this shouldn't be able to corrupt the underlying NFS link, and it should recover after the stress period is concluded.
Note that nobody is actually using the clients at this time (2AM), but in the morning we come in and all of our clients are NFS-dead. Shutting down all the firefox processes before the backup is a workaround - the NFS links are fine unless mozilla is running.
Reproducible: Always
Steps to Reproduce:
1. leave client computer running with firefox open windows under NFS through server backup routine
2.
3.
Actual Results:
NFS link becomes corrupted and unusable. This only happens if firefox windows are open when the server is stressed.
I'm not even sure how anyone would be able to easily reproduce this issue. Is there anyway you can provide in depth information here on what is causing the issue...not just Firefox but what exactly Firefox is doing that is causing this?
Yep, this is a difficult one since there are several packages involved.
We have several clients in one office which have their home directories mounted NFS from a main server. The main server has regular backups with gzip, which ties up the server for close to an hour every night.
After the backup starts, the NFS mounted firefox clients fail to respond, which in itself is understandable. But after the backups are completed, the entire NFS mount is unresponsive. It takes a restart of nfs both on the server and the client to get home directories mountable again.
This is a recent problem - possibly with our last upgrade to 3.0.10, or possibly in conjunction with an upgrade to our nfs package, I'm not sure. We do regular weekly upgrades from Gentoo.
I originally thought this was an nfs build issue, but backing out nfs-utils to several previous revisions hasn't eliminated this problem. Finding the mozilla link to this was trial-and-error - working overnight last week I found out when the NFS link failed, which corresponded to the start of the backup routine. The next night I killed off all the mozilla clients before the backup and everything was fine. Now we kill them off as a routine every evening.
I suspect that 3.0.10 (or possibly an earlier version if we skipped one) introduced more disk activity, and if the client can't talk to the server quickly enough then firefox chokes. And when it chokes, it takes the NFS link with it.
I've just upgraded to 3.0.11 and I'll let you know if the same issue occurs after tonight's run. Looking at the changelog in 3.0.11 it didn't look like any NFS related issues were addressed though.
The workaround for us is to close out Mozilla every evening, and restart in the morning. This works fine. I am reluctant to move people's home directories to the clients just because of the extra backup overhead, but that might be necessary long-term if this is still a problem going forward.
dilbot, could you try updating one of the computers to Firefox 3.5 RC3 and see if the issue still occurs?
Also, I can't confirm the bug due to the setup and configuration needed but I added the "qawanted" keyboard to this bug to see if QA could possibly test this.
Will do. I'm out on business for the next week but I'll load up FF-3.5RC3 on our client base when I get back and I'll report the results.
(In reply to comment #5)
> Will do. I'm out on business for the next week but I'll load up FF-3.5RC3 on
> our client base when I get back and I'll report the results.
Ok, thanks. By the way, the final version of Firefox 3.5 should be out by then. So check for an update first to make sure you have the latest Firefox.
Comment 7•16 years ago
|
||
dilbot, before you upgrade to Firefox 3.5.x you should first upgrade one version to 3.0.13. A newer sqlite3 version has been included. No idea if this could be a reason why it happens but please run a test.
The backup you are running is only from the profile which is located on the server? You transfer nothing from the clients?
Hi Henrik,
I did upgrade to 3.0.13 since 3.5 still isn't stable in Gentoo. Still the same problem. But then I gave up and localized the home directories, since otherwise I had to kill mozilla clients every night before the backups.
Basically, when mozilla clients are active from an NFS mounted home directory and the server gets stressed (backups or other processes), the NFS mount fails. If there are no mozilla clients running for the same processes, the NFS mount is fine.
I ran with NFS mounted home directories with mozilla running on them for several years up until this started happening. I'm not sure what changed. I thought it was on the NFS build side, but rolling back twice didn't help. So I think it's on the Mozilla side.
Comment 9•16 years ago
|
||
Dilbot, can you please do us a favor? If this is a regression it would be really helpful for us to know between which versions this problem has been started. Can you please download and run the former 3.0.x builds and check when it has been broken? Given that information we can start to check all the checkins for this particular version which will guide us to the appropriate patch. Thanks!
Keywords: regression,
regressionwindow-wanted
Version: unspecified → 3.0 Branch
Reporter | ||
Comment 10•16 years ago
|
||
Hi Henrik,
I'd like to, but we've moved to localized home directories and backups due to this problem, and unfortunately I can't really go backwards without affecting a large number of clients.
Hopefully someone else will see this problem and have more input. I'll try migrating back to NFS mounted in a year or so and see if the problem still exists.
Comment 11•16 years ago
|
||
That's sad to hear. Having such a setup constellation is not a normal thing so it will be hard to have further progress on it.
With the localized home directories you are not able to run one single client in the old way. Is that correct?
Comment 12•14 years ago
|
||
This is a mass search for Firefox General bugs filed against version 3.0 that are UNCO and have not been changed for 200 days.
Reporter, please update to Firefox 3.6.10 or alter. Firefox 3.0 is no longer supported and is no longer receiving updates. After you update, please create a fresh profile, http://support.mozilla.com/kb/managing+profiles, and test to see if your bug still exists. If you still the bug, then please post a comment with the version you tested against, and the problem. If the issue is no longer there, please set the RESOLUTION to RESOLVED, WORKSFORME.
Whiteboard: qawanted → [CLOSEME 2010-11-01]
Comment 13•14 years ago
|
||
No reply from reporter, INCOMPLETE. Please retest with Firefox 3.6.12 or later and a new profile (http://support.mozilla.com/kb/Managing+profiles). If you continue to see this issue with the newest firefox and a new profile, then please comment on this bug.
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago
Resolution: --- → INCOMPLETE
Updated•9 years ago
|
Keywords: regressionwindow-wanted
You need to log in
before you can comment on or make changes to this bug.
Description
•