Open Bug 433129 Opened 16 years ago Updated 6 months ago

nfs hosted profiles/browsers result in broken search and history because NFS does not follow POSIX filesystem semantics and does not correctly implement posix advisory locking. Add storage.nfs_filesystem preference

Categories

(Toolkit :: Storage, defect, P5)

x86
Linux
defect

Tracking

()

Tracking Status
firefox-esr17 19+ fixed

People

(Reporter: flin.gs, Assigned: stransky)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [leave open])

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.13) Gecko/20080311 Iceweasel/2.0.0.13 (Debian-2.0.0.13-0etch1)
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.13) Gecko/20080311 Iceweasel/2.0.0.13 (Debian-2.0.0.13-0etch1)

Summary:

After about 30 minutes, or near abouts, the search engine feature on the top right hand side of Firefox breaks. When I type, it appears with the letters I'm typing, however, when I push enter, nothing happens. When the search field isn't broken, I can push enter to search my keywords. I have included the error message from Firebug below. Hope this helps, I'm on Debian Etch, using Iceweasel 2.0.0.13.

Error from Firebug:

[Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [mozIStorageStatementWrapper.step]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: file:///usr/lib/iceweasel/components/nsSearchService.js :: epsGetAttr :: line 2841" data: no]
[Break on this error] var value = null;
nsSearchService.j... (line 2841)
[Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [mozIStorageStatementWrapper.step]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: file:///usr/lib/iceweasel/components/nsSearchService.js :: epsGetAttr :: line 2841" data: no]
[Break on this error] var value = null;
nsSearchService.j... (line 2841)
[Exception... "Component is not available" nsresult: "0x80040111 (NS_ERROR_NOT_AVAILABLE)" location: "JS frame :: file:///usr/lib/iceweasel/components/nsSessionStore.js :: sss_saveState :: line 1753" data: no]
[Break on this error] oState.session = { state: ((this._loadState == STATE_RUNNING) ? STATE_RUNNIN...
nsSessionStore.js (line 1753)

Thanks for your time.

Reproducible: Always

Steps to Reproduce:
1. Browse random websites (may take a while)
2. Use the search engine from time to time (I use it about 10 - 15 times per session)
3. Search feature will break after a while, and will not search when you push enter, or click the search icon.
Actual Results:  
Search feature breaks, forcing you to type the URL of your search engine, and search that way, instead of using the quick search feature.


My addons include:

Adblock: 0.5.3.043
Copy Selected Links: 2.1
DownThemAll!: 1.0.1
Firebug: 1.05
FireFTP: 0.97.1
Greasemonkey: 0.7.20080121.0
Live HTTP Headers: 0.13.1
Reporter, does this still happen with a recent build of Firefox, e.g. Firefox 3.0.10 or the Firefox 3.5 beta 4? You should use a build directly downloaded from the Mozilla server to check for this issue. Further run Firefox in Safe Mode (http://support.mozilla.com/kb/Safe+Mode) to check if an add-on could have been caused this problem. Thanks.
Version: unspecified → 2.0 Branch
This bug was reported using a version of Firefox that security and stability updates are no longer provided for.  All users are strongly encouraged to upgrade to Firefox 3 by selecting 'Check for Updates' in the Help menu or by going to http://www.mozilla.com/en-US/firefox/firefox.html

If you can no longer reproduce this bug using the latest Firefox 3.0.x version, please change the status of this bug to 'RESOLVED' 'WORKSFORME'.

If you can still reproduce this bug, please provide additional details to help resolve this issue.
Just redirected here from bug 376084.

This is still an issue for me without any of the named extensions installed.

Whenever this happens to me in the search bar, I get the following error in the
Error console (Ubuntu Jaunty, Firefox 3.0.11):

Error: [Exception... "Component returned failure code: 0x80004005
(NS_ERROR_FAILURE) [mozIStorageStatementWrapper.step]" nsresult: "0x80004005
(NS_ERROR_FAILURE)" location: "JS frame ::
file:///usr/lib/firefox-3.0.11/components/nsSearchService.js :: epsGetAttr ::
line 2921" data: no]
Source File: file:///usr/lib/firefox-3.0.11/components/nsSearchService.js
Line: 2921

This happens after the following error appears:

Error: Permission denied to call method Location.toString

The address bar does not work either, but no error is returned when I try to
use it. The only extension installed is the Ubuntu Firefox Enhancements.
Adam, can you  please install Firefox 3.5.1 and check if this problem is fixed for you? Please make a copy of your profile first just to make sure we don't repair any of the profile files when Firefox 3.5.1 is getting started. Thanks.
Same issue, using the firefox-3.5 package from Jaunty repositories.

Error: [Exception... "Component returned failure code: 0x80630002 [mozIStorageStatementWrapper.step]"  nsresult: "0x80630002 (<unknown>)"  location: "JS frame :: file:///usr/lib/xulrunner-1.9.1.1/components/nsSearchService.js :: epsGetAttr :: line 3263"  data: no]
Source File: file:///usr/lib/xulrunner-1.9.1.1/components/nsSearchService.js
Line: 3263

firefox-3.5:
  Installed: 3.5.1+build1+nobinonly-0ubuntu0.9.04.1
  Candidate: 3.5.1+build1+nobinonly-0ubuntu0.9.04.1
  Version table:
 *** 3.5.1+build1+nobinonly-0ubuntu0.9.04.1 0
        500 http://fireball-mirror.phys.wvu.edu jaunty-security/universe Packages
        500 http://fireball-mirror.phys.wvu.edu jaunty-updates/universe Packages
        100 /var/lib/dpkg/status
     3.5~b4~hg20090330r24021+nobinonly-0ubuntu1 0
        500 http://fireball-mirror.phys.wvu.edu jaunty/universe Packages
Please don't use the Ubuntu version of Firefox. Instead download Firefox from http://www.mozilla.com/.
Downloaded Firefox 3.5.1 has the same issue.

Error: uncaught exception: [Exception... "Component returned failure code: 0x80630002 [mozIStorageStatementWrapper.step]"  nsresult: "0x80630002 (<unknown>)"  location: "JS frame :: file:///home/users/dorsey/Desktop/firefox/components/nsSearchService.js :: epsGetAttr :: line 3263"  data: no]
Adam have you other extensions installed? If yes please try in Safe Mode (see link in comment 1) and/or create a fresh profile (http://support.mozilla.com/en-US/kb/Managing+profiles)
Adblock Plus is installed.  I will remove it and use a fresh profile for a while and see if the problem continues.
The issue is still occurring with a fresh profile and no addons.  

Error: uncaught exception: [Exception... "Component returned failure code: 0x80630002 [mozIStorageStatementWrapper.step]"  nsresult: "0x80630002 (<unknown>)"  location: "JS frame :: file:///home/users/dorsey/Desktop/firefox/components/nsSearchService.js :: epsGetAttr :: line 3263"  data: no]

Also got this one when asked to save a password, don't know if it's related or not:

Error: uncaught exception: [Exception... "'Couldn't write to database, login not added.' when calling method: [nsILoginManagerStorage::addLogin]"  nsresult: "0x8057001e (NS_ERROR_XPC_JS_THREW_STRING)"  location: "JS frame :: file:///home/users/dorsey/Desktop/firefox/components/nsLoginManager.js :: anonymous :: line 451"  data: no]
Where are your profiles located? Under ~/.mozilla/firefox? Does it also happen when you create a new profile on another folder?
Yes, my profiles are located in the default directory.  I will try creating a profile in a different directory.
I created a profile in a different directory outside of my home directory (my home directory is NFS mounted, so I wanted to be sure that wasn't an issue) and had the same issue again.

Error: [Exception... "Component returned failure code: 0x80630002 [mozIStorageStatementWrapper.step]"  nsresult: "0x80630002 (<unknown>)"  location: "JS frame :: file:///home/users/dorsey/Desktop/firefox/components/nsSearchService.js :: epsGetAttr :: line 3263"  data: no]
Source File: file:///home/users/dorsey/Desktop/firefox/components/nsSearchService.js
Line: 3263
Shawn, is there something wrong with the creation of the statement? Because it also happens with a fresh profile could it be application related?
Mmh I wonder if it's related that the profiles.ini is stored on the nfs mounted folder. Adam, can you please create a fresh user account on your system and try the same again? This account should not be stored on nfs.
0x80630002 is NS_ERROR_STORAGE_IOERR.  The file IO operations are failing and SQLite is telling us that they failed.
The last test account was not created on NFS.  It was created outside my home directory on the local hard disk, and I had the same error.
I just had a thought.  Firefox is running out of my home directory.  I'm going to move it to a local directory as well, and make everything run off of the local hard disk with no NFS involved.
Adam, any update?
It hasn't broken again yet, but sometimes it took a while to happen.  I'll try to use firefox heavily today and see if I can get it to break.
I've been using firefox since August 17 with no issue.  As soon as I moved everything (firefox + profile) off of NFS, the problem went away.  It appears that my NFS setup is doing something to Firefox that it doesn't like.
Adam, is it only the search feature which breaks for you when running the profile via an NFS mounted folder or are other components like bookmarks and passwords also affected?
Both the search bar and the address bar are affected.
We have the same problem here, home on NFSv4.
Ubuntu Jaunty, both firefox 3.0.13 and 3.5.2 show this behavior.

Is SQLite beeing too agressive on NFS file locks or something ?
I found this one on SQLite on NFS:
http://www.nabble.com/SQLite-on-NFS-cache-coherency-td15655701.html

Is it finally an sqlite bug ?
Anyway, firefox should handle this error correctly, display a warning, and close/reload the database or so..
Shawn shall we forward it to the sqlite team?
Oh, and see bug 500926 too where massive problems with NFS were reported.
(In reply to comment #24)
> Is SQLite beeing too agressive on NFS file locks or something ?
> I found this one on SQLite on NFS:
> http://www.nabble.com/SQLite-on-NFS-cache-coherency-td15655701.html
That's interesting.  We'll see what the SQLite folks have to say.
The default "unix" VFS for SQLite assumes that the underlying filesystem follows POSIX file semantics and obeys all the POSIX requirements related fcntl locks.  The article points out that NFS does not follow POSIX filesystem semantics and does not correctly implement posix advisory locking.  The article suggests that SQLite be modified so as to follow NFS standards rather than posix standards.  What can I say: the nice thing about standards is that there are so many to choose from.

Your easiest and quickest work-around is probably to use "dot-file" locking instead of fcntl() locking on non-Mac unix hosts.  Just open with sqlite3_open_v2(...,"unix-dotfile");  The problem with dot-file locking is that a system crashes in the middle of a transaction, you have stale locks left on the filesystem that have to be manually deleted.

We will take the action to see if we can't come up with yet another VFS for unix that works better on NFS.  Note that we will NOT be making incompatible changes to SQLite.  We have a huge installed base to support and we can't going around making arbitrary changes to the locking protocol.  But we can make additional VFSes available to folks who want to use them.  Of course, the danger there is that two separate clients might decide to use different VFSes with locking protocols to access the same database files, and not see each others locks, and step on one another.  But, what else can we do....
for our purposes, since our users are evil, we really need an on disk object which is recognized as a lock no matter which OS our user runs (users do manage to try to share profiles on network volumes [afs, nfs, cifs] across platforms)
I have a modified version of the unix VFS for SQLite that puts fcntl locks (posix advisory locks) over the entire database file in order to clue NFS in on cache coherency, as described in http://www.nabble.com/SQLite-on-NFS-cache-coherency-td15655701.html.  The changes are backwards compatible so it is acceptable to merge these changes into the official SQLite source tree.  However, the changes make SQLite run about 1.5% slower.  

I can disable the extra locking in order to get back to the original performance on non-NFS filesystems (the usual case).  But in order to do so, I need a way to reliably detect when a file is on an NFS filesystem and when it is not.  I can use fstatfs() on linux.  But fstatfs() is not posix (it is a BSDism, iirc) and so is not universally available.  Furthermore, the "struct statfs" that fstatfs() returns varies dramatically from one OS to the next.

So, question:  Does anybody know of a reliable cross-platform (or cross-unix) way of determining whether or not an open file is sitting on an NFS mount?
We (the SQLite developers) have looked around but have been unable to find a reliable way to determine whether or not a file is on an NFS mount.  Hence, we are puntting the problem up to you (the FF developers.)  

The SQLite check-in at http://www.sqlite.org/src/vinfo/2aeab80e5b adds a new operating system interface on unix builds. The new vfs is named "unix-wfl" where "wfl" stands for "whole-file locking".  The new VFS locks the entire file as recommended by http://www.nabble.com/SQLite-on-NFS-cache-coherency-td15655701.html, not just a small region of bytes.  Presumably the "unix-wfl" VFS will fix the cache coherency problem on NFS.  (I say "presumably" because we have yet to come up with a good way to test this.)  The downside is that unix-wfl is about 1% slower than the standard "unix" VFS.

To make use of the new VFS, simply put the string "unix-wfl" as the 4th parameter to sqlite3_open_v2() when initially opening the connection to the database file.
I could test this if I could get patched version ...
(In reply to comment #31)
> We (the SQLite developers) have looked around but have been unable to find a
> reliable way to determine whether or not a file is on an NFS mount.  Hence, we
> are puntting the problem up to you (the FF developers.)
That fix wouldn't help on windows though, would it?
The fix is unix only.  Windows uses manditory locks, not advisory locks (since when SQLite was first developed, win95/98/ME was still in wide spread use and win95/98/ME only has manditory locks) and so locking the entire file just won't work.  

Is it common to find a windows deployment that is able to talk to NFSv4?
(In reply to comment #34)
> Is it common to find a windows deployment that is able to talk to NFSv4?
This I don't know, but I'm fine with the solution that will be in the next version of SQLite.  I'm not really sure how we can detect an NFS mount for the databases or not yet though, which is saddening.
Given that it is a real problem lets confirm this bug finally.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Maybe we should change the Component from "Search" to "General", since all the sqlite related operations are concerned. This makes Firefox unusable, it has to be restarted at least all 30 minutes.
I would be pleased to help with this problem, please share your insights on how firefox should detect being on NFS and initialize sqlite correctly.
Component: Search → General
QA Contact: search → general
Summary: Search field breaks → nfs hosted profiles/browsers result in broken search and history
The suggestion in comment 31 has been tested by our customer and works as expected (unix-wfl has been removed, we have tested unix-excl) so I'm going to work on that. 

I think the easiest way is to use some pref to identify the the profile is located on NFS volume and the sqlite files should be opened in this special mode. The tricky part is to choose an appropriate API level where the options will be inserted.
Attached patch a patchSplinter Review
This patch adds storage.nfs_filesystem preference and when it's set the underlying file system is changed to some nfs friendly one.
Component: General → Storage
Product: Firefox → Toolkit
Version: 2.0 Branch → unspecified
Comment on attachment 649641 [details] [diff] [review]
a patch

The patch has been successfully tested by our customers. Shawn, can you please check it? Thanks!
Attachment #649641 - Flags: review?(sdwilsh)
Assignee: nobody → stransky
Comment on attachment 649641 [details] [diff] [review]
a patch

Review of attachment 649641 [details] [diff] [review]:
-----------------------------------------------------------------

r=asuth with:
- the crasher fix noted below

- an expanded comment preceding the preference name definition that provides some additional context as to why we have a preference rather than detecting things.  For example, "This preference is a workaround to allow users/sysadmins to identify that the profile exists on an NFS share whose implementation is incompatible with SQLite's default locking implementation. Bug 433129 attempted to automatically identify such file-systems, but a reliable way was not found and it was determined that the fallback locking is slower than POSIX locking, so we do not want to do it by default."


Do you have any plans to publish/document this otherwise hidden preference?  Specifically, is there a Red Hat document somewhere that we could link to from somewhere in the https://wiki.mozilla.org/Enterprise hierarchy or the enterprise mailing list to make this more broadly known/available?

I think there still exists the problem of automatically detecting the NFS case and handling it better, so we should either leave this bug open after landing or clone it into a new bug.  If a new bug is filed, the comment block I noted above should include it.

::: storage/src/TelemetryVFS.cpp
@@ +450,5 @@
> +  bool expected_vfs;
> +  sqlite3_vfs *vfs;
> +  if (Preferences::GetBool(PREF_NFS_FILESYSTEM)) {
> +    vfs = sqlite3_vfs_find(EXPECTED_VFS_NFS);
> +    expected_vfs = vfs->zName && !strcmp(vfs->zName, EXPECTED_VFS_NFS);

This check is not doing what you want.  sqlite3_vfs_find is either going to return null if the VFS does not exist (for example, on OS2), in which case you will crash when you try and lookup the name.  OR it's going to do a redundant check.  So I think the expect_vfs check should just be "expected_vfs = !!vfs;" or maybe "expected_vfs = (vfs != nullptr)".
Attachment #649641 - Flags: review?(sdwilsh) → review+
Attached patch check-in patchSplinter Review
Thanks! There's an updated patch for check-in.

Yes, I'm going to promote the workaround to the Enterprise group when the patch is in tree.
Attachment #658850 - Flags: checkin?
Keywords: checkin-needed
Whiteboard: [leave open for more patches]
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d4ae78b85de
Keywords: checkin-needed
Whiteboard: [leave open for more patches] → [leave open]
Attachment #658850 - Flags: checkin? → checkin+
Depends on: 798366
Blocks: 719952
(In reply to Martin Stránský from comment #42)
> 
> Yes, I'm going to promote the workaround to the Enterprise group when the
> patch is in tree.

Is this patch likely to be added to ESR 17?
If there are valid reasons (and looks like there may be, or at least I think so) just nominate it for ESR17 approval expressing the reasons.
I can't say this patch helps or not in our use of firefox - as we don't run a version of firefox with this patch - we are currently using the latest ESR 10 release, although we plan to move to ESR 17 in the near future.

However, we do run firefox on Linux with NFS hosted home directories (where the firefox profiles live) and have seen the problems described in this bug - so any extra possible help in this regard would be useful

I believe this patch is in v18 and will apply easily to the ESR 17 source - and as it only adds a preference that is disabled by default, this won't affect the default behaviour of firefox

I also believe Redhat already have a similar patch in their RHEL build of firefox ESR 10 

I hope that these are enough valid reasons for inclusion in ESR 17
Yes, we (Red Hat) ship this patch in Firefox ESR 10 line without any problems and we're going to carry on with it in the FF ESR 17 line too.
Comment on attachment 658850 [details] [diff] [review]
check-in patch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: this is a longstanding problem with business companies using firefox with NFS mounted profiles, in such a case the browser may often have issues with accessing any profile data stored in databases.
User impact if declined: Firefox doesn't work correctly on NFS mmounted profiles and there's no workaround to that
Fix Landed on Version: 18
Risk to taking this patch (and alternatives if risky): limited, its behavior must be enabled flipping a pref
String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #658850 - Flags: approval-mozilla-esr17?
Comment on attachment 658850 [details] [diff] [review]
check-in patch

This doesn't meet our general ESR criteria, but is being requested by an ESR deployment and is low risk enough that we will take it in the next ESR 17 release.
Attachment #658850 - Flags: approval-mozilla-esr17? → approval-mozilla-esr17+
https://hg.mozilla.org/releases/mozilla-esr17/rev/40af46720da2

Going to call this fixed for esr17 since presumably any future work on this bug isn't going to be uplifted.
User Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130212 Firefox/17.0

I couldn't reproduce this issue on Firefox ESR 17 (Build ID: 20130212034503) and on Firefox 3.5.1. I've tried with a profile saved on NFS mounted directory and also locally (under ~/.mozilla/firefox) with/without addons.

Can someone who has encountered this issue please verify the fix?
Adam, Gareth, James

Can you verify the fix works for you?  (ref comment 52)
Flags: needinfo?(james-p)
Flags: needinfo?(flin.gs)
Flags: needinfo?(adam.dorsey)
Summary: nfs hosted profiles/browsers result in broken search and history → nfs hosted profiles/browsers result in broken search and history because NFS does not follow POSIX filesystem semantics and does not correctly implement posix advisory locking. Add storage.nfs_filesystem preference
We're now using Firefox ESR 24 with NFS hosted home directories using storage.nfs_filesystem=true  - and haven't had any of these issues
Flags: needinfo?(james-p)
Priority: -- → P5
Flags: needinfo?(flin.gs)
Flags: needinfo?(adam.dorsey)

I still get this error regularly. Websites fail to load (web.whatsapp.com, outlook.office.com, ...) and all report the same error in the console "... NS_ERROR_STORAGE_IOERR ...". Sometimes all I need to do, is restart firefox. Sometimes, I need to close firefox and then manually navigate to the profile directory, do a "mv webappsstore.sqlite{,}" and "mv places.sqlite{,}", followed by "sqlite3 webappsstore.sqlite_" -> ".clone webappsstore.sqlite" and "sqlite3 places.sqlite_" -> ".clone places.sqlite" calls. Afterwards, all sites work again, but on a few I get logged out randomly.

I think it has something to do with the incognito mode. I believe it mostly (or only?) appears when I am using the incognito mode for a long-ish duration - to be able to login to google twice but with different accounts for example. Its quite tedious to restart firefox every few hours just to be able to continue working.

This issues exists for me at least since fall last year, or even longer (I cant remember precisely of when it first appeared).
I am and was running an up to date Linux Mint Desktop (currently 19.3) with the home-directory being mounted on a NFS 4.0+krb5i share, hosted by an up to date debian buster server.

https://glitch.com/~firefox-storage-test

Specific Subsystem Statuses:

LocalStorage
Bad: Our test logic is broken, please copy and paste the contents of 'Debug Info' below and anything in the devtools console and send to :asuth. (unexpectedBreakage)
QuotaManager
Good: Totally Working. (fullyOperational)
IndexedDB
Good: Totally Working. (fullyOperational)
Cache API
Good: Totally Working. (fullyOperational)

{
"v": 1,
"curVersion": 76,
"prevVersion": 0,
"ls": {},
"qm": {
"lastWorkedIn": 76
},
"idb": {
"persistentCreatedIn": 0,
"persistentLastOpenedIn": 76,
"clearDetectedIn": 0
},
"cache": {
"firstCacheCreatedIn": 0,
"unpaddedOpaqueCreatedIn": 0,
"paddedOpaqueCreatedIn": 76
}
}

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: major → --

The severity field is not set for this bug.
:mak, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(mak)
Severity: -- → S3
Flags: needinfo?(mak)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: