Closed Bug 478189 Opened 16 years ago Closed 12 years ago

corrupt profile on unexpected computer restarts (over nfs)

Categories

(Firefox :: Bookmarks & History, defect)

x86_64
Linux
defect
Not set
major

Tracking

()

RESOLVED DUPLICATE of bug 433129
Tracking Status
blocking2.0 --- -

People

(Reporter: stephen.stegall, Unassigned)

References

(Blocks 2 open bugs)

Details

User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.0.6) Gecko/2009011912 Firefox/3.0.6 Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.6) Gecko/2009020900 Gentoo Firefox/3.0.6 Firefox becomes un-usable on a hard restart of a computer. Reproducible: Sometimes Steps to Reproduce: 1. Hard Power Off/Reset Actual Results: I am using the latest Firefox. This has happened more then five times now (how many ~/.mozilla/firefox/old_* folders I have, so it has happened more then this). I have never had this problem until Firefox 3. I had a hard reset on my computer, open Firefox, various components no longer work. The components are: - Search bar does not search - Tabs never stop loading, even when they appear to have loaded - Bookmarks are no longer there - Back, stop, reload buttons do no function - URL bar does not update - Restore bookmarks does not work - History is blank Expected Results: I expected it to work as normal. Sometimes it does... Irrelevant of add-ons, but the ones that might be of interest are: - Tab Mix Plus - Session Manager - Tree Style Tab Tab Mix Plus and Session Manager were tried independently in attempt to resolve the issue. Usually either of these help when crashed... I usually used SM, but after the first few times, I switched to TM+. Once or twice is would be acceptable, but it has happened to frequently to be ignored. My ~ is on a NFS share. If any additional information is required, I am more then willing to help resolve this issue.
looks like broken places.sqlite
Component: General → Bookmarks & History
QA Contact: general → bookmarks
Seems so. I just opened up places.sqlite $ sqlite3 places.sqlite > .dump > .quit Opened ff, and it was resolved. Thank You.
shouldn't .dump export tables and why does it fix the database ?
Alright, most bizarre... something different this time... $ lsof ~/.mozilla/firefox/scui465z.default/places.sqlite returns nothing $ ssh nfs-server $ lsof ~/.mozilla/firefox/scui465z.default/places.sqlite returns nothing $ exit # back to the original machine $ cd ~/.mozilla/firefox/scui465z.default $ sqlite3 places.sqlite .dump BEGIN TRANSACTION; COMMIT; $ cp places.sqlite{,_backup} $ sqlite3 places.sqlite_backup .dump ...... works $ sqlite3 places.sqlite .tables Error: database is locked $ sqlite3 places.sqlite_backup .tables moz_anno_attributes moz_favicons moz_keywords moz_annos moz_historyvisits moz_places moz_bookmarks moz_inputhistory moz_bookmarks_roots moz_items_annos $ mv places.sqlite{,_orig} $ mv places.sqlite{_backup,} $ sqlite3 places.sqlite .tables moz_anno_attributes moz_favicons moz_keywords moz_annos moz_historyvisits moz_places moz_bookmarks moz_inputhistory moz_bookmarks_roots moz_items_annos And FF works. Before I just copied the profile such as $ cp -a broken scui465z.default I figured before that, just opening places.sqlite in sqlite3 repaired the database. Apparently that is not the case... Sure enough, $ sqlite3 places.sqlite_orig .tables Error: database is locked $ mv places.sqlite{_orig,} $ sqlite3 places.sqlite .tables Error: database is locked FF returns to its original broken behavior. Restarting the computer did not resolve this issue. I restarted the NFS server (entire computer, probably could have just been the nfs-server) and the issue was resolved. Must be something in the locking mechanism. Not sure if firefox can resolve this issue. Perhaps an sqlite3 issue?
oh...NFS... As it turns out, NFS likes to lie to SQLite about things from time to time, which ends up causing problems. Sadly, we, nor SQLite can do anything about it. Maybe you are seeing bug 436737 or a variant thereof?
I am using the kernel mode NFS driver. But perhaps something similar. For documentation purposes... $ uname -a Linux fchs 2.6.27-gentoo-r7 #3 SMP Sun Jan 11 15:31:10 EST 2009 x86_64 AMD Athlon(tm) 64 Processor 3400+ AuthenticAMD GNU/Linux $ ssh nfs-server uname -a Linux bup 2.6.26-gentoo-r1 #2 Thu Jan 8 08:18:38 EST 2009 i686 AMD Athlon(tm) Processor AuthenticAMD GNU/Linux Using net-fs/nfs-utils-1.1.3 on both machines. $ sqlite3 --version 3.6.10 Might be related to https://bugzilla.mozilla.org/show_bug.cgi?id=463394 but doesn't seem so, NFS is sync... $ grep home /etc/fstab nfs-server:/home /home nfs hard,bg 0 1 $ ssh nfs-server grep home /etc/exports /home/ nfs-server(sync,no_root_squash,no_subtree_check,rw) But this is defiantly an NFS locking issue. Perhaps NFS should not lie to SQLite...
Summary: corrupt profile on unexpected computer restarts → corrupt profile on unexpected computer restarts (over nfs)
Moving to NEW so we can get some movement here. This is biting a number of Linux users I know in school/office environments, from general anecdotal evidence, but I'd never connected those issues to sqlite+NFS. We may be able to work around the nfs brain-damage here by using dotfile locking with sqlite (either by explicitly passing "unix-dotfile" to qslite3_open_v2 or by compiling with the appropriate SQLITE_ENABLE_LOCKING_STYLE; both are documented at <http://www.sqlite.org/compile.html#enable_locking_style>). Doing that might or might not also help the SMB issues in bug 502283. See also http://slashdot.org/comments.pl?sid=1506024&cid=30739010 for what prompted me to jump in here...
Blocks: 502283
Status: UNCONFIRMED → NEW
Ever confirmed: true
Oh, and the point is that as we move more of our sqlite access off the main thread slightly slower locking might be more palatable.
blocking2.0: --- → ?
Although, I'd still like to only do it for networked drives. The hard part is how to determine that. See also bug 484883.
Depends on: 484883
Are people still seeing this in the wild?
This is not new, not a regression, so not blocking on it. Would take a safe fix, if that's possible.
blocking2.0: ? → -
I still see it. Fedora 13 with firefox-3.6.7-1.fc13.i686.
service nfslock restart (which restarts rpc.statd on fedora) on the client seems to help.
My problem may have been due to a bug in sm-notify. See https://bugzilla.redhat.com/show_bug.cgi?id=625531
Blocks: 719952
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
No longer blocks: tb-enterprise
You need to log in before you can comment on or make changes to this bug.