Closed
Bug 368428
Opened 18 years ago
Closed 16 years ago
XUL FastLoad cache corruption when application running while upgrading
Categories
(Core :: Networking: JAR, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.1b1
People
(Reporter: glandium, Assigned: glandium)
Details
Attachments
(1 file, 5 obsolete files)
4.13 KB,
patch
|
Biesinger
:
review+
|
Details | Diff | Splinter Review |
This is something that does not happen on windows but will happen with unix builds. Here is the scenario to trigger the issue: - Start the application - Upgrade the application while running, replacing the files currently in use (this kind of thing can happen when upgrading through the linux distribution upgrade system while a user is running the application). - In the already running application, open a new kind of window, control, etc. For example, the preferences panel. It will more than probably show XML errors. - Stop the application - Start the application again. Go to the preferences panel again, if you went there and got XML errors. You will get the same errors. - Stop the application - Remove the XUL.mfasl file - Start the application. Try again. The XML errors are gone. To solve this, maybe it would be better to not cache invalid XUL files...
Assignee | ||
Comment 1•18 years ago
|
||
A bit more precision: - When you go to the preferences panel the first time, after upgrade, the XUL.mfasl is removed and recreated with the cache content for the broken preferences panel. - The corruption doesn't seem to happen with flat chrome (no jar files), so this may be related to jar handling vs. the jar file that changing.
Assignee | ||
Comment 2•18 years ago
|
||
I can confirm that bypassing zip cache in nsZipReaderCache::GetZip solves the problem. This cache should be invalidated when the zip file's mtime changes. This probably means adding the old mtime information to the nsJAR or the nsZipArchive class. Changing bug component accordingly (though it sounds weird that Jar is under Networking)
Component: XP Toolkit/Widgets: XUL → Networking: JAR
QA Contact: xptoolkit.xul → networking.jar
Assignee | ||
Comment 3•18 years ago
|
||
I'm not quite sure it doesn't leak the previously cached nsJAR object, but releasing ends in a dead lock... Not knowing the code doesn't help :-p
Attachment #253083 -
Flags: review?
Assignee | ||
Updated•18 years ago
|
Attachment #253083 -
Flags: review? → review?(darin.moz)
Assignee | ||
Comment 4•18 years ago
|
||
(Rereading comment #0, I feel I have to precise it doesn't happen on windows because the installer doesn't want to run if an instance is running)
Assignee | ||
Comment 5•18 years ago
|
||
Comment on attachment 253083 [details] [diff] [review] patch Mmmmmm for some reason, that leads to a lock...
Attachment #253083 -
Flags: review?(darin.moz)
Assignee | ||
Comment 6•18 years ago
|
||
mZips.Put was trying to release the collisioning zip, which ended in a lock in nsZipReaderCache::ReleaseZip. The new patch removes the zip from mZips beforehand.
Attachment #253083 -
Attachment is obsolete: true
Attachment #253090 -
Flags: review?(darin.moz)
Comment 7•18 years ago
|
||
Comment on attachment 253090 [details] [diff] [review] patch sorry, i'm not going to have time to give this a proper review. over to biesi.
Attachment #253090 -
Flags: review?(darin.moz) → review?(cbiesinger)
Comment 8•17 years ago
|
||
Mike, is this a problem on the trunk? biesi, do you think you could review the patch sometime?
Assignee: nobody → mh+mozilla
Assignee | ||
Comment 9•17 years ago
|
||
Updated patch for current trunk
Attachment #253090 -
Attachment is obsolete: true
Attachment #284746 -
Flags: review?(cbiesinger)
Attachment #253090 -
Flags: review?(cbiesinger)
Comment 10•17 years ago
|
||
Comment on attachment 284746 [details] [diff] [review] New patch against trunk + zipFile->GetLastModifiedTime(&mMtime); you should check that that succeeded + lock.unlock(); + mZips.Remove(&key); + lock.lock(); why are you doing that outside of the lock? And why is that safe? you should initialize mMtime in the constructor
Assignee | ||
Comment 11•17 years ago
|
||
(In reply to comment #10) > + lock.unlock(); > + mZips.Remove(&key); > + lock.lock(); > > why are you doing that outside of the lock? And why is that safe? mZips.Remove does end up locking when it needs to. If you don't unlock, that's a dead lock situation. I'll update the patch with your other comments when you tell me you're okay with this part.
Comment 12•17 years ago
|
||
It locks a different lock... consider ReleaseZip: That also removes the entry while holding the lock.
Comment 13•17 years ago
|
||
Comment on attachment 284746 [details] [diff] [review] New patch against trunk r-, please call .Remove while holding the lock
Attachment #284746 -
Flags: review?(cbiesinger) → review-
Assignee | ||
Comment 14•17 years ago
|
||
This should solve your concerns.
Attachment #284746 -
Attachment is obsolete: true
Attachment #308130 -
Flags: review?(cbiesinger)
Comment 15•17 years ago
|
||
Comment on attachment 308130 [details] [diff] [review] patch against trunk + if (zip && LL_EQ(Mtime, zip->GetMtime())) { You can actually use == these days for comparing PRInt64 values. + ReleaseZip(zip); I don't think it's necessary to call this... the mZips.Remove should be enough. Doesn't this cause a deadlock anyway?
Attachment #308130 -
Flags: review?(cbiesinger) → review+
Comment 16•17 years ago
|
||
Mike, I just noticed that Ubuntu is taking this patch upstream for XULRunner 1.9. Can you address biesi's nits, and then I can work on getting this in for Firefox 3.1 / Gecko 1.9.1? Thanks!
Status: NEW → ASSIGNED
Assignee | ||
Comment 17•17 years ago
|
||
I'll have to do some testing. IIRC, mZips.Remove alone leads to a deadlock situation.
Assignee | ||
Comment 18•17 years ago
|
||
Same patch with changes suggested in comment #15. I verified this works properly without ReleaseZip. Retrospectively, I wonder how I got deadlock situations before with the same code...
Attachment #308130 -
Attachment is obsolete: true
Comment 19•17 years ago
|
||
Downstream Ubuntu (and probably Debian) are taking this patch to fix corruption issues they are having when upgrading Firefox while it is running. Would be great to get it upstream. Possible to get it for Firefox 3.0.1?
Flags: wanted1.9.0.x?
Flags: blocking1.9.0.1?
Assignee | ||
Comment 20•16 years ago
|
||
Comment on attachment 322150 [details] [diff] [review] patch This patch is not good. I just got a lock situation while running epiphany due to the removal of ReleaseZip. Stacktrace: #0 0x00007ffc20821d04 in __lll_lock_wait () from /lib/libpthread.so.0 #1 0x00007ffc2081d984 in _L_lock_1007 () from /lib/libpthread.so.0 #2 0x00007ffc2081d7ae in pthread_mutex_lock () from /lib/libpthread.so.0 #3 0x00007ffc26f29709 in PR_Lock (lock=0x1d4a0c0) at ptsynch.c:206 #4 0x00007ffc1a9aef37 in nsZipReaderCache::ReleaseZip (this=0x1d49e10, zip=0x80) at ../../dist/include/xpcom/nsAutoLock.h:219 #5 0x00007ffc1a9af086 in nsJAR::Release (this=0x1d4bf20) at nsJAR.cpp:158 #6 0x00007ffc1b003dc8 in nsSupportsHashtable::Remove (this=<value optimized out>, aKey=<value optimized out>, value=0x0) at nsHashtable.cpp:854 #7 0x00007ffc1a9af161 in nsZipReaderCache::GetZip (this=0x1d49e10, zipFile=0x45bb7f0, result=0x8ee7518) at nsJAR.cpp:1147 #8 0x00007ffc1a9b1dc7 in nsJARInputThunk::EnsureJarStream (this=0x8ee7500) at nsJARChannel.cpp:138 #9 0x00007ffc1a9b1f91 in nsJARChannel::Open (this=0x8a59640, stream=0x7fff2fd750f0) at nsJARChannel.cpp:663 #10 0x00007ffc1a9a3837 in nsStringBundle::LoadProperties (this=0x1f79c90) at nsStringBundle.cpp:128 #11 0x00007ffc1a9a4e1f in nsStringBundle::GetStringFromName (this=0x1f79c90, aName=0x80, aResult=0x1) at nsStringBundle.cpp:269 This looks pretty much like the deadlocks that forced me to force an unlock or do ReleaseZip beforehand.
Attachment #322150 -
Attachment is obsolete: true
Comment 21•16 years ago
|
||
attachment 322150 [details] [diff] [review] + keep a ref on zip to not trigger ReleaseZip ... which causes the recurring lock attempt. I was able to 100% reproduce the deadlock by simply touching some .jar files in chrome/ and trying to access new chrome elements afterwards in firefox. I could verify that this patch fixes it for me. Any thoughts?
Assignee | ||
Comment 22•16 years ago
|
||
Does the older zip properly get released afterwards, then ? Replacing antiLockZipGrip = zip by ReleaseZip(zip) works, too...
Comment 23•16 years ago
|
||
Yes, putting the grip before the lock on the stack should make release happen after unlock.(In reply to comment #22) > Does the older zip properly get released afterwards, then ? > > Replacing antiLockZipGrip = zip by ReleaseZip(zip) works, too... > Don't know if I understand you correctly, but yes, this patch tries to take care that actual Release happens happen _outside_ the GetZip lock.
Comment 24•16 years ago
|
||
(In reply to comment #22) > Replacing antiLockZipGrip = zip by ReleaseZip(zip) works, too... > Just tried this approach and I get the same deadlock by just sudo touch /usr/lib/xulrunner-1.9/chrome/*.jar before opening any menu. Clicking on Edit will then deadlock firefox. Could you please verify that this is the case for you too and then test the updated patch I submitted? Merci.
Assignee | ||
Comment 25•16 years ago
|
||
Attachment 308130 [details] [diff] has always worked fine for me... it's strange that it deadlocks for you with ReleaseZip... there's no reason it should...
Assignee | ||
Comment 26•16 years ago
|
||
well, i have been fast talking, strangely enough, the releasezip patch finally failed for me... while yours works. I'll start applying your patch in next xulrunner upload in debian.
Updated•16 years ago
|
Attachment #323500 -
Flags: review?(cbiesinger)
Updated•16 years ago
|
Flags: blocking1.9.0.1? → blocking1.9.0.1-
Comment 27•16 years ago
|
||
What's the impact of this bug? Only on XULRunner apps, right?
Comment 28•16 years ago
|
||
(In reply to comment #27) > What's the impact of this bug? Only on XULRunner apps, right? > Comment #24 mentions deadlocking Firefox
Comment 29•16 years ago
|
||
Sure, if you upgrade XULRunner while Firefox is running. But is it the case that if a Linux user upgrades Firefox a bunch of their other applications fall over? And the fix is required on the 1.8 branch, not the 1.9 branch, right?
Assignee | ||
Comment 30•16 years ago
|
||
If you upgrade Firefox while it is running, chances are you will screw up your XUL FastLoad cache. This happens with any version and is unrelated to Firefox being built on top of xulrunner or not.
Updated•16 years ago
|
Attachment #323500 -
Flags: review?(cbiesinger) → review+
Updated•16 years ago
|
Keywords: checkin-needed
Comment 31•16 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/c61681813a11
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.1b1
Updated•16 years ago
|
Flags: wanted1.9.0.x? → wanted1.9.0.x-
You need to log in
before you can comment on or make changes to this bug.
Description
•