Closed Bug 342577 Opened 15 years ago Closed 14 years ago

nsZipArchive::SeekToItem can fail due to thread race

Categories

(Core :: Networking: File, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: mark, Assigned: mark)

Details

(Keywords: fixed1.8.1)

Attachments

(3 files)

nsZipArchive::SeekToItem can be called with the same nsZipItem on two different threads simultaneously.  This creates a race if SeekToItem has never been called on that nsZipItem previously to update its offset member (and set hasDataOffset).

http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/modules/libjar/nsZipArchive.cpp&rev=1.85&mark=1070,1090-1093,1097-1098#1065

thr 0x2254000 (A): PR_OpenFile(..., 1, \00) = 23
thr 0x2254000 (A): nsZipArchive::SeekToItem(0x219f558, [23]) enter
thr 0x2254000 (A): _MD_lseek: lseek(23, 492557, 0) = 0
thr 0x2262000 (B): PR_OpenFile(..., 1, \00) = 24
thr 0x2262000 (B): nsZipArchive::SeekToItem(0x219f558, [24]) enter
thr 0x2262000 (B): _MD_lseek: lseek(24, 492557, 0) = 0
thr 0x2254000 (A): _MD_lseek: lseek(23, 492620, 0) = 0
thr 0x2262000 (B): _MD_lseek: lseek(24, 492683, 0) = 0
thr 0x2254000 (A): nsZipArchive::SeekToItem: PR_Seek([23], 492683, 0) = 492620

(The letters A and B were added to make it easier to identify the two threads.  In this particular case, ... is omitted, but the file is "/path/to/MinefieldDebug.app/Contents/MacOS/chrome/browser.jar" on both threads, and aItem->name is "content/browser/utilityOverlay.js".  This behavior also occurs with other members and other jar files.)

As the transcript indicates, aItem->offset is initially 492557, and aItem->hasDataOffset is false, os 492557 refers to the local header.  Both threads enter the block intended to update the offset, read the local header, and set aItem->offset by incrementing it.  Thread A attempts to seek to the data offset before thread B has incremented the member, but by the time thread A's seek returns, thread B has incremented the member, causing thread A's check on the seek position to fail.  (The last line in the output contains the then-current value of offset, which is not the offset supplied as seek's argument.)  Meanwhile, thread B seeks to the wrong position.  The seek succeeds, but thread B will read garbage.

Threads A and B both share the same stack:

 0: 0x2b945f37 _ZN16nsJARInputStream4InitEP5nsJARP9nsZipItemP10PRFileDesc
 1: 0x2b949d63 _ZN5nsJAR14GetInputStreamEPKcPP14nsIInputStream
 2: 0x2b94efd1 _ZN15nsJARInputThunk15EnsureJarStreamEv
 3: 0x2b94f045 _ZN15nsJARInputThunk4ReadEPcjPj
 4: 0x2bc9d83d _ZN22nsInputStreamTransport4ReadEPcjPj
 5: 0x1389a77 _ZN16nsStreamCopierOB16FillOutputBufferEP15nsIOutputStreamPvPcjjPj
 6: 0x130757b _ZN18nsPipeOutputStream13WriteSegmentsEPFjP15nsIOutputStreamPvPcjj
PjES2_jS4_
 7: 0x1389ddd _ZN16nsStreamCopierOB6DoCopyEPjS0_
 8: 0x138a914 _ZN15nsAStreamCopier7ProcessEv
 9: 0x138ac3e _ZN15nsAStreamCopier3RunEv
10: 0x132f74d _ZN12nsThreadPool3RunEv
[...]

This problem is fairly easy to reproduce in Firefox on the trunk on a Mac immediately after rebooting (cold buffer cache) and doing autoreg (remove compatibility.ini).  I notice that at least one jar member can't be read during most startups, often causing bookmarks on the bookmark toolbar to not be displayed, or the browser to fail to respond to most commands.
Attached patch WorkaroundSplinter Review
This isn't really a fix for the underlying problem, but it works around it by storing the header and data offsets separately.
Assignee: nobody → mark
Status: NEW → ASSIGNED
Comment on attachment 226855 [details] [diff] [review]
Workaround

We might want to do something about the underlying race anyway, but how do you feel about this approach?
Attachment #226855 - Flags: review?(darin)
Comment on attachment 226855 [details] [diff] [review]
Workaround

r+sr=darin

This seems like something that is worth fixing on the 1.8 branch even though it is showing up with greater frequency on the trunk (due to thread manager)
Attachment #226855 - Flags: superreview+
Attachment #226855 - Flags: review?(darin)
Attachment #226855 - Flags: review+
Attachment #226855 - Flags: approval1.8.1?
Checked in on trunk.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment on attachment 226855 [details] [diff] [review]
Workaround

a=dbaron on behalf of drivers.  Please check in to the MOZILLA_1_8_BRANCH and mark the bug fixed1.8.1 once you have.
Attachment #226855 - Flags: approval1.8.1? → approval1.8.1+
Checked in on MOZILLA_1_8_BRANCH for 1.8.1b1.
Keywords: fixed1.8.1
A small nit to possibly fix in a followup patch, or taken up another patch touching this file:
instead of doing: ((nsZipItem*)aItem)->dataOffset += aItem->headerOffset + ...
it is better to   ((nsZipItem*)aItem)->dataOffset = aItem->headerOffset +
This make sure that, when due to MT, this code gets executes twice, the dataOffset will still be valid. 
Oops, that was a careless error.  It was supposed to be =, as it was in the trunk patch.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #230806 - Flags: review?(darin)
Attachment #230806 - Flags: review?(darin) → review+
Attachment #230806 - Flags: approval1.8.1?
Attachment #230806 - Flags: approval1.8.1? → approval1.8.1+
Fixed for real for 1.8.1b2.
Status: REOPENED → RESOLVED
Closed: 15 years ago14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.