311 bytes, text/html
2.71 KB, patch
|Details | Diff | Splinter Review|
2.13 KB, patch
|Details | Diff | Splinter Review|
3.97 KB, patch
|Details | Diff | Splinter Review|
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:220.127.116.11) Gecko/20091016 Firefox/3.5.4 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:18.104.22.168) Gecko/20091016 Firefox/3.5.4 Visiting http://aegeriwetter.dyndns.org:8088/cgi-bin/guestimage.html Firefox 3.5.4 crashes after some time. A clean profile without extensions and themes was used to reproduce. After removing the clutter from the webpage source code all that remains to reproduce the crash was the image tag. The src-attribute refers to a multipart/x-mixed-replace resource. A reduced test case follows. Reproducible: Always Steps to Reproduce: 1. Visit test page 2. wait a few moments Actual Results: Firefox crashes, the dialog to submit the crash report appears Expected Results: Don't crash. Just show the image or reject it. Related crash report: http://crash-stats.mozilla.com/report/index/fd5a722f-55ac-45d3-90e0-9a0872091028?p=1
The problem rose up on the German Firefox community board and was first detected by stedenon. http://www.camp-firefox.de/forum/viewtopic.php?f=1&t=76154 Relying on the given statements Windows XP with Fx 3.0.14 and 3.5.3 is also affected.
Confirmed on Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:22.214.171.124pre) Gecko/20091027 Shiretoko/3.5.5pre http://crash-stats.mozilla.com/report/index/82a9b6f9-6e33-4f2e-b7ab-7e2c42091028?p=1 0 ntdll.dll RtlEnterCriticalSection 1 nspr4.dll PR_Lock nsprpub/pr/src/threads/combined/prulock.c:233 2 js3250.dll ClaimTitle js/src/jslock.cpp:536 3 xul.dll nsPNGEncoder::QueryInterface modules/libpr0n/encoders/png/nsPNGEncoder.cpp:51 4 nspr4.dll _PR_MD_UNLOCK nsprpub/pr/src/md/windows/w95cv.c:344 5 xul.dll imgContainer::ResetDiscardTimer modules/libpr0n/src/imgContainer.cpp:1272 6 xul.dll xul.dll@0x8da9eb I don't see a crash on Namoroka so far.
The second time I get a crash report more similar to the reporter's stack: http://crash-stats.mozilla.com/report/index/c0bf8132-bac4-44fb-8c79-7a4422091028?p=1 0 js3250.dll js_LookupPropertyWithFlags js/src/jsobj.cpp:3802 1 js3250.dll js_LookupProperty js/src/jsobj.cpp:3774 2 xul.dll imgContainer::GetFrameAt modules/libpr0n/src/imgContainer.cpp:221 3 xul.dll nsStringBundleService::FormatWithBundle intl/strres/src/nsStringBundle.cpp:807 4 mozcrt19.dll arena_malloc_small obj-firefox/memory/jemalloc/src/jemalloc.c:4072 5 xul.dll xul.dll@0x29fda9 6 xul.dll nsStringInputStream::Release xpcom/io/nsStringStream.cpp:120 7 xul.dll nsStringInputStreamConstructor xpcom/io/nsStringStream.cpp:429 8 xul.dll nsGenericFactory::CreateInstance obj-firefox/xpcom/build/nsGenericFactory.cpp:80 9 xul.dll nsCreateInstanceByContractID::operator obj-firefox/xpcom/build/nsComponentManagerUtils.cpp:210 10 xul.dll nsCOMPtr_base::assign_from_qi_with_error obj-firefox/xpcom/build/nsCOMPtr.cpp:105 11 xul.dll nsCOMPtr_base::assign_from_qi_with_error obj-firefox/xpcom/build/nsCOMPtr.cpp:107 12 xul.dll nsMultiMixedConv::SendData netwerk/streamconv/converters/nsMultiMixedConv.cpp:839 13 xul.dll nsMultiMixedConv::BufferData netwerk/streamconv/converters/nsMultiMixedConv.cpp:726 14 xul.dll nsMultiMixedConv::OnDataAvailable netwerk/streamconv/converters/nsMultiMixedConv.cpp:596
How long are you supposed to wait? It's been a few minutes now and I can't seem to reproduce it on the current trunk build (although Firefox claims that the page is always being loaded).
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:126.96.36.199pre) Gecko/20091127 Shiretoko/3.5.6pre A minute or so. It is still crashing here.
Your user agent indicates that you are running 3.5.6. I am running Minefield (3.7a1pre), and with that, I did not experience the crash. Since 3.6 is soon to be released, can anyone check if the crash is present there?
Like I said in comment 3 this crash is not reproducible with Namoroko but I confirmed it for Firefox 3.5 and Firefox 3.0.
I see. Should this be fixed for 3.0 and 3.5, or are we going to wait for people to upgrade to 3.6 instead?
Firefox 3.0 and 3.5 releases are still updated with stability and security fixes and only developers are empowered to fix or to wontfix this bug.
I more or less completely rewrote multipart/x-mixed-replace for trunk, so I'm not surprised that the crash doesn't reproduce there. Unfortunately, that doesn't really tell us much about how to fix it on 3.0 and 3.5. That crash stack is pretty confusing. Joe, can you make anything of it?
Interesting, I've now had this on two separate (common) sites. Are you sure the only reason for this crash occurring is the case which this bug lists?
Just to clarify - my rewrite didn't make it into 3.6.
But I use a trunk build with the latest updates.
Can someone who can reproduce this bug find the first nightly where it stopped happening? Over here I can't seem to reproduce it, at least on Mac and Linux.
Actually, I've managed to reproduce. Just took forever. So in a debug build I get: ###!!! ASSERTION: nsVoidArray::FastElementAt: index out of range: '0 <= aIndex && aIndex < Count()', file ../../../dist/include/xpcom/nsVoidArray.h, line 72 and the top several frames are: #0 nsVoidArray::FastElementAt (this=0x7fffddb9d4a0, aIndex=0) at ../../../dist/include/xpcom/nsVoidArray.h:73 #1 0x00007fffe8b74166 in nsCOMArray_base::ObjectAt (this=0x7fffddb9d4a0, aIndex=0) at ../../../dist/include/xpcom/nsCOMArray.h:100 #2 0x00007fffe8b7590a in nsCOMArray<gfxIImageFrame>::ObjectAt (this= 0x7fffddb9d4a0, aIndex=0) at ../../../dist/include/xpcom/nsCOMArray.h:160 #3 0x00007fffe8b74bfe in nsCOMArray<gfxIImageFrame>::operator (this= 0x7fffddb9d4a0, aIndex=0) at ../../../dist/include/xpcom/nsCOMArray.h:170 #4 0x00007fffe8b6f624 in imgContainer::GetFrameAt (this=0x7fffddb9d470, index=0, _retval=0x7fffddaae028) at ../../../../mozilla/modules/libpr0n/src/imgContainer.cpp:218 So someone's doing a GetFrameAt on an imgContainer but the imgContainer's mFrames has length 0. Then we try to addref whatever random pointer we get out of mFrames and crash.
Oh, and the "someone" is nsJPEGDecoder::ProcessData. And mNumFrames is 1 in the imgContainer. The relevant code is gone on 1.9.2; it was removed in bug 753.
And the key is that under that GetFrameAt call we call imgContainer::RestoreDiscardedData which returns without restoring because mDiscarded is false. Looks like a regression from bug 296818. Presumably it can still bite us in other ways, if mDiscarded can be not matching reality. Joe, Bobby, can you look into this?
ccing Federico too, since this is his code.
imgContainer::Init sets mDiscarded to false but doesn't touch mFrames or mNumFrames. And of course imgContainer::Init is called when mDiscarded is true, like so: #0 imgContainer::Init (this=0x7fffde1e03a0, aWidth=640, aHeight=480, aObserver=0x7fffde127ce8) at ../../../../mozilla/modules/libpr0n/src/imgContainer.cpp:120 #1 0x00007fffe8b9a25e in nsJPEGDecoder::ProcessData (this=0x7fffddd89000, data= 0x7fffde70d800 "M10M-Web\r\nTYP=M\r\nBRD=2.1\r\nRAM=134217728\r\nCLK=206\r\nENDSECTION ID\r\nSECTION SENSORS\r\nSIN=0\r\nBTR=0\r\nBTL=0\r\nPIR=0\r\nILR=-141\r\nTIN=117\r\nENDSECTION SENSORS\r\nSECTION EVENT\r\nEVD=SLEEP\r\nEST=\r\nAST=\r\nREC=OFF\r\nMDT="..., count=1024, writeCount=0x7fffffffcd3c) at ../../../../../mozilla/modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp:472 #2 0x00007fffe8b998fe in ReadDataOut (in=0x7fffddd06100, closure=0x7fffddd89000, fromRawSegment= 0x7fffde70d800 "M10M-Web\r\nTYP=M\r\nBRD=2.1\r\nRAM=134217728\r\nCLK=206\r\nENDSECTION ID\r\nSECTION SENSORS\r\nSIN=0\r\nBTR=0\r\nBTL=0\r\nPIR=0\r\nILR=-141\r\nTIN=117\r\nENDSECTION SENSORS\r\nSECTION EVENT\r\nEVD=SLEEP\r\nEST=\r\nAST=\r\nREC=OFF\r\nMDT="..., toOffset=0, count=1024, writeCount=0x7fffffffcd3c) at ../../../../../mozilla/modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp:248 #3 0x00007ffff7516b4b in nsStringInputStream::ReadSegments (this=0x7fffddd06100, writer= 0x7fffe8b998bc <ReadDataOut(nsIInputStream*, void*, char const*, PRUint32, PRUint32, PRUint32*)>, closure=0x7fffddd89000, aCount=1024, result=0x7fffffffcd3c) at ../../../mozilla/xpcom/io/nsStringStream.cpp:276 #4 0x00007fffe8b999ed in nsJPEGDecoder::WriteFrom (this=0x7fffddd89000, inStr=0x7fffddd06100, count=1024, writeCount= 0x7fffffffcd3c) at ../../../../../mozilla/modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp:266 #5 0x00007fffe8b8a714 in imgRequest::OnDataAvailable (this=0x7fffde127ce0, aRequest=0x7fffddd27380, ctxt=0x0, inStr= 0x7fffddd06100, sourceOffset=1000, count=1024) at ../../../../mozilla/modules/libpr0n/src/imgRequest.cpp:993 #6 0x00007fffef1c4e6c in nsMultiMixedConv::SendData (this=0x7fffde17aa80, aBuffer= 0x7fffde70d800 "M10M-Web\r\nTYP=M\r\nBRD=2.1\r\nRAM=134217728\r\nCLK=206\r\nENDSECTION ID\r\nSECTION SENSORS\r\nSIN=0\r\nBTR=0\r\nBTL=0\r\nPIR=0\r\nILR=-141\r\nTIN=117\r\nENDSECTION SENSORS\r\nSECTION EVENT\r\nEVD=SLEEP\r\nEST=\r\nAST=\r\nREC=OFF\r\nMDT="..., aLen=1024) at ../../../../mozilla/netwerk/streamconv/converters/nsMultiMixedConv.cpp:839 #7 0x00007fffef1c399e in nsMultiMixedConv::OnDataAvailable (this=0x7fffde17aa80, request=0x7fffde1b4850, context=0x0, inStr= 0x7fffde11de40, sourceOffset=14097, count=1024) at ../../../../mozilla/netwerk/streamconv/converters/nsMultiMixedConv.cpp:596
And it's immediately after that call that we hit the FastElementAt assert.
Sorry, I'm not familiar with this code anymore :(
#2 top crash in 3.7a1 data
Chris, this particular crash is not an issue on 3.6 or m-c last I checked. What is comment 25 based on?
ah, maybe we have another new crash with signature RtlEnterCriticalSection http://crash-stats.mozilla.com/topcrasher/byversion/Firefox/3.7a1 http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&signature=RtlEnterCriticalSection&version=Firefox%3A3.7a1 Frame Module Signature [Expand] Source 0 ntdll.dll RtlEnterCriticalSection 1 nspr4.dll PR_Lock nsprpub/pr/src/threads/combined/prulock.c:233 2 xul.dll nsAutoLock::nsAutoLock obj-firefox/dist/include/nsAutoLock.h:219 3 xul.dll nsSSLIOLayerHelpers::isKnownAsIntolerantSite security/manager/ssl/src/nsNSSIOLayer.cpp:2230 4 xul.dll nsSSLIOLayerSetOptions security/manager/ssl/src/nsNSSIOLayer.cpp:3501 5 xul.dll nsNSSShutDownPreventionLock::~nsNSSShutDownPreventionLock security/manager/ssl/src/nsNSSShutDown.cpp:472 6 xul.dll nsSSLIOLayerNewSocket security/manager/ssl/src/nsNSSIOLayer.cpp:2250
Yes, that stack is totally unrelated to this bug.
So, I'm having trouble understanding: is or isn't this a topcrash (I think the answer is no, and the unrelated topcrash chofmann references is bug 546483 which is bug 546483)
(err, which is bug 521849)
> So, I'm having trouble understanding: is or isn't this a topcrash How does one tell? Note that this is not an issue on 1.9.2 or m-c; it's 1.9.1-only. Also note that any time it happens the top frame on the stack will be random, since the symptom of this bug is that we jump to a random part of the address space. From a brief look at crash-stats, that means that the "topcrash" report there is completely useless for understanding how often this crash happens. This crash will only happen with mjpeg and only if the update frequency is low enough to cause discards. That said, when it does happen it's almost certainly exploitable; see above about "jump to random part of address space".
And it looks like there's no way to search crash-stats for stacks _including_ a particular function. Or at least I can't find one.
Bug 536812 has what looks like a lower bound on crashes due to this bug. That lower bound is about 25 crashes per day on average; that's about 1/4 what our topcrash list of top 300 crashes terminates in. Again, the main issue here is that it's easy to create a page triggering this crash if you try...
Created attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace This class of webcams has a very low refresh rate, around 0.0167 fps, which is slow enough to allow discarding to come in between one frame and another. The fix for this is to simply not discard, which is what we do on trunk. The only formats that are discardable in 1.9.1 and 1.9.2 are JPEG and PNG, and while nobody does it on the web, multipart/x-mixed-replace doesn't require JPEG files only, so I fixed it in both places for safety.
Can we get a test?
I plan on writing a test for at least mozilla-central, but it might be tricky on older branches due to lackings in their httpd.js.
(In reply to comment #37) > I plan on writing a test for at least mozilla-central, but it might be tricky > on older branches due to lackings in their httpd.js. Sounds good enough to me.
Comment on attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace Is this ok to be just in jpeg/png? I see those are the only decoders that call SetDiscardable, I assume that this isn't an issue on the trunk? (And won't cause problems if someone does, say, a multipart/replace bmp?)
Right. SetDiscardable is what sets an image as discardable; all other images are not discardable, and this doesn't apply. This bug never applied to trunk because Bobby Holley overhauled discarding, and it's all handled centrally instead of in each decoder. That's the better fix, obviously, but it's also more invasive.
Created attachment 442848 [details] [diff] [review] test This is a test (against mozilla-central) for a multipart/x-mixed-replace JPEG with a long enough refresh timer to allow for discarding to happen. Sadly, I had to add another delay HTML webpage to make this work.
Comment on attachment 442848 [details] [diff] [review] test The delay is really unfortunate, but it's better than crashing.
Comment on attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace This doesn't affect 1.9.2 and above. It doesn't really bother me which 1.9.1 release this goes in to, as long as it does.
Comment on attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace a=LegNeato for 188.8.131.52. Please land ASAP.
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/6d296f4b8ad9 (Really, this was RESOLVED FIXED by bug 435296, and then a very minimal backport happened in this bug, but we don't have per-branch fixed status.)
Verified for 184.108.40.206 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:220.127.116.11) Gecko/20100504 Firefox/3.5.10 (.NET CLR 3.5.30729) using test page. 18.104.22.168 crashed cleanly on same box.
Accidentally unset status1.9.3.
Comment on attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace looks fine.
Comment on attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace Requesting approval1.9.0.next on this patch so that we can take it in upcoming Camino 2.0.x security and stability updates. If approved, I'll handle the checkins, unless the patch author requests otherwise.
Comment on attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace Approved for 22.214.171.124, a=dveditz
Checking in modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp; /cvsroot/mozilla/modules/libpr0n/decoders/jpeg/nsJPEGDecoder.cpp,v <-- nsJPEGDecoder.cpp new revision: 1.95; previous revision: 1.94 done Checking in modules/libpr0n/decoders/png/nsPNGDecoder.cpp; /cvsroot/mozilla/modules/libpr0n/decoders/png/nsPNGDecoder.cpp,v <-- nsPNGDecoder.cpp new revision: 1.85; previous revision: 1.84 done
Comment on attachment 441910 [details] [diff] [review] Don't allow discarding if we're multipart/x-mixed-replace Approved for 126.96.36.199, a=dveditz for release-drivers
Does anyone have any manual STR for this on branch? http://aegeriwetter.dyndns.org is no longer accessible and the attached testcase loads an image from it.