Closed Bug 1320899 Opened 6 years ago Closed 6 years ago

Assertion failure: aNewProgress & (FLAG_SIZE_AVAILABLE | FLAG_HAS_ERROR), at ProgressTracker.cpp:41 | AddressSanitizer: heap-buffer-overflow READ of size 1

Categories

(Core :: Graphics: ImageLib, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
Tracking Status
firefox-esr45 --- fixed
firefox50 --- wontfix
firefox51 + wontfix
firefox52 + verified
firefox-esr52 --- fixed
firefox53 + fixed
firefox54 --- fixed

People

(Reporter: cbook, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: assertion, csectype-bounds, sec-high, Whiteboard: [adv-main52-][post-critsmash-triage])

Attachments

(3 files)

Attached file bughunter stack
Found via bughunter and reproduced with latest tinderbox windows trunk debug build.

Steps to reproduce:
-> Load http://swling.com/UTC.htm
--> Crash after a few seconds 
----> Assertion

affects trunk to beta

[5496] WARNING: Forced to copy ObserverTable due to nested notifications: file c:\builds\moz2_slave\m-cen-w32-d-000000000000000000\build\src\image\ProgressTracker.h, line 88
Assertion failure: aNewProgress & (FLAG_SIZE_AVAILABLE | FLAG_HAS_ERROR), at c:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/src/image/ProgressTracker.cpp:41

also seems a sec issue, bughunter reports this as security bug and also was able to reproduce on asan builds.

==2214==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x611000d600d8 at pc 0x7faa8a5ea8c2 bp 0x7ffc1f6007b0 sp 0x7ffc1f6007a8
Attached file asan information/stack
[Tracking Requested - why for this release]:

bughunter found and i guess a sec bug
Timothy, could you take a look at this ?
Flags: needinfo?(tnikkel)
Summary: Assertion failure: aNewProgress & (FLAG_SIZE_AVAILABLE | FLAG_HAS_ERROR), at ProgressTracker.cpp:41 → Assertion failure: aNewProgress & (FLAG_SIZE_AVAILABLE | FLAG_HAS_ERROR), at ProgressTracker.cpp:41 | AddressSanitizer: heap-buffer-overflow READ of size 1
Bughunter showed the asan message with symbols.
Looking into it. Seems to be an issue with our multipart image code which is a fragile area of code.
The image causing problems is http://tycho.usno.navy.mil/cgi-bin/nph-usnoclock.gif?zone=UTC&ticks=04

The problem doesn't happen everytime you look the page, but I'm able to easily reproduce with a few shift reloads.

The assert seems like a legit imagelib issue. It comes about when the first part of a multipart image is corrupt and we cannot get the size of it. But my proposed solution is silencing it in the case of multipart images, because I think we handle it fine. It seems like there a bigger problems in this testcase though, so I'll deal with that assert in another bug.

It seems like problem might lie in network code. There are at least two problems I've observed.

1) The image data we get for parts is sometimes truncated. For example, sometimes a part will be 2 bytes and consist of "GI" (ie the first two bytes of GIF87a). The server could be sending the data truncated, but I never observed this when getting the image via wget.

2) We get more than one OnStopRequest call in a row without a OnStartRequest call in between.

In addition, the ASAN stack is completely in the network code, and as far as I can tell imagelib isn't asking network to do anything wrong.

Could we get someone on the networking team to evaluate? I'll move this to the network component, feel free to move back if you feel otherwise.
Component: ImageLib → Networking
Flags: needinfo?(tnikkel) → needinfo?(mcmanus)
#2 in comment 6 is particularly concerning.

honza, wdyt?
Flags: needinfo?(mcmanus) → needinfo?(honzab.moz)
Group: core-security → network-core-security
I'm currently digging a bit around here.  Apparently, the server is sending the response byte-by-byte!  And some of the code is not ready for it.  At the moment I don't know yet what's wrong, but I suspect poor coding in the multipart channel (theory only!)
Assignee: nobody → honzab.moz
Flags: needinfo?(honzab.moz)
OK, too much time spent already...  the server sends the response in even 1 byte small chunks.

- we are not sending duplicate onstart/onstop
- there is probably something broken in the multimixed conv:
  1. from cache (one chunk) delivers the part as 170 bytes but from net when cut to 1 byte chunkgs it's just 21 bytes
  2. the buffer overflow when checking the delimited in nsMultiMixedConv::OnDataAvailable
- image code causes the assertion (not us) in reaction to this chunked load, see the last stack in this comment (actually the crashing one gives the answer, if looked at more closely..)



When we are loading from cache, there is one chunk of the multipart data coming.  I've added some tracepoints (bless you Visual Studio).  This is how it looks when we load from cache:

nsPartChannel::nsPartChannel(nsIChannel *, unsigned int, nsIStreamListener *) 0x0f385500
nsPartChannel::SendOnStartRequest(nsISupports *) 0x0f385500
nsPartChannel::SendOnDataAvailable(nsISupports *, nsIInputStream *, unsigned __int64, unsigned int) 0x0f385500 0 170

>  0 = offset  170 = count (which means the whole chunk)

mozilla::image::LaunchDecodingTask(mozilla::image::IDecodingTask *, mozilla::image::RasterImage *, unsigned int, bool) aHaveSourceData=false aFlags=0
nsPartChannel::SendOnStopRequest(nsISupports *, nsresult) 0x0f385500

> This is where we set the required flag the first time (there is one more place we do, tho)
> apparently the decoder thread wins here

uri=0x1c9ffb88 "http://tycho.usno.navy.mil/cgi-bin/nph-usnoclock.gif?zone=UTC&ticks=04" mProgress |= FLAG_SIZE_AVAILABLE  
  xul.dll!mozilla::image::Decoder::PostSize
  xul.dll!mozilla::image::nsGIFDecoder2::BeginGIF
  xul.dll!mozilla::image::nsGIFDecoder2::FinishImageDescriptor
  xul.dll!mozilla::image::nsGIFDecoder2::ReadImageDescriptor
  xul.dll!mozilla::image::nsGIFDecoder2::DoDecode::__l2::<lambda>
  xul.dll!mozilla::image::StreamingLexer<enum mozilla::image::nsGIFDecoder2::State,16>::BufferedRead<mozilla::image::LexerTransition<enum mozilla::image::nsGIFDecoder2::State> <lambda>(mozilla::image::nsGIFDecoder2::State, const char *, unsigned int) >
  xul.dll!mozilla::image::StreamingLexer<enum mozilla::image::nsGIFDecoder2::State,16>::Lex<mozilla::image::LexerTransition<enum mozilla::image::nsGIFDecoder2::State> <lambda>(mozilla::image::nsGIFDecoder2::State, const char *, unsigned int) >
  xul.dll!mozilla::image::nsGIFDecoder2::DoDecode
  xul.dll!mozilla::image::Decoder::Decode
  xul.dll!mozilla::image::MetadataDecodingTask::Run
  xul.dll!mozilla::image::DecodePoolWorker::Run
  xul.dll!nsThread::ProcessNextEvent
  xul.dll!NS_ProcessNextEvent
  xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!nsThread::ThreadFunc
  nss3.dll!_PR_NativeRunThread
  nss3.dll!pr_root
  ucrtbase.dll!thread_start<unsigned int (__stdcall*)(void *)>
  kernel32.dll!@BaseThreadInitThunk@12
  ntdll.dll!__RtlUserThreadStart
  ntdll.dll!__RtlUserThreadStart@8
  
mozilla::image::RasterImage::OnImageDataComplete(nsIRequest *, nsISupports *, nsresult, bool) mTransient=true mHasSize=false
mozilla::image::LaunchDecodingTask(mozilla::image::IDecodingTask *, mozilla::image::RasterImage *, unsigned int, bool) aHaveSourceData=true aFlags=1

> and here we set it the second time, sync on the main thread
> note nsPartChannel::SendOnStopRequest is still on the stack

uri=0x1c9ffb88 "http://tycho.usno.navy.mil/cgi-bin/nph-usnoclock.gif?zone=UTC&amp;ticks=04" mProgress |= FLAG_SIZE_AVAILABLE  
  xul.dll!mozilla::image::Decoder::PostSize
  xul.dll!mozilla::image::nsGIFDecoder2::BeginGIF
  xul.dll!mozilla::image::nsGIFDecoder2::FinishImageDescriptor
  xul.dll!mozilla::image::nsGIFDecoder2::ReadImageDescriptor
  xul.dll!mozilla::image::nsGIFDecoder2::DoDecode::__l2::<lambda>
  xul.dll!mozilla::image::StreamingLexer<enum mozilla::image::nsGIFDecoder2::State,16>::BufferedRead<mozilla::image::LexerTransition<enum mozilla::image::nsGIFDecoder2::State> <lambda>(mozilla::image::nsGIFDecoder2::State, const char *, unsigned int) >
  xul.dll!mozilla::image::StreamingLexer<enum mozilla::image::nsGIFDecoder2::State,16>::Lex<mozilla::image::LexerTransition<enum mozilla::image::nsGIFDecoder2::State> <lambda>(mozilla::image::nsGIFDecoder2::State, const char *, unsigned int) >
  xul.dll!mozilla::image::nsGIFDecoder2::DoDecode
  xul.dll!mozilla::image::Decoder::Decode
  xul.dll!mozilla::image::MetadataDecodingTask::Run
  xul.dll!mozilla::image::DecodePool::SyncRunIfPossible
  xul.dll!mozilla::image::LaunchDecodingTask
  xul.dll!mozilla::image::RasterImage::DecodeMetadata
  xul.dll!mozilla::image::RasterImage::OnImageDataComplete
  xul.dll!mozilla::image::MultipartImage::OnImageDataComplete
  xul.dll!imgRequest::OnStopRequest
  xul.dll!ProxyListener::OnStopRequest
  xul.dll!mozilla::dom::MediaDocumentStreamListener::OnStopRequest
  xul.dll!mozilla::dom::ImageListener::OnStopRequest
  xul.dll!nsDocumentOpenInfo::OnStopRequest
  xul.dll!nsPartChannel::SendOnStopRequest
  xul.dll!nsMultiMixedConv::SendStop
  xul.dll!nsMultiMixedConv::OnDataAvailable
  xul.dll!nsDocumentOpenInfo::OnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::DoOnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::OnTransportAndData
  xul.dll!mozilla::net::TransportAndDataEvent::Run
  xul.dll!mozilla::net::ChannelEventQueue::RunOrEnqueue
  xul.dll!mozilla::net::HttpChannelChild::RecvOnTransportAndData
  xul.dll!mozilla::net::PHttpChannelChild::OnMessageReceived
  xul.dll!mozilla::dom::PContentChild::OnMessageReceived
  xul.dll!mozilla::ipc::MessageChannel::DispatchAsyncMessage
  xul.dll!mozilla::ipc::MessageChannel::DispatchMessageW
  xul.dll!mozilla::ipc::MessageChannel::RunMessage
  xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run
  xul.dll!nsThread::ProcessNextEvent
  xul.dll!NS_ProcessNextEvent
  xul.dll!mozilla::ipc::MessagePump::Run
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!nsBaseAppShell::Run
  xul.dll!nsAppShell::Run
  xul.dll!XRE_RunAppShell
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!XRE_InitChildProcess
  firefox.exe!content_process_main
  firefox.exe!NS_internal_main
  firefox.exe!wmain
  firefox.exe!__scrt_common_main_seh
  kernel32.dll!@BaseThreadInitThunk@12
  ntdll.dll!__RtlUserThreadStart
  ntdll.dll!__RtlUserThreadStart@8
  
> and here is the assertion check, still from the same nsPartChannel::SendOnStopRequest
> flag is set (1 = FLAG_SIZE_AVAILABLE), we are OK

URI=0x1c9ffb88 "http://tycho.usno.navy.mil/cgi-bin/nph-usnoclock.gif?zone=UTC&amp;ticks=04" mProgress=1 progress=8   
  xul.dll!mozilla::image::ProgressTracker::SyncNotifyProgress
  xul.dll!mozilla::image::RasterImage::NotifyProgress
  xul.dll!mozilla::image::RasterImage::NotifyForLoadEvent
  xul.dll!mozilla::image::RasterImage::OnImageDataComplete
  xul.dll!mozilla::image::MultipartImage::OnImageDataComplete
  xul.dll!imgRequest::OnStopRequest
  xul.dll!ProxyListener::OnStopRequest
  xul.dll!mozilla::dom::MediaDocumentStreamListener::OnStopRequest
  xul.dll!mozilla::dom::ImageListener::OnStopRequest
  xul.dll!nsDocumentOpenInfo::OnStopRequest
  xul.dll!nsPartChannel::SendOnStopRequest
  xul.dll!nsMultiMixedConv::SendStop
  xul.dll!nsMultiMixedConv::OnDataAvailable
  xul.dll!nsDocumentOpenInfo::OnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::DoOnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::OnTransportAndData
  xul.dll!mozilla::net::TransportAndDataEvent::Run
  xul.dll!mozilla::net::ChannelEventQueue::RunOrEnqueue
  xul.dll!mozilla::net::HttpChannelChild::RecvOnTransportAndData
  xul.dll!mozilla::net::PHttpChannelChild::OnMessageReceived
  xul.dll!mozilla::dom::PContentChild::OnMessageReceived
  xul.dll!mozilla::ipc::MessageChannel::DispatchAsyncMessage
  xul.dll!mozilla::ipc::MessageChannel::DispatchMessageW
  xul.dll!mozilla::ipc::MessageChannel::RunMessage
  xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run
  xul.dll!nsThread::ProcessNextEvent
  xul.dll!NS_ProcessNextEvent
  xul.dll!mozilla::ipc::MessagePump::Run
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!nsBaseAppShell::Run
  xul.dll!nsAppShell::Run
  xul.dll!XRE_RunAppShell
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!XRE_InitChildProcess
  firefox.exe!content_process_main
  firefox.exe!NS_internal_main
  firefox.exe!wmain
  firefox.exe!__scrt_common_main_seh
  kernel32.dll!@BaseThreadInitThunk@12
  ntdll.dll!__RtlUserThreadStart
  ntdll.dll!__RtlUserThreadStart@8






OK, and now when the server sends the response in small one or two byte chunks:




nsPartChannel::nsPartChannel(nsIChannel *, unsigned int, nsIStreamListener *) 0x0f1fc670
nsPartChannel::SendOnStartRequest(nsISupports *) 0x0f1fc670
nsPartChannel::SendOnDataAvailable(nsISupports *, nsIInputStream *, unsigned __int64, unsigned int) 0x0f1fc670 0 1

> the first chunk came as 1 byte

mozilla::image::LaunchDecodingTask(mozilla::image::IDecodingTask *, mozilla::image::RasterImage *, unsigned int, bool) aHaveSourceData=false aFlags=0
nsPartChannel::SendOnDataAvailable(nsISupports *, nsIInputStream *, unsigned __int64, unsigned int) 0x0f1fc670 1 21

> the rest is received complete, but note only 22 bytes...
> I've also seen all 22 bytes received one byte by one (final result is identical, tho)

nsPartChannel::SendOnStopRequest(nsISupports *, nsresult) 0x0f1fc670
mozilla::image::RasterImage::OnImageDataComplete(nsIRequest *, nsISupports *, nsresult, bool) mTransient=true mHasSize=false
mozilla::image::LaunchDecodingTask(mozilla::image::IDecodingTask *, mozilla::image::RasterImage *, unsigned int, bool) aHaveSourceData=true aFlags=1

URI=0x0f1fcb88 "http://tycho.usno.navy.mil/cgi-bin/nph-usnoclock.gif?zone=UTC&amp;ticks=04" mProgress=0 progress=520
> 520 = FLAG_LOAD_COMPLETE | FLAG_HAS_ERROR
  xul.dll!mozilla::image::ProgressTracker::SyncNotifyProgress
  xul.dll!mozilla::image::RasterImage::NotifyProgress
  xul.dll!mozilla::image::RasterImage::NotifyForLoadEvent
  xul.dll!mozilla::image::RasterImage::OnImageDataComplete
  xul.dll!mozilla::image::MultipartImage::OnImageDataComplete
  xul.dll!imgRequest::OnStopRequest
  xul.dll!ProxyListener::OnStopRequest
  xul.dll!mozilla::dom::MediaDocumentStreamListener::OnStopRequest
  xul.dll!mozilla::dom::ImageListener::OnStopRequest
  xul.dll!nsDocumentOpenInfo::OnStopRequest
  xul.dll!nsPartChannel::SendOnStopRequest
  xul.dll!nsMultiMixedConv::SendStop
  xul.dll!nsMultiMixedConv::OnDataAvailable
  xul.dll!nsDocumentOpenInfo::OnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::DoOnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::OnTransportAndData
  xul.dll!mozilla::net::TransportAndDataEvent::Run
  xul.dll!mozilla::net::ChannelEventQueue::RunOrEnqueue
  xul.dll!mozilla::net::HttpChannelChild::RecvOnTransportAndData
  xul.dll!mozilla::net::PHttpChannelChild::OnMessageReceived
  xul.dll!mozilla::dom::PContentChild::OnMessageReceived
  xul.dll!mozilla::ipc::MessageChannel::DispatchAsyncMessage
  xul.dll!mozilla::ipc::MessageChannel::DispatchMessageW
  xul.dll!mozilla::ipc::MessageChannel::RunMessage
  xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run
  xul.dll!nsThread::ProcessNextEvent
  xul.dll!NS_ProcessNextEvent
  xul.dll!mozilla::ipc::MessagePump::Run
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!nsBaseAppShell::Run
  xul.dll!nsAppShell::Run
  xul.dll!XRE_RunAppShell
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!XRE_InitChildProcess
  firefox.exe!content_process_main
  firefox.exe!NS_internal_main
  firefox.exe!wmain
  firefox.exe!__scrt_common_main_seh
  kernel32.dll!@BaseThreadInitThunk@12
  ntdll.dll!__RtlUserThreadStart
  ntdll.dll!__RtlUserThreadStart@8


And here is the assertion problem, note that ProgressTracker::SyncNotifyProgress is called recusively (through an observer):

  
URI=0x0f1fcb88 "http://tycho.usno.navy.mil/cgi-bin/nph-usnoclock.gif?zone=UTC&amp;ticks=04" mProgress=0 progress=8 
> 8 = FLAG_LOAD_COMPLETE
  xul.dll!mozilla::image::ProgressTracker::SyncNotifyProgress
  xul.dll!mozilla::image::MultipartImage::OnLoadComplete
  xul.dll!mozilla::image::SyncNotifyInternal::__l46::<lambda>
  xul.dll!mozilla::image::ImageObserverNotifier<mozilla::image::ObserverTable const *>::operator()<void <lambda>(mozilla::image::IProgressObserver *) >
  xul.dll!mozilla::image::SyncNotifyInternal<mozilla::image::ObserverTable const *>
  xul.dll!mozilla::image::ProgressTracker::SyncNotifyProgress::__l2::<lambda>
  xul.dll!mozilla::image::CopyOnWrite<mozilla::image::ObserverTable>::Read<void <lambda>(const mozilla::image::ObserverTable *) >
  xul.dll!mozilla::image::ProgressTracker::SyncNotifyProgress
  xul.dll!mozilla::image::RasterImage::NotifyProgress
  xul.dll!mozilla::image::RasterImage::NotifyForLoadEvent
  xul.dll!mozilla::image::RasterImage::OnImageDataComplete
  xul.dll!mozilla::image::MultipartImage::OnImageDataComplete
  xul.dll!imgRequest::OnStopRequest
  xul.dll!ProxyListener::OnStopRequest
  xul.dll!mozilla::dom::MediaDocumentStreamListener::OnStopRequest
  xul.dll!mozilla::dom::ImageListener::OnStopRequest
  xul.dll!nsDocumentOpenInfo::OnStopRequest
  xul.dll!nsPartChannel::SendOnStopRequest
  xul.dll!nsMultiMixedConv::SendStop
  xul.dll!nsMultiMixedConv::OnDataAvailable
  xul.dll!nsDocumentOpenInfo::OnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::DoOnDataAvailable
  xul.dll!mozilla::net::HttpChannelChild::OnTransportAndData
  xul.dll!mozilla::net::TransportAndDataEvent::Run
  xul.dll!mozilla::net::ChannelEventQueue::RunOrEnqueue
  xul.dll!mozilla::net::HttpChannelChild::RecvOnTransportAndData
  xul.dll!mozilla::net::PHttpChannelChild::OnMessageReceived
  xul.dll!mozilla::dom::PContentChild::OnMessageReceived
  xul.dll!mozilla::ipc::MessageChannel::DispatchAsyncMessage
  xul.dll!mozilla::ipc::MessageChannel::DispatchMessageW
  xul.dll!mozilla::ipc::MessageChannel::RunMessage
  xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run
  xul.dll!nsThread::ProcessNextEvent
  xul.dll!NS_ProcessNextEvent
  xul.dll!mozilla::ipc::MessagePump::Run
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!nsBaseAppShell::Run
  xul.dll!nsAppShell::Run
  xul.dll!XRE_RunAppShell
  xul.dll!mozilla::ipc::MessagePumpForChildProcess::Run
  xul.dll!MessageLoop::RunInternal
  xul.dll!MessageLoop::RunHandler
  xul.dll!MessageLoop::Run
  xul.dll!XRE_InitChildProcess
  firefox.exe!content_process_main
  firefox.exe!NS_internal_main
  firefox.exe!wmain
  firefox.exe!__scrt_common_main_seh
  kernel32.dll!@BaseThreadInitThunk@12
  ntdll.dll!__RtlUserThreadStart
  ntdll.dll!__RtlUserThreadStart@8
  
firefox.exe has triggered a breakpoint.




The bad flags are set at MultipartImage::OnLoadComplete.  I think the combination there is illegal regarding the expectations in CheckProgressConsistency.  Hence, an image bug.

Moving to the right component.
Assignee: honzab.moz → nobody
Component: Networking → ImageLib
Daniel, are you sure this particular bug (the assertion failure) is sec-high?
As I said in comment 6 I've debugged the assert in ProgressTracker.cpp and I think it should not be firing, I plan to write a patch to make it not fire in this testcase. I will do that in another bug since I think there are more serious problems going on in this testcase:

1) not delivering the full data from the network to consumers (this I haven't verified, so it's possible the server is truncating the data in some cases, but wget always seems to get the full data of each part)
2) missing onstart call
3) the ASAN report of a heap buffer overflow (stack completely in network, not in image)

(In reply to Honza Bambas (:mayhemer) from comment #9)
> - we are not sending duplicate onstart/onstop

Not duplicate onstart/onstop, a missing onstart. For multipart streams it seems we get an onstart/onstop pair for each part...except in this testcase sometimes an onstart is missing.

If you want to investigate the missing onstart call I'd recommend commenting out the assert in ProgressTracker.cpp and reloading the testcase until you trigger this assert

https://dxr.mozilla.org/mozilla-central/rev/cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/image/SourceBuffer.cpp#467

that should indicate that we've had two onstop calls in a row. Turn on logging beforehand to verify this.

(In reply to Honza Bambas (:mayhemer) from comment #10)
> Daniel, are you sure this particular bug (the assertion failure) is sec-high?

There is also an ASAN report of a heap buffer overflow which justifies at least sec-high.
Assignee: nobody → honzab.moz
Component: ImageLib → Networking
Flags: needinfo?(honzab.moz)
Track 51+/52+ as sec-high.
(In reply to Timothy Nikkel (:tnikkel) from comment #11)
> As I said in comment 6 I've debugged the assert in ProgressTracker.cpp and I
> think it should not be firing, 

True is that I don't know this code well enough.  Probably when you set FLAG_LOAD_COMPLETE (| FLAG_LAST_PART_COMPLETE) in MultipartImage::OnLoadComplete it's expected that one of the FLAG_SIZE_AVAILABLE | FLAG_HAS_ERROR is already on mProgress.

Is that so?

> I plan to write a patch to make it not fire
> in this testcase. I will do that in another bug since I think there are more
> serious problems going on in this testcase:
> 
> 1) not delivering the full data from the network to consumers (this I
> haven't verified, so it's possible the server is truncating the data in some
> cases, but wget always seems to get the full data of each part)

It might be that there is an issue with the mutipartconv, hence the second bug.  A second bug because I believe that an assertion in the image code caused by _a truncated only data_ should be fixed in the image code and not in the networking (lower level) code.

tho, indeed if we are not calling onstart, it's another thing and the assertion correctly tells us that networking code is wrong.  but that didn't confirm for me so far.

> 2) missing onstart call

I don't see onstart would be missing.  I think it's off the table.  There were always nsPartChannel::SendOnDataAvailable seen for the appropriate channel and consumer.  that method only forwards to the listener, not even checking non-nullness (the target is raster image here I think).

> 3) the ASAN report of a heap buffer overflow (stack completely in network,
> not in image)

Not much related.  It's actually a third bug.  I think this is because the delimiter parser looks for '-' after the delimiter token falsely assuming the data are there.  but when the server sends the response byte by byte, it's not there.  and hence the read after buffer boundery.  definitely a bad bug to be fixed, but fixing this particular nit will not fix this bug, afaict.

> 
> (In reply to Honza Bambas (:mayhemer) from comment #9)
> > - we are not sending duplicate onstart/onstop
> 
> Not duplicate onstart/onstop, a missing onstart. 

no, doesn't happening.

> For multipart streams it
> seems we get an onstart/onstop pair for each part...except in this testcase
> sometimes an onstart is missing.

if so, can you tell me how did you find out?  maybe I was missing some tracepoint or log to look at.  if that is the case, then we may have the cause of the image code assertion failure, yes.  but so far I always see onstart + the assertion failure.

> 
> If you want to investigate the missing onstart call I'd recommend commenting
> out the assert in ProgressTracker.cpp and reloading the testcase until you
> trigger this assert

I don't follow.

> 
> https://dxr.mozilla.org/mozilla-central/rev/
> cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/image/SourceBuffer.cpp#467

I've never hit this.

> 
> that should indicate that we've had two onstop calls in a row. Turn on
> logging beforehand to verify this.

which module exactly?  I didn't find any logging added in the image code :(

> 
> (In reply to Honza Bambas (:mayhemer) from comment #10)
> > Daniel, are you sure this particular bug (the assertion failure) is sec-high?
> 
> There is also an ASAN report of a heap buffer overflow which justifies at
> least sec-high.

explained above.  I'd like to fix that in the second bug I reported.



Putting back to the right component I believe this bug belongs to.  Hope the explanations justify that.

Let's mark this dep on bug 1321612 that will check the multimixconv first.
Assignee: honzab.moz → nobody
Component: Networking → ImageLib
Depends on: 1321612
Flags: needinfo?(honzab.moz) → needinfo?(tnikkel)
(Timothy, sorry I didn't cc you on bug 1321612 at the first place, that could be a source of the confusion here)
(In reply to Honza Bambas (:mayhemer) from comment #14)
> (Timothy, sorry I didn't cc you on bug 1321612 at the first place, that
> could be a source of the confusion here)

Ahh okay! That makes a lot more sense now.
I filed bug 1321946 (with patch) to fix the imagelib assert since it is not a security issue.
To reproduce the assert at

https://dxr.mozilla.org/mozilla-central/rev/cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/image/SourceBuffer.cpp#467

apply the patch from bug 1321946. Run Firefox with NSPR_LOG_MODULES=imgRequest:5, reload the testcase patch until you hit the assert. Then look at the output, there should be a line with "imgRequest::OnStopRequest" right before the assert. Search for the "this" pointer from this line. If you do it enough times you should see two lines in a row with the same "this" pointer and OnStopRequest.

(There is another situation where this assert can happen, when we get no OnDataAvailable call. It seems legit to not get an OnDataAvailable call, so I wrote patches to fix this case.)
Flags: needinfo?(tnikkel) → needinfo?(honzab.moz)
Flags: needinfo?(tnikkel)
(In reply to Timothy Nikkel (:tnikkel) from comment #17)
> To reproduce the assert at
> 
> https://dxr.mozilla.org/mozilla-central/rev/
> cd4cdcc9ad6c45dad8b8d8c0d40e459db2bca8a1/image/SourceBuffer.cpp#467
> 
> apply the patch from bug 1321946. Run Firefox with
> NSPR_LOG_MODULES=imgRequest:5, reload the testcase patch until you hit the
> assert. Then look at the output, there should be a line with
> "imgRequest::OnStopRequest" right before the assert. Search for the "this"
> pointer from this line. If you do it enough times you should see two lines
> in a row with the same "this" pointer and OnStopRequest.
> 
> (There is another situation where this assert can happen, when we get no
> OnDataAvailable call. It seems legit to not get an OnDataAvailable call, so
> I wrote patches to fix this case.)

A note: if imgRequest::OnStopRequest is out of order, then it might be an img lib bug.  Anyway, I will first fix the multimixedconv (which is totally messed up) and then see if it's fixed.  Let's go layer by layer.
I can reproduce two OnStopRequest's in a row with the patch from bug 1321612 (and the tokenizer patch from bug 1322825) by shift-reloading the page before the stream has finished (the page is still loading). I put some printfs in the networking code. It looks like nsMultiMixedConv sends the OnStopRequest for a part, and then the http channel sends OnStopRequest (I assume because the shift-reload cancelled the request) and this OnStopRequest gets passed on through.
(In reply to Timothy Nikkel (:tnikkel) from comment #19)
> I can reproduce two OnStopRequest's in a row with the patch from bug 1321612
> (and the tokenizer patch from bug 1322825) by shift-reloading the page
> before the stream has finished (the page is still loading). I put some
> printfs in the networking code. It looks like nsMultiMixedConv sends the
> OnStopRequest for a part, and then the http channel sends OnStopRequest (I
> assume because the shift-reload cancelled the request) and this
> OnStopRequest gets passed on through.

Thanks!  The patch for the multimix still needs some polishing, this will be part of it.  Interesting, any of the tests don't catch this :/
Flags: needinfo?(honzab.moz)
I tested the new version of the patch and I no longer see any issues. Thanks!
Depends on: 1323875
Depends on: 1323877
I filed bug 1323875 (for an issue I noticed while investigating this) and bug 1323877 (to fix imagelib if we get no OnDataAvailable calls, which only seems to happen before the patch from bug 1321612).

That should cover all the issues observed from this website/testcase.
Flags: needinfo?(tnikkel)
Tracking 53+ for this sec high issue.
Sounds like we should close this bug?
Flags: needinfo?(tnikkel)
We can close this bug as long as we don't need this bug to track all the issues with this testcase and all the tracking flags etc are migrated to dependent bugs.
Flags: needinfo?(tnikkel) → needinfo?(milan)
Let's leave it open then  until that's all sorted out.
Flags: needinfo?(milan)
just pinged in bug 1321612 to get that landed
Flags: needinfo?(dveditz)
I don't think we need to keep this open. Bug 1321612 should be the only remaining issue here. Someone just needs to make sure the tracking/sec flags are set properly in both bugs.
Mark 51 won't fix as 51 was released.
All dependant bugs landed.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Group: network-core-security → core-security-release
Flags: needinfo?(dveditz)
Whiteboard: [adv-main52-]
Flags: qe-verify+
Whiteboard: [adv-main52-] → [adv-main52-][post-critsmash-triage]
Tested on Windows 7 x32 with the build: http://ftp.mozilla.org/pub/firefox/tinderbox-builds/mozilla-release-win32-debug/1488485271/   provided by Tomcat, and I can't reproduce the issue anymore. 

Please note that I used the build: http://ftp.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central-win32-debug/1480434387/ , the build from the day when the bug was reported and I could reproduce the crash 1 time. 

Based on the fact that I can't reproduce the initial crash with the latest build, I can confirm the fix.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.