Closed Bug 920725 Opened 6 years ago Closed 6 years ago

crash in nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, unsigned int, unsigned int*)

Categories

(Core :: DOM: HTML Parser, defect, critical)

x86
Windows NT
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla27
Tracking Status
firefox24 --- unaffected
firefox25 + verified
firefox26 --- verified
firefox27 --- verified

People

(Reporter: ashughes, Assigned: sworkman)

References

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-33845ac5-085e-4aaf-ab03-c6c112130925.
=============================================================
0 	xul.dll 	nsHtml5StreamParser::WriteStreamBytes(unsigned char const *,unsigned int,unsigned int *) 	parser/html/nsHtml5StreamParser.cpp
1 	xul.dll 	nsHtml5StreamParser::WriteSniffingBufferAndCurrentSegment(unsigned char const *,unsigned int,unsigned int *) 	parser/html/nsHtml5StreamParser.cpp
2 	xul.dll 	nsHtml5StreamParser::SniffStreamBytes(unsigned char const *,unsigned int,unsigned int *) 	parser/html/nsHtml5StreamParser.cpp
3 	xul.dll 	nsHtml5StreamParser::DoDataAvailable(unsigned char const *,unsigned int) 	parser/html/nsHtml5StreamParser.cpp
4 	xul.dll 	nsHtml5StreamParser::CopySegmentsToParser(nsIInputStream *,void *,char const *,unsigned int,unsigned int,unsigned int *) 	parser/html/nsHtml5StreamParser.cpp
5 	xul.dll 	nsStringInputStream::ReadSegments(tag_nsresult (*)(nsIInputStream *,void *,char const *,unsigned int,unsigned int,unsigned int *),void *,unsigned int,unsigned int *) 	xpcom/io/nsStringStream.cpp
6 	xul.dll 	nsHtml5StreamParser::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned __int64,unsigned int) 	parser/html/nsHtml5StreamParser.cpp
7 	xul.dll 	nsDocumentOpenInfo::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned __int64,unsigned int) 	uriloader/base/nsURILoader.cpp
8 	xul.dll 	nsStreamListenerWrapper::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned __int64,unsigned int) 	obj-firefox/dist/include/nsStreamListenerWrapper.h
9 	xul.dll 	nsHTTPCompressConv::do_OnDataAvailable(nsIRequest *,nsISupports *,unsigned __int64,char const *,unsigned int) 	netwerk/streamconv/converters/nsHTTPCompressConv.cpp
10 	xul.dll 	nsHTTPCompressConv::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned __int64,unsigned int) 	netwerk/streamconv/converters/nsHTTPCompressConv.cpp
11 	xul.dll 	mozilla::net::nsHttpChannel::OnDataAvailable(nsIRequest *,nsISupports *,nsIInputStream *,unsigned __int64,unsigned int) 	netwerk/protocol/http/nsHttpChannel.cpp
12 	xul.dll 	nsInputStreamPump::OnStateTransfer() 	netwerk/base/src/nsInputStreamPump.cpp
13 	xul.dll 	nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream *) 	netwerk/base/src/nsInputStreamPump.cpp
14 	xul.dll 	nsInputStreamReadyEvent::Run() 	xpcom/io/nsStreamUtils.cpp
15 	xul.dll 	nsThread::ProcessNextEvent(bool,bool *) 	xpcom/threads/nsThread.cpp
16 	xul.dll 	nsThread::ThreadFunc(void *) 	xpcom/threads/nsThread.cpp
17 	nss3.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c
18 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c
19 	msvcr100.dll 	_callthreadstartex 	f:\\dd\\vctools\\crt_bld\\self_x86\\crt\\src\\threadex.c
20 	msvcr100.dll 	_threadstartex 	f:\\dd\\vctools\\crt_bld\\self_x86\\crt\\src\\threadex.c
21 	kernel32.dll 	BaseThreadStart 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=nsHtml5StreamParser%3A%3AWriteStreamBytes%28unsigned%20char%20const*%2C%20unsigned%20int%2C%20unsigned%20int*%29

Explosive in Firefox 25 Beta 1, currently at #4 in browser crashes, #12 overall. Other Firefox versions are reported but in much lower volume.

Unfortunately I see no correlations nor any useful comments as of yet.
I've been playing around with flipora.com but have not experienced a crash in the last hour. At this point I think it would be useful to have a developer advise us what types of functionality use this code path.
Henri, you're listed as the module owner of the HTML Parser. Can you take a look at this stack and recent changes made in FF25? Thanks.
Flags: needinfo?(hsivonen)
I suspect bug 497003. sworkman, does that look plausible to you?

(How come the crash stack doesn't have line numbers?)
Flags: needinfo?(hsivonen) → needinfo?(sworkman)
Thanks!
Flags: needinfo?(sworkman)
Please stop cancelling need-info and spamming bugs.
Flags: needinfo?(sworkman)
(Sorry - just noticing this now...and probably not going to be able to do much until next Tue when I'm back in the office after summit).

From the stack trace, it looks like mLastBuffer is gone or invalid. And it's being accessed off main thread in OnDataAvailable - so, maybe something unset it ON the main thread? I see it set in OnStartRequest and then in WriteStreamBytes, which should be in the context of OnDataAvailable. OnStart and OnData should be called in sequence, never together - I don't think they're being called in parallel here. Also, I don't see mLastBuffer ever being set to nullptr (except in the constructor).

I'm wondering if the parser object is being deleted as part of a Cancel or something on the main thread?

mLastBuffer's class (nsHtml5OwningUTF16Buffer) uses nsAutoRefCnt for refcounting - since it's accessed on multiple threads, maybe it could use ThreadsafeAutoRefCnt.

All speculative musings after looking for 10 minutes :) STR would be great, of course ;) ... More next week.
Flags: needinfo?(sworkman)
(In reply to Steve Workman [:sworkman] from comment #7)
> All speculative musings after looking for 10 minutes :) STR would be great,
> of course ;) ... More next week.

sadly str haven't been possible thus far according to comment 2. let us know if we can pull anything from crash stats to help in the investigation. Hopefully we can do some hardening and land before Thursday's beta, especially if the fix is only speculative.
Assignee: nobody → sworkman
So, it looks like mLastBuffer is null. From debugging one of the minidumps, I see that buffer->end is at +0x08, the crash line is calling buffer->getEnd, and all the reports crash at 0x08. So, that's nice to know.

I need to figure out why mLastBuffer could be null here and what's the simplest/safest patch we could apply.
(Sorry, I did not intend to change those flags earlier - I was only writing a comment. And I can't set 'tracking-firefox25' back to +).

Not much more to add patch-wise, even speculative. It's unclear to me why mLastBuffer would be null and I don't have any great speculative fixes to offer. I could add a check for mLastBuffer and try to return somewhat safely - something like:

if (!mLastBuffer) {
    return NS_ERROR_UNEXPECTED;
}

so rather than crashing, the page will cause an error. But that might push the bug into the shadows.

Henri, your thoughts would be greatly appreciated here.
Flags: needinfo?(hsivonen)
Another option: the patch in Bug 915905 *may* help here. I'm wondering if we're seeing concurrent OnStartRequest and OnDataAvailable calls, which shouldn't happen. Bug 915905 aims to stop that.

Rationale: The only place I can see mLastBuffer being set to nullptr is in nsHtml5StreamParser's constructor. Then it is set to point to something in OnStartRequest, which should return before OnDataAvailable is called, before WriteStreamBytes. So, if it's null in WriteStreamBytes, maybe OnDataAvailable was called too early?

I have no other evidence to prove this, but if it's what's happening then the patch in Bug 915905 should help.

Risk: It's been on nightly for a couple of weeks with no known issues. But it hasn't baked on Aurora, so there's a risk that the larger audience on beta may find something.

I've pushed both options (1. Handle the nullptr as an error (comment 10) and 2. the patch from Bug 915905) to try.

Alex, let me know if you want to try this.
Current Ranking:
 * Release: N/A
 * Beta:    #4  (0.77%)
 * Aurora:  #38 (0.27%)
 * Nightly: N/A
Thanks Anthony: since it's not on Nightly, then I'm a bit more convinced to try the patch from Bug 915905, pushing to Aurora and Beta.

I'll set approval-aurora/beta? in that bug.
(In reply to Steve Workman [:sworkman] (PTO Fri Oct 10, 2013) from comment #7)
> mLastBuffer's class (nsHtml5OwningUTF16Buffer) uses nsAutoRefCnt for
> refcounting - since it's accessed on multiple threads, maybe it could use
> ThreadsafeAutoRefCnt.

Wouldn't that just add unnecessary locking/atomic operations, since concurrent access to instances of that class from different threads supposed to be prevented by nsHtml5StreamParser::mTokenizerMutex?

(In reply to Steve Workman [:sworkman] (PTO Fri Oct 10, 2013) from comment #11)
> Another option: the patch in Bug 915905 *may* help here. I'm wondering if
> we're seeing concurrent OnStartRequest and OnDataAvailable calls, which
> shouldn't happen. Bug 915905 aims to stop that.
> 
> Rationale: The only place I can see mLastBuffer being set to nullptr is in
> nsHtml5StreamParser's constructor. Then it is set to point to something in
> OnStartRequest, which should return before OnDataAvailable is called, before
> WriteStreamBytes. So, if it's null in WriteStreamBytes, maybe
> OnDataAvailable was called too early?
> 
> I have no other evidence to prove this, but if it's what's happening then
> the patch in Bug 915905 should help.

Seems reasonable to me that this would be the right fix.
Flags: needinfo?(hsivonen)
We're going to try to confirm this did the trick on Aurora (see https://bugzilla.mozilla.org/show_bug.cgi?id=915905#c9)
So, bsmedberg reports in bug 915905 comment 10 that crashes are still happening after the patch from bug 915905 landed. I see one crash on aurora with 20131013004004 and one on nightly with 20131011115413 (boo!). Both on flipora.com.

So, since that patch didn't work, what about the suggestion in comment 10:

if (!mLastBuffer) {
    return NS_ERROR_UNEXPECTED;
}

So, nsHtml5StreamParser::OnDataAvailable would return an error instead of crashing. Henri, thoughts, worries?
(In reply to Steve Workman [:sworkman] from comment #17)
> So, since that patch didn't work, what about the suggestion in comment 10:
> 
> if (!mLastBuffer) {
>     return NS_ERROR_UNEXPECTED;
> }
> 
> So, nsHtml5StreamParser::OnDataAvailable would return an error instead of
> crashing. Henri, thoughts, worries?

That's better than crashing. However, please be sure to call MarkAsBroken() before returning in order to avoid introducing a security bug by simply dropping input on the floor.
Can we try to get a fix in for Thursday's beta? Last chance for speculative fixes.
Henri, awesome. MarkAsBroken - for sure. Meant to include that.

Alex, top priority today.
Current Rank:
 * Release: N/A
 * Beta:    #4  (1.0%)
 * Aurora:  #20 (0.41%)
 * Nightly: N/A

This seems to be rising in affected branches (25, 26) since last week.
-- Return with error in nsHtml5StreamParser::WriteStreamBytes if mLastBuffer is null
-- NS_WARNING to get error output on console.
-- Calls MarkAsBroken() before returning.

try: https://tbpl.mozilla.org/?tree=Try&rev=ada03ad2a662
Attachment #817423 - Flags: review?(hsivonen)
Attachment #817423 - Flags: review?(hsivonen) → review+
Keywords: checkin-needed
Comment on attachment 817423 [details] [diff] [review]
v1.0 Return with error in nsHtml5StreamParser::WriteStreamBytes if mLastBuffer is null

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
-- Bug 497003.
 
User impact if declined:
-- Continued crashes, seems to be most likely on flipora.com according to crash
reports.

Testing completed (on m-c, etc.):
-- Just landed on m-c; passed try push.

Risk to taking this patch (and alternatives if risky):
-- No current alternatives.
-- Crash converted to error: so, no crash, but lower visibility for us.

String or IDL/UUID changes made by this patch:
-- None.
Attachment #817423 - Flags: approval-mozilla-beta?
Attachment #817423 - Flags: approval-mozilla-aurora?
(In reply to Steve Workman [:sworkman] from comment #24)
> Testing completed (on m-c, etc.):
> -- Just landed on m-c; passed try push.
> 
I meant just landed on m-i.
https://hg.mozilla.org/mozilla-central/rev/08a84b468021
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Comment on attachment 817423 [details] [diff] [review]
v1.0 Return with error in nsHtml5StreamParser::WriteStreamBytes if mLastBuffer is null

thanks Steve!
Attachment #817423 - Flags: approval-mozilla-beta?
Attachment #817423 - Flags: approval-mozilla-beta+
Attachment #817423 - Flags: approval-mozilla-aurora?
Attachment #817423 - Flags: approval-mozilla-aurora+
Given this was not reproducible we will evaluate whether this patch worked or not based on crashstats.
Fingers crossed...
This does not show up in the top-crash report for Firefox 25.0b9. It's #62 and dropping on Aurora, and non-existent on Nightly. Calling this verified fixed based on this data.
Thanks Anthony!
This bug still crashes my Seamonkey - Linux 64 Fedora 20. Both nightly from Mozilla FTP and my own build crashing as soon as i trying to read first unread RSS feed message in mailnews.

Today's report ID is a451a278-a2db-471e-989f-b7a702140124 for example.
Steve can confirm but I'm not sure this fix was intended to eliminate this crash altogether or whether it was just meant to reduce it's prevalence. As it stands there have only been 4 reports of this signature against Seamonkey, 1 report against Thunderbird, and 2 reports against Firefox in the last 7 days.

Misak, if you can reproduce this reliably then it might be worth filing a follow-up bug but I still consider *this* report to be resolved. 

Steve, what do you think?
Yes, i can reproduce it without any problem 100% with official nightly and my build. Filling follow-up.
Blocks: 964182
Yeah, Anthony's thinking was my thinking on two counts:
1. The fix was somewhat speculative, if I remember correctly. We had a good idea where the crash was happening, but not 100% accuracy on the reason.
2. A follow-up sounds like it might be helpful in determining the root cause. So, thanks for filing that.

I'll assign it to myself and continue comments over there.
You need to log in before you can comment on or make changes to this bug.