Open Bug 447987 Opened 16 years ago Updated 2 years ago

[mozTXTToHTMLConv] Long URL inside *structs* causes hang

Categories

(Core :: Networking, defect, P5)

x86
All
defect

Tracking

()

People

(Reporter: raydenxy, Unassigned)

References

()

Details

(Keywords: hang, Whiteboard: [necko-would-take])

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Build Identifier: version 2.0.0.16 (20080708) + 2.0.0.14

Thunderbird 2.0.0.14 is prone to a remote denial of service attack because it fails to properly handle overly long url's in the form of www.[100000+ x 'a'].com.An example will be <a href="http://www.a.a.a.a.a....[100000+].com/">test</a> embedded into a html file sent as an attachement.When trying to open the email Thunderbird will try to interpret the html page for inline display and start eating up big amounts of cpu and memory(ram) and stop responding thus hanging.A malicious attacker can send an email having attached such an html file,thereby causing a remote denial of service attack on thunderbird clients trying to open the email.
Tested on Thunderbird 2.0.0.14 under Windows XP.Other versions might be affected too. 

Please visit the following url for the original advisory:
http://shinnok.evonet.ro/vulns_html/thunderbird.html

Reproducible: Always

Steps to Reproduce:
Please read the Details section.
Actual Results:  
Thunderbird hangs consuming cpu and memory.

Expected Results:  
Thunderbird should actually invalidate such urls or gracefully recover when provided with very long urls.
Summary: Thunderbird 2.0.0.14 url handling cpu+memory consumption DOS → Thunderbird 2.0.0.14+2.0.0.16 url handling cpu+memory consumption DOS
This is just a hang and one way of x ways to hang a browser.
confirming with SM trunk and moving to Core but could be a dupe.
Windbg fails to generate a stacktrace for this hang...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: hang
Product: Thunderbird → Core
QA Contact: general → general
Summary: Thunderbird 2.0.0.14+2.0.0.16 url handling cpu+memory consumption DOS → Gecko hangs with very long URL
Version: unspecified → 1.9.0 Branch
Boris, is this possibly related to one of the Thunderbird hangs (bug 440641) that was eventually found to be a Core bug?
That bug has to do with selecting long text.  That's not happening here.  So something else is going on.

Loading the HTML file in question in a browser does NOT hang.  So my gut instinct is that this is an issue in the mailnews code.
I sent the URL via send page to myself and in Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.1b4pre) Gecko/20090415 SeaMonkey/2.0b1pre, there's no noticeable hang.

I can confirm, that it hangs TB 2.0.0.19 for Linux: OS->ALL
OS: Windows XP → All
Yep, worksforme in comm-central seamonkey too.  Would be interesting to figure out when it stopped hanging.
Attached file gdb bt full log for hang (obsolete) —
I could reproduce the log with a self compiled SeaMonkey build. It only happens with view message body set to plain text. The attachment shows the "bt full" output of an attached gdb.
Attached file sanitized gdb log
Now only with the relevant parts and loaded symbols.
Attachment #373720 - Attachment is obsolete: true
I see this as well. The plain text is the critical part.

I've run it in the debugger and paused and restarted several times and the critical function seems to be:

#3  0x00007fb36f4b4349 in mozTXTToHTMLConv::ScanTXT (this=0x7fb35bc8fbe0, aInString=0x7fb35a63f008, aInStringLength=4243554, whattodo=32767, aOutString=@0x7fff84e7b3b0)
    at /home/[...]/seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:1206
No locals.
#4  0x00007fb36f4b46b3 in mozTXTToHTMLConv::ScanTXT (this=0x7fb35bc8fbe0, text=0x7fb35a63f008, whattodo=14, _retval=0x7fff84e7b5e8)
    at /home/[...]/seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:1380
No locals.

In all my pauses, it was ScanTxt that was the common factor.
So what is the expected behavior for "view as plain text" in this case?  Is it supposed to show the URL from the <a> so that the user can click on it or something?

That is, do we have some code convert that HTML file to plaintext that has the huge url as plaintext and then we do url-detection on the result?

And if so, would putting a long plaintext url into a text/plain e-mail have the same hang result?
I made some more tests. I don't see the hang for plain text mails and I don't see the hang if I insert the URL directly into a HTML-Mail.

I can only reproduce it with "view as plain text", an attached HTML-Mail with that long link and "display attachments as inline".
OK, so what's different in the other cases?  Is mozTXTToHTMLConv::ScanTXT called at all in those cases?
In any case, when it _is_ called in the above stack, why exactly is aInStringLength=4243554 for ScanTXT but aInLength=32767 for FindURL?

It'd be interesting to step through ScanTXT and FindURL in a build that shows the problem.
I don't understand that code so far, but I've found the following:

mozTXTToHTMLConv::CalculateURLBoundaries (this=0x7f65d6c44059, aInString=0x7f65c42bcfc0, aInStringLength=246818, pos=2425, whathasbeendone=32613, check=abbreviated, start=3603753352, end=4, txtURL=@0x7f65c3c021d0, desc=@0x7f65d6553a02, replaceBefore=@0x7f65c3c7a84a, replaceAfter=@0x7f65e3f9e235) at /seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:382

I've set a breakpoint there and pos is increased for 2 at every break. So we seem to need quite some time to get to the end of the String.
Are we making multiple FindURL calls too?
FindURL and ScanTXT seem to call each other with FindURL:pos and ScanTXT:aInStringLength increased by 2 per call:

mozTXTToHTMLConv::FindURL (this=0x7f200f7f7b60, aInString=0x7f200723f008, aInLength=246818, pos=12891, whathasbeendone=14, outputHTML=@0x7fff3349b220, replaceBefore=@0x7fff3349b2c8, replaceAfter=@0x7fff3349b2c4) at /seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:524

mozTXTToHTMLConv::ScanTXT (this=0x7f200f7f7b60, aInString=0x7f200723f6f6, aInStringLength=12006, whattodo=12, aOutString=@0x7fff3349ae70) at /seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:1116

mozTXTToHTMLConv::FindURL (this=0x7f200f7f7b60, aInString=0x7f200723f008, aInLength=246818, pos=12893, whathasbeendone=14, outputHTML=@0x7fff3349b220, replaceBefore=@0x7fff3349b2c8, replaceAfter=@0x7fff3349b2c4) at /seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:524

 mozTXTToHTMLConv::ScanTXT (this=0x7f200f7f7b60, aInString=0x7f200723f6f6, aInStringLength=12008, whattodo=12, aOutString=@0x7fff3349ae70) at /seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:1116

And this goes on and on.
Uh....  ScanTXT when called under FindURL shouldn't be calling FindURL.  Can you show me a stack where it is, exactly?

So it does sound like we make multiple FindURL calls.  Can you tell me why?
Sry, I misinterpreted the ddd output. What we have is the following:

#0  mozTXTToHTMLConv::ScanTXT (this=0x7fe20d0dee40, aInString=0x7fe20c93f008, aInStringLength=246818, whattodo=14, aOutString=@0x7fff3295b850) at /home/i6stud/sibresch/seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:1206
#1  0x00007fe21c5d0c2d in mozTXTToHTMLConv::ScanTXT (this=0x7fe20d0dee40, text=0x7fe20c93f008, whattodo=14, _retval=0x7fff3295bb18) at /home/i6stud/sibresch/seamonkey_hg/src/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp:1380
	outString = {<nsAString_internal> = {mData = 0x7fe20c302008, mLength = 12029, mFlags = 5}, <No data fields>}
	inLength = 246818
#2  0x00007fe2139db244 in MimeInlineTextPlain_parse_line (line=0x7fe20c083008 "\n    [2008-07-23] Thunderbird 2.0.0.14 url handling cpu+memory consumption DOS\n\nThunderbird 2.0.0.14 is prone to a remote denial of service attack because it \nfails to properly handle overly long url'"..., length=246818, obj=0x7fe20e5f1f60) at /home/i6stud/sibresch/seamonkey_hg/src/mailnews/mime/src/mimetpla.cpp:449

That's the beginning. When ScanTxT at #0 is entered, there are after some time repetitive calls to FindURL with an increased value of pos every time. FindURL calls CalculateURLBoundaries, that calls ScanTXT again (which doesn't call FindURL) and then it goes back to the #0 ScanTXT.

So no recursion here, but a lot of calls to FindURL.
Hmm.  Do we end up making a FindURL call for every '.', perhaps?  That's what it's looking like.

For one of those FindURL calls, what's the value of |pos| and what does it output for |replaceBefore| and |replaceAfter|?  And what does it return (true or false)?
Attached file minimal testcase
The problem is not the long link, that works fine (e.g. in plain text mails). What breaks us here is http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#1162

The /a&gt; at the beginning of the testcase sets structPhrase_italic to 1. The FindURL call in mozTXTToHTMLConv.cpp#1206 works, but due to the second condition including structPhrase_italic, the if-clause is false and the link isn't copied to the outputHTML.
(In reply to comment #21)
> based on comment 20, compare bug 122876, bug 142507 and bug 230423.

dup?
Product: Core → Thunderbird
QA Contact: general → general
Whiteboard: dupme
Version: 1.9.0 Branch → unspecified
Component: General → Networking
Product: Thunderbird → Core
QA Contact: general → networking
Summary: Gecko hangs with very long URL → [mozTXTToHTMLConv] Long URL inside *structs* causes hang
Whiteboard: dupme
Whiteboard: [necko-would-take]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P5
QA Whiteboard: qa-not-actionable

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: critical → --
Severity: -- → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: