Last Comment Bug 607222 - (CVE-2010-3765) Interleaving document.write and appendChild can lead to duplicate text frames and overrunning of text run buffers
(CVE-2010-3765)
: Interleaving document.write and appendChild can lead to duplicate text frames...
Status: RESOLVED FIXED
[sg:critical] 0day
: crash, testcase, verified1.9.1, verified1.9.2
Product: Core
Classification: Components
Component: DOM (show other bugs)
: unspecified
: All All
: P1 critical (vote)
: mozilla2.0b7
Assigned To: Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
:
:
Mentors:
: 607412 (view as bug list)
Depends on: 607576
Blocks: 607228 607240
  Show dependency treegraph
 
Reported: 2010-10-25 21:28 PDT by Reed Loden [:reed] (use needinfo?)
Modified: 2014-02-02 17:04 PST (History)
53 users (show)
bzbarsky: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
beta7+
.12+
.12-fixed
.15+
.15-fixed


Attachments
stack (12.98 KB, text/plain)
2010-10-25 21:55 PDT, Bob Clary [:bc:]
no flags Details
minimized (safe) click-to-crash testcase (894 bytes, text/html)
2010-10-25 22:21 PDT, Daniel Veditz [:dveditz]
no flags Details
simpler crash-on-load testcase (369 bytes, text/html)
2010-10-26 00:33 PDT, Jesse Ruderman
no flags Details
valgrind log (73.11 KB, text/plain)
2010-10-26 00:49 PDT, Bob Clary [:bc:]
no flags Details
WIP (1.60 KB, patch)
2010-10-26 10:03 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
bzbarsky: review+
jonas: superreview+
christian: approval1.9.2.12+
christian: approval1.9.1.15+
Details | Diff | Splinter Review
Alternative/complimentary patch (1.16 KB, patch)
2010-10-26 12:19 PDT, Jonas Sicking (:sicking) No longer reading bugmail consistently
bugs: review+
bzbarsky: review+
christian: approval1.9.2.12+
christian: approval1.9.1.15+
Details | Diff | Splinter Review
a bit faster patch (5.51 KB, patch)
2010-10-26 14:31 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
jonas: review+
bzbarsky: review-
Details | Diff | Splinter Review
same for 191 if needed (5.50 KB, patch)
2010-10-26 14:57 PDT, Olli Pettay [:smaug] (way behind * queues, especially ni? queue)
bzbarsky: review-
Details | Diff | Splinter Review

Description Reed Loden [:reed] (use needinfo?) 2010-10-25 21:28:45 PDT
Created attachment 485975 [details]
PoC (without the binary)

Morten Kråkvik of Telenor SOC reported an in-the-wild 0day affecting Firefox 3.6.x on Windows to security@:

=========================================================================

A compromised site is currently redirecting visitors to
hxxp://l-3com.dyndns-work.com/admissions/admin.php, which contains
exploits directed at Firefox users on Windows.

The exploit is confirmed successful on Windows XP SP3 + Firefox
3.6.11.
Comment 1 Bob Clary [:bc:] 2010-10-25 21:55:54 PDT
Created attachment 485979 [details]
stack

stack and variables from winxp 20101024 1.9.2 build on winxp with 3.6.11 ua string.
Comment 2 Bob Clary [:bc:] 2010-10-25 21:59:20 PDT
Forgot to mention. The PoC works/crashes if you change your Firefox version to 3.6.11. No need to go to the exploit site.
Comment 3 Daniel Veditz [:dveditz] 2010-10-25 22:05:38 PDT
Created attachment 485982 [details]
'scvhost.txt' (binary from malware site)
Comment 4 Daniel Veditz [:dveditz] 2010-10-25 22:13:05 PDT
Here's one crash I get with a minimized testcase (on mac, 3.6.12pre)
bp-957a753d-474c-4907-b0b4-1d41a2101025
Comment 5 Reed Loden [:reed] (use needinfo?) 2010-10-25 22:13:32 PDT
scvhost.exe : Not detected by Sandbox (Signature: NO_VIRUS)


 [ DetectionInfo ]
    * Filename: C:\analyzer\scan\scvhost.exe.
    * Sandbox name: NO_MALWARE
    * Signature name: NO_VIRUS.
    * Compressed: NO.
    * TLS hooks: NO.
    * Executable type: Application.
    * Executable file structure: OK.
    * Filetype: PE_I386.

 [ General information ]
    * File length:        48640 bytes.
    * MD5 hash: e8ead7641f68822c8fbfe53ad7f1bf52.
    * SHA1 hash: 244860d5c40d8d13c16fa8bba133c7608a09a276.

 [ Changes to filesystem ]
    * Creates file C:\WINDOWS\temp\symantec.exe.

 [ Changes to registry ]
    * Creates value "Microsoft Windows Update"="C:\WINDOWS\temp\symantec.exe" in key "HKCU\Software\Microsoft\Windows\CurrentVersion\Run".
    * Creates value "Microsoft Windows Update"="C:\WINDOWS\temp\symantec.exe" in key "HKLM\Software\Microsoft\Windows\CurrentVersion\Run".

 [ Network services ]
    * Connects to "update.microsoft.com" on port 80 (TCP).
    * Connects to "l-3com.dyndns-work.com" on port 443 (TCP).

 [ Process/window information ]
    * Creates process "reg.EXE".
    * Will automatically restart after boot (I'll be back...).
    * Creates process "cmd.exe".

 [ Signature Scanning ]
    * C:\WINDOWS\temp\symantec.exe (48640 bytes) : no signature detection.
Comment 6 Daniel Veditz [:dveditz] 2010-10-25 22:21:51 PDT
Created attachment 485985 [details]
minimized (safe) click-to-crash testcase
Comment 7 Daniel Veditz [:dveditz] 2010-10-25 22:28:15 PDT
bp-b1f1a539-9eec-4c62-a204-cee8f2101025
bp-66776ae0-e584-47f6-85c9-8e4292101025

two more crashes, the second one seems kind of odd.

I don't think the malware is that interesting to us, the fact that it gets to run at all is a fail. We should send it on to A-V companies, of course -- they'll want to see it.
Comment 8 Andreas Gal :gal 2010-10-25 22:44:24 PDT
###!!! ASSERTION: Invalid offset: 'aOffset <= mSkipChars->mCharCount', file ../../../../gfx/thebes/src/gfxSkipChars.cpp, line 92
###!!! ASSERTION: Overflow: 'aStart + aLength <= mCharacterCount', file ../../../../gfx/thebes/src/gfxFont.cpp, line 1890
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901
WARNING: Break suggested inside cluster!: file ../../../../gfx/thebes/src/gfxFont.cpp, line 1901

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6a56d50 in nsFontCache::GetMetricsFor (this=0x7fffd4b3d130, aFont=..., aLangGroup=0x7fffe0de7248, 
    aUserFontSet=0x0, aMetrics=@0x7fffffff7cc0) at ../../../../gfx/src/thebes/nsThebesDeviceContext.cpp:163
163	                tfm->GetThebesFontGroup()->UpdateFontList();
(gdb) bt
Comment 9 Daniel Veditz [:dveditz] 2010-10-25 22:49:59 PDT
Assuming this will block branch releases. Doesn't seem to crash Firefox 4/trunk.
Comment 10 Reed Loden [:reed] (use needinfo?) 2010-10-25 23:13:18 PDT
Affects all OSes on both Firefox 3.6.x and Firefox 3.5.x.
Comment 11 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-25 23:29:16 PDT
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fa1e6f870ff1&tochange=f60b3bbfa8ce
seems to be the fix range on mozilla-central, for 64-bit Linux nightlies
Comment 12 Daniel Veditz [:dveditz] 2010-10-25 23:57:21 PDT
Google's safe-browsing service now blocks the malware site.
Comment 13 christian 2010-10-26 00:08:19 PDT
Submitted to Panda Security, McAfee, Microsoft, Symantec, ClamAV. Trend Micro's upload form throws an error.
Comment 14 christian 2010-10-26 00:08:43 PDT
(And Google obviously)
Comment 15 Jesse Ruderman 2010-10-26 00:33:06 PDT
Created attachment 486011 [details]
simpler crash-on-load testcase
Comment 16 Bob Clary [:bc:] 2010-10-26 00:49:13 PDT
Created attachment 486012 [details]
valgrind log
Comment 17 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 01:04:27 PDT
OK, I've debugged Jesse's testcase to understanding the last step of the problem:  we're crashing because we've double-constructed frames for the "abarBAR" text (originally "a"), so when we notify the primary frame about the character data append in nsCSSFrameConstructor::CharacterDataChanged (when appending "barBAR"), we only notify the first of the two.  Then we crash when dealing with the second one, since its text run has an incorrect cached length (of 1 instead of 7), and we overrun data.  (At least that explains the first assertion.)
Comment 18 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 01:38:29 PDT
First ContentAppended is here:

#0  nsCSSFrameConstructor::ContentAppended (this=0x14d8e30, 
    aContainer=0x14dc0f0, aNewIndexInContainer=7)
    at /home/dbaron/builds/1.9.2/mozilla/layout/base/nsCSSFrameConstructor.cpp:6250
#1  0x00007feaee9d3656 in PresShell::ContentAppended (this=0x18ec780, 
    aDocument=0x18d2fc0, aContainer=0x14dc0f0, aNewIndexInContainer=7)
    at /home/dbaron/builds/1.9.2/mozilla/layout/base/nsPresShell.cpp:5044
#2  0x00007feaeecf1303 in nsNodeUtils::ContentAppended (aContainer=0x14dc0f0, 
    aNewIndexInContainer=7)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsNodeUtils.cpp:137
#3  0x00007feaeec5486f in nsContentSink::NotifyAppend (this=0x14d1620, 
    aContainer=0x14dc0f0, aStartIndex=7)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsContentSink.cpp:1380
#4  0x00007feaeee2933a in SinkContext::FlushTags (this=0x14f0810)
    at /home/dbaron/builds/1.9.2/mozilla/content/html/document/src/nsHTMLContentSink.cpp:1379
#5  0x00007feaeee2f8ed in HTMLContentSink::FlushTags (this=0x14d1620)
    at /home/dbaron/builds/1.9.2/mozilla/content/html/document/src/nsHTMLContentSink.cpp:3199
#6  0x00007feaeec551d2 in nsContentSink::BeginUpdate (this=0x14d1620, 
    aDocument=0x18d2fc0, aUpdateType=1)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsContentSink.cpp:1625
#7  0x00007feaeec9b93a in nsDocument::BeginUpdate (this=0x18d2fc0, 
    aUpdateType=1)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsDocument.cpp:3917
#8  0x00007feaeea464e9 in mozAutoDocUpdate::mozAutoDocUpdate (
    this=0x7fff3e43f370, aDocument=0x18d2fc0, aUpdateType=1, aNotify=1)
    at /home/dbaron/builds/1.9.2/mozilla/layout/generic/../../content/base/src/mozAutoDocUpdate.h:53
#9  0x00007feaeecd673f in nsGenericElement::doInsertChildAt (aKid=0x193df50, 
    aIndex=7, aNotify=1, aParent=0x14dc0f0, aDocument=0x18d2fc0, 
    aChildArray=...)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3219
#10 0x00007feaeecd620e in nsGenericElement::InsertChildAt (this=0x14dc0f0, 
    aKid=0x193df50, aIndex=7, aNotify=1)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3169
#11 0x00007feaeecd9110 in nsGenericElement::doReplaceOrInsertBefore (
    aReplace=0, aNewChild=0x193df98, aRefChild=0x0, aParent=0x14dc0f0, 
    aDocument=0x18d2fc0, aReturn=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3953
#12 0x00007feaeecd75d1 in nsGenericElement::InsertBefore (this=0x14dc0f0, 
    aNewChild=0x193df98, aRefChild=0x0, aReturn=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3488
#13 0x00007feaeedb0df8 in nsHTMLBodyElement::InsertBefore (this=0x14dc0f0, 
    newChild=0x193df98, refChild=0x0, _retval=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/html/content/src/nsHTMLBodyElement.cpp:95
---Type <return> to continue, or q <return> to quit---
#14 0x00007feaeecbf528 in nsGenericElement::AppendChild (this=0x14dc0f0, 
    aNewChild=0x193df98, aReturn=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.h:500
#15 0x00007feaeedb0ea7 in nsHTMLBodyElement::AppendChild (this=0x14dc0f0, 
    newChild=0x193df98, _retval=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/html/content/src/nsHTMLBodyElement.cpp:95
#16 0x00007feaee6a76ad in nsIDOMNode_AppendChild (cx=0x17b24e0, argc=1, 
    vp=0x17cea28) at dom_quickstubs.cpp:2617


and the second is here:

#0  nsCSSFrameConstructor::ContentAppended (this=0x14d8e30, 
    aContainer=0x14dc0f0, aNewIndexInContainer=7)
    at /home/dbaron/builds/1.9.2/mozilla/layout/base/nsCSSFrameConstructor.cpp:6250
#1  0x00007feaee9d3656 in PresShell::ContentAppended (this=0x18ec780, 
    aDocument=0x18d2fc0, aContainer=0x14dc0f0, aNewIndexInContainer=7)
    at /home/dbaron/builds/1.9.2/mozilla/layout/base/nsPresShell.cpp:5044
#2  0x00007feaeecf1303 in nsNodeUtils::ContentAppended (aContainer=0x14dc0f0, 
    aNewIndexInContainer=7)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsNodeUtils.cpp:137
#3  0x00007feaeecd68de in nsGenericElement::doInsertChildAt (aKid=0x193df50, 
    aIndex=7, aNotify=1, aParent=0x14dc0f0, aDocument=0x18d2fc0, 
    aChildArray=...)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3238
#4  0x00007feaeecd620e in nsGenericElement::InsertChildAt (this=0x14dc0f0, 
    aKid=0x193df50, aIndex=7, aNotify=1)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3169
#5  0x00007feaeecd9110 in nsGenericElement::doReplaceOrInsertBefore (
    aReplace=0, aNewChild=0x193df98, aRefChild=0x0, aParent=0x14dc0f0, 
    aDocument=0x18d2fc0, aReturn=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3953
#6  0x00007feaeecd75d1 in nsGenericElement::InsertBefore (this=0x14dc0f0, 
    aNewChild=0x193df98, aRefChild=0x0, aReturn=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.cpp:3488
#7  0x00007feaeedb0df8 in nsHTMLBodyElement::InsertBefore (this=0x14dc0f0, 
    newChild=0x193df98, refChild=0x0, _retval=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/html/content/src/nsHTMLBodyElement.cpp:95
#8  0x00007feaeecbf528 in nsGenericElement::AppendChild (this=0x14dc0f0, 
    aNewChild=0x193df98, aReturn=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/base/src/nsGenericElement.h:500
#9  0x00007feaeedb0ea7 in nsHTMLBodyElement::AppendChild (this=0x14dc0f0, 
    newChild=0x193df98, _retval=0x7fff3e43f780)
    at /home/dbaron/builds/1.9.2/mozilla/content/html/content/src/nsHTMLBodyElement.cpp:95
#10 0x00007feaee6a76ad in nsIDOMNode_AppendChild (cx=0x17b24e0, argc=1, 
    vp=0x17cea28) at dom_quickstubs.cpp:2617
Comment 19 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 01:51:08 PDT
And the "a" text and the <base> element also look like they're out-of-order in the document tree.
Comment 20 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 02:06:36 PDT
I tend to think nsGenericElement::doReplaceOrInsertBefore needs to flush the content sink *before* determining the index (insPos) to use for the insertion (when aRefChild is null).
Comment 21 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 02:07:50 PDT
(In reply to comment #20)
> I tend to think nsGenericElement::doReplaceOrInsertBefore needs to flush the
> content sink *before* determining the index (insPos) to use for the insertion
> (when aRefChild is null).

I should have said nsGenericElement::doReplaceOrInsertBefore or one of its callers needs ...
Comment 22 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 02:20:33 PDT
Inserting:
  mozAutoDocUpdate updateBatch(aDocument, UPDATE_CONTENT_MODEL, PR_TRUE);
right before the if (aRefChild) about 20 lines into that function does in fact fix the assertions/crash with both attachment 486011 [details] and attachment 485985 [details].

But that's probably not quite the right place, and it sort of looks to me like there might be invariants about other stuff that needs to be done *before* doing that (although maybe the calls to nsMutationGuard::DidMutate aren't really an invariant).
Comment 23 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 03:33:19 PDT
Would that kind of updateBatch regress Bug 423269?
Comment 24 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 03:58:01 PDT
mozAutoDocUpdate updateBatch(aDocument, UPDATE_CONTENT_MODEL, PR_TRUE)
seems to cause a hang here.
Comment 25 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 04:22:36 PDT
Or it does fixes the assertions I see with Jesse's testcase (which doesn't crash here), but https://bugzilla.mozilla.org/attachment.cgi?id=485985 hangs.
Comment 26 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 05:01:28 PDT
Ah, the hang is probably because of the long loop.
Comment 27 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 05:37:32 PDT
I don't understand 
http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsGenericElement.cpp?mark=4086-4086#4083
Why it is ok to not notify other cases? Jonas, I think you added
that code long ago in Bug 325730.

If the parameter is always true, I don't get crashes nor assertions using
the testcases in this bug.
Comment 28 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 08:49:13 PDT
(In reply to comment #22)
> Inserting:
>   mozAutoDocUpdate updateBatch(aDocument, UPDATE_CONTENT_MODEL, PR_TRUE);
> right before the if (aRefChild) about 20 lines into that function does in fact
> fix the assertions/crash with both attachment 486011 [details] and attachment 485985 [details].

Flushing the document right before returning from nsHTMLDocument::WriteCommon
fixes the crashes too, but I wonder if data coming from the necko to
parser could trigger this too.
Comment 29 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 10:00:18 PDT
Ah, so nsGenericElement::doInsertChildAt does have mozAutoDocUpdate.
So yes, I agree with dbaron.
We should have the update earlier, so that it flushes before we call
container->IndexOf or container->GetChildCount() .
Comment 30 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 10:03:34 PDT
Created attachment 486085 [details] [diff] [review]
WIP

This is almost what dbaron suggested in comment 22.
But I don't quite like this, since this may cause a small performance regression 
because of double batching.
Comment 31 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 10:43:07 PDT
Another option is to check mutation guards around the update batch starts (both of them) and reget insPos as needed like we do in the aReplace case when RemoveChildAt triggers mutations, right?
Comment 32 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 10:49:55 PDT
That might be a faster solution. Less extra virtual calls.
Comment 33 Jonas Sicking (:sicking) No longer reading bugmail consistently 2010-10-26 11:17:44 PDT
So we *should* not need to change the second argument to the auto-batch-guard. We definitely should move it to above getting the insPos, but we should be able to only do it when doing a replace or inserting a fragment. Which means that such a fix would not affect the testcase as it doesn't use replace or fragment.

The reason that should be enough is that every time we flush we should be expecting random mutations to the tree. So I think the fix is figuring out why that part isn't working currently. I.e. why doesn't the current mutation guards catch the mutation here and update all indexes as needed.
Comment 34 christian 2010-10-26 11:32:07 PDT
This has been made public @ http://www.norman.com/security_center/virus_description_archive/129146/. The press is picking it up FYI.
Comment 35 christian 2010-10-26 11:35:31 PDT
I should be clearer...

I have not seen anyone talk in specifics about this bug, only that it exists. Researchers have also got their hands on the malware so it is safe to assume they (and bad guys) got the page that was serving the 0day as well.
Comment 36 Mike Shaver (:shaver -- probably not reading bugmail closely) 2010-10-26 11:36:08 PDT
I don't think we should be too picky about small performance regressions here, if it adds risk (complexity) or delay.  We have an in-the-wild exploit, and we can replace with a faster version in a subsequent update if we think it's critical to restore the missing perf.

(Not that it's clear exactly what the performance impact would be, and on what loads, but a couple of virtual calls per batch or similar sounds like something we should just eat.)
Comment 37 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 11:36:44 PDT
There is no mutation guard around the part where the html sink ends up adding stuff to the DOM.  That part being the BeginUpdate() call...
Comment 38 Reed Loden [:reed] (use needinfo?) 2010-10-26 11:37:38 PDT
Einar Oftedal, Head of Malware Detection Team (MDT), Norman R&D had this to add about the vulnerability:

===============================================================================

So the vulnerability appears to some kind of race condition issue in the 
handling of DOM element (tags) properties. The exploit enumerates all 
properties of the tags AUDIO, A, and BASE and sets them to 'a' one by one 
in a loop that spins may times. Unfortunately I'm not too familar with the 
Firefox code base, but my theory is that at some point in time, Firefox 
uses an object in two separate places, but only has incremented the 
object's reference count once. Consequently, when either part of the code 
is done with the object, the reference count reaches zero and the object 
is freed. This triggers a use-after-free condition upon subsequent use of 
the object, such as seen below. In order for an attacker to exploit this 
issue, a heap spray would need to be created in order to fill the freed 
memory. In this case, it appears that a very big heap spray (>1GB) is 
needed because of the high address. Hence, low memory machines would 
probably fail at running the exploit (I also had some initial problems 
reproducing in a VM due to low memory). However, this could probably be 
addressed which leads me to think that whoever made this exploit did not 
put much effort into it. I also believe the vulnerability was found by 
fuzzing as the author seems to just have bruteforced his way through all 
the properties until some strange application behavior was triggered. Btw, 
I replaced the heap spray with my own (just long strings of 'A's) and was 
able to hit EIP 0x41414141 without much effort, hence it seems to be 
highly exploitable.

0:000> r  
eax=438a0ee0 ebx=00000000 ecx=038fdb80 edx=00000000 esi=0012ee78 
edi=00000000
eip=100ec811 esp=0012e8c8 ebp=0012ea74 iopl=0         nv up ei pl nz na po 
nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000 efl=00010202
xul!XPCWrappedNative::GetNewOrUsed+0x21:
100ec811 8b08            mov     ecx,dword ptr [eax] 
ds:0023:438a0ee0=????????

0:000> kb  
ChildEBP RetAddr  Args to Child 
0012ea74 100f77eb 0012ee78 038fdb80 02ab9c80 
xul!XPCWrappedNative::GetNewOrUsed+0x21 
[e:\builds\moz2_slave\win32_build\build\js\src\xpconnect\src\xpcwrappednative.cpp 
@ 327]
0012eb0c 100f6e58 0012ecf0 0012ebe0 00000000 
xul!XPCConvert::NativeInterface2JSObject+0x28b 
[e:\builds\moz2_slave\win32_build\build\js\src\xpconnect\src\xpcconvert.cpp 
@ 1199]
0012ebac 100efe12 0012ecf0 0012ebe0 0012ec60 
xul!XPCConvert::NativeData2JS+0x98 
[e:\builds\moz2_slave\win32_build\build\js\src\xpconnect\src\xpcconvert.cpp 
@ 471]
0012ee48 100fc072 0012ee78 00000001 00000000 
xul!XPCWrappedNative::CallMethod+0x5e2 
[e:\builds\moz2_slave\win32_build\build\js\src\xpconnect\src\xpcwrappednative.cpp 
@ 2810]
0012ef14 00335c4d 02cc4000 02e93640 00000000 xul!XPC_WN_GetterSetter+0x1b2 
[e:\builds\moz2_slave\win32_build\build\js\src\xpconnect\src\xpcwrappednativejsops.cpp 
@ 1784]
0012efc8 0032a533 02cc4000 00000000 03d3d0f0 js3250!js_Invoke+0x42d 
[e:\builds\moz2_slave\win32_build\build\js\src\jsinterp.cpp @ 1360]
0012effc 00337de0 02cc4000 02e93640 02e7a580 
js3250!js_InternalInvoke+0x103 
[e:\builds\moz2_slave\win32_build\build\js\src\jsinterp.cpp @ 1423]
0012f068 0033911a 02cc4000 02e93640 00000000 
js3250!js_GetPropertyHelper+0x310 
[e:\builds\moz2_slave\win32_build\build\js\src\jsobj.cpp @ 4275]
0012f2b4 003169c5 02cc4000 0012f364 01953090 js3250!js_Interpret+0x115a 
[e:\builds\moz2_slave\win32_build\build\js\src\jsops.cpp @ 1904]
0012f338 003041a1 02df8760 01953090 00000000 js3250!js_Execute+0x1a5 
[e:\builds\moz2_slave\win32_build\build\js\src\jsinterp.cpp @ 1601]
0012f364 100606f3 02cc4000 02df8760 01c6f7e4 
js3250!JS_EvaluateUCScriptForPrincipals+0x61 
[e:\builds\moz2_slave\win32_build\build\js\src\jsapi.cpp @ 5073]
0012f3d8 1018230e 0012f4ac 02df8760 01c6f7e0 
xul!nsJSContext::EvaluateString+0x15e 
[e:\builds\moz2_slave\win32_build\build\dom\base\nsjsenvironment.cpp @ 
1764]
0012f490 10014d80 02e992b0 0012f4ac 02e992b0 
xul!nsScriptLoader::EvaluateScript+0x177 
[e:\builds\moz2_slave\win32_build\build\content\base\src\nsscriptloader.cpp 
@ 711]
0012f544 1018f596 02e992b0 02e992b0 02beb5e4 
xul!nsScriptLoader::ProcessRequest+0x6f 
[e:\builds\moz2_slave\win32_build\build\content\base\src\nsscriptloader.cpp 
@ 625]
0012f8e4 100505ab 02beb5e4 02beb5e4 03d91c00 
xul!nsScriptLoader::ProcessScriptElement+0x2e6 
[e:\builds\moz2_slave\win32_build\build\content\base\src\nsscriptloader.cpp 
@ 577]
0012f8fc 10014b17 03d91cbc 03d91c00 02beb5c0 
xul!nsScriptElement::MaybeProcessScript+0x74 
[e:\builds\moz2_slave\win32_build\build\content\base\src\nsscriptelement.cpp 
@ 193]
0012f9a8 10014af7 1004227f 00000001 02e99370 
xul!nsHTMLScriptElement::MaybeProcessScript+0x1d 
[e:\builds\moz2_slave\win32_build\build\content\html\content\src\nshtmlscriptelement.cpp 
@ 565]
0012f9ac 1004227f 00000001 02e99370 00000000 
xul!nsHTMLScriptElement::DoneAddingChildren+0xc 
[e:\builds\moz2_slave\win32_build\build\content\html\content\src\nshtmlscriptelement.cpp 
@ 490]
0012f9c4 1009c49d 02beb5e4 00000000 03d56380 
xul!HTMLContentSink::ProcessSCRIPTEndTag+0x63 
[e:\builds\moz2_slave\win32_build\build\content\html\document\src\nshtmlcontentsink.cpp 
@ 3116]
0012f9e4 1009bce2 00000053 00000000 00000000 
xul!SinkContext::CloseContainer+0x11d 
[e:\builds\moz2_slave\win32_build\build\content\html\document\src\nshtmlcontentsink.cpp 
@ 1018]
Comment 39 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 11:38:34 PDT
> but a couple of virtual calls per batch or similar

The number depends on the number of observers; it ranges from a few to tens of thousands, depending on the page.  But yes, if I don't have a patch per comment 31 in the next 10 mins (which I hope I will), or if it's not safe, then we should just eat that cost.
Comment 40 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 11:44:48 PDT
OK, it's more complicated than I thought, because the BeginUpdate that triggers the flush is in nsGenericElement::doInsertChildAt, which just has an index to work with, not reference nodes and the like.

So I think the safe fix is probably Smaug's patch...
Comment 41 Jonas Sicking (:sicking) No longer reading bugmail consistently 2010-10-26 12:10:28 PDT
Indeed it is the inner batch that causes the mutation, since it currently is the only batch which is created.

Smaugs patch frontloads the mutations that the sink causes. But I'm worried that we can still get mutations when the inner (current) batch starts. Though I guess that is less likely to happen since we tend to only do work when entering and exiting the outer-most batch. Except for mutation events of course, but we tend to guard somewhat well against those these days.

Another fix would be to change nsGenericElement::doInsertChildAt to make it more stable in the face of mutations. The problem in this testcase is that we calculate isAppend before starting the batch. If we calculated it after starting the batch we'd simply insert the content before the last node. Technically wrong per specs, but shouldn't crash.
Comment 42 Jonas Sicking (:sicking) No longer reading bugmail consistently 2010-10-26 12:10:51 PDT
The best thing is probably to do both.
Comment 43 Jonas Sicking (:sicking) No longer reading bugmail consistently 2010-10-26 12:19:26 PDT
Created attachment 486128 [details] [diff] [review]
Alternative/complimentary patch

This should also fix the crash. Though probably worth taking both on branch. And definitely take this on m-c.
Comment 44 Reed Loden [:reed] (use needinfo?) 2010-10-26 12:39:12 PDT
*** Bug 607412 has been marked as a duplicate of this bug. ***
Comment 45 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 12:48:11 PDT
Note that on m-c this is not a problem in HTML because the HTML5 parser doesn't do weird stuff from BeginUpdate.  Is it a problem in XML?
Comment 46 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 12:56:57 PDT
Could it be problem with XHTML even in m-c if the code does something like?

// While page is being loaded...
element.appendChild(foobar);
initiateSyncXHR(); // Could parser modify DOM here?
element.appendChild(foobar2);
Comment 47 Daniel Veditz [:dveditz] 2010-10-26 13:36:51 PDT
removing status2.0 'unaffected' due to comment 43. It may or may not be exploitable on trunk, let's just fix it.
Comment 48 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 14:31:44 PDT
Created attachment 486178 [details] [diff] [review]
a bit faster patch

This shouldn't add new virtual nor new Addref/Release calls in the common case
where container is an element.
Comment 49 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 14:51:01 PDT
Comment on attachment 486128 [details] [diff] [review]
Alternative/complimentary patch

I don't understand why ::DisMutate needs to be moved.

Other than that, r=me
Comment 50 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-26 14:57:12 PDT
Created attachment 486189 [details] [diff] [review]
same for 191 if needed
Comment 51 christian 2010-10-26 15:58:15 PDT
Any chance of review for this soon? I know jonas isn't around until 5...can someone else pick this up?
Comment 52 Zack Weinberg (:zwol) 2010-10-26 17:06:11 PDT
I don't grok which data structure has become corrupt here, but does this indicate that something that should have been covered by frame poisoning wasn't?  (in 3.6)  Or that the content tree could benefit from similar protection?
Comment 53 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 17:29:29 PDT
I don't think frame poisoning should have helped here; that said, I did file some followup bugs (bug 607478 and bug 607494) on other remediation that might.

Here's a basic description of what went wrong:

We have the following sequence of events (in Jesse's testcase):

  document.write("<a></a>a");
  document.body.appendChild(base_element);
  // get scrollTop to flush layout
  document.write("barBAR");

After the first of these document.write() calls, the a *element* has been created and nsIDocumentObserver/nsIMutationObservers have been notified.  However, the "a" is still buffered somewhere and a text node has not yet been created for it.

Inside the appendChild implementation, we share code (I guess that's the purpose) by converting the append to an appropriate insert, doing common stuff, and then later converting those insertions that are appends back to appends for notification purposes.

Before we handle the append child, we clearly need to flush anything buffered in the parser/content sink (Flush_Content).  The problem is that we do this too late, *after* converting the append to the concept "insertion of node at index 7" and "is an append".  Then we flush, creating a text node that ends up in the content tree at index 7 within its parent, and notifying the document observers that an append has occured.  The signature of ContentAppended says "content has been appended to this parent's child list, and the index of the first new child is 7".  This causes a text frame to be created.

Then we actually do the insertion.  We insert the base element at index 7, incorrectly before the "a" text.  Then we send another ContentAppended (since we'd already determined it was an append), with the index of 7.  This notification means we go create frames for the content nodes at index 7 (base) and 8 ("a").  base is display:none, but now we create a *second* frame for the "a" text.  This one doesn't end up being the primary frame.

Then there's a layout flush, which causes text frame reflow and text run creation.

Then we come to the third document.write.  Since the child list of the body ends with text, we handle this write by appending text to an existing text node.  document observers are notified (CharacterDataChanged).  The frame constructor handles this by getting the primary frame for the text node (which gets only the *first* of the two text frames), and tells the text frame that its data changed, which makes it drop its text run.

Then, when we get to reflow, we get to the second text frame, which was not told to drop its text from, so it still has a text run mapping one character.  However, it knows that it has 7 characters, so it advances iterators within its text run's data by 7 characters, which is too far.  (That gets to the first assertion, at least, which is where I started debugging.)
Comment 54 Jonas Sicking (:sicking) No longer reading bugmail consistently 2010-10-26 17:52:48 PDT
Comment on attachment 486178 [details] [diff] [review]
a bit faster patch

>@@ -3211,17 +3212,18 @@ nsGenericElement::doInsertChildAt(nsICon
> 
>   PRUint32 childCount = aChildArray.ChildCount();
>   NS_ENSURE_TRUE(aIndex <= childCount, NS_ERROR_ILLEGAL_VALUE);
> 
>   nsMutationGuard::DidMutate();
> 
>   PRBool isAppend = (aIndex == childCount);
> 
>-  mozAutoDocUpdate updateBatch(aDocument, UPDATE_CONTENT_MODEL, aNotify);
>+  mozAutoDocUpdate updateBatch(aNoBeginEndUpdate ? nsnull : aDocument,
>+                               UPDATE_CONTENT_MODEL, aNotify);

Would be more readable as

mozAutoDocUpdate updateBatch(aDocument, UPDATE_CONTENT_MODEL,
                             aNotify && !aNoBeginEndUpdate);

r=me with that.
Comment 55 Zack Weinberg (:zwol) 2010-10-26 17:54:11 PDT
Thanks for the explanation.  Just to make sure I get it: the memory corruption that allows executing data is not from lifecycle problems but from running pointers off the end of an array (in the text run object)?
Comment 56 David Baron :dbaron: ⌚️UTC-7 (busy September 14-25) 2010-10-26 17:55:15 PDT
(In reply to comment #55)
> Thanks for the explanation.  Just to make sure I get it: the memory corruption
> that allows executing data is not from lifecycle problems but from running
> pointers off the end of an array (in the text run object)?

Yes, I believe so.
Comment 57 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 18:01:36 PDT
Comment on attachment 486178 [details] [diff] [review]
a bit faster patch

You can't make that optimization, because we have element classes that override InsertChildAt to do some things in addition to calling nsGenericElement::InsertChildAt.
Comment 58 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 18:02:06 PDT
In particular, html select and optgroup do, as does svg switch (and xtf, and maybe some others; I stopped looking).
Comment 59 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 18:09:04 PDT
Do be clear, I think we should take smaug's "wip" patch from earlier in this bug.
Comment 60 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 18:09:34 PDT
Comment on attachment 486085 [details] [diff] [review]
WIP

r=me
Comment 61 Daniel Veditz [:dveditz] 2010-10-26 18:20:44 PDT
Comment on attachment 486189 [details] [diff] [review]
same for 191 if needed

It's needed.

r=dveditz based on sicking's r+. The only difference (besides line numbers) is the old branch didn't name the mozAutoDocConditionContentUpdateBatch, but either way it's removed on both branches.
Comment 62 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 18:22:55 PDT
Comment on attachment 486189 [details] [diff] [review]
same for 191 if needed

r-, for the same reason
Comment 63 Daniel Veditz [:dveditz] 2010-10-26 18:25:14 PDT
The one time a mid-air might have saved me...  My r- did mid-air with your r- so I guess attachments can only mid-air with themselves.
Comment 64 Boris Zbarsky [:bz] (still a bit busy) 2010-10-26 18:33:23 PDT
Comment on attachment 486128 [details] [diff] [review]
Alternative/complimentary patch

r=me
Comment 65 christian 2010-10-26 18:35:59 PDT
Approved. Please land on mozilla-1.9.2 (default and the GECKO19211_20100930_RELBRANCH branch) and mozilla-1.9.1 (default and GECKO19114_20100930_RELBRANCH branch)
Comment 67 juan becerra [:juanb] 2010-10-26 22:56:48 PDT
I've observed that 3.6.11 across platforms crashes on the simplified tet cases immediately on WinXP, and after a few tries on Mac 10.6 and Linux 10.10, and I have verified that the 3.6.12 candidate builds do not crash after several times of trying (about 5:1 tries). Instead you get an unresponsive-script dialog which you can choose to stop or continue.

However, I noticed that if you check the "don't ask me again" checkbox, you hang the browser and the only option is to force quit.

I'll check 3.5.15 builds next.
Comment 68 Henri Sivonen (:hsivonen) (Not reading bugmail or doing reviews until 2016-10-03) 2010-10-26 23:15:12 PDT
(In reply to comment #45)
> Note that on m-c this is not a problem in HTML because the HTML5 parser doesn't
> do weird stuff from BeginUpdate.

Specifically, the HTML5 parser flushes trailing text before returning from document.write and makes sure all parser-inserted nodes have notified before running a script and before returning from document.write().
Comment 69 juan becerra [:juanb] 2010-10-26 23:25:53 PDT
I checked in Mac 10.6, XP, and Ubuntu 10.10

3.5.14: crashes immediately
3.5.15 build candidates: You get an unresponsive script dialog, which you can stop or continue, but it doesn't crash.

You can reproduce the hang if you check the "don't ask me again" and select to continue, but otherwise this bug fix is verified on 1.9.1 and 1.9.2.
Comment 70 Olli Pettay [:smaug] (way behind * queues, especially ni? queue) 2010-10-27 02:36:02 PDT
(In reply to comment #57)
> You can't make that optimization, because we have element classes that override
> InsertChildAt to do some things in addition to calling
> nsGenericElement::InsertChildAt.
Argh, I was searching only in content/base
Comment 71 Al Billings [:abillings] 2010-10-27 08:44:48 PDT
(In reply to comment #69)
> I checked in Mac 10.6, XP, and Ubuntu 10.10
> 
> 3.5.14: crashes immediately
> 3.5.15 build candidates: You get an unresponsive script dialog, which you can
> stop or continue, but it doesn't crash.

 I assume that you did this with 3.6.11 and 3.6.12 too since you marked this as verified1.9.2?
Comment 72 juan becerra [:juanb] 2010-10-27 10:23:54 PDT
(In reply to comment #71)

Yes, I checked 3.6.11/3.6.12 and 3.5.14/3.5.15 (see comment #67 and comment #69 respectively).
Comment 73 Al Billings [:abillings] 2010-10-27 11:10:17 PDT
Thanks, Juan. I missed comment 67 in the rush. I appreciate you doing the verification for QA.
Comment 74 tomaszkalinowski123 2010-10-29 13:07:46 PDT
I reported it too:

https://bugzilla.mozilla.org/show_bug.cgi?id=606790
Comment 75 tomaszkalinowski123 2010-10-29 13:09:33 PDT
Firefox 3.6.12 patch this vulnerable.
Comment 76 tomaszkalinowski123 2010-10-29 13:19:04 PDT
I reported document.write vulnerable one day before

https://bugzilla.mozilla.org/show_bug.cgi?id=607222

but now Firefox is safe.
Comment 77 Reed Loden [:reed] (use needinfo?) 2010-10-29 13:29:42 PDT
Bug 606790 is not the same as this bug.
Comment 78 alexa.gerancho 2014-02-02 17:04:06 PST Comment hidden (spam)

Note You need to log in before you can comment on or make changes to this bug.