With speculative HTML5 parsing, http://www.forbes.com/ causes a text node to be appended to a null parent, which causes a pending append notification to be posted with null argument.
Trying to debug this with assertions, I see non-sensical stacks. Also, I managed to get a non-null bogus pointer at the crash site. I suspect heap corruption. :-(
Maybe the non-sensical stacks were due to the debugger breaking on the main thread when another thread sends an interrupt signal (when assertions are set to trap). The value of the text node being supposedly appended to the 0x0 parent node is "\n\v\f\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklm" Which suggests the tree op is completely corrupted. Moreover, I tried asserting that an append is never created with a null parent but that assertion never fires. I also discovered that Valgrind doesn't work on Snow Leopard. :-( I'll try on Linux.
It crashes in the same place on Linux, which (fortunately) suggests that this isn't badness caused by a double-free or something like that. Valgrind doesn't complain about anything prior to the crash. Upon crash in Valgrind on Linux, the bad pointer is 0xa5a5a5a5 (0x0 on Mac without Valgrind). So far I've been unable to locate documentation saying what marker pattern 0xa5a5a5a5 is. Anyway, it looks like the parent pointer is uninitialized but somehow the opcode is an append. It seems to me that whenever eTreeOpAppend is used in an initialization, it is used with the right number of operands. I'm now asserting the non-nullness of those operands, but those assertions don't fire. So: * The same crash on at least two platforms. * Consistent crash. * The second pointer in the tree op is uninitialized. * The opcode is append. (Or at least something that acts like append in switch!) * The first pointer is interpreted (by gdb) to be a text node, but its memory looks bogus. * The tree op creation code in the tree builder looks correct. What could touch the tree op memory behind the back of the tree builder?
So 0xa5a5a5a5 is jemalloc uninitialized memory.
What must be happening here is that each tree op is initialized alright but when the append runs, previous tree ops haven't assigned the right pointers to the nsIContent** handles. Specifically, the createElement operation for the parent hasn't run. This is odd. I thought that whenever ops got trashed, the creation ops and subsequent ops depending on the creation ops would get trashed together.
One queue of tree ops has a handle for a td element. A subsequent run of tree ops starts with creating a text node for "\n" and appending it to td element, but the td element was never created. It doesn't help to clear the text buffer of the tree builder and to trash the op queue of the tree builder when a speculation fails.
Created attachment 410761 [details] [diff] [review] Be more careful about emptying the tree builder op queue The actual fix is in the two first hunks of the diff. The rest is just added debug code that's nice to put in the tree, too, IMO. So what happened here was that an element got created as a result of document.write(). The stack node for that element was properly cloned to the parser thread but I had forgotten to transfer the tree ops from the document.write() tree builder to the executor before going back to the off-the-main-thread parser.
This really needs a regression test, too.
Created attachment 411409 [details] [diff] [review] Crashtest
washingtonpost.com is crashing the latest Minefield builds as well and I believe it's the same bug. Here's my crash reports: http://crash-stats.mozilla.com/report/index/5a7738b3-f7b6-4602-8720-160cb2091129 http://crash-stats.mozilla.com/report/index/e0ef5b71-155e-4064-84b1-64e392091129
I've had this crash on three separate sites today and then I found this through the crash report. How is progress?
(In reply to comment #14) > I've had this crash on three separate sites today and then I found this through > the crash report. How is progress? The crash is understood and has a reviewed fix. However, the trunk is restricted to 1.9.2 blockers and this isn't a 1.9.2 blocker. Hence, I'm waiting for the tree to reopen. I'm sorry for the prolonged unfixedness. For the time being, if you aren't specifically testing HTML5 parsing, it's probably best to set html5.enable to false until the patch for this bug lands. OTOH, if you are testing HTML5 parsing specifically, tryserver builds with the fix for this bug (and other HTML5 parsing fixes, including Web compat fixes to the parsing algorithm itself) are available from https://firstname.lastname@example.org/
a=beltzner for landing on mozilla-central, this is not part of the build as it's off by default at the moment so it won't interfere with mozilla-1.9.2 landings.
Thanks for r & a. Pushed: http://hg.mozilla.org/mozilla-central/rev/2f11a4721c94 (I didn't push the test yet, to avoid pushing anything that'd run on tinderboxen.)