Closed Bug 561874 Opened 10 years ago Closed 10 years ago

[HTML5] 4MB text file crashes Minefield when HTML5 is enabled [@ nsGenericElement::GetBaseURI() ]


(Core :: DOM: HTML Parser, defect, P2, critical)

Windows 7



Tracking Status
blocking2.0 --- beta1+


(Reporter: geeknik, Assigned: hsivonen)



(Keywords: crash, regression)

Crash Data


(2 files, 2 obsolete files)

I've attached a 4MB text file, which contains 36512 lines in the following format:

filename[time] <irc_nick> text + url (or just a url by itself)

There is no HTML in this code. When I disabled HTML5, the file loads in Minefield after a brief pause, however if I re-enable HTML5, this file will instantly crash the browser every single time.
Keywords: crash
Hardware: x86 → All
Summary: [HTML5] 4MB text file crashes Minefield when HTML5 is enabled Firefox 3.7a5pre Crash Report [@ nsGenericElement::GetBaseURI() ] → [HTML5] 4MB text file crashes Minefield when HTML5 is enabled [@ nsGenericElement::GetBaseURI() ]
this is basically 'deeply nested dom', the old parser has a guard against this. i think we already have a bug for that.
Whiteboard: DUPEME
> There is no HTML in this code.

Sure there is.  Stuff in angle brackets is HTML tags, and the file is .html.

Henri, I thought you had reintroduced guards for this.  What gives?
blocking2.0: --- → ?
Attached patch Patch for the .java side (obsolete) — Splinter Review
(In reply to comment #2)
> Henri, I thought you had reintroduced guards for this.  What gives?

It was possible for the stack to grow past the guard and after that, the guard no longer worked (because it used == instead of >= assuming that it got evaluated for every step of growth, which is didn't.)

Here's what I think fixes this. I'll test the patch still a bit.

(BTW, the test case has script tags pointing to remote servers, so it's a good idea to remove those before loading the test case.)
Assignee: nobody → hsivonen
Comment on attachment 445713 [details] [diff] [review]
Patch for the .java side

I gave this some more thought. Popping is bad, because the thee builder mode can go out of sync and the subsequent tags can do unwanted things. Ignoring start tag tokens would be bad, because it would change the executability properties of pieces of content.

I think a better way to fix this is to let the tree builder algorithm run per spec but intercepting and retargeting append tree operations on the Gecko glue layer when the stack is deep..
Attachment #445713 - Attachment is obsolete: true
Like the older solutions, this solution makes the tree grow wide instead of deep when the depth reaches a magic number.

With this patch, a huge file can still lock up the app for a long time. I'm not sure what I can do about that from within the parser except to throw data away.

I haven't tested this patch on Windows, yet.
Sounds like this is a regression, so blocking.
blocking2.0: ? → beta1+
Keywords: regression
This patch is like the previous one, except this one drops elements and comments instead of surrogate-parenting them.

As a result, this makes text nodes is deep trees to get coalesced by the tree op executor on the main thread. This means that the browser doesn't crash but pages that end up having a lot of text coalesced on the main thread will take a *long* time to load. I think this is OK, since this is enough for the user to stop the load (the UI is responsive while the text is trickling in) and the point here is to prevent crashing--not making the pathological case fast.

This patch also makes the crash from bug 557180 go away in the HTML case (but not in the XML case).
Attachment #446218 - Attachment is obsolete: true
Attachment #447480 - Flags: review?(bzbarsky)
Blocks: 542268
Blocks: 557180
Blocks: 568460
Priority: -- → P2
Blocks: 569049
Comment on attachment 447480 [details] [diff] [review]
Run TreeBuilder.cpp per spec, make tree ops retarget text appends to a surrogate parent when the tree is deep, drop comments and elements

s/or rather small/is rather small/ and looks good.
Attachment #447480 - Flags: review?(bzbarsky) → review+
Duplicate of this bug: 570019
Thanks. Pushed with the s/or/is/ fix.
Closed: 10 years ago
Resolution: --- → FIXED
Duplicate of this bug: 569049
Crash Signature: [@ nsGenericElement::GetBaseURI() ]
Whiteboard: DUPEME
You need to log in before you can comment on or make changes to this bug.