Open Bug 659333 Opened 14 years ago Updated 3 years ago

Crash in tree builder nsHtml5TreeBuilder::accumulateCharacters with large `innerHTML` assignments

Categories

(Core :: DOM: HTML Parser, defect)

5 Branch
x86
Windows XP
defect

Tracking

()

UNCONFIRMED

People

(Reporter: mingyiliu, Unassigned)

References

Details

(Keywords: crash, testcase)

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 We have multiple JSON data files on server. When loading one 48 MB JSON in FF4, it was fine and consumes 430 MB memory. When loading another 62 MB JSON in another tab, FF4 crashes (not just the tab, the browser goes down). This is very reproduciable. Our JSON data are proprietary but they're simple & very shallow structures composed of arrays and hashes no more than 3 levels deep, very ordinary except for their large sizes. While it's understandable that browser can't deal with unreasonably large data, but FF4 should probably stay open (crashing just one tab is probably OK but not the browser). The Google Chrome 11 on the other hand used much less memory parsing the 48 MB JSON (220 MB vs. the 430 MB used by FF), but it also crashed once I tried to load a few tabs with such large data. However, Chrome only crashed the new tabs where I tried to load more large JSON data, not the old tabs that had large JSON data opened or the browser itself. Interestingly, though, Chrome also crashed the tab every time I tried to load the 62 MB JSON even if it's the only tab open. I guess Chrome set a lower limit on max tab memory consumption. Reproducible: Always Steps to Reproduce: 1. Load a large JSON object (~60 MB). 2. Open a new tab and load another obj with similar size. 3. Repeat 1-2 if needed. 4. Entire FF4 crashes. Actual Results: FF4 browser crashes. Expected Results: Browser stays open, only crash the tab, or refuse to load the JSON and warn user about it. My WinXP 32-bit laptop has 3 GB memory. This crash happens on new profiles with NO addons or existing profiles with addons disabled, or profiles with addons enabled.
BTW, I also feel that if FF4 developers could take a look into why Chrome11 uses just half the RAM FF4 used to parse and hold the same 48 MB JSON data, it could be very helpful too for FF4 to support JS apps.
Do you get a CrashID or WinDbg Tracelog by Chance? https://developer.mozilla.org/En/How_to_get_a_stacktrace_for_a_bug_report
Version: unspecified → 4.0 Branch
Just installed FF5b2 and it crashed same way. CrashID is: bp-b2c7347a-4fd7-45cf-8f28-801142110525 and I disabled all addons this time.
Signature mozalloc_abort(char const* const) | mozalloc_handle_oom() | jArray<unsigned short, int>::newJArray(int) UUID b2c7347a-4fd7-45cf-8f28-801142110525 Uptime 6.8 minutes Last Crash 1528 seconds (25.5 minutes) before submission Install Age 548 seconds (9.1 minutes) since version was first installed. Install Time 2011-05-25 13:21:05 Product Firefox Version 5.0 Build ID 20110517192056 Release Channel beta Branch 2.2 OS Windows NT OS Version 5.1.2600 Service Pack 3 CPU x86 CPU Info GenuineIntel family 6 model 15 stepping 11 Crash Reason EXCEPTION_BREAKPOINT Crash Address 0x7f1a39 User Comments same scenario as reported in https://bugzilla.mozilla.org/show_bug.cgi?id=659333 App Notes Cisco VPN AdapterVendorID: 8086, AdapterDeviceID: 2a02, AdapterDriverVersion: 6.14.10.4785 D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- Frame Module Signature Source 0 mozalloc.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp:77 1 mozalloc.dll mozalloc_handle_oom() memory/mozalloc/mozalloc_oom.cpp:54 2 xul.dll jArray<unsigned short,int>::newJArray(int) parser/html/jArray.h:57 3 xul.dll nsHtml5TreeBuilder::accumulateCharacters(unsigned short const*,int,int) 4 xul.dll nsHtml5TreeBuilder::characters(unsigned short const*,int,int) parser/html/nsHtml5TreeBuilder.cpp:417 5 xul.dll nsHtml5Tokenizer::stateLoop(int,unsigned short,int,unsigned short*,int,int,int) parser/html/nsHtml5Tokenizer.cpp:3265 6 xul.dll nsHtml5Tokenizer::tokenizeBuffer(nsHtml5UTF16Buffer*) parser/html/nsHtml5Tokenizer.cpp:391 7 xul.dll nsHtml5Parser::ParseHtml5Fragment(nsAString_internal const&,nsIContent*,nsIAtom*,int,int,int) parser/html/nsHtml5Parser.cpp:537 8 xul.dll nsGenericHTMLElement::SetInnerHTML(nsAString_internal const&) content/html/content/src/nsGenericHTMLElement.cpp:757 9 xul.dll nsIDOMNSHTMLElement_SetInnerHTML obj-firefox/js/src/xpconnect/src/dom_quickstubs.cpp:21359 10 mozjs.dll js::Shape::set(JSContext*,JSObject*,bool,js::Value*) js/src/jsscopeinlines.h:278 11 mozjs.dll js_SetPropertyHelper(JSContext*,JSObject*,int,unsigned int,js::Value*,int) js/src/jsobj.cpp:5592 12 mozjs.dll js::mjit::stubs::SetName<0>(js::VMFrame&,JSAtom*) js/src/methodjit/StubCalls.cpp:260 13 @0x6eaed9f 14 mozjs.dll js::Interpret(JSContext*,JSStackFrame*,unsigned int,JSInterpMode) js/src/jsinterp.cpp:4710
Keywords: crash
Product: Firefox → Core
QA Contact: general → general
Version: 4.0 Branch → 5 Branch
This is an OOM crash. I find it interesting that we're crashing under a SetInnerHTML call instead of the JSON parsing code as I would have expected. Does this happen because we build a visual representation or something?
Component: General → HTML: Parser
QA Contact: general → parser
No way to tell given the complete lack of any actual HTML code showing the problem. Reporter, can you please provide a URL to such code?
FWIW, OOMing with the HTML parser on the stack is not a parser bug. It's a consequence of the decision to transition to "infallible" malloc.
Well, that depends. If the HTML parser is allocating something whose size is explicitly under page control using the infallible allocator, that's an HTML parser bug. But it's hard to say anything specific without a way to reproduce.
You're right. It's not JSON.parse that failed. It was failing during setting innerHTML. I did not realize that because my 2-yr-old code was failing recently due to users opening multiple tabs of large JSON data. But in reality, the code was using Extjs library's Ext.Element.load() call, which would take a wrapped HTMLElement, put in a loading animation and text in the element, sends an ajax call, and when it's back, updating the element with the req.responseTEXT, then call the callback user sent in. And the failure step was really on innerHTML after I did some testing to confirm. I can't post any proprietary JSON data/script, so I made up the following reproducible way of producing the browser crash without using Extjs. Server script (bigJSON.pl) is below: #!/user/bin/perl # test browers' abilities in handling big JSON objects use strict; use CGI qw/:standard -debug/; use JSON; my $size = param('size') || 50; print "Content-type: text/javascript\n\n"; my $json = {data => []}; my $s; while(1) { my $obj = {}; for(my $i = 0; $i < 30; $i++) { $obj->{getKey()} = 'abcd'; } $s += 420; push(@{$json->{data}}, $obj); last if $s >= $size * 1000000; } print to_json($json); sub getKey { my @s; for(my $i = 0; $i < 4; $i++) { push(@s, chr(ord('a') + rand(25))); } return join('', @s); } Now here's the frontend bioJSON.html file: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>Test Big JSON</title> <script type="text/javascript"> function getJSON() { var req = new XMLHttpRequest(); req.open('GET', 'http://my-cgi-bin/bigJSON.pl?size=55', true); // size is in MB, change size to test browser tolerance req.onreadystatechange = function (aEvt) { if (req.readyState == 4) { if(req.status == 200) { // alert('back from server'); document.getElementById('test').innerHTML = req.responseText; // alert('loaded'); window.testjson = JSON.parse(req.responseText); // alert('parsed'); } else dump("Error loading page\n"); } }; req.send(null); } </script> </head> <body onload="getJSON()"> <div id="test"> </div> </body> </html> In my 3 GB laptop, 3 tabs of bigJSON.html will cause FF5b2 fail. If one comments out the innerHTML line, it's very hard to get FF to crash (it usually just refuses to load or sends an out-of-memory error), although I don't remember if it's 100% not failing.
Looks like this is simply a matter of the text accumulation buffer in the HTML parser's tree builder hitting OOM as it is try to grow to accommodate a lot of text.
Keywords: testcase
I confirm that there is such a bug. Firefox consume a lot of memory when processing large arrays of data JSON and finally crashed. Sample data can be viewed at this link https://www.dropbox.com/s/rdh7jy6tlz6ogjg/twitter_json_data.json. Site where falls when parsing http://json.parser.online.fr/ (excuse me for bad english, i used google translate)
Attached file A large JSON data set
Alternative steps to reproduce: 1. Go to http://json.parser.online.fr/ 2. Load contents of attachment into input field. 3. Firefox crashes.
Blocks: 1128528
With STR of comment#14, I got a different regression window in comment#0. And I filed Bug 1128528.
Severity: critical → S2

This is basically just an OOM. I'm not sure if it still reproduces (the sample JSON file is no longer available). If it does, it would be nice to handle it more cleanly, but it shouldn't be a major issue in practice, especially with e10s/Fission.

Severity: S2 → S3
Summary: JSON.parse crashes browser when decoding large JSON → Crash in tree builder nsHtml5TreeBuilder::accumulateCharacters with large `innerHTML` assignments
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: