Closed Bug 643940 Opened 14 years ago Closed 13 years ago

Possible leak in the HTML5 parser

Categories

(Core :: DOM: HTML Parser, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 654106

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

(Keywords: memory-leak)

The OS X leaks tool is screaming about leaks with stacks similar to this:

Call stack: [thread 0x7fff7110eca0]: | start | main | XRE_main | nsAppStartup::Run() | nsAppShell::Run() | -[NSApplication run] | -[NSApplication sendEvent:] |       -[ToolbarWindow sendEvent:] | -[NSWindow sendEvent:] | -[ChildView mouseDown:] | nsChildView::DispatchWindowEvent(nsGUIEvent&) | nsChildView::DispatchEvent(nsGUIEvent*,       nsEventStatus&) | HandleEvent(nsGUIEvent*) | nsViewManager::DispatchEvent(nsGUIEvent*, nsIView*, nsEventStatus*) | PresShell::HandleEvent(nsIView*, nsGUIEvent*, int, n      sEventStatus*) | PresShell::HandlePositionedEvent(nsIView*, nsIFrame*, nsGUIEvent*, nsEventStatus*) | PresShell::HandleEventInternal(nsEvent*, nsIView*, nsEventStatus*)       | nsEventDispatcher::Dispatch(nsISupports*, nsPresContext*, nsEvent*, nsIDOMEvent*, nsEventStatus*, nsDispatchingCallback*, nsCOMArray<nsPIDOMEventTarget>*) | nsEventT      argetChainItem::HandleEventTargetChain(nsEventChainPostVisitor&, unsigned int, nsDispatchingCallback*, int, nsCxPusher*) | nsEventListenerManager::HandleEventInternal(n      sPresContext*, nsEvent*, nsIDOMEvent**, nsPIDOMEventTarget*, unsigned int, nsEventStatus*, nsCxPusher*) | nsEventListenerManager::HandleEventSubType(nsListenerStruct*,       nsIDOMEventListener*, nsIDOMEvent*, nsPIDOMEventTarget*, unsigned int, nsCxPusher*) | SharedStub | PrepareAndDispatch | nsXPCWrappedJS::CallMethod(unsigned short, XPTMe      thodDescriptor const*, nsXPTCMiniVariant*) | nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) | JS_CallF      unctionValue | js::ExternalInvoke(JSContext*, js::Value const&, js::Value const&, unsigned int, js::Value*, js::Value*) | js::Invoke(JSContext*, js::CallArgs const&, un      signed int) | js::RunScript(JSContext*, JSScript*, JSStackFrame*) | js::mjit::JaegerShot(JSContext*) | 0x1165f85f7 | js::mjit::ic::Call(js::VMFrame&, js::mjit::ic::Call      ICInfo*) | CallCompiler::update() | js::mjit::stubs::UncachedCallHelper(js::VMFrame&, unsigned int, js::mjit::stubs::UncachedCallResult*) | UncachedInlineCall(js::VMFra      me&, unsigned int, void**, bool*, unsigned int) | js::Interpret(JSContext*, JSStackFrame*, unsigned int, JSInterpMode) | js_fun_apply(JSContext*, unsigned int, js::Valu      e*) | js::Invoke(JSContext*, js::CallArgs const&, unsigned int) | js::RunScript(JSContext*, JSScript*, JSStackFrame*) | js::mjit::JaegerShot(JSContext*) | 0x11cb56195 |       js::mjit::stubs::CompileFunction(js::VMFrame&, unsigned int) | js::Interpret(JSContext*, JSStackFrame*, unsigned int, JSInterpMode) | js::mjit::JaegerShot(JSContext*)       | 0x1179d00f8 | js::mjit::ic::Call(js::VMFrame&, js::mjit::ic::CallICInfo*) | CallCompiler::update() | js::mjit::stubs::UncachedCallHelper(js::VMFrame&, unsigned int, j      s::mjit::stubs::UncachedCallResult*) | UncachedInlineCall(js::VMFrame&, unsigned int, void**, bool*, unsigned int) | js::Interpret(JSContext*, JSStackFrame*, unsigned i      nt, JSInterpMode) | js::mjit::JaegerShot(JSContext*) | 0x11cbc5c6a | js::mjit::ic::Call(js::VMFrame&, js::mjit::ic::CallICInfo*) | CallCompiler::update() | js::mjit::st      ubs::UncachedCallHelper(js::VMFrame&, unsigned int, js::mjit::stubs::UncachedCallResult*) | UncachedInlineCall(js::VMFrame&, unsigned int, void**, bool*, unsigned int)       | js::Interpret(JSContext*, JSStackFrame*, unsigned int, JSInterpMode) | js_SetPropertyHelper(JSContext*, JSObject*, long, unsigned int, js::Value*, int) | nsIDOMNSHTML      Element_SetInnerHTML(JSContext*, JSObject*, long, int, unsigned long long*) | nsGenericHTMLElement::SetInnerHTML(nsAString_internal const&) | nsHtml5Parser::ParseHtml5F      ragment(nsAString_internal const&, nsIContent*, nsIAtom*, int, int, int) | nsHtml5Tokenizer::tokenizeBuffer(nsHtml5UTF16Buffer*) | nsHtml5Tokenizer::stateLoop(int, unsi      gned short, int, unsigned short*, int, int, int) | nsHtml5Portability::newStringFromBuffer(unsigned short*, int, int) | nsAString_internal::Assign(unsigned short const*      , unsigned int) | nsAString_internal::ReplacePrepInternal(unsigned int, unsigned int, unsigned int, unsigned int) | nsAString_internal::MutatePrep(unsigned int, unsigne      d short**, unsigned int*) | malloc | malloc_zone_malloc

We have code like this: <http://mxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5Tokenizer.cpp#2839>, and as far as I can tell, publicIdentifier is never deleted anywhere.

Henri, do you know what's going on here?
(In reply to comment #0)
> We have code like this:
> <http://mxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5Tokenizer.cpp#2839>,
> and as far as I can tell, publicIdentifier is never deleted anywhere.

It is released in
http://mxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5Tokenizer.cpp#3677
and a couple of other places.

> Henri, do you know what's going on here?

I don't. However, if the parser was leaking nsStrings, surely our own leak detection code should catch it.

The only allocations in the parser that aren't tracked by our own leak detection code are allocations for jArray's backing buffers:
http://mxr.mozilla.org/mozilla-central/source/parser/html/jArray.h#57

When the leak detection tool dumps a stack, what has it detected with that stack?
(In reply to comment #1)
> I don't. However, if the parser was leaking nsStrings, surely our own leak
> detection code should catch it.

Yeah.  Do we stores masked pointers anywhere in the HTML5 parser?  Like, anding a pointer value with 0x1 and storing it?  This will confuse the leak detection tool.

> When the leak detection tool dumps a stack, what has it detected with that
> stack?

It shows you the stack under which the leaked memory was allocated.  You have to do the job of trying to make sense out of the stack yourself...
Blocks: mlk-fx5+
Out of curiosity, is there a reason we don't MOZ_COUNT_[C|D]TOR jArray?
(In reply to comment #2)
> (In reply to comment #1)
> > I don't. However, if the parser was leaking nsStrings, surely our own leak
> > detection code should catch it.
> 
> Yeah.  Do we stores masked pointers anywhere in the HTML5 parser?  Like, anding
> a pointer value with 0x1 and storing it?  This will confuse the leak detection
> tool.

No, not yet. (Though it's part of the plan for atom pointers.)

I don't know if that sort of thing happens somewhere inside nsString, but on superficial inspection, it doesn't look like it. (It looks like there's a separate mFlags.)

> > When the leak detection tool dumps a stack, what has it detected with that
> > stack?
> 
> It shows you the stack under which the leaked memory was allocated.  You have
> to do the job of trying to make sense out of the stack yourself...

OK.

(In reply to comment #3)
> Out of curiosity, is there a reason we don't MOZ_COUNT_[C|D]TOR jArray?

jArray is passed by value so that the length and the pointer to the backing memory get copied. When jArray goes out of scope, it doesn't release its backing memory, because the pointer may have been copied to another jArray that still cares about the backing memory. So basically, jArray is just a way to pass a pointer and a length around together in way that works as a function return value, etc.
(In reply to comment #4)
> (In reply to comment #2)
> > (In reply to comment #1)
> > > I don't. However, if the parser was leaking nsStrings, surely our own leak
> > > detection code should catch it.
> > 
> > Yeah.  Do we stores masked pointers anywhere in the HTML5 parser?  Like, anding
> > a pointer value with 0x1 and storing it?  This will confuse the leak detection
> > tool.
> 
> No, not yet. (Though it's part of the plan for atom pointers.)

In this case, I would suspect that the string is really leaked in some situations.  Someone more familiar with the code should investigate what situations may cause this.
The stack in description includes a setting of innerHTML which could mean this was caused by bug 654106. Other than that there was a leak in JSAtom which could cause string leaks (bug 655435).
(In reply to comment #6)
> The stack in description includes a setting of innerHTML which could mean
> this was caused by bug 654106. Other than that there was a leak in JSAtom
> which could cause string leaks (bug 655435).

Ooh, good catch!

Ehsan, can you re-run leaks and see if this still happens?
No longer blocks: mlk-fx5+
(In reply to comment #7)
> (In reply to comment #6)
> > The stack in description includes a setting of innerHTML which could mean
> > this was caused by bug 654106. Other than that there was a leak in JSAtom
> > which could cause string leaks (bug 655435).
> 
> Ooh, good catch!
> 
> Ehsan, can you re-run leaks and see if this still happens?

Hmm, I need to get an optimized build and try with that...
I gave this a shot again, and this time I didn't see this stack.  But you shouldn't read too much into this, since I don't exactly remember what I did the last time, and furthermore, my test case is gmail, which means that it's almost impossible to get the same execution twice in a row.
(In reply to comment #9)
> But you
> shouldn't read too much into this, since I don't exactly remember what I did
> the last time, and furthermore, my test case is gmail, which means that it's
> almost impossible to get the same execution twice in a row.

Given comment 6, I'm going to read enough that I'll close this bug.  It can be reopened or refiled if anyone reproduces it in the future.

And if you were an end-user I'd explain to you the importance of clear steps-to-reproduce when filing bugs, but because you're a developer I know I don't need to do that ;P
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
(In reply to comment #10)
> And if you were an end-user I'd explain to you the importance of clear
> steps-to-reproduce when filing bugs, but because you're a developer I know I
> don't need to do that ;P

Ah, right!  Sorry, I was testing bug 497808 and talking to roc on irc, and I just assumed that everybody in the future would have that context!  ;-)

FWIW, I opened a gmail tab and continued to use it mildly for about half an hour.
You need to log in before you can comment on or make changes to this bug.