Closed
Bug 515338
Opened 15 years ago
Closed 15 years ago
[HTML5] Make HTML5 parser internals not hold nsIContent or regular dynamic atoms
Categories
(Core :: DOM: HTML Parser, defect)
Core
DOM: HTML Parser
Tracking
()
RESOLVED
FIXED
People
(Reporter: hsivonen, Assigned: hsivonen)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 6 obsolete files)
13.68 KB,
patch
|
Details | Diff | Splinter Review | |
7.64 KB,
patch
|
Details | Diff | Splinter Review | |
149.03 KB,
patch
|
mozilla+ben
:
review+
|
Details | Diff | Splinter Review |
The tokenizer and tree builder must not hold main-thread-only objects like nsIContent and nsAtomImpl.
Assignee | ||
Comment 1•15 years ago
|
||
Assignee | ||
Comment 2•15 years ago
|
||
Attachment #400031 -
Attachment is obsolete: true
Assignee | ||
Updated•15 years ago
|
Attachment #400756 -
Flags: review?(bnewman)
Assignee | ||
Comment 3•15 years ago
|
||
Comment on attachment 400756 [details] [diff] [review] Change the memory management regime for atoms, nodes and attributes, v2 This is ready for review, except changes over in bug 514661 will affect how exactly this patch will call the scoped atom table stuff. However, those changes shouldn't affect the substance of this patch.
Comment 4•15 years ago
|
||
Does this patch depend on any others? It's not applying cleanly for me, even on top of 514661. Maybe I'm just confused.
Keywords: 4xp
Assignee | ||
Comment 5•15 years ago
|
||
It depends on bug 499642, bug 514661 and bug 515142. Bug 515142 is a spec correctness change to the Java bits, so I'm assuming it can go in as rs=sicking.
Comment 6•15 years ago
|
||
My comments: diff --git a/parser/html/nsHtml5Portability.cpp b/parser/html/nsHtml5Portability.cpp --- a/parser/html/nsHtml5Portability.cpp +++ b/parser/html/nsHtml5Portability.cpp @@ -115,20 +115,8 @@ nsHtml5Portability::releaseLocal(nsIAtom { NS_IF_RELEASE(local); } -// XXX Useless code -void -nsHtml5Portability::retainElement(nsIContent** element) -{ -} - -// XXX Useless code -void -nsHtml5Portability::releaseElement(nsIContent** element) -{ -} - PRBool nsHtml5Portability::localEqualsBuffer(nsIAtom* local, PRUnichar* buf, PRInt32 offset, PRInt32 length) { return local->Equals(nsDependentSubstring(buf + offset, buf + offset + length)); diff --git a/parser/html/nsHtml5Portability.h b/parser/html/nsHtml5Portability.h --- a/parser/html/nsHtml5Portability.h +++ b/parser/html/nsHtml5Portability.h @@ -68,10 +68,11 @@ class nsHtml5Portability static jArray<PRUnichar,PRInt32> newCharArrayFromString(nsString* string); static void releaseString(nsString* str); static void retainLocal(nsIAtom* local); static void releaseLocal(nsIAtom* local); - static void retainElement(nsIContent** elt); - static void releaseElement(nsIContent** elt); + // Advertise the uselessness here? + static void retainElement(nsIContent** elt) {} + static void releaseElement(nsIContent** elt) {} static PRBool localEqualsBuffer(nsIAtom* local, PRUnichar* buf, PRInt32 offset, PRInt32 length); static PRBool lowerCaseLiteralIsPrefixOfIgnoreAsciiCaseString(const char* lowerCaseLiteral, nsString* string); static PRBool lowerCaseLiteralEqualsIgnoreAsciiCaseString(const char* lowerCaseLiteral, nsString* string); static PRBool literalEqualsString(const char* literal, nsString* string); diff --git a/parser/html/nsHtml5TreeBuilder.cpp b/parser/html/nsHtml5TreeBuilder.cpp --- a/parser/html/nsHtml5TreeBuilder.cpp +++ b/parser/html/nsHtml5TreeBuilder.cpp @@ -617,8 +617,9 @@ nsHtml5TreeBuilder::startTag(nsHtml5Elem selfClosing = PR_FALSE; } else { appendToCurrentNodeAndPushElementMayFosterCamelCase(currNs, elementName, attributes); } + // Why do these assignments not use nsHtml5HtmlAttributes::EMPTY_ATTRIBUTES? attributes = nsnull; goto starttagloop_end; } else { attributes->adjustForMath(); diff --git a/parser/html/nsHtml5TreeOpExecutor.cpp b/parser/html/nsHtml5TreeOpExecutor.cpp --- a/parser/html/nsHtml5TreeOpExecutor.cpp +++ b/parser/html/nsHtml5TreeOpExecutor.cpp @@ -75,8 +75,10 @@ NS_IMPL_RELEASE_INHERITED(nsHtml5TreeOpE NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(nsHtml5TreeOpExecutor, nsContentSink) NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mFlushTimer) NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mScriptElement) + // This is gonna traverse a lot of elements. Maybe there's no other way, but + // I hope you're pretty sure this is necessary. NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedElements) NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedNonElements) NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END diff --git a/parser/html/nsHtml5TreeOpExecutor.h b/parser/html/nsHtml5TreeOpExecutor.h --- a/parser/html/nsHtml5TreeOpExecutor.h +++ b/parser/html/nsHtml5TreeOpExecutor.h @@ -112,8 +112,12 @@ class nsHtml5TreeOpExecutor : public nsI nsTArray<nsHtml5TreeOperation> mOpQueue; nsTArray<nsIContentPtr> mElementsSeenInThisAppendBatch; nsTArray<nsHtml5PendingNotification> mPendingNotifications; nsHtml5StreamParser* mStreamParser; + + // Any idea how often duplicate elements are inserted? + // From a brief glance at debugging spew, I didn't notice any obvious + // repetition, so the hash doesn't seem immediately necessary. nsCOMArray<nsIContent> mOwnedElements; // maybe change to hash // This could be optimized away by introducing more tree ops so that // non-elements wouldn't use the handle setup but the text node / comment diff --git a/parser/html/nsHtml5TreeOperation.cpp b/parser/html/nsHtml5TreeOperation.cpp --- a/parser/html/nsHtml5TreeOperation.cpp +++ b/parser/html/nsHtml5TreeOperation.cpp @@ -58,15 +58,15 @@ #include "nsIStyleSheetLinkingElement.h" #include "nsIDOMDocumentType.h" #define TO_CONTENT(_field) \ - (*((nsIContent**)_field)) + (*(static_cast<nsIContent**>(_field))) #define TO_ATOM(_field) \ - ((nsIAtom*)_field) + (static_cast<nsIAtom*>(_field)) #define TO_ATTRIBUTES(_field) \ - ((nsHtml5HtmlAttributes*)_field) + (static_cast<nsHtml5HtmlAttributes*>(_field)) nsHtml5TreeOperation::nsHtml5TreeOperation() : mOpCode(eTreeOpAppend) { @@ -188,9 +188,9 @@ nsHtml5TreeOperation::Perform(nsHtml5Tre } } } - stateMask = stateMask ^ PRUint32(node->IntrinsicState()); + stateMask ^= PRUint32(node->IntrinsicState()); if (stateMask && document) { MOZ_AUTO_DOC_UPDATE(document, UPDATE_CONTENT_STATE, PR_TRUE); document->ContentStatesChanged(node, nsnull, stateMask); } @@ -365,9 +365,9 @@ nsHtml5TreeOperation::Perform(nsHtml5Tre aBuilder->StartLayout(); // this causes a flush anyway return rv; } case eTreeOpDocumentMode: { - aBuilder->DocumentMode(*((nsHtml5DocumentMode*)mPtrOne)); + aBuilder->DocumentMode(*static_cast<nsHtml5DocumentMode*>(mPtrOne)); return rv; } default: { NS_NOTREACHED("Bogus tree op"); diff --git a/parser/html/nsHtml5TreeOperation.h b/parser/html/nsHtml5TreeOperation.h --- a/parser/html/nsHtml5TreeOperation.h +++ b/parser/html/nsHtml5TreeOperation.h @@ -99,8 +99,9 @@ class nsHtml5TreeOperation { ~nsHtml5TreeOperation(); inline void Init(nsIContent** aNode, nsIContent** aParent) { + // Why abandon the old meaningful variable names (mNode, mParent)? mPtrOne = (void*)aNode; mPtrTwo = (void*)aParent; } @@ -130,8 +131,9 @@ class nsHtml5TreeOperation { inline void Init(nsHtml5DocumentMode aMode) { mOpCode = eTreeOpDocumentMode; switch (aMode) { case QUIRKS_MODE: + // Oh, I see. mPtrOne = (void*) &sQuirks; return; case ALMOST_STANDARDS_MODE: mPtrOne = (void*) &sAlmostStandards; @@ -201,8 +203,9 @@ class nsHtml5TreeOperation { // 3) Make the queue take items the size of pointer and make the op code // decide how many operands it dequeues after it. eHtml5TreeOperation mOpCode; PRInt32 mInt; + // Can you at least make these union types? void* mPtrOne; void* mPtrTwo; void* mPtrThree; };
Comment 7•15 years ago
|
||
This trick essentially gives the value stored in mPtrOne a bunch of typed aliases. As each name corresponds to a specific type, the need for void* casting goes away. I still think having multiple members (m1, m2, m3) is useful, although technically you could have union members like struct { nsIContent **node, **parent }. Since the options provided by m1..3 will probably be distinct, the division ensures somewhat more type safety than having a single union type.
Assignee | ||
Comment 8•15 years ago
|
||
> -// XXX Useless code > -void > -nsHtml5Portability::retainElement(nsIContent** element) > -{ > -} > - > -// XXX Useless code > -void > -nsHtml5Portability::releaseElement(nsIContent** element) > -{ > -} I've now moved the ugliness to the Java source and the translator. I didn't remove these from the Java source, yet, because it seemed premature to remove the possibility of element refcounting C++ targets for non-Gecko uses. However, making the translator change take effect with this patch would be inconvenient, because by now, the translator also has other changes for the next iteration and those would break this patch. OK if those useless methods go away with the next landing after this one? > diff --git a/parser/html/nsHtml5TreeBuilder.cpp > b/parser/html/nsHtml5TreeBuilder.cpp > --- a/parser/html/nsHtml5TreeBuilder.cpp > +++ b/parser/html/nsHtml5TreeBuilder.cpp > @@ -617,8 +617,9 @@ nsHtml5TreeBuilder::startTag(nsHtml5Elem > selfClosing = PR_FALSE; > } else { > appendToCurrentNodeAndPushElementMayFosterCamelCase(currNs, > elementName, attributes); > } > + // Why do these assignments not use > nsHtml5HtmlAttributes::EMPTY_ATTRIBUTES? > attributes = nsnull; It felt more natural to use null as the marker that the the object's ownership has been adopted away. This is a bit awkward, but the attribute holder will cross the thread boundary and I wanted to avoid situation where an object participates in refcounting on two threads. Also, I wanted to avoid constructs that can't be expressed easily in the Java source (reference to pointer or pointer pointer that'd allow the method doing the adoption write into a stack variable of its invocator). > diff --git a/parser/html/nsHtml5TreeOpExecutor.cpp > b/parser/html/nsHtml5TreeOpExecutor.cpp > --- a/parser/html/nsHtml5TreeOpExecutor.cpp > +++ b/parser/html/nsHtml5TreeOpExecutor.cpp > @@ -75,8 +75,10 @@ NS_IMPL_RELEASE_INHERITED(nsHtml5TreeOpE > > NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(nsHtml5TreeOpExecutor, > nsContentSink) > NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mFlushTimer) > NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mScriptElement) > + // This is gonna traverse a lot of elements. Maybe there's no other way, > but > + // I hope you're pretty sure this is necessary. > NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedElements) > NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMARRAY(mOwnedNonElements) > NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END It's necessary if one wants the cycle collector to be able to correctly traverse the whole object graph. It's rather useless if it's useless to have the cycle collector fully traverse the object graph during the parse, though. Therefore, I'm not at all sure that this is necessary. I suspect it's unnecessary, but I'm not 100% sure it's unnecessary. That is, I don't know if there are cases where the parser stops in mid-parse and the cycle collector is relied upon for detecting the situation and collecting the whole graph. > diff --git a/parser/html/nsHtml5TreeOpExecutor.h > b/parser/html/nsHtml5TreeOpExecutor.h > --- a/parser/html/nsHtml5TreeOpExecutor.h > +++ b/parser/html/nsHtml5TreeOpExecutor.h > @@ -112,8 +112,12 @@ class nsHtml5TreeOpExecutor : public nsI > nsTArray<nsHtml5TreeOperation> mOpQueue; > nsTArray<nsIContentPtr> mElementsSeenInThisAppendBatch; > nsTArray<nsHtml5PendingNotification> mPendingNotifications; > nsHtml5StreamParser* mStreamParser; > + > + // Any idea how often duplicate elements are inserted? > + // From a brief glance at debugging spew, I didn't notice any obvious > + // repetition, so the hash doesn't seem immediately necessary. > nsCOMArray<nsIContent> mOwnedElements; // maybe change to > hash What I had in mind was trying to deref the nodes earlier in the future. In such a scenario, it would also have to be possible to addref the "form pointer" element. I've removed the comment. > - (*((nsIContent**)_field)) > + (*(static_cast<nsIContent**>(_field))) Casting replaced with use of unions. > - stateMask = stateMask ^ PRUint32(node->IntrinsicState()); > + stateMask ^= PRUint32(node->IntrinsicState()); Fixed. > + // Why abandon the old meaningful variable names (mNode, mParent)? Because many tree ops use the members for other purposes. > + // Can you at least make these union types? Made them into unions. > I still think having multiple members (m1, m2, m3) is useful, although > technically you could have union members like struct { nsIContent **node, > **parent }. Since the options provided by m1..3 will probably be distinct, the > division ensures somewhat more type safety than having a single union type. Type safety is up to the programmer here anyway, so I went with a single union type for the three members. Also changed the atom stuff to use the new patch for bug 514661.
Attachment #400756 -
Attachment is obsolete: true
Attachment #401834 -
Flags: review?(bnewman)
Attachment #400756 -
Flags: review?(bnewman)
Assignee | ||
Comment 9•15 years ago
|
||
Attachment #401834 -
Attachment is obsolete: true
Attachment #401876 -
Flags: review?(bnewman)
Attachment #401834 -
Flags: review?(bnewman)
Assignee | ||
Comment 10•15 years ago
|
||
Actually, avoiding leaks by using the destructor is wrong, because nsTArray copies its contents upon append. The best fix would be having a way to append an nsTArray to another by moving so that array items would neither be copy-constructed nor destructed by the append.
Assignee | ||
Comment 11•15 years ago
|
||
It seems the AppendElements() stuff wasn't in this patch but in the next one in my queue, so the copy-construction case doesn't arise yet in this patch.
Assignee | ||
Comment 12•15 years ago
|
||
Attachment #401876 -
Attachment is obsolete: true
Attachment #402824 -
Flags: review?(bnewman)
Attachment #401876 -
Flags: review?(bnewman)
Comment 13•15 years ago
|
||
Comment 14•15 years ago
|
||
Comment on attachment 402962 [details] [diff] [review] feedback for "Avoid leaks when tree ops are discarded without executing them (In reply to comment #13) > Created an attachment (id=402962) [details] > feedback for "Avoid leaks when tree ops are discarded without executing them Moving the union type into nsHtml5TreeOperation, using nsAutoPtr and nsAutoArrayPtr instead of the ~nsHtml5TreeOperation (seems cleaner to me), and const-ifying the nsHtml5Parser::mAtomTable member (since it never gets reassigned).
Attachment #402962 -
Attachment is patch: true
Attachment #402962 -
Attachment mime type: application/octet-stream → text/plain
Assignee | ||
Comment 15•15 years ago
|
||
(In reply to comment #14) > (From update of attachment 402962 [details] [diff] [review]) > (In reply to comment #13) > > Created an attachment (id=402962) [details] [details] > > feedback for "Avoid leaks when tree ops are discarded without executing them > > Moving the union type into nsHtml5TreeOperation, using nsAutoPtr and > nsAutoArrayPtr instead of the ~nsHtml5TreeOperation (seems cleaner to me), Won't this fail when we have speculation and tree ops in failed speculations get destroyed without ever calling Perform() them? (Even now forced Teminated() should lead to that situation.) > and > const-ifying the nsHtml5Parser::mAtomTable member (since it never gets > reassigned). Never reassiging is a bug, actually. It needs to get reassigned or otherwise emptied when the innerHTML parser is recycled.
Assignee | ||
Comment 16•15 years ago
|
||
I made the other changes except I didn't move the tree op member deletions due to concerns stated above.
Attachment #402824 -
Attachment is obsolete: true
Attachment #402824 -
Flags: review?(bnewman)
Assignee | ||
Comment 17•15 years ago
|
||
This step along the way introduced string leaks. Must be a failure to delete an attribute holder somewhere.
Assignee | ||
Comment 18•15 years ago
|
||
Bug 518104 is now in the queue ahead of this one, since I had to put one or the other first.
Depends on: 518104
Assignee | ||
Comment 19•15 years ago
|
||
Attachment #403779 -
Attachment is obsolete: true
Attachment #404803 -
Flags: review?(bnewman)
Comment 20•15 years ago
|
||
Comment on attachment 404803 [details] [diff] [review] Fix memory leaks Looks good to me.
Attachment #404803 -
Flags: review?(bnewman) → review+
Assignee | ||
Updated•15 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 21•15 years ago
|
||
Thank you for the review. Pushed as http://hg.mozilla.org/mozilla-central/rev/34bc9a8eb49e
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•