Closed
Bug 499642
Opened 15 years ago
Closed 15 years ago
[HTML5] Split the HTML5 parser into stream parser, doc.write parser and tree op executor
Categories
(Core :: DOM: HTML Parser, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: hsivonen, Assigned: hsivonen)
References
Details
Attachments
(1 file, 13 obsolete files)
175.22 KB,
patch
|
mozilla+ben
:
review+
mrbkap
:
superreview+
|
Details | Diff | Splinter Review |
In order to prepare for off-the-main-thread parsing, the HTML5 parser should be split into * The network stream parser (will move off the main thread) * document.write() parser (duplicates tokenizer & tree builder; will stay on the main thread) * Tree operation executor (will stay on the main thread) * Some kind of coordinator that implements nsIParser and wraps these.
Assignee | ||
Updated•15 years ago
|
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•15 years ago
|
||
Assignee | ||
Comment 2•15 years ago
|
||
Attachment #395046 -
Attachment is obsolete: true
Assignee | ||
Comment 3•15 years ago
|
||
This stuff is more intertwingled than I thought. :-(
Attachment #395047 -
Attachment is obsolete: true
Assignee | ||
Comment 4•15 years ago
|
||
Up next, splitting off the stream parser from the rest of the parser. I think I'm going to leave the doc.write parser inside the coordinator object, since splitting those seems to carry no benefit.
Attachment #395575 -
Attachment is obsolete: true
Assignee | ||
Comment 5•15 years ago
|
||
Note to self: diffed against rev 32d7abba0d2b.
Attachment #395595 -
Attachment is obsolete: true
Assignee | ||
Comment 6•15 years ago
|
||
Requires htmlparser/translator svn rev 576 now.
Assignee | ||
Comment 7•15 years ago
|
||
Attachment #396703 -
Attachment is obsolete: true
Assignee | ||
Comment 8•15 years ago
|
||
Rebased, fixed document mode. I guess this one needs to go through tryserver and dogfooding a bit.
Attachment #396721 -
Attachment is obsolete: true
Assignee | ||
Comment 9•15 years ago
|
||
Sigh. This is broken when there's a late charset switch.
Assignee | ||
Comment 10•15 years ago
|
||
Attachment #396991 -
Attachment is obsolete: true
Assignee | ||
Comment 11•15 years ago
|
||
Comment on attachment 397002 [details] [diff] [review] Maybe now. With zeroing operator new. Seems to work finally. Rebased to 9798dcc4a19d. The only deliberate behavior change is ignoring late meta charset switches from document.written content.
Attachment #397002 -
Flags: review?(bnewman)
Assignee | ||
Comment 12•15 years ago
|
||
Note: This patch deliberately doesn't introduce dual tokenizers and tree builders yet. nsHtml5StreamParser just has pointers to the tokenizer and tree builder owned by nsHtml5Parser.
Comment 13•15 years ago
|
||
(In reply to comment #9) > Sigh. This is broken when there's a late charset switch. What breaks, out of curiosity?
Comment 14•15 years ago
|
||
Nits mostly, some of them probably beside the point at this stage. My biggest concern is with the lack of QueryInterface calls (you use static_cast instead). I think there may be something wrong with the interface table declarations of nsHtml5{TreeOpExecutor,StreamParser} that prevents QueryInterface from working properly (I get errors about nsISupports being an ambiguous base), but I'm not sure how to fix it. I assume ignoring late charset switches is a permanent design decision? Is there any treatment of that in the HTML5 spec (or elsewhere)? r=me if you'll consider and/or accept the following suggestions: diff --git a/parser/html/nsHtml5Parser.cpp b/parser/html/nsHtml5Parser.cpp --- a/parser/html/nsHtml5Parser.cpp +++ b/parser/html/nsHtml5Parser.cpp @@ -89,24 +89,24 @@ NS_INTERFACE_TABLE_HEAD(nsHtml5Parser) NS_INTERFACE_MAP_END NS_IMPL_CYCLE_COLLECTING_ADDREF(nsHtml5Parser) NS_IMPL_CYCLE_COLLECTING_RELEASE(nsHtml5Parser) NS_IMPL_CYCLE_COLLECTION_CLASS(nsHtml5Parser) NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN(nsHtml5Parser) - NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mExecutor, nsIContentSink); - NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mStreamParser, nsIStreamListener); + NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mExecutor, nsIContentSink) + NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mStreamParser, nsIStreamListener) tmp->mTreeBuilder->DoTraverse(cb); NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(nsHtml5Parser) - NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mExecutor); - NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mStreamParser); + NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mExecutor) + NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mStreamParser) tmp->mTreeBuilder->DoUnlink(); NS_IMPL_CYCLE_COLLECTION_UNLINK_END nsHtml5Parser::nsHtml5Parser() : mFirstBuffer(new nsHtml5UTF16Buffer(0)) , mLastBuffer(mFirstBuffer) , mExecutor(new nsHtml5TreeOpExecutor()) , mTreeBuilder(new nsHtml5TreeBuilder(mExecutor)) @@ -130,16 +130,17 @@ nsHtml5Parser::SetContentSink(nsIContent { NS_ASSERTION(aSink == static_cast<nsIContentSink*> (mExecutor), "Attempt to set a foreign sink."); } NS_IMETHODIMP_(nsIContentSink*) nsHtml5Parser::GetContentSink(void) { + // Why can't this be done with a QueryInterface? return static_cast<nsIContentSink*> (mExecutor); } NS_IMETHODIMP_(void) nsHtml5Parser::GetCommand(nsCString& aCommand) { aCommand.Assign("view"); } @@ -155,16 +156,17 @@ nsHtml5Parser::SetCommand(eParserCommand { NS_ASSERTION(aParserCommand == eViewNormal, "Parser command was not eViewNormal."); } NS_IMETHODIMP_(void) nsHtml5Parser::SetDocumentCharset(const nsACString& aCharset, PRInt32 aCharsetSource) { + // Why don't we call mExecutor->SetDocumentCharset(aCharset) here? NS_PRECONDITION(mExecutor->GetLifeCycle() == NOT_STARTED, "Document charset set too late."); NS_PRECONDITION(mStreamParser, "Tried to set charset on a script-only parser."); mStreamParser->SetDocumentCharset(aCharset, aCharsetSource); } NS_IMETHODIMP_(void) nsHtml5Parser::SetParserFilter(nsIParserFilter* aFilter) @@ -190,16 +192,17 @@ nsHtml5Parser::GetDTD(nsIDTD** aDTD) } NS_IMETHODIMP nsHtml5Parser::GetStreamListener(nsIStreamListener** aListener) { if (!mStreamParser) { mStreamParser = new nsHtml5StreamParser(mTokenizer, mExecutor, this); } + // Needs a QueryInterface: NS_ADDREF(*aListener = mStreamParser); return NS_OK; } NS_IMETHODIMP nsHtml5Parser::ContinueParsing() { UnblockParser(); @@ -505,18 +508,19 @@ nsHtml5Parser::CancelParsingEvents() } void nsHtml5Parser::Reset() { mExecutor->Reset(); mLastWasCR = PR_FALSE; mFragmentMode = PR_FALSE; - mBlocked = PR_FALSE; + UnblockParser(); mSuspending = PR_FALSE; + // Isn't this redundant with mExecutor->Reset()? mExecutor->SetLifeCycle(NOT_STARTED); mStreamParser = nsnull; // mUninterruptibleDocWrite = PR_FALSE; mRootContextKey = nsnull; mContinueEvent = nsnull; // weak ref // Portable parser objects while (mFirstBuffer->next) { nsHtml5UTF16Buffer* oldBuf = mFirstBuffer; diff --git a/parser/html/nsHtml5Parser.h b/parser/html/nsHtml5Parser.h --- a/parser/html/nsHtml5Parser.h +++ b/parser/html/nsHtml5Parser.h @@ -316,17 +316,17 @@ class nsHtml5Parser : public nsIParser { /** * The event loop will spin ASAP */ PRBool mSuspending; // script execution /** - * + * Can this be killed off? */ PRBool mUninterruptibleDocWrite; // Gecko integration void* mRootContextKey; nsIRunnable* mContinueEvent; // weak ref // Portable parser objects @@ -344,22 +344,22 @@ class nsHtml5Parser : public nsIParser { /** * The tree operation executor */ nsRefPtr<nsHtml5TreeOpExecutor> mExecutor; /** * The HTML5 tree builder */ - nsAutoPtr<nsHtml5TreeBuilder> mTreeBuilder; + const nsAutoPtr<nsHtml5TreeBuilder> mTreeBuilder; /** * The HTML5 tokenizer */ - nsAutoPtr<nsHtml5Tokenizer> mTokenizer; + const nsAutoPtr<nsHtml5Tokenizer> mTokenizer; /** * The stream parser. */ nsRefPtr<nsHtml5StreamParser> mStreamParser; }; #endif diff --git a/parser/html/nsHtml5StreamParser.cpp b/parser/html/nsHtml5StreamParser.cpp --- a/parser/html/nsHtml5StreamParser.cpp +++ b/parser/html/nsHtml5StreamParser.cpp @@ -193,16 +193,18 @@ nsHtml5StreamParser::FinalizeSniffing(co } rv = detector->Done(); NS_ENSURE_SUCCESS(rv, rv); // fall thru; callback may have changed charset } else { NS_ERROR("Could not instantiate charset detector."); } } + // Are we relying on the zeroing operator new to set mCharsetSource to + // kCharsetUninitialized by default? if (mCharsetSource == kCharsetUninitialized) { // Hopefully this case is never needed, but dealing with it anyway mCharset.Assign("windows-1252"); mCharsetSource = kCharsetFromWeakDocTypeDefault; mExecutor->SetDocumentCharset(mCharset); } return SetupDecodingAndWriteSniffingBufferAndCurrentSegment(aFromSegment, aCount, aWriteCount); } @@ -417,32 +419,33 @@ nsHtml5StreamParser::OnStopRequest(nsIRe PRUint32 writeCount; rv = FinalizeSniffing(nsnull, 0, &writeCount, 0); NS_ENSURE_SUCCESS(rv, rv); } switch (mExecutor->GetLifeCycle()) { case TERMINATED: break; case NOT_STARTED: + // What's the deal with these XXX!!! lines? // mExecutor->SetParser(this); XXX!!! // mTreeBuilder->setScriptingEnabled(mExecutor->IsScriptEnabled()); XXX!!! mTokenizer->start(); mExecutor->SetLifeCycle(STREAM_ENDING); break; case STREAM_ENDING: NS_ERROR("OnStopRequest when the stream lifecycle was already ending."); break; default: mExecutor->SetLifeCycle(STREAM_ENDING); break; } #ifdef DEBUG mStreamListenerState = eOnStop; #endif - if (!mExecutor->IsScriptExecutingImpl()) { + if (!mExecutor->IsScriptExecuting()) { ParseUntilSuspend(); } if (mObserver) { mObserver->OnStopRequest(aRequest, aContext, status); } return NS_OK; } @@ -479,17 +482,17 @@ nsHtml5StreamParser::OnDataAvailable(nsI mExecutor->MaybeFlush(); NS_PRECONDITION(eOnStart == mStreamListenerState || eOnDataAvail == mStreamListenerState, "Error: OnStartRequest() must be called before OnDataAvailable()"); NS_ASSERTION(mRequest == aRequest, "Got data on wrong stream."); PRUint32 totalRead; nsresult rv = aInStream->ReadSegments(ParserWriteFunc, static_cast<void*> (this), aLength, &totalRead); NS_ASSERTION(totalRead == aLength, "ReadSegments read the wrong number of bytes."); - if (!mExecutor->IsScriptExecutingImpl()) { + if (!mExecutor->IsScriptExecuting()) { ParseUntilSuspend(); } return rv; } void nsHtml5StreamParser::internalEncodingDeclaration(nsString* aEncoding) { diff --git a/parser/html/nsHtml5StreamParser.h b/parser/html/nsHtml5StreamParser.h --- a/parser/html/nsHtml5StreamParser.h +++ b/parser/html/nsHtml5StreamParser.h @@ -152,16 +152,18 @@ class nsHtml5StreamParser : public nsISt inline void Suspend() { mSuspending = PR_TRUE; } void ParseUntilSuspend(); // The stuff below here is conceptually private but needs to be reachable // from a C-style callback. + // If you made ParserWriteFunc a static method of nsHtml5StreamParser, then + // you wouldn't need to publicize SniffStreamBytes and WriteStreamBytes. /** * True when there is a Unicode decoder already */ inline PRBool HasDecoder() { return !!mUnicodeDecoder; } diff --git a/parser/html/nsHtml5TreeOpExecutor.cpp b/parser/html/nsHtml5TreeOpExecutor.cpp --- a/parser/html/nsHtml5TreeOpExecutor.cpp +++ b/parser/html/nsHtml5TreeOpExecutor.cpp @@ -59,17 +59,17 @@ #define NS_HTML5_TREE_OP_EXECUTOR_MAX_QUEUE_TIME 3000UL // milliseconds #define NS_HTML5_TREE_OP_EXECUTOR_DEFAULT_QUEUE_LENGTH 200 #define NS_HTML5_TREE_OP_EXECUTOR_MIN_QUEUE_LENGTH 100 #define NS_HTML5_TREE_OP_EXECUTOR_MAX_TIME_WITHOUT_FLUSH 5000 // milliseconds NS_IMPL_CYCLE_COLLECTION_CLASS(nsHtml5TreeOpExecutor) -NS_INTERFACE_TABLE_HEAD_CYCLE_COLLECTION_INHERITED(nsHtml5TreeOpExecutor) \ +NS_INTERFACE_TABLE_HEAD_CYCLE_COLLECTION_INHERITED(nsHtml5TreeOpExecutor) NS_INTERFACE_TABLE_INHERITED1(nsHtml5TreeOpExecutor, nsIContentSink) NS_INTERFACE_TABLE_TAIL_INHERITING(nsContentSink) NS_IMPL_ADDREF_INHERITED(nsHtml5TreeOpExecutor, nsContentSink) NS_IMPL_RELEASE_INHERITED(nsHtml5TreeOpExecutor, nsContentSink) @@ -94,16 +94,18 @@ nsHtml5TreeOpExecutor::nsHtml5TreeOpExec , mFlushTimer(do_CreateInstance("@mozilla.org/timer;1")) , mNeedsCharsetSwitch(PR_FALSE) { } nsHtml5TreeOpExecutor::~nsHtml5TreeOpExecutor() { + // Why not just assert emptiness here? Can the executor ever be destroyed + // when there are outstanding operations in the queue? mOpQueue.Clear(); if (mFlushTimer) { mFlushTimer->Cancel(); // XXX why is this even necessary? it is, though. } } static void TimerCallbackFunc(nsITimer* aTimer, void* aClosure) @@ -500,16 +502,18 @@ nsHtml5TreeOpExecutor::MaybeSuspend() { } } void nsHtml5TreeOpExecutor::MaybeExecuteScript() { if (mScriptElement) { // mUninterruptibleDocWrite = PR_FALSE; ExecuteScript(); + // This seems to be the only use of mStreamParser in nsHtml5TreeOpExecutor. + // Would it be possible to avoid letting the executor hold this reference? mStreamParser->Suspend(); } } void nsHtml5TreeOpExecutor::DeferredTimerFlush() { if (mTreeBuilder->GetOpQueueLength() > 0) { mNeedsFlush = PR_TRUE; diff --git a/parser/html/nsHtml5TreeOpExecutor.h b/parser/html/nsHtml5TreeOpExecutor.h --- a/parser/html/nsHtml5TreeOpExecutor.h +++ b/parser/html/nsHtml5TreeOpExecutor.h @@ -174,16 +174,17 @@ public: /** * Unimplemented. For interface compat only. */ NS_IMETHOD WillResume(); /** * Sets the parser. */ + // Can the TreeOpExecutor really handle arbitrary nsIParser*s, or just nsHtml5Parser*s? NS_IMETHOD SetParser(nsIParser* aParser); /** * No-op for backwards compat. */ virtual void FlushPendingNotifications(mozFlushType aType); /** @@ -201,16 +202,18 @@ public: virtual void UpdateChildCounts(); virtual nsresult FlushTags(); virtual void PostEvaluateScript(nsIScriptElement *aElement); /** * Sets up style sheet load / parse */ void UpdateStyleSheet(nsIContent* aElement); + // Shouldn't need all these 'inline' keywords inside the class definition... + // Getters and setters for fields from nsContentSink inline nsIDocument* GetDocument() { return mDocument; } inline nsNodeInfoManager* GetNodeInfoManager() { return mNodeInfoManager; } inline nsIDocShell* GetDocShell() { @@ -346,16 +349,17 @@ public: inline eHtml5ParserLifecycle GetLifeCycle() { return mLifeCycle; } inline void SetLifeCycle(eHtml5ParserLifecycle aLifeCycle) { mLifeCycle = aLifeCycle; } + // Does this need to be public? nsHtml5Tokenizer* GetTokenizer(); void MaybeExecuteScript(); inline void MaybePreventExecution() { if (mScriptElement) { nsCOMPtr<nsIScriptElement> script = do_QueryInterface(mScriptElement); NS_ASSERTION(script, "mScriptElement didn't QI to nsIScriptElement!");
Assignee | ||
Comment 15•15 years ago
|
||
(In reply to comment #14) > My biggest concern is with the lack of QueryInterface calls (you use > static_cast instead). I was thinking of QueryInterface as a dynamic_cast substitute and these cases are casts where it's known what the other object is. > I assume ignoring late charset switches is a permanent design decision? Is > there any treatment of that in the HTML5 spec (or elsewhere)? > > r=me if you'll consider and/or accept the following suggestions: Unfortunately, there was still at least one substantial bug left: It was possible for nsHtml5Parser to get refcounted to zero and released while nsHtml5StreamParser was still alive. I've tried to address that issue in a new patch by introducing a reference cycle that gets broken in the end just like the parser–sink cycle. Re-requesting review with this substantial difference. (But bnewman seems to be away right now.) Also, requesting sr for the nsIParser interface change. > - NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mExecutor, > nsIContentSink); > - NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mStreamParser, > nsIStreamListener); > + NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mExecutor, > nsIContentSink) > + NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR_AMBIGUOUS(mStreamParser, > nsIStreamListener) [...] > - NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mExecutor); > - NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mStreamParser); > + NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mExecutor) > + NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mStreamParser) Fixed. > NS_IMETHODIMP_(nsIContentSink*) > nsHtml5Parser::GetContentSink(void) > { > + // Why can't this be done with a QueryInterface? > return static_cast<nsIContentSink*> (mExecutor); > } It probably could, but it seems it doesn't need to. Is it poor form to do it this way? > NS_IMETHODIMP_(void) > nsHtml5Parser::SetDocumentCharset(const nsACString& aCharset, PRInt32 > aCharsetSource) > { > + // Why don't we call mExecutor->SetDocumentCharset(aCharset) here? > NS_PRECONDITION(mExecutor->GetLifeCycle() == NOT_STARTED, > "Document charset set too late."); > NS_PRECONDITION(mStreamParser, "Tried to set charset on a script-only > parser."); > mStreamParser->SetDocumentCharset(aCharset, aCharsetSource); > } Fixed. > NS_IMETHODIMP > nsHtml5Parser::GetStreamListener(nsIStreamListener** aListener) > { > if (!mStreamParser) { > mStreamParser = new nsHtml5StreamParser(mTokenizer, mExecutor, this); > } > + // Needs a QueryInterface: > NS_ADDREF(*aListener = mStreamParser); > return NS_OK; > } Since mStreamParser is a class pointer, doesn't the compiler to the right thing here without QI? > void > nsHtml5Parser::Reset() > { > mExecutor->Reset(); > mLastWasCR = PR_FALSE; > mFragmentMode = PR_FALSE; > - mBlocked = PR_FALSE; > + UnblockParser(); Fixed. > mSuspending = PR_FALSE; > + // Isn't this redundant with mExecutor->Reset()? > mExecutor->SetLifeCycle(NOT_STARTED); Yes. Removed. > /** > - * > + * Can this be killed off? > */ > PRBool mUninterruptibleDocWrite; Yes. Killed. > /** > * The HTML5 tree builder > */ > - nsAutoPtr<nsHtml5TreeBuilder> mTreeBuilder; > + const nsAutoPtr<nsHtml5TreeBuilder> mTreeBuilder; > > /** > * The HTML5 tokenizer > */ > - nsAutoPtr<nsHtml5Tokenizer> mTokenizer; > + const nsAutoPtr<nsHtml5Tokenizer> mTokenizer; Fixed. > + // Are we relying on the zeroing operator new to set mCharsetSource to > + // kCharsetUninitialized by default? > if (mCharsetSource == kCharsetUninitialized) { Yes. > switch (mExecutor->GetLifeCycle()) { > case TERMINATED: > break; > case NOT_STARTED: > + // What's the deal with these XXX!!! lines? > // mExecutor->SetParser(this); XXX!!! > // mTreeBuilder->setScriptingEnabled(mExecutor->IsScriptEnabled()); > XXX!!! Oops. Fixed with NS_NOTREACHED, because this is now handled in Parse() in nsHtml5Parser. > - if (!mExecutor->IsScriptExecutingImpl()) { > + if (!mExecutor->IsScriptExecuting()) { [...] > - if (!mExecutor->IsScriptExecutingImpl()) { > + if (!mExecutor->IsScriptExecuting()) { Fixed. > // The stuff below here is conceptually private but needs to be reachable > // from a C-style callback. > + // If you made ParserWriteFunc a static method of nsHtml5StreamParser, then > + // you wouldn't need to publicize SniffStreamBytes and WriteStreamBytes. Fixed. > -NS_INTERFACE_TABLE_HEAD_CYCLE_COLLECTION_INHERITED(nsHtml5TreeOpExecutor) \ > +NS_INTERFACE_TABLE_HEAD_CYCLE_COLLECTION_INHERITED(nsHtml5TreeOpExecutor) Fixed. > nsHtml5TreeOpExecutor::~nsHtml5TreeOpExecutor() > { > + // Why not just assert emptiness here? Can the executor ever be destroyed > + // when there are outstanding operations in the queue? > mOpQueue.Clear(); Fixed. > void > nsHtml5TreeOpExecutor::MaybeExecuteScript() { > if (mScriptElement) { > // mUninterruptibleDocWrite = PR_FALSE; > ExecuteScript(); > + // This seems to be the only use of mStreamParser in > nsHtml5TreeOpExecutor. > + // Would it be possible to avoid letting the executor hold this reference? > mStreamParser->Suspend(); > } > } My plan was for this to go away in a later iteration. > /** > * Sets the parser. > */ > + // Can the TreeOpExecutor really handle arbitrary nsIParser*s, or just > nsHtml5Parser*s? > NS_IMETHOD SetParser(nsIParser* aParser); It can't, but this method signature comes from nsIContentSink and didn't seem useful to introduce a second method that takes nsHtml5Parser*. > + // Shouldn't need all these 'inline' keywords inside the class > definition... Removed. Do compilers these days inline the methods anyway? > + // Does this need to be public? > nsHtml5Tokenizer* GetTokenizer(); Fixed.
Attachment #397002 -
Attachment is obsolete: true
Attachment #397854 -
Flags: superreview?(mrbkap)
Attachment #397854 -
Flags: review?(mrbkap)
Attachment #397002 -
Flags: review?(bnewman)
Assignee | ||
Updated•15 years ago
|
Attachment #397854 -
Attachment is obsolete: true
Attachment #397854 -
Flags: superreview?(mrbkap)
Attachment #397854 -
Flags: review?(mrbkap)
Assignee | ||
Comment 16•15 years ago
|
||
Comment on attachment 397854 [details] [diff] [review] Address review commets, make stream parser hold a strong ref back to parser The old problem that script global object doesn't exist until first OnDataAvailable came back to bite me. This patch is broken. Since the whole point is moving OnDataAvailable elsewhere, this situation is a bit of a problem.
Assignee | ||
Comment 17•15 years ago
|
||
I thought about putting a tree op for WillBuildModel on the queue, but that doesn't work right if the network stream ends and is removed from the load group before the queue is flushed. Maybe it's time to actually fix things so that script global object is initialized before calling into the parser.
Assignee | ||
Comment 18•15 years ago
|
||
Changing the docshell seemed too difficult and dangerous, so I fixed the issue (again) on the parser side this time leaving around comments to prevent the issue for reoccurring.
Attachment #398354 -
Flags: superreview?(mrbkap)
Attachment #398354 -
Flags: review?(mrbkap)
Assignee | ||
Updated•15 years ago
|
Attachment #398354 -
Attachment is obsolete: true
Attachment #398354 -
Flags: superreview?(mrbkap)
Attachment #398354 -
Flags: review?(mrbkap)
Assignee | ||
Comment 19•15 years ago
|
||
Comment on attachment 398354 [details] [diff] [review] Fix script global object order Still not baked enough. This stuff is incredible brittle. :-(
Assignee | ||
Comment 20•15 years ago
|
||
> I assume ignoring late charset switches is a permanent design decision? Is > there any treatment of that in the HTML5 spec (or elsewhere)? This is to address bug 487949 comment 69. It's not in the spec yet. (Now is a convenient time to address this, because changing this as a separate patch would be harder.)
Assignee | ||
Comment 21•15 years ago
|
||
Assignee | ||
Comment 22•15 years ago
|
||
Attachment #399051 -
Attachment is obsolete: true
Assignee | ||
Comment 23•15 years ago
|
||
Sorry about the bug spam. Managed to attached the wrong patch twice.
Attachment #399054 -
Attachment is obsolete: true
Assignee | ||
Updated•15 years ago
|
Attachment #399056 -
Flags: superreview?(mrbkap)
Attachment #399056 -
Flags: review?(bnewman)
Comment 24•15 years ago
|
||
Two quick questions before I grant the review: diff --git a/parser/html/nsHtml5Parser.cpp b/parser/html/nsHtml5Parser.cpp --- a/parser/html/nsHtml5Parser.cpp +++ b/parser/html/nsHtml5Parser.cpp @@ -292,16 +292,19 @@ nsHtml5Parser::Parse(const nsAString& aS PRBool aLastCall, nsDTDMode aMode) // ignored { NS_PRECONDITION(!mFragmentMode, "Document.write called in fragment mode!"); // Maintain a reference to ourselves so we don't go away // till we're completely done. nsCOMPtr<nsIParser> kungFuDeathGrip(this); + // I'm all for more kung fu, but it seems as though the death grip on the + // parser should protect the parser's nsRefPtr members, so these additional + // grips shouldn't be necessary. Any theory as to why they help? nsRefPtr<nsHtml5StreamParser> streamKungFuDeathGrip(mStreamParser); nsRefPtr<nsHtml5TreeOpExecutor> treeOpKungFuDeathGrip(mExecutor); // Return early if the parser has processed EOF switch (mExecutor->GetLifeCycle()) { case TERMINATED: return NS_OK; case NOT_STARTED: diff --git a/parser/html/nsHtml5StreamParser.cpp b/parser/html/nsHtml5StreamParser.cpp --- a/parser/html/nsHtml5StreamParser.cpp +++ b/parser/html/nsHtml5StreamParser.cpp @@ -394,16 +394,17 @@ nsHtml5StreamParser::OnStartRequest(nsIR mStreamListenerState = eOnStart; #endif mRequest = aRequest; /* * If you move the following line, be very careful not to cause * WillBuildModel to be called before the document has had its * script global object set. + * Can this be enforced with an assertion in nsHtml5TreeOpExecutor::WillBuildModel? */ mTokenizer->start(); mExecutor->SetLifeCycle(PARSING); if (mCharsetSource < kCharsetFromChannel) { // we aren't ready to commit to an encoding yet // leave converter uninstantiated for now return NS_OK;
Assignee | ||
Comment 25•15 years ago
|
||
(In reply to comment #24) > Two quick questions before I grant the review: > > diff --git a/parser/html/nsHtml5Parser.cpp b/parser/html/nsHtml5Parser.cpp > --- a/parser/html/nsHtml5Parser.cpp > +++ b/parser/html/nsHtml5Parser.cpp > @@ -292,16 +292,19 @@ nsHtml5Parser::Parse(const nsAString& aS > PRBool aLastCall, > nsDTDMode aMode) // ignored > { > NS_PRECONDITION(!mFragmentMode, "Document.write called in fragment mode!"); > > // Maintain a reference to ourselves so we don't go away > // till we're completely done. > nsCOMPtr<nsIParser> kungFuDeathGrip(this); > + // I'm all for more kung fu, but it seems as though the death grip on the > + // parser should protect the parser's nsRefPtr members, so these additional > + // grips shouldn't be necessary. Any theory as to why they help? > nsRefPtr<nsHtml5StreamParser> streamKungFuDeathGrip(mStreamParser); > nsRefPtr<nsHtml5TreeOpExecutor> treeOpKungFuDeathGrip(mExecutor); In the other case where the old parser does a kungFuDeathGrip, the new split parser kept crashing until I made it grip the stream parser and executor also. It's not clear to me why exactly this is, since I didn't audit the code outside the parser that holds references to the parts of the parser. In the case of this method, I don't have a theory of whether gripping the stream parser and executor is absolutely necessary. It's not fully clear to me why the old parser grips itself in this method. I added these grips, because the other old grip required gripping the other objects, too, and I wanted to avoid this landing getting delayed by a crash caused by failure to have the same grip pattern here as well. (My planned schedule was already disrupted by the other grip case.) I moved the grips to the top of the method just in case. Added comments about this. > + * Can this be enforced with an assertion in > nsHtml5TreeOpExecutor::WillBuildModel? Added.
Attachment #399056 -
Attachment is obsolete: true
Attachment #399974 -
Flags: superreview?(mrbkap)
Attachment #399974 -
Flags: review?(bnewman)
Attachment #399056 -
Flags: superreview?(mrbkap)
Attachment #399056 -
Flags: review?(bnewman)
Updated•15 years ago
|
Attachment #399974 -
Flags: review?(bnewman) → review+
Comment 26•15 years ago
|
||
(In reply to comment #25) > > + // I'm all for more kung fu, but it seems as though the death grip on the > > + // parser should protect the parser's nsRefPtr members, so these additional > > + // grips shouldn't be necessary. Any theory as to why they help? > > nsRefPtr<nsHtml5StreamParser> streamKungFuDeathGrip(mStreamParser); > > nsRefPtr<nsHtml5TreeOpExecutor> treeOpKungFuDeathGrip(mExecutor); > > In the other case where the old parser does a kungFuDeathGrip, the new split > parser kept crashing until I made it grip the stream parser and executor also. > It's not clear to me why exactly this is, since I didn't audit the code outside > the parser that holds references to the parts of the parser. My analysis was flawed, actually: the grips could well be necessary if mStreamParser or mExecutor are ever reassigned during parsing. It doesn't appear trivial to make those data members const, so the grips should definitely stay. And, as you argue, they don't hurt. mrbkap: nudge?
Updated•15 years ago
|
Attachment #399974 -
Flags: superreview?(mrbkap) → superreview+
Assignee | ||
Comment 27•15 years ago
|
||
Thank you for the r&sr. Pushed as: http://hg.mozilla.org/mozilla-central/rev/ac68245b5363
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 28•15 years ago
|
||
case eTreeOpProcessOfflineManifest: { - rv = aBuilder->ProcessOfflineManifest(mNode); + aBuilder->ProcessOfflineManifest(mNode); return rv; } This doesn't look right to me.
Comment 29•15 years ago
|
||
(In reply to comment #28) > case eTreeOpProcessOfflineManifest: { > - rv = aBuilder->ProcessOfflineManifest(mNode); > + aBuilder->ProcessOfflineManifest(mNode); > return rv; > } > > This doesn't look right to me. nsContentSink::ProcessOfflineManifest is void: http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsContentSink.h#251
Assignee | ||
Comment 30•15 years ago
|
||
The parser was written with the assumption that no-fail malloc would land long before the parser, so the parser drops OOM-or-OK type nsresults on the floor all over the place.
You need to log in
before you can comment on or make changes to this bug.
Description
•