Closed Bug 482918 Opened 14 years ago Closed 13 years ago

[HTML5] Move HTML5 parsing off the main thread

Categories

(Core :: DOM: HTML Parser, defect, P1)

Other Branch
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

Attachments

(3 files, 10 obsolete files)

1.99 KB, patch
Details | Diff | Splinter Review
55.27 KB, patch
mozilla+ben
: review+
Details | Diff | Splinter Review
55.26 KB, patch
Details | Diff | Splinter Review
When loading an HTML5 document into a browsing context, the bytes to Unicode decoding, the tokenization and the HTML5 tree builder algorithms should run off the main thread.
Blocks: 482919
Priority: -- → P1
Depends on: 499642
Blocks: 503292
Depends on: 514661
Blocks: 514669
Depends on: 515142
Depends on: 515338
Depends on: 516406
Depends on: 516411
Attached patch Backup of WIP (obsolete) — Splinter Review
As far as I can tell, the way the functors wrap "this" in an nsRefPtr is incompatible with 'this' being a cycle collected (i.e. thread-unsafe nsISupports) object.

Should I have a variant of nsRefPtr that proxies releases to back to main thread or should I make the stream parser not participate in CC or something else?

(Also unaddressed so far is the Unicode decoder stuff. IIRC, encoding aliases are loaded lazily. Doing that off the main thread without a mutex seems like a problem.)
Attachment #402859 - Attachment is obsolete: true
Attached patch Entire mq in one patch (obsolete) — Splinter Review
So the patch should now work in theory but it doesn't work in practice.

Problem 1: The functor used for releasing in nsHtml5RefPtr gets a null this pointer wrapped in ref(), and there's a bad access upon trying to run the functor. (#if 0'ed out for now.)

Problem 2: If the pref is html5.offmainthread is set to true, there is almost always a crash when the DeadlockDetector tries to access its hashtable. Stack:
#0    0x0001826c in PL_HashTableRawLookup at plhash.c:178
#1    0x00520f3d in
mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::GetEntry
at DeadlockDetector.h:267
#2    0x00521def in
mozilla::DeadlockDetector<mozilla::BlockingResourceBase::DeadlockDetectorEntry>::CheckAcquisition
at DeadlockDetector.h:435
#3    0x005204ed in mozilla::BlockingResourceBase::CheckAcquire at
BlockingResourceBase.cpp:140
#4    0x005207ba in mozilla::Monitor::Enter at BlockingResourceBase.cpp:314
#5    0x1a972329 in mozilla::MonitorAutoEnter::MonitorAutoEnter at
Monitor.h:216
#6    0x1a974806 in RunnableFunctor<void>::AfterRun at RunnableFunctor.h:61
#7    0x1a9749f9 in RunnableFunctorImpl<void, Functor<void, NullType> >::Run at
RunnableFunctor.h:134
#8    0x0059a6c6 in nsThread::ProcessNextEvent at nsThread.cpp:527
#9    0x0051df68 in NS_ProcessNextEvent_P at nsThreadUtils.cpp:230
#10    0x0059a8d5 in nsThread::ThreadFunc at nsThread.cpp:254
#11    0x0016b1f7 in _pt_root at ptthread.c:228
#12    0x9069c155 in _pthread_start
#13    0x9069c012 in thread_start

Problem 3: OnDataAvailable uses an nsAutoArrayPtr in the functor to hold the data. When the destructor fires, malloc complains that the pointer being freed is misaligned.

Problem 4: When a monitor enters its monitor in AfterRun, the monitor's memory is filled with 0xfffd on Intel Mac (freed memory?). As a result, pthread_mutex_lock returns 22 and the app aborts itself.
#0	0x906d8e42 in __kill
#1	0x906d8e34 in kill$UNIX2003
#2	0x9074b23a in raise
#3	0x90757679 in abort
#4	0x001497c3 in PR_Assert at prlog.c:563
#5	0x001638d6 in PR_Lock at ptsynch.c:207
#6	0x001645f6 in PR_EnterMonitor at ptsynch.c:548
#7	0x0052044e in mozilla::Monitor::Enter at BlockingResourceBase.cpp:317
#8	0x12199329 in mozilla::MonitorAutoEnter::MonitorAutoEnter at Monitor.h:216
#9	0x1219b806 in RunnableFunctor<void>::AfterRun at RunnableFunctor.h:61
#10	0x1219b9f9 in RunnableFunctorImpl<void, Functor<void, NullType> >::Run at RunnableFunctor.h:134
#11	0x0059984a in nsThread::ProcessNextEvent at nsThread.cpp:527
#12	0x0051e332 in NS_ProcessPendingEvents_P at nsThreadUtils.cpp:180
#13	0x14649151 in nsBaseAppShell::NativeEventCallback at nsBaseAppShell.cpp:121
#14	0x145fedda in nsAppShell::ProcessGeckoEvents at nsAppShell.mm:413
#15	0x94c2340f in CFRunLoopRunSpecific
#16	0x94c23aa8 in CFRunLoopRunInMode
#17	0x92c992ac in RunCurrentEventLoopInMode
#18	0x92c98ffe in ReceiveNextEventCommon
#19	0x92c98f39 in BlockUntilNextEventMatchingListInMode
#20	0x936c96d5 in _DPSNextEvent
#21	0x936c8f88 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]
#22	0x936c1f9f in -[NSApplication run]
#23	0x145fce04 in nsAppShell::Run at nsAppShell.mm:766
#24	0x1431fd2a in nsAppStartup::Run at nsAppStartup.cpp:182
#25	0x000acc5e in XRE_main at nsAppRunner.cpp:3418
#26	0x00002863 in main at nsBrowserApp.cpp:156

bnewman, cjones, any advice on how to proceed?

(In good news, I believe I've seen Hixie's Live DOM viewer parse off the main thread twice when by timing luck none of the above problems killed the app first.)
Attachment #403254 - Attachment description: Entiry mq in one patch → Entire mq in one patch
The DeadlockDetector crash makes no sense to me, but looks possibly bad.  Would you please file a separate bug on it?
I've been unable to reproduce the deadlock detector crash since I rebooted and have had a different set of processes running. My current hypothesis is that problem #3 above corrupts the heap and the exact location of failure depends on timing, available real memory or things like that. I'm not seeing crashes in totally illogical places.
I took the functors out and used hand-rolled runnables. It's alive! Woohoo! (cnn.com is broken, though, but simpler pages seem to work.)
Attachment #403253 - Attachment is obsolete: true
Attachment #403254 - Attachment is obsolete: true
Attachment #403463 - Attachment is obsolete: true
(In reply to comment #8)
> Created an attachment (id=403777) [details]
> Proxy releases back to main thread

If you add another template parameter to nsRefPtr itself, then you can easily swap out the Release semantics without creating a whole new smart pointer class.  Then you could just

  typedef nsRefPtr<nsHtml5StreamParser,
                   ProxyingRefPtrReleasePolicy>
          nsHtml5StreamParserRefPtr;

and use nsHtml5StreamParserRefPtr wherever you need it.

I'm not sure it's appropriate to define the ProxyingRefPtrReleasePolicy in nsAutoPtr.h, but I would think it's a pretty common use case.
(In reply to comment #9)
> Created an attachment (id=403843) [details]
> Another approach to proxying that avoids duplicating nsRefPtr code

+      void
+      Release( T* ptr )
+        {
>>>        if ( ptr )
+            ptr->Release();
+        }

And, ah, um... just imagine that I checked those T* ptr's for null :)
(In reply to comment #10)
> And, ah, um... just imagine that I checked those T* ptr's for null :)

And that ProxyingRefPtrReleasePolicy::mMainThread was static (and called sMainThread).  And that my code followed the weird style conventions in nsAutoPtr.h.  And probably some other bits I missed.

In other words, please consider this patch just for illustration purposes.
Attached patch Address remaining assertions (obsolete) — Splinter Review
Attachment #403777 - Attachment is obsolete: true
(In reply to comment #9)
> Created an attachment (id=403843) [details]
> Another approach to proxying that avoids duplicating nsRefPtr code

> If you add another template parameter to nsRefPtr itself, then you can easily
> swap out the Release semantics without creating a whole new smart pointer
> class.

Is parametrizing templates with other templates known to work across all compilers that Gecko needs to compile on?

https://developer.mozilla.org/en/C___Portability_Guide#Be_very_careful_when_writing_C.2b.2b_templates looks discouraging.
Attachment #404025 - Attachment is obsolete: true
Attached patch Refresh patch (obsolete) — Splinter Review
Attachment #405484 - Attachment is obsolete: true
Attached patch Fix bitrot (obsolete) — Splinter Review
Attachment #406447 - Attachment is obsolete: true
Attached patch Fix bitrot, properly (obsolete) — Splinter Review
Attachment #406684 - Attachment is obsolete: true
Attachment #406685 - Flags: review?(bnewman)
Status: NEW → ASSIGNED
Attachment #406685 - Attachment is obsolete: true
Attachment #407270 - Flags: review?(bnewman)
Attachment #406685 - Flags: review?(bnewman)
Comment on attachment 407270 [details] [diff] [review]
Change nsCharsetAliasImpl thread-safety protection

Looks good. Two suggestions:

- You can add an unprotected !mFastTableCreated check here

  if (!mFastTableCreated) { // <-- add this
    // Probably better to make this non-lazy and get rid of the mutex
    mozilla::MutexAutoLock autoLock(mFastTableMutex);
    if (!mFastTableCreated) {
      ...
    }
  } // <-- and this

  to avoid locking after mFastTableCreated has been set to PR_TRUE. The existing (protected) check is still necessary when there's a race.

- You should be able to use the existing nsRunnableMethod class instead of nsHtml5RequestStopper and nsHtml5ContinueAfterScript.  See here:

  http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsThreadUtils.h#247

  Probably still a good idea to use the NS_NEW_RUNNABLE_METHOD macro. nsRunnableMethod won't work for nsHtml5DataAvailable, since it takes arguments.
Attachment #407270 - Flags: review?(bnewman) → review+
(In reply to comment #20)
> (From update of attachment 407270 [details] [diff] [review])
> Looks good. Two suggestions:

Thanks.

> - You can add an unprotected !mFastTableCreated check here
> 
>   if (!mFastTableCreated) { // <-- add this
>     // Probably better to make this non-lazy and get rid of the mutex
>     mozilla::MutexAutoLock autoLock(mFastTableMutex);
>     if (!mFastTableCreated) {
>       ...
>     }
>   } // <-- and this
> 
>   to avoid locking after mFastTableCreated has been set to PR_TRUE. The
> existing (protected) check is still necessary when there's a race.

Fixed.

> - You should be able to use the existing nsRunnableMethod class instead of
> nsHtml5RequestStopper and nsHtml5ContinueAfterScript.  See here:
> 
>   http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsThreadUtils.h#247
> 
>   Probably still a good idea to use the NS_NEW_RUNNABLE_METHOD macro.
> nsRunnableMethod won't work for nsHtml5DataAvailable, since it takes arguments.

This wouldn't work right because the stream parser is a CC participant and doesn't have thread-safe Release().
Here's the patch with the if inside and outside the mutex for reference.
Thank you for the review.

Landed as http://hg.mozilla.org/mozilla-central/rev/7cda86954b4c
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 593046
You need to log in before you can comment on or make changes to this bug.