1359490 - abstract away event loop spinning

Assignee

Description

•

7 years ago

Even something as simple as:

template<typename Pred>
void
SpinEventLoopUntil(Pred aPredicate, nsIThread* aThread = nullptr)
{
  nsIThread* thread = aThread ? aThread : NS_GetCurrentThread();

  while (!aPredicate()) {
    MOZ_ALWAYS_TRUE(NS_ProcessNextEvent(thread, true));
  }
}

would be a little bit nicer than what we have, and would be slightly more efficient for a lot of the cases that we have.  For Quantum DOM, this change would also enable us to add the thread-yielding smarts we need in one place, rather than having it scattered all over.

With any luck, we can *almost* eliminate NS_ProcessNextEvent except from a few well-known places.

Nathan Froyd [:froydnj]

Assignee

Comment 1

•

7 years ago

One sadness here is that the predicate typically has to access member variables and/or functions, and for refcounted things, that means we have to take an extra reference to satisfy the static analysis.  Kind of unfortunate, but that's life.

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 2

•

7 years ago

Would there be a way to mark such functions, to indicate that they're using their function object parameters straight away?

Nathan Froyd [:froydnj]

Assignee

Comment 3

•

7 years ago

(In reply to Gerald Squelart [:gerald] from comment #2)
> Would there be a way to mark such functions, to indicate that they're using
> their function object parameters straight away?

Probably!  But it seems better to modify the static analysis to perform some sort of limited escape analysis on the lambda.  Something like:

* Not stored anywhere;
* Not passed to other functions;
* Not captured by other lambdas (which I guess is a special case of storing?).

Lambdas that pass the above, and maybe some other checks I haven't thought of, ought to be able to capture `this` without problems...though I think last time we tried to relax the analysis, we started writing use-after-free bugs.  So we'd have to be very careful in defining our conditions.

Nathan Froyd [:froydnj]

Assignee

Updated

•

7 years ago

Blocks: 1359923

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox55: --- → affected

status-firefox57: affected → ---

Nathan Froyd [:froydnj]

Assignee

Comment 4

•

7 years ago

Attached patch add an event loop spinning abstraction function — Details — Splinter Review

This function is arguably nicer than calling NS_ProcessNextEvent
manually, is slightly more efficient, and will enable better auditing
for NS_ProcessNextEvent when we do Quantum DOM scheduling changes.

Things I'd especially like feedback on:

* Lambda syntax; this is my first significant patch with lambdas;

* The big comment before ProcessFailureBehavior.  Strictly speaking, we don't
  *need* the enum, as we could just do:

  while (!SpinEventLoopUntil(...)) {}

  but that seems kind of ugly.  Better ways to address this welcome.

* Other things that I may not have thought of here!

Attachment #8862614 - Flags: review?(gsquelart)

Nathan Froyd [:froydnj]

Assignee

Comment 5

•

7 years ago

To be clear, this patch passes the static analysis, so the concerns from comment 1 are unfounded!

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Updated

•

7 years ago

Assignee: nobody → nfroyd

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 6

•

7 years ago

Comment on attachment 8862614 [details] [diff] [review]
add an event loop spinning abstraction function

Review of attachment 8862614 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with my small nits addressed (or ignored as appropriate).

Happy to re-review if you implement my crazy suggestions!

::: devtools/server/nsJSInspector.cpp
@@ +73,5 @@
>  
>    mozilla::dom::AutoNoJSAPI nojsapi;
>  
>    uint32_t nestLevel = ++mNestedLoopLevel;
> +  if (!SpinEventLoopUntil([&, this]() { return mNestedLoopLevel < nestLevel; })) {

'&' also captures '*this' by reference, so 'this' is not needed.
(And more cases in this patch.)

::: dom/filehandle/ActorsParent.cpp
@@ +945,5 @@
>      MOZ_ASSERT(mShutdownComplete);
>      return;
>    }
>  
> +  SpinEventLoopUntil([&, this]() { return mShutdownComplete; });

In this kind of cases (only accessing member variables), `[this]` should be enough.

::: dom/indexedDB/ActorsParent.cpp
@@ +13596,5 @@
>        }
>  #endif // DEBUG
> +
> +      return false;
> +        }));

I'd unindent `}));` to be two columns left, below `return false;`.

::: dom/network/UDPSocketChild.cpp
@@ +122,5 @@
>      return NS_ERROR_FAILURE;
>    }
>  
> +  if (!SpinEventLoopUntil([&done]() { return done; })) {
> +    return NS_ERROR_FAILURE;

No more warning in the error case, is that intended?

::: dom/quota/ActorsParent.cpp
@@ -2861,5 @@
> -  nsIThread* currentThread = NS_GetCurrentThread();
> -  MOZ_ASSERT(currentThread);
> -
> -  while (!done) {
> -    MOZ_ALWAYS_TRUE(NS_ProcessNextEvent(currentThread));

You removed the MOZ_ALWAYS_TRUE, is that intended?
(Guessing: Probably yes, otherwise we would have got crash reports by now.)

::: dom/workers/ServiceWorkerRegistrar.cpp
@@ -992,5 @@
> -  nsCOMPtr<nsIThread> thread(do_GetCurrentThread());
> -  while (true) {
> -    MOZ_ALWAYS_TRUE(NS_ProcessNextEvent(thread));
> -    if (completed) {
> -      break;

Wow, someone worked really hard to write a `do while` loop!
I'm guessing that doing the extra initial test now is fine though.

No more MOZ_ALWAYS_TRUE?

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +3669,5 @@
>  
>    // Here we handle the event queue for dispatches to the main thread
>    // When the GTest thread is complete it will send one more dispatch
>    // with gTestsComplete == true.
> +  SpinEventLoopUntil([&gTestsComplete]() { return gTestsComplete; });

Assuming gFoo is global, there's no need to capture it.

::: storage/mozStorageService.cpp
@@ +944,3 @@
>          }
> +
> +        return !anyOpen;

Forget anyOpen, just do: `for(){if(){/*still open*/ return false;}} /*all closed*/ return true;`

::: toolkit/components/url-classifier/tests/gtest/Common.cpp
@@ -60,5 @@
> -    // Condition variable doesn't work here because instrusively
> -    // notifying the from NS_NewCheckSummedOutputStream() or
> -    // HashStore::WriteFile() is weird.
> -    if (!NS_ProcessNextEvent(NS_GetCurrentThread(), false)) {
> -      PR_Sleep(PR_MillisecondsToInterval(100));

You didn't keep the PR_Sleep -- intended?

::: xpcom/threads/SharedThreadPool.cpp
@@ -77,5 @@
>  SharedThreadPool::SpinUntilEmpty()
>  {
>    MOZ_ASSERT(NS_IsMainThread());
> -  while (!IsEmpty()) {
> -    sMonitor->AssertNotCurrentThreadIn();

The assertion doesn't just work inside the lambda?

::: xpcom/threads/ThrottledEventQueue.cpp
@@ -258,5 @@
>      // Once shutdown begins we set the Atomic<bool> mShutdownStarted flag.
>      // This prevents any new runnables from being dispatched into the
>      // TaskQueue.  Therefore this loop should be finite.
> -    while (!IsEmpty()) {
> -      MOZ_ALWAYS_TRUE(NS_ProcessNextEvent());

No more MOZ_ALWAYS_TRUE?

::: xpcom/threads/nsThread.cpp
@@ +995,5 @@
>  void
>  nsThread::WaitForAllAsynchronousShutdowns()
>  {
> +  // This is the motivating example for why SpinEventLoop has the template
> +  // parameter we are providing here.

I think you should put the full argument here!

::: xpcom/threads/nsThreadUtils.h
@@ +256,5 @@
> +  IgnoreAndContinue,
> +  ReportToCaller,
> +};
> +
> +template<enum ProcessFailureBehavior Behavior = ProcessFailureBehavior::ReportToCaller,

No need for the `enum` keyword here.

@@ +259,5 @@
> +
> +template<enum ProcessFailureBehavior Behavior = ProcessFailureBehavior::ReportToCaller,
> +         typename Pred>
> +bool
> +SpinEventLoopUntil(Pred aPredicate, nsIThread* aThread = nullptr)

0. Returned bool
I think you should document what its value means, probably in the first doc block above.
And it would gently introduce the behaviors. :-)

I see most calls just ignore it. It seems a bit dangerous in a way, especially when the default behavior mentions "Report", but then we ignore that report!

Crazy thinking about that: How about having two specializations, so that either the function will be `void`, or will return `MOZ_MUST_USE bool` (Assuming MOZ_MUST_USE can indeed be selectively applied to a specialization.)
Tweaked behaviors could be:
- ProcessFailureBehavior::ReportToCaller -> `MOZ_MUST_USE bool`, and `return false;` if !didSomething.
- ProcessFailureBehavior::Stop -> `void`, and `return;` if !didSomething.
- ProcessFailureBehavior::IgnoreAndContinue -> `void` and no early return.
I'd vote for ReportToCaller to still be the default, as it should force people to think a bit about which behavior to pick at every call, because either they'll need to understand and deal with the return value, or explicitly choose a non-returning behavior.


1. Function name:
Strictly speaking, you're not spinning "Until", but more "WhileNot" -- i.e., the predicate is tested even before the first event-processing loop.
But it sounds a bit awkward and mind-bending, so I'm not sure it would be such a good idea?

Alternatively, you could change it to "While", i.e., reverse the predicate's meaning -- but then you'd have to change every caller! :-P
However most of the original loops were `while (pred)`, which means the predicate would stay the same as it was, instead of having to reverse it as you have (painfully) done!

But then again, it's probably simpler to usually think about how to end a loop, instead of when to keep it running, so as a user I would probably like Until/WhileNot better.

AARGH! Sorry for the over-thinking. I'll let you decide...


2. Predicate parameter:
Take it by '&&'. As it's a templated type, the most appropriate passing method will be taken, potentially avoiding some copies.

@@ +269,5 @@
> +
> +    if (Behavior == ProcessFailureBehavior::IgnoreAndContinue) {
> +      // Don't care what happened, continue on.
> +      continue;
> +    } else if (!didSomething) {

Instead of `if (B==I) { continue; } else if (!d) {return}` you could just do `if (B==R && !d) {return}`, it feels simpler to me in this case.

::: xpfe/appshell/nsXULWindow.cpp
@@ +2003,5 @@
>    xulWin->LockUntilChromeLoad();
>  
>    {
>      AutoNoJSAPI nojsapi;
> +    SpinEventLoopUntil([=]() { return !xulWin->IsLocked(); });

`[=]` seems wrong; why not `[&]` like everywhere else?

Attachment #8862614 - Flags: review?(gsquelart) → review+

Gerald Squelart (he/him) (not at Mozilla since 2022-09-15)

Comment 7

•

7 years ago

> I'd unindent `}));` to be two columns left, below `return false;`.

To clarify, I meant two columns left compared to the return statement, i.e.:
> +      return false;
> +    }));

Nathan Froyd [:froydnj]

Assignee

Comment 8

•

7 years ago

Thanks for the review!

(In reply to Gerald Squelart [:gerald] from comment #6)
> ::: dom/network/UDPSocketChild.cpp
> @@ +122,5 @@
> >      return NS_ERROR_FAILURE;
> >    }
> >  
> > +  if (!SpinEventLoopUntil([&done]() { return done; })) {
> > +    return NS_ERROR_FAILURE;
> 
> No more warning in the error case, is that intended?

Sort of.  I guess we could warn.

> ::: dom/quota/ActorsParent.cpp
> @@ -2861,5 @@
> > -  nsIThread* currentThread = NS_GetCurrentThread();
> > -  MOZ_ASSERT(currentThread);
> > -
> > -  while (!done) {
> > -    MOZ_ALWAYS_TRUE(NS_ProcessNextEvent(currentThread));
> 
> You removed the MOZ_ALWAYS_TRUE, is that intended?
> (Guessing: Probably yes, otherwise we would have got crash reports by now.)

Ah, no.  I was originally moving MOZ_ALWAYS_TRUE into the event loop spinner itself, but that's obviously not correct in general.

> ::: dom/workers/ServiceWorkerRegistrar.cpp
> @@ -992,5 @@
> > -  nsCOMPtr<nsIThread> thread(do_GetCurrentThread());
> > -  while (true) {
> > -    MOZ_ALWAYS_TRUE(NS_ProcessNextEvent(thread));
> > -    if (completed) {
> > -      break;
> 
> Wow, someone worked really hard to write a `do while` loop!
> I'm guessing that doing the extra initial test now is fine though.
> 
> No more MOZ_ALWAYS_TRUE?

Same deal here.

> ::: toolkit/components/url-classifier/tests/gtest/Common.cpp
> @@ -60,5 @@
> > -    // Condition variable doesn't work here because instrusively
> > -    // notifying the from NS_NewCheckSummedOutputStream() or
> > -    // HashStore::WriteFile() is weird.
> > -    if (!NS_ProcessNextEvent(NS_GetCurrentThread(), false)) {
> > -      PR_Sleep(PR_MillisecondsToInterval(100));
> 
> You didn't keep the PR_Sleep -- intended?

Ugh.  So it was intended, because this is silly code: try to process an event if one's there...if not, sleep, and then try to do the same thing again.  Why not just wait in the thread's event queue itself instead?  Probably more efficient.

I thought the condition is set by a runnable sent to the current thread.  But it can actually be triggered by runnables run on other threads (?).  The comment suggests that we have to do things this way because other code can run on this thread (?), but I think the person who wrote the comment must have been confused.

Upshot: I think the conversion is correct, but you have to squint at it kind of funny to see.

> ::: xpcom/threads/SharedThreadPool.cpp
> @@ -77,5 @@
> >  SharedThreadPool::SpinUntilEmpty()
> >  {
> >    MOZ_ASSERT(NS_IsMainThread());
> > -  while (!IsEmpty()) {
> > -    sMonitor->AssertNotCurrentThreadIn();
> 
> The assertion doesn't just work inside the lambda?

Oh, it probably does!  I guess it will be checked "outside" the loop, but that's probably not such a bad thing.

> ::: xpcom/threads/ThrottledEventQueue.cpp
> @@ -258,5 @@
> >      // Once shutdown begins we set the Atomic<bool> mShutdownStarted flag.
> >      // This prevents any new runnables from being dispatched into the
> >      // TaskQueue.  Therefore this loop should be finite.
> > -    while (!IsEmpty()) {
> > -      MOZ_ALWAYS_TRUE(NS_ProcessNextEvent());
> 
> No more MOZ_ALWAYS_TRUE?

Same deal as before.

> Crazy thinking about that: How about having two specializations, so that
> either the function will be `void`, or will return `MOZ_MUST_USE bool`
> (Assuming MOZ_MUST_USE can indeed be selectively applied to a
> specialization.)
> Tweaked behaviors could be:
> - ProcessFailureBehavior::ReportToCaller -> `MOZ_MUST_USE bool`, and `return
> false;` if !didSomething.
> - ProcessFailureBehavior::Stop -> `void`, and `return;` if !didSomething.
> - ProcessFailureBehavior::IgnoreAndContinue -> `void` and no early return.
> I'd vote for ReportToCaller to still be the default, as it should force
> people to think a bit about which behavior to pick at every call, because
> either they'll need to understand and deal with the return value, or
> explicitly choose a non-returning behavior.

I kind of like this idea, but then I have to audit every site that I use this function at, and I'm not confident of being able to identify the correct behavior at every site.  I guess maybe ReportToCaller for all the places that use MOZ_ALWAYS_TRUE or similar, and IgnoreAndContinue for everybody else?  I'm also not sure that Stop would ever be the correct behavior during shutdown, for reasons (obliquely) described in the comment for the enum.

> AARGH! Sorry for the over-thinking. I'll let you decide...

I like the Until variant, seems more natural to me.
 
> ::: xpfe/appshell/nsXULWindow.cpp
> @@ +2003,5 @@
> >    xulWin->LockUntilChromeLoad();
> >  
> >    {
> >      AutoNoJSAPI nojsapi;
> > +    SpinEventLoopUntil([=]() { return !xulWin->IsLocked(); });
> 
> `[=]` seems wrong; why not `[&]` like everywhere else?

Fair enough.

Nathan Froyd [:froydnj]

Assignee

Comment 9

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #8)
> > ::: toolkit/components/url-classifier/tests/gtest/Common.cpp
> > @@ -60,5 @@
> > > -    // Condition variable doesn't work here because instrusively
> > > -    // notifying the from NS_NewCheckSummedOutputStream() or
> > > -    // HashStore::WriteFile() is weird.
> > > -    if (!NS_ProcessNextEvent(NS_GetCurrentThread(), false)) {
> > > -      PR_Sleep(PR_MillisecondsToInterval(100));
> > 
> > You didn't keep the PR_Sleep -- intended?
> 
> Ugh.  So it was intended, because this is silly code: try to process an
> event if one's there...if not, sleep, and then try to do the same thing
> again.  Why not just wait in the thread's event queue itself instead? 
> Probably more efficient.
> 
> I thought the condition is set by a runnable sent to the current thread. 
> But it can actually be triggered by runnables run on other threads (?).  The
> comment suggests that we have to do things this way because other code can
> run on this thread (?), but I think the person who wrote the comment must
> have been confused.
> 
> Upshot: I think the conversion is correct, but you have to squint at it kind
> of funny to see.

Apparently this is not the case; the patch consistently times out on Mac GTests, while being OK everywhere else.  Henry wrote this code in bug 1339050...Henry, why are we not waiting inside NS_ProcessNextEvent here?  What do the threads doing the async update look like at this point?  Does having to busy-wait here mean that |done| is getting written from some other thread than the one doing the waiting (ugh)?

Flags: needinfo?(hchang)

Henry Chang [:hchang]

Comment 10

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #9)
> (In reply to Nathan Froyd [:froydnj] from comment #8)
> > > ::: toolkit/components/url-classifier/tests/gtest/Common.cpp
> > > @@ -60,5 @@
> > > > -    // Condition variable doesn't work here because instrusively
> > > > -    // notifying the from NS_NewCheckSummedOutputStream() or
> > > > -    // HashStore::WriteFile() is weird.
> > > > -    if (!NS_ProcessNextEvent(NS_GetCurrentThread(), false)) {
> > > > -      PR_Sleep(PR_MillisecondsToInterval(100));
> > > 
> > > You didn't keep the PR_Sleep -- intended?
> > 
> > Ugh.  So it was intended, because this is silly code: try to process an
> > event if one's there...if not, sleep, and then try to do the same thing
> > again.  Why not just wait in the thread's event queue itself instead? 
> > Probably more efficient.
> > 
> > I thought the condition is set by a runnable sent to the current thread. 
> > But it can actually be triggered by runnables run on other threads (?).  The
> > comment suggests that we have to do things this way because other code can
> > run on this thread (?), but I think the person who wrote the comment must
> > have been confused.
> > 
> > Upshot: I think the conversion is correct, but you have to squint at it kind
> > of funny to see.
> 
> Apparently this is not the case; the patch consistently times out on Mac
> GTests, while being OK everywhere else.  Henry wrote this code in bug
> 1339050...Henry, why are we not waiting inside NS_ProcessNextEvent here? 
> What do the threads doing the async update look like at this point?  Does
> having to busy-wait here mean that |done| is getting written from some other
> thread than the one doing the waiting (ugh)?

Hi Nathan,

AysncApplyUpdate() would just post an event to a separate thread (let's call it
the update thread) for update and notify the result via a callback on the 
caller thread. 

The intention is to wrap that AysncApplyUpdate() to a synchronous update API
for test case itself. (since the very original API is synchronous and all the
test cases were written with that assumption in mind.)

I was supposed to use a condition variable or a simple busy-wait to synchronize with 
the AsyncApplyUpdate()'s completion. However, I ended up failing to do any of them
because somewhere on the update thread would "SyncRunnable to main thread" 
be done for initializing NS_CRYPTO_HASH_CONTRACTID [1], which is triggered by [2].
As a result, on the main thread, we have to wait and process pending events
to unblock the update thread. |done| is only a flag for the main thread to know
if AsyncApplyUpdate() has been complete.

To get rid of that busy-wait, I have no better solution than pre-initializing
NS_CRYPTO_HASH_CONTRACTID on non-main thread prior to calling AsyncApplyUpdate().
However, in that case, we still have to synchronize with the completion of 
creating NS_CRYPTO_HASH_CONTRACTID on the main thread.





[1] http://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/security/manager/ssl/nsNSSComponent.cpp#101
[2] http://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/toolkit/components/url-classifier/nsCheckSummedOutputStream.cpp#26

Flags: needinfo?(hchang)

Henry Chang [:hchang]

Comment 11

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #9)
> (In reply to Nathan Froyd [:froydnj] from comment #8)
> > > ::: toolkit/components/url-classifier/tests/gtest/Common.cpp
> > > @@ -60,5 @@
> > > > -    // Condition variable doesn't work here because instrusively
> > > > -    // notifying the from NS_NewCheckSummedOutputStream() or
> > > > -    // HashStore::WriteFile() is weird.
> > > > -    if (!NS_ProcessNextEvent(NS_GetCurrentThread(), false)) {
> > > > -      PR_Sleep(PR_MillisecondsToInterval(100));
> > > 
> > > You didn't keep the PR_Sleep -- intended?
> > 
> > Ugh.  So it was intended, because this is silly code: try to process an
> > event if one's there...if not, sleep, and then try to do the same thing
> > again.  Why not just wait in the thread's event queue itself instead? 
> > Probably more efficient.
> > 
> > I thought the condition is set by a runnable sent to the current thread. 
> > But it can actually be triggered by runnables run on other threads (?).  The
> > comment suggests that we have to do things this way because other code can
> > run on this thread (?), but I think the person who wrote the comment must
> > have been confused.
> > 
> > Upshot: I think the conversion is correct, but you have to squint at it kind
> > of funny to see.
> 
> Apparently this is not the case; the patch consistently times out on Mac
> GTests, while being OK everywhere else.  Henry wrote this code in bug
> 1339050...Henry, why are we not waiting inside NS_ProcessNextEvent here? 
> What do the threads doing the async update look like at this point?  Does
> having to busy-wait here mean that |done| is getting written from some other
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> thread than the one doing the waiting (ugh)?
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Yes. |done| will be written on the "brand new thread" and we wait on the main thread.
The reason is AsyncApplyUpdate() notifies the result via a callback on the
"caller thread". 

That said, SpinEventLoopUntil() is supposed to work because it looks no difference
from what I am doing, isn't it?


[1] http://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/toolkit/components/url-classifier/tests/gtest/Common.cpp#48

Nathan Froyd [:froydnj]

Assignee

Comment 12

•

7 years ago

(In reply to Henry Chang [:hchang] from comment #11)
> (In reply to Nathan Froyd [:froydnj] from comment #9)
> > Apparently this is not the case; the patch consistently times out on Mac
> > GTests, while being OK everywhere else.  Henry wrote this code in bug
> > 1339050...Henry, why are we not waiting inside NS_ProcessNextEvent here? 
> > What do the threads doing the async update look like at this point?  Does
> > having to busy-wait here mean that |done| is getting written from some other
>  
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > thread than the one doing the waiting (ugh)?
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Yes. |done| will be written on the "brand new thread" and we wait on the
> main thread.
> The reason is AsyncApplyUpdate() notifies the result via a callback on the
> "caller thread". 

Would it be problematic for the callback to post an event to the caller thread instead?  That would let use spin the event loop properly, without racing on |done|.

> That said, SpinEventLoopUntil() is supposed to work because it looks no
> difference
> from what I am doing, isn't it?

It's not the same, because the current code uses NS_ProcessNextEvent(..., false), which checks to see if there's a runnable event, and if there's not, immediately returns.  So the SpinEventLoopUntil version has a race, which looks like:

T1: Checks value of |done|, finds it to be false.
T1: Calls NS_ProcessNextEvent.
T1: Waits for an event to show up in the event queue.
T2: Sets |done| to true.

And now T1 is waiting for an event that will never get posted.

Flags: needinfo?(hchang)

Henry Chang [:hchang]

Comment 13

•

7 years ago

Attached patch 0001-Avoid-racing-on-done.patch (obsolete) — Details — Splinter Review

Flags: needinfo?(hchang)

Henry Chang [:hchang]

Comment 14

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #12)
> (In reply to Henry Chang [:hchang] from comment #11)
> > (In reply to Nathan Froyd [:froydnj] from comment #9)
> > > Apparently this is not the case; the patch consistently times out on Mac
> > > GTests, while being OK everywhere else.  Henry wrote this code in bug
> > > 1339050...Henry, why are we not waiting inside NS_ProcessNextEvent here? 
> > > What do the threads doing the async update look like at this point?  Does
> > > having to busy-wait here mean that |done| is getting written from some other
> >  
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > thread than the one doing the waiting (ugh)?
> >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> > Yes. |done| will be written on the "brand new thread" and we wait on the
> > main thread.
> > The reason is AsyncApplyUpdate() notifies the result via a callback on the
> > "caller thread". 
> 
> Would it be problematic for the callback to post an event to the caller
> thread instead?  That would let use spin the event loop properly, without
> racing on |done|.
> 
> > That said, SpinEventLoopUntil() is supposed to work because it looks no
> > difference
> > from what I am doing, isn't it?
> 
> It's not the same, because the current code uses NS_ProcessNextEvent(...,
> false), which checks to see if there's a runnable event, and if there's not,
> immediately returns.  So the SpinEventLoopUntil version has a race, which
> looks like:
> 
> T1: Checks value of |done|, finds it to be false.
> T1: Calls NS_ProcessNextEvent.
> T1: Waits for an event to show up in the event queue.
> T2: Sets |done| to true.
> 
> And now T1 is waiting for an event that will never get posted.

Thanks for your explanation! I have changed NS_ProcessNextEvent(..., false) 
to NS_ProcessNextEvent(..., true) and it's supposed to identical to using
SpinEventLoopUntil(). The change is verified to work perfectly.

Would you like me to file a new bug and resolve it first or just merge
to your patch here?

Thank you :)

Nathan Froyd [:froydnj]

Assignee

Comment 15

•

7 years ago

(In reply to Henry Chang [:hchang] from comment #14)
> Thanks for your explanation! I have changed NS_ProcessNextEvent(..., false) 
> to NS_ProcessNextEvent(..., true) and it's supposed to identical to using
> SpinEventLoopUntil(). The change is verified to work perfectly.
> 
> Would you like me to file a new bug and resolve it first or just merge
> to your patch here?

Please file a new bug, blocking this one.  And please change the AbstractThread::MainThread()->Dispatch(...) bit to just use NS_DispatchToMainThread instead.  Thanks!

Flags: needinfo?(hchang)

Henry Chang [:hchang]

Comment 16

•

7 years ago

Sure Nathan :)

BTW, just out of curious, why do you hope me to use NS_DispatchToMainThread
instead of AbstractThread::MainThread()? I was told lately to use 
AbstractThread::MainThread whenever possible. Is there a specific 
reason in this case?

Thanks!

Flags: needinfo?(hchang)

Henry Chang [:hchang]

Updated

•

7 years ago

Attachment #8865303 - Attachment is obsolete: true

Henry Chang [:hchang]

Updated

•

7 years ago

Depends on: 1363266

Nathan Froyd [:froydnj]

Assignee

Comment 17

•

7 years ago

(In reply to Henry Chang [:hchang] from comment #16)
> BTW, just out of curious, why do you hope me to use NS_DispatchToMainThread
> instead of AbstractThread::MainThread()? I was told lately to use 
> AbstractThread::MainThread whenever possible. Is there a specific 
> reason in this case?

Oh!  Why were you told to use AbstractThread::MainThread()?  This is news to me. :)

Flags: needinfo?(hchang)

Henry Chang [:hchang]

Comment 18

•

7 years ago

(In reply to Nathan Froyd [:froydnj] from comment #17)
> (In reply to Henry Chang [:hchang] from comment #16)
> > BTW, just out of curious, why do you hope me to use NS_DispatchToMainThread
> > instead of AbstractThread::MainThread()? I was told lately to use 
> > AbstractThread::MainThread whenever possible. Is there a specific 
> > reason in this case?
> 
> Oh!  Why were you told to use AbstractThread::MainThread()?  This is news to
> me. :)

After double check, avoid using NS_DispatchToMainThread() is only limited
to content process. Sorry for confusing you :p

Flags: needinfo?(hchang)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 19

•

7 years ago

(In reply to Henry Chang [:hchang] from comment #18)
> > Oh!  Why were you told to use AbstractThread::MainThread()?  This is news to
> > me. :)
> 
> After double check, avoid using NS_DispatchToMainThread() is only limited
> to content process. Sorry for confusing you :p

Why? I think NS_DispatchToMainThread should be preferred in the content process as well. Why is it better to use AbstractThread::MainThread?

Pulsebot

Comment 20

•

7 years ago

Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c8840a3a2c3b
add an event loop spinning abstraction function; r=gerald

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 21

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c8840a3a2c3b

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox55: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla55

add an event loop spinning abstraction function 7 years ago Nathan Froyd [:froydnj] 48.27 KB, patch	mozbugz : review+	Details \| Diff \| Splinter Review
0001-Avoid-racing-on-done.patch 7 years ago Henry Chang [:hchang] 2.96 KB, patch		Details \| Diff \| Splinter Review