Closed Bug 1131557 Opened 5 years ago Closed 5 years ago

Server multiple xpcom events in a single poll iteration

Categories

(Core :: Networking, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox39 --- fixed

People

(Reporter: dragana, Assigned: dragana)

References

Details

Attachments

(1 file, 4 obsolete files)

Till now we have been serving only one xpcom event between two poll iteration.

We will change this to server all pending events.

It will be possible to turn this off with a pref.

The time spend serving only xpcom events will be limited.
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Attached patch serveMultipleEvents_v1.patch (obsolete) — Splinter Review
From our discussion from yesterday I have decided to do this part in a separate bug.
Attachment #8562058 - Flags: feedback?(mcmanus)
Comment on attachment 8562058 [details] [diff] [review]
serveMultipleEvents_v1.patch

Review of attachment 8562058 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks! I'm interested in what a full try run (including talos - all platforms) has to say about this

::: netwerk/base/nsSocketTransportService2.cpp
@@ +752,5 @@
> +                    if (!mServingPendingQueue) {
> +                        nsresult rv = Dispatch(NS_NewRunnableMethod(this,
> +                            &nsSocketTransportService::MarkTheLastElementOfPendingQueue),
> +                            nsIEventTarget::DISPATCH_NORMAL);
> +                        NS_ENSURE_SUCCESS(rv, rv);

NS_ENSURE_SUCCESS has a return-on-failure semantic - that sounds bad here!

@@ +762,5 @@
> +                        pendingEvents = false;
> +                        thread->HasPendingEvents(&pendingEvents);
> +                    } while (pendingEvents && mServingPendingQueue &&
> +                             ((PR_Now() - eventQueueStart) <
> +                              PR_MillisecondsToInterval(20)));

that val should be preffable.. I had imagined something more like 100, but its just intuition

::: netwerk/base/nsSocketTransportService2.h
@@ +217,5 @@
>      // True if TCP keepalive is enabled globally.
>      bool        mKeepaliveEnabledPref;
>  
> +    bool        mServerMultipleEventsPerPollIter;
> +    bool        mServingPendingQueue;

I think that needs to be "volatile bool mServerPendingQueue".. (I wrote register in my email - whoops)
Attachment #8562058 - Flags: feedback?(mcmanus) → feedback+
(In reply to Patrick McManus [:mcmanus] from comment #2)
> Comment on attachment 8562058 [details] [diff] [review]
> serveMultipleEvents_v1.patch
> 
> Review of attachment 8562058 [details] [diff] [review]:
> -----------------------------------------------------------------

> @@ +762,5 @@
> > +                        pendingEvents = false;
> > +                        thread->HasPendingEvents(&pendingEvents);
> > +                    } while (pendingEvents && mServingPendingQueue &&
> > +                             ((PR_Now() - eventQueueStart) <
> > +                              PR_MillisecondsToInterval(20)));
> 
> that val should be preffable.. I had imagined something more like 100, but
> its just intuition
> 

First I though to connect it to duration of poll iteration (min, avr or something), but no need to make it complicate.
I will make pref and we can try it out.


> ::: netwerk/base/nsSocketTransportService2.h
> @@ +217,5 @@
> >      // True if TCP keepalive is enabled globally.
> >      bool        mKeepaliveEnabledPref;
> >  
> > +    bool        mServerMultipleEventsPerPollIter;
> > +    bool        mServingPendingQueue;
> 
> I think that needs to be "volatile bool mServerPendingQueue".. (I wrote
> register in my email - whoops)

Your email?
> Your email?

below.. spam?
----

I was thinking something like this as a substitute for not knowing how many events were in the xpcom queue

{
...
while (!mShutdown) {
  DoPollIteration();
  register bool runQ = true;
  PostEvent(thread, clearRunQ, &runQ);
  while (runQ) {
    NS_ProcessNextEvent(thread);
  }
 return NS_OK;
}

void ClearRunQ(bool *arg)
{
 *arg = false;
}
Note: don't use volatile.  Use Atomic<> if you need, probably with rel_acq ordering only.
Attached patch serveMultipleEvents_v2.patch (obsolete) — Splinter Review
Attachment #8562058 - Attachment is obsolete: true
Attachment #8562756 - Flags: review?(mcmanus)
joel, can you take a look at the try run in comment 7 and assess the talos runs in it?
Flags: needinfo?(jmaher)
Comment on attachment 8562756 [details] [diff] [review]
serveMultipleEvents_v2.patch

Review of attachment 8562756 [details] [diff] [review]:
-----------------------------------------------------------------

exciting. r=mcmanus, and I'll add a review flag for joel to give his input based on the try run.
Attachment #8562756 - Flags: review?(mcmanus)
Attachment #8562756 - Flags: review?(jmaher)
Attachment #8562756 - Flags: review+
hey, the try run looks good.  although there is one data point per test, there no regressions with those data points.
Flags: needinfo?(jmaher)
Comment on attachment 8562756 [details] [diff] [review]
serveMultipleEvents_v2.patch

Review of attachment 8562756 [details] [diff] [review]:
-----------------------------------------------------------------

this code is out of my domain knowledge, but the try run looks good and so do the talos results- just a slight question regarding the new prefs and using them in our automation

::: modules/libpref/init/all.js
@@ +1361,5 @@
> +// If this pref is false only one xpcom event will be served per poll
> +// iteration. This is the original behavior.
> +// If it is true multiple events will be served.
> +pref("network.sts.serve_multiple_events_per_poll_iteration", true);
> +pref("network.sts.max_time_for_events_between_to_polls", 100);

do we need to adjust this in talos or unittest automation?
Attachment #8562756 - Flags: review?(jmaher) → review+
Comment on attachment 8562756 [details] [diff] [review]
serveMultipleEvents_v2.patch

Review of attachment 8562756 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/nsSocketTransportService2.cpp
@@ +40,5 @@
>  #define SOCKET_LIMIT_TARGET 550U
>  #define SOCKET_LIMIT_MIN     50U
>  #define BLIP_INTERVAL_PREF "network.activity.blipIntervalMilliseconds"
> +#define SERVE_MULTIPLE_EVENTS_PREF "network.sts.serve_multiple_events_per_poll_iteration"
> +#define MAX_TIME_BETWEEN_TWO_POLLS "network.sts.max_time_for_events_between_to_polls"

the pref string is wrong

@@ +761,5 @@
> +                        } else {
> +                            mServingPendingQueue = true;
> +                        }
> +                    }
> +                    PRTime eventQueueStart = PR_Now();

instead use TimeStamp::NowLoRes please, using PR_Now here is really a bit mad..

@@ +768,5 @@
> +                        pendingEvents = false;
> +                        thread->HasPendingEvents(&pendingEvents);
> +                    } while (pendingEvents && mServingPendingQueue &&
> +                             ((PR_Now() - eventQueueStart) <
> +                              (PR_USEC_PER_MSEC * mMaxTimePerPollIter)));

my favorite way is (but up to you):

        bool processedEvent;
        nsresult rv;
        do {
          rv = NS_ProcessNextEvent(false, &processedEvent);
        } while (NS_SUCCEEDED(rv) && processedEvent);

@@ +772,5 @@
> +                              (PR_USEC_PER_MSEC * mMaxTimePerPollIter)));
> +                }
> +            } else {
> +
> +                if (pendingEvents) {

nit: 

} else if (pendingEvents) {

or even better:

if (pendingEvents) {
  if (mServe..) {
  } else {
  }
}
Attachment #8562756 - Flags: feedback-
> 
> @@ +761,5 @@
> > +                        } else {
> > +                            mServingPendingQueue = true;
> > +                        }
> > +                    }
> > +                    PRTime eventQueueStart = PR_Now();
> 
> instead use TimeStamp::NowLoRes please, using PR_Now here is really a bit
> mad..

the pr intervals are fine here. they are efficient and well suited to this case that we know will be using small time ranges and this file is already a pile of NSPR dependencies. at the same time, I don't have an objection to timestamp::nowlores.
(In reply to Patrick McManus [:mcmanus] from comment #13)
> > 
> > @@ +761,5 @@
> > > +                        } else {
> > > +                            mServingPendingQueue = true;
> > > +                        }
> > > +                    }
> > > +                    PRTime eventQueueStart = PR_Now();
> > 
> > instead use TimeStamp::NowLoRes please, using PR_Now here is really a bit
> > mad..
> 
> the pr intervals are fine here. they are efficient and well suited to this
> case that we know will be using small time ranges and this file is already a
> pile of NSPR dependencies. at the same time, I don't have an objection to
> timestamp::nowlores.

PR_Now is a wallclock time..  PR_IntervalNow() is what you probably mean.  I don't care, but honestly NoLoRes() is optimized for this kind of stuff.
(In reply to Honza Bambas (:mayhemer) from comment #14)
> (In reply to Patrick McManus [:mcmanus] from comment #13)
> 
> PR_Now is a wallclock time..  PR_IntervalNow() is what you probably mean.  I

yes - that is a bug. I conflated them in my head. (which admittedly is a convincing argument about it being a crappy api.)
Attached patch serveMultipleEvents_v3.patch (obsolete) — Splinter Review
Attachment #8562756 - Attachment is obsolete: true
Attachment #8565087 - Flags: review?(mcmanus)
Comment on attachment 8565087 [details] [diff] [review]
serveMultipleEvents_v3.patch

Review of attachment 8565087 [details] [diff] [review]:
-----------------------------------------------------------------

couple nits.. and then ship it!

::: modules/libpref/init/all.js
@@ +1362,5 @@
> +// If this pref is false only one xpcom event will be served per poll
> +// iteration. This is the original behavior.
> +// If it is true multiple events will be served.
> +pref("network.sts.serve_multiple_events_per_poll_iteration", true);
> +pref("network.sts.max_time_for_events_between_two_polls", 100);

comment on the units (ms)

::: netwerk/base/nsSocketTransportService2.cpp
@@ +68,5 @@
>      , mKeepaliveIdleTimeS(600)
>      , mKeepaliveRetryIntervalS(1)
>      , mKeepaliveProbeCount(kDefaultTCPKeepCount)
>      , mKeepaliveEnabledPref(false)
> +    , mServeMultipleEventsPerPollIter(false)

true - to match default config

@@ +70,5 @@
>      , mKeepaliveProbeCount(kDefaultTCPKeepCount)
>      , mKeepaliveEnabledPref(false)
> +    , mServeMultipleEventsPerPollIter(false)
> +    , mServingPendingQueue(false)
> +    , mMaxTimePerPollIter(0)

100 to match default config

@@ +1024,5 @@
> +        rv = tmpPrefService->GetBoolPref(SERVE_MULTIPLE_EVENTS_PREF,
> +                                         &serveMultiplePref);
> +        if (NS_SUCCEEDED(rv) &&
> +            serveMultiplePref != mServeMultipleEventsPerPollIter) {
> +            mServeMultipleEventsPerPollIter = serveMultiplePref;

its just a bool.. easier to do the assignment without the != conditional.
Attachment #8565087 - Flags: review?(mcmanus) → review+
Attached patch serveMultipleEvents_v4.patch (obsolete) — Splinter Review
Attachment #8565087 - Attachment is obsolete: true
Attachment #8565954 - Flags: review+
Keywords: checkin-needed
Backed out for frequent Linux32 xpcshell failures.
https://hg.mozilla.org/integration/mozilla-inbound/rev/7f4925153bd9

https://treeherder.mozilla.org/logviewer.html#?job_id=6724780&repo=mozilla-inbound

Also, on an interesting note, this patch appears to build fine on its own, but shortly after this landed, ckerschb tried to land bug 1109910 and hit bustage that looks very-related to this push. Some sort of unified build issue?
https://treeherder.mozilla.org/logviewer.html#?job_id=6726305&repo=mozilla-inbound
Ryan is right in comment 21.

Testing on my local machine:
* When I only compile the patch attached to this bug - works fine.
* When I only compile the patch from bug 1109910 - works fine.
* When I apply both patches and compile, I get:

> 0:24.79 In file included from /home/ckerschb/moz/mc-obj-dbg/netwerk/base/Unified_cpp_netwerk_base3.cpp:2:
> 0:24.79 /home/ckerschb/moz/mc/netwerk/base/nsSocketTransportService2.cpp:765:49: error: incomplete type 'mozilla::TimeStamp' named in nested name specifier
> 0:24.79                     TimeStamp eventQueueStart = TimeStamp::NowLoRes();
> 0:24.79                                                 ^~~~~~~~~~~
> 0:24.79 ../../dist/include/GeckoProfiler.h:55:7: note: forward declaration of 'mozilla::TimeStamp'
> 0:24.79 class TimeStamp;

I don't know what might cause such an issue, but it seems that a simple inclusion of TimeStamp.h fixes the issue. Inlining a few lines from the diff underneath:

> diff --git a/netwerk/base/nsSocketTransportService2.cpp b/netwerk/base/nsSocketTransportService2.cpp
> --- a/netwerk/base/nsSocketTransportService2.cpp
> +++ b/netwerk/base/nsSocketTransportService2.cpp
> #include "NetworkActivityMonitor.h"
> #include "nsIObserverService.h"
> #include "mozilla/Services.h"
>+#include "mozilla/TimeStamp.h"
> #include "mozilla/Preferences.h"
> #include "mozilla/Likely.h"
> #include "mozilla/PublicSSL.h"

Please note, that this fix only fixes the compile issues, not the xpcshell test failures pointed out by Ryan in Comment 21.
(In reply to Christoph Kerschbaumer [:ckerschb] from comment #22)
> Ryan is right in comment 21.
> 
> > diff --git a/netwerk/base/nsSocketTransportService2.cpp b/netwerk/base/nsSocketTransportService2.cpp
> > --- a/netwerk/base/nsSocketTransportService2.cpp
> > +++ b/netwerk/base/nsSocketTransportService2.cpp
> > #include "NetworkActivityMonitor.h"
> > #include "nsIObserverService.h"
> > #include "mozilla/Services.h"
> >+#include "mozilla/TimeStamp.h"
> > #include "mozilla/Preferences.h"
> > #include "mozilla/Likely.h"
> > #include "mozilla/PublicSSL.h"
> 
> Please note, that this fix only fixes the compile issues, not the xpcshell
> test failures pointed out by Ryan in Comment 21.

Sorry  for not writing comment earlier, I was attending to fix this build issue so that it works with your patch. You can push your patch and i will fix the build issue after i take a look at xpcshell error.
Depends on: 1135405
Blocks: 1130444
This patch fix the conflict with bug 1109910.

The bug 1135405 have fixed failing test from comment #21. The following try run are with both patches:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f0b46478f0f1
Attachment #8565954 - Attachment is obsolete: true
Attachment #8572879 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/7d6ee599c58b
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in before you can comment on or make changes to this bug.