Last Comment Bug 534698 - nsTransportEventSinkProxy implementation eats event notifications
: nsTransportEventSinkProxy implementation eats event notifications
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: Trunk
: All All
: -- normal (vote)
: ---
Assigned To: Honza Bambas (:mayhemer)
:
Mentors:
Depends on:
Blocks: 526207
  Show dependency treegraph
 
Reported: 2009-12-14 12:51 PST by Honza Bambas (:mayhemer)
Modified: 2010-06-26 10:34 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
v1 (1.14 KB, patch)
2009-12-14 12:51 PST, Honza Bambas (:mayhemer)
no flags Details | Diff | Splinter Review
v2 (1.83 KB, patch)
2009-12-28 11:26 PST, Honza Bambas (:mayhemer)
bzbarsky: review+
Details | Diff | Splinter Review
v2.1 [Checkin comment 15] (2.46 KB, patch)
2010-01-25 12:33 PST, Honza Bambas (:mayhemer)
honzab.moz: review+
Details | Diff | Splinter Review

Description Honza Bambas (:mayhemer) 2009-12-14 12:51:29 PST
Created attachment 417534 [details] [diff] [review]
v1

Http transport event sink proxy is setup to coalesce all events coming from the connection. It might lead to loose of some event notifications, often happens with STATUS_CONNECTED_TO, that is vital for successful logging. This happens because when there is still a pending event we simply do not fire (queue) a new event even it has a different status notification. It might make sense only for protocols other then http.

This bug is there from the beginning, see:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/netwerk/protocol/http/src/nsHttpTransaction.cpp&rev=1.104#184
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/netwerk/base/src/nsTransportUtils.cpp&rev=1.4#139

and the attachment.
Comment 1 Boris Zbarsky [:bz] (TPAC) 2009-12-14 13:00:41 PST
1) The comment is now incorrect, yes?
2) What is the practical impact on the number and type of events delivered as a
   result of this patch?  Can you give me some statistics for some typical
   websites?
Comment 2 Honza Bambas (:mayhemer) 2009-12-14 13:33:17 PST
(In reply to comment #1)
> 1) The comment is now incorrect, yes?

Yes

> 2) What is the practical impact on the number and type of events delivered as a
>    result of this patch? 

We get all events that can be used for logging in software like Firebug or just for correct receive of all expected chain of events when observing the transport notifications.

> Can you give me some statistics for some typical
>    websites?

I was trying to load (uncached) four web sites: www.google.cz, www.seznam.cz, www.mozilla.org and www.mozilla.com. I get 307 notification w/ this patch and 250 w/o this patch. It's about 23% raise. We have 6 states and this might indicate we were getting only 5 of them, now we get all.
Comment 3 Boris Zbarsky [:bz] (TPAC) 2009-12-14 13:39:17 PST
What do the numbers look like on www.cnn.com?

What about gmail?
Comment 4 Honza Bambas (:mayhemer) 2009-12-14 13:54:46 PST
(In reply to comment #3)
> What do the numbers look like on www.cnn.com?
> 

296 to 356 (~20% diff).

> What about gmail?

I unfortunately don't have a gmail account to test...
Comment 5 Honza Bambas (:mayhemer) 2009-12-14 14:01:06 PST
And to correct the description: we do fire next event, but overwrite the current (delete it). So, we replace (coalesce) CONNECTED_TO with e.g. SENDING_TO or such an event.
Comment 6 Boris Zbarsky [:bz] (TPAC) 2009-12-14 14:20:05 PST
> 296 to 356 (~20% diff).

Another question.  Any estimate on how long each event takes to handle?
Comment 7 Honza Bambas (:mayhemer) 2009-12-14 14:42:30 PST
(In reply to comment #6)
> Another question.  Any estimate on how long each event takes to handle?

nsHttpChannel::OnTransportStatus: about 51ms on my 2.4 GHz Athlon64, debug build.
Comment 8 Boris Zbarsky [:bz] (TPAC) 2009-12-14 14:47:13 PST
So 50 extra events is 2.5 seconds?  What about opt build?
Comment 9 Honza Bambas (:mayhemer) 2009-12-14 14:54:13 PST
It will take some time to test.

What to do when there is a high performance impact with this patch? Should we leave the event sink code actually 'broken'?
Comment 10 Boris Zbarsky [:bz] (TPAC) 2009-12-14 15:02:48 PST
Well, whether it's broken depends on what promises it makes....

If there's a high performance impact, then we should see where it's coming from.  If it's something fundamental, then perhaps we should have different modes for the events: a tracing mode for cases when we need every single event and a normal mode for what users normally need.
Comment 11 Honza Bambas (:mayhemer) 2009-12-15 10:03:20 PST
(In reply to comment #8)
> So 50 extra events is 2.5 seconds?  What about opt build?

PerformanceCounter on windows tells me 0 for a release build, really a jump...

(In reply to comment #10)
> Well, whether it's broken depends on what promises it makes....
> 
> If there's a high performance impact, then we should see where it's coming
> from.  If it's something fundamental, then perhaps we should have different
> modes for the events: a tracing mode for cases when we need every single event
> and a normal mode for what users normally need.

We could use nsIHttpActivityObserver.isActive attribute for this, it's being checked right under the event sink initiation (coalesce = !isActive).
Comment 12 Honza Bambas (:mayhemer) 2009-12-28 11:26:45 PST
Created attachment 419330 [details] [diff] [review]
v2

Using activity distributor isActive attribute to decide if to coalesce events or not. No impact to normal usage, developers gets all notifications as would expect.
Comment 13 Boris Zbarsky [:bz] (TPAC) 2009-12-31 16:50:43 PST
Comment on attachment 419330 [details] [diff] [review]
v2

>     // create transport event sink proxy that coalesces all events

This comment needs to be right above the net_NewTransportEventSinkProxy call, and probably be changed to reflect the new reality.

>+    PRBool active = PR_FALSE;

I'd actually prefer we just set active to false in the else branch here (the one where we land if NS_FAILED(rv)).

And maybe rename this to activityDistributorActive?

r=me with those changes.
Comment 14 Honza Bambas (:mayhemer) 2010-01-25 12:33:15 PST
Created attachment 423393 [details] [diff] [review]
v2.1 [Checkin comment 15]

Updated patch ready to land.
Comment 15 Honza Bambas (:mayhemer) 2010-01-26 05:37:55 PST
Comment on attachment 423393 [details] [diff] [review]
v2.1 [Checkin comment 15]

http://hg.mozilla.org/mozilla-central/rev/79745cb186d5
Comment 16 Honza Bambas (:mayhemer) 2010-01-27 11:34:25 PST
Comment on attachment 423393 [details] [diff] [review]
v2.1 [Checkin comment 15]

For Firebug this would be nice to have also on 1.9.2.x (Fx3.6).  This is complementary fix to bug 488270 that has landed on 1.9.2.
Comment 17 Daniel Veditz [:dveditz] 2010-03-12 13:31:54 PST
Comment on attachment 423393 [details] [diff] [review]
v2.1 [Checkin comment 15]

Didn't make 1.9.2.2, we'll look at next time.
Comment 18 Daniel Veditz [:dveditz] 2010-06-26 08:37:50 PDT
Comment on attachment 423393 [details] [diff] [review]
v2.1 [Checkin comment 15]

Clearing old approval requests now that 1.9.2.4 has shipped. If you believe this patch is still necessary on the 1.9.2 branch please re-request approval along with a risk/benefit analysis explaining why we need it.

Note You need to log in before you can comment on or make changes to this bug.