Closed Bug 770264 Opened 12 years ago Closed 12 years ago

HTTP Connection Diagnostics

Categories

(Core :: Networking: HTTP, enhancement)

16 Branch
x86_64
Linux
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16
Tracking Status
firefox15 --- verified

People

(Reporter: mcmanus, Assigned: mcmanus)

References

Details

(Whiteboard: [spdy])

Attachments

(4 files)

bugs like 762162 demand a HTTP diagnostic other than the NSPR log. The log can only be made by restarting firefox (or by a developer with really keen debugger skills) which can have the side effect of clearing the error condition.. furthermore the error may take a long time to trigger and in those cases the size of the speed-cost of the log may make its use prohibitive.

765849 proposes a about:net information page - which is a good idea. There is a summer of code project looking at combining that with some simple diagnostics. Something like that should be the end goal.

But I need some of this info now to solve the other bugs.

This bug proposes (and implements) a quick and dirty data dump that can be used in the interim to satisfy the immediate need for some of this information.

The mecahnism it uses is crude (a dump to javascript console triggered by a pref change), but it should be extremely resistant to regression problems and can therefore be safely backported. The data will need to be cut and pasted and post processed, but the key pieces should be there. No changes to existing code paths or data structures are made.
Assignee: nobody → mcmanus
Attached file sample report
Attached patch patch 0Splinter Review
I'd love to get this reviewed as quick as possible to increase its chances of being accepted for backport.

I acknowledge its crude, but my goal here was to try and avoid adding any code that wasn't basically just a printf of an existing data structure. Other than the prototypes the code even manages to basically live in a new file.

To try it out, just open the javascript console - make sure it is showing the warning level and then flip network.http.diagnostics to true. (you can get a new report by going back to false and then true again - it prints each time it transitions to true.)
Attachment #638432 - Flags: review?(honzab.moz)
Attached patch aurora versionSplinter Review
an aurora (ff15) version
Attached patch beta versionSplinter Review
a beta (ff14) version
Comment on attachment 638432 [details] [diff] [review]
patch 0

Review of attachment 638432 [details] [diff] [review]:
-----------------------------------------------------------------

r=honzab with few nits.

::: netwerk/protocol/http/ConnectionDiagnostics.cpp
@@ +58,5 @@
> +                              self->AtActiveConnectionLimit(ent, NS_HTTP_ALLOW_KEEPALIVE));
> +  self->mLogData.AppendPrintf("   RestrictConnections = %d\n",
> +                              self->RestrictConnections(ent));
> +  self->mLogData.AppendPrintf("   Pending Q Length = %d\n",
> +                              ent->mPendingQ.Length());

It should be %u, nsTArray::Length is unsigned.

@@ +72,5 @@
> +                              ent->mUsingSpdy, ent->mTestedSpdy, ent->mSpdyPreferred);
> +  self->mLogData.AppendPrintf("   pipelinestate = %d penalty = %d\n",
> +                              ent->mPipelineState, ent->mPipeliningPenalty);
> +  for (i = 0; i < nsAHttpTransaction::CLASS_MAX; ++i) {
> +    self->mLogData.AppendPrintf("   pipeline per class penalty %u %d\n",

Be consistent with how you log the class (classification).  On other place you are logging it as %x.  Choose only one.

@@ +76,5 @@
> +    self->mLogData.AppendPrintf("   pipeline per class penalty %u %d\n",
> +                                i, ent->mPipeliningClassPenalty[i]);
> +  }
> +  for (i = 0; i < ent->mActiveConns.Length(); ++i) {
> +    self->mLogData.AppendPrintf("   :: Active Connection #%d\n", i);

#%u ?

@@ +88,5 @@
> +    self->mLogData.AppendPrintf("   :: Half Open #%d\n", i);
> +    ent->mHalfOpens[i]->PrintDiagnostics(self->mLogData);
> +  }
> +    
> +  return PL_DHASH_NEXT;

Log also pending transactions (request path, classification, caps, restart count?)

@@ +103,5 @@
> +  if (mPrimarySynStarted.IsNull())
> +    log.AppendPrintf("    primary not started\n");
> +  else
> +    log.AppendPrintf("    primary started %ldms ago\n",
> +                     (mPrimarySynStarted - now).ToMilliseconds());

This must be logged as e.g. %.2fms - ToMilliseconds() returns double.

@@ +112,5 @@
> +    log.AppendPrintf("    backup started %ldms ago\n",
> +                     (mBackupSynStarted - now).ToMilliseconds());
> +    
> +  log.AppendPrintf("    primary transport %p, backup transport %p\n",
> +                   mSocketTransport.get(), mBackupTransport.get());

I'd rather just expose the information about whether the transport is present or not (i.e. %d with !!m*Transport, or so).  As I understand, the log doesn't have a different purpose.

@@ +130,5 @@
> +  log.AppendPrintf("    iskeepalive = %d  dontReuse = %d isReused = %d\n",
> +                   IsKeepAlive(), mDontReuse, mIsReused);
> +
> +  log.AppendPrintf("    mTransaction = %p mSpdySession = %p\n",
> +                   mTransaction.get(), mSpdySession.get());

As well here.

@@ +141,5 @@
> +                   mMaxBytesRead, mTotalBytesRead, mTotalBytesWritten);
> +    
> +  log.AppendPrintf("    rtt = %ums\n", PR_IntervalToMilliseconds(mRtt));
> +
> +  log.AppendPrintf("    idlemonitoring = %d transactionCount=%d\n",

transactionCount=%u

@@ +178,5 @@
> +  log.AppendPrintf("     Ping Timeout = %ums\n",
> +                   PR_IntervalToMilliseconds(gHttpHandler->SpdyPingTimeout()));
> +  log.AppendPrintf("     Idle for Any Activity (ping) = %ums\n",
> +                   PR_IntervalToMilliseconds(now - mLastReadEpoch));    
> +  log.AppendPrintf("     Idle for Data Activity = %umsn",

%ums\n ?

@@ +215,5 @@
> +  log.AppendPrintf("     Ping Timeout = %ums\n",
> +                   PR_IntervalToMilliseconds(gHttpHandler->SpdyPingTimeout()));
> +  log.AppendPrintf("     Idle for Any Activity (ping) = %ums\n",
> +                   PR_IntervalToMilliseconds(now - mLastReadEpoch));    
> +  log.AppendPrintf("     Idle for Data Activity = %umsn",

%ums\n

::: netwerk/protocol/http/nsHttpConnectionMgr.h
@@ +395,5 @@
>          void SetSpeculative(bool val) { mSpeculative = val; }
>  
>          bool HasConnected() { return mHasConnected; }
>  
> +        void PrintDiagnostics (nsCString &log);

Space between PrintDiagnostics and (

@@ +607,5 @@
>                                               nsAutoPtr<nsConnectionEntry> &ent,
>                                               void *closure);
> +
> +    // For diagnostics
> +    void OnMsgPrintDiagnostics     (PRInt32, void *);

Spaces.

@@ +608,5 @@
>                                               void *closure);
> +
> +    // For diagnostics
> +    void OnMsgPrintDiagnostics     (PRInt32, void *);
> +    static PLDHashOperator PrintDiagnosticsCB (const nsACString &key,

Space.

::: netwerk/protocol/http/nsHttpHandler.cpp
@@ +1148,5 @@
> +    if (PREF_CHANGED(HTTP_PREF("diagnostics"))) {
> +        rv = prefs->GetBoolPref(HTTP_PREF("diagnostics"), &cVar);
> +        if (NS_SUCCEEDED(rv) && cVar) {
> +            if (mConnMgr)
> +                mConnMgr->PrintDiagnostics();

You should ignore this when called to load all prefs (with pref == null).  Otherwise it will print out on startup (may cause unexpected trouble and is not necessary ; also people could mistakenly send the first report...).
Attachment #638432 - Flags: review?(honzab.moz) → review+
Comment on attachment 638432 [details] [diff] [review]
patch 0

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 

This is a diagnostic only feature meant to help solve HTTP connection bugs - specifically spdy bugs 762126 762025. 

It was designed to be as safe to backport as possible. The diagnostic mechanism is dump to the JS error console, which obviates the previous need to have started firefox with NSPR logging. When the problem is witnessed a diagnostic can be produced by changing a about:config pref at that time and the information copied out of the javascript console.

User impact if declined: 

continued lack of insight into the aforementioned bugs.

Testing completed (on m-c, etc.):  on m-c

Risk to taking this patch (and alternatives if risky): 

designed to be very low. Unless the pref (network.http.diagnositics) is flipped to a non-default value no new code is executed.

String or UUID changes made by this patch: 

the diagnostic strings are not localized (they are generally code level diagnostic names), no uuid changes are made.
Attachment #638432 - Flags: approval-mozilla-beta?
Attachment #638432 - Flags: approval-mozilla-aurora?
Comment on attachment 638432 [details] [diff] [review]
patch 0

[Triage Comment]
It's too late in the cycle to take a forward change of this nature for FF14 (our final beta build is about to be kicked off). FF15 will be on beta in about a week, so we can pick up feedback from that channel then. Approved for Aurora 15.
Attachment #638432 - Flags: approval-mozilla-beta?
Attachment #638432 - Flags: approval-mozilla-beta-
Attachment #638432 - Flags: approval-mozilla-aurora?
Attachment #638432 - Flags: approval-mozilla-aurora+
Target milestone reflects when it was fixed on trunk. Use the status flags for Aurora/Beta.

https://hg.mozilla.org/mozilla-central/rev/d9351d6dd5e3
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: mozilla15 → mozilla16
Depends on: 775508
Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0 (20120724191344)

I tried to verify this fix by the guidelines in comment 2, but nothing was displayed in the console when flipping network.http.diagnostics.

I tried with the Web Console, Error Console and Firebug. Made sure warning level was showing but I still didn't get anything displayed.

Patrick, which console are you referring to by "javascript console"? Should I change any other prefs, or perform any other actions to see the logged information?
(In reply to Ioana Budnar [QA] from comment #12)
> Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0
> (20120724191344)
> 
>> 
> Patrick, which console are you referring to by "javascript console"? 

error console - ctrl shift j

Should
> I change any other prefs, or perform any other actions to see the logged
> information?

flip the pref from false to true. (going the other way has no impact)

I just reconfirmed it works fine on both linux and windows. I've even used it to fix a bug in something else.
Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0 (20120724191344)

I verified this again, on a fresh profile, by checking the Error Console (Messages section, not Warnings) and it seems to work ok. 

It still doesn't work on my old profile, but that is a pretty dirty one. I disabled all my 16 addons and it still reproduces. When I flip the pref to true I usually get this in the Error Console: 

Timestamp: 07/27/2012 07:55:26 PM
Warning: ReferenceError: reference to undefined property b.view
Source File: chrome://global/content/bindings/tree.xml
Line: 1038

I will investigate this and file a new issue if I can find any way of reproducing it a second time.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: