Last Comment Bug 671591 - pipeline robustness - restart partial idempotent transaction
: pipeline robustness - restart partial idempotent transaction
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla14
Assigned To: Patrick McManus [:mcmanus]
:
:
Mentors:
Depends on: 599164
Blocks: 742827
  Show dependency treegraph
 
Reported: 2011-07-14 10:12 PDT by Patrick McManus [:mcmanus]
Modified: 2012-04-05 11:41 PDT (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch 1 (14.84 KB, patch)
2011-07-14 10:16 PDT, Patrick McManus [:mcmanus]
no flags Details | Diff | Splinter Review
patch 2 (20.08 KB, patch)
2012-01-11 13:34 PST, Patrick McManus [:mcmanus]
honzab.moz: review+
Details | Diff | Splinter Review
v3 (21.40 KB, patch)
2012-02-15 07:59 PST, Patrick McManus [:mcmanus]
honzab.moz: review+
Details | Diff | Splinter Review
v4 (21.37 KB, patch)
2012-02-15 12:49 PST, Patrick McManus [:mcmanus]
mcmanus: review+
Details | Diff | Splinter Review

Description Patrick McManus [:mcmanus] 2011-07-14 10:12:52 PDT
originally split out from bug 599164

WIP has this problem

   if (mHaveAllHeaders) {
        // XXX TODO
        // honza notes:
        // This creates a race condition: it may happen that we grab mResponseHead to
        // mRestartedResponseHeaders but in the same time it is grabbed by http channel on
        // the main thread.
        if (!mRestartInProgress)
            mRestartInProgress = new RestartVerifier();
        mReadBeforeRestart = PR_MAX(mReadBeforeRestart, mContentRead);
        mRestartInProgress->Set(mContentLength, mResponseHead);
	mResponseHead = nsnull;
    }

plus need for some kind of automated test - which will be hard in our infrastructure
Comment 1 Patrick McManus [:mcmanus] 2011-07-14 10:16:51 PDT
Created attachment 545941 [details] [diff] [review]
patch 1
Comment 2 Honza Bambas (:mayhemer) 2011-07-14 10:21:43 PDT
Adding also some people around tools we would like to update and use for the test.

My plan is to finish the work on pipelining support in httpd.js.  I don't think it should be that hard.  Working on it will be a well spent time.  We will probably want a bug for that separately, though.
Comment 3 Patrick McManus [:mcmanus] 2012-01-11 13:34:47 PST
Created attachment 587805 [details] [diff] [review]
patch 2

This version addresses the race condition with a very minor lock.

I've tested it by forcing a reset error (just by adding c code) for any connection that had read more than 100K * (restarts + 1) and then loading pages that contained objects >100k (but less than 1M) and watching the right things happen.
Comment 4 Honza Bambas (:mayhemer) 2012-01-12 12:08:24 PST
(In reply to Patrick McManus [:mcmanus] from comment #3)
> I've tested it by forcing a reset error (just by adding c code) for any
> connection that had read more than 100K * (restarts + 1) and then loading
> pages that contained objects >100k (but less than 1M) and watching the right
> things happen.

Thanks.  I think we could land this, while pipelining pref'ed off, w/o an automated test, then continue on the other stage and in parallel create automated tests for all the critical stuff in pipelining that would be a precondition for pref'ing on.
Comment 5 Honza Bambas (:mayhemer) 2012-02-09 13:23:37 PST
Comment on attachment 587805 [details] [diff] [review]
patch 2

Review of attachment 587805 [details] [diff] [review]:
-----------------------------------------------------------------

r=honzab, check on comments please.

Now there is a time to write the tests...

::: netwerk/protocol/http/nsHttpTransaction.cpp
@@ +860,5 @@
> +
> +        if (!mTakenResponseHeader && !mForTakeResponseHead) {
> +            // TakeResponseHeader() has not been called yet and this
> +            // is the first restart. Store the resp headers exclusively
> +            // for TakeResponseHeader()

It's TakeResponseHead().

Please add a comment why we are doing this at the first place.

As I understand, presence of mResponseHead is a code logic flag and after restart is must be null since we want to recreate the object and let it parse the incoming headers again, from scratch.  

But, since there had been some content processed, nsHttpChannel::OnStartRequest might be called.  But it gets called on a different thread and we don't know when exactly.  And we have to give it something, so we save the head in mForTakeResponseHead to make it available to the channel.

@@ +1261,4 @@
>      }
>  
>      mDidContentStart = true;
> +    mRestartInProgressVerifier.Set(mContentLength, mResponseHead);

Maybe do explicit IsSetup() check when calling this to emphasize here this is done just ones?

@@ +1331,5 @@
> +        mToReadBeforeRestart -= ignore;
> +        memmove(buf, buf + ignore, *contentRead + *contentRemaining);
> +        if (!mToReadBeforeRestart)
> +            mRestartInProgressVerifier.SetActive(false);
> +    }

This would be better moved to a method of the verifier.  It's a lot changing its state and I also think mToReadBeforeRestart could be encapsulated in that class.

Up to you.

@@ +1597,5 @@
> +    if (!matchOld(newHead, mLastModified, nsHttp::Last_Modified))
> +        return false;
> +
> +    if (!matchOld(newHead, mETag, nsHttp::ETag))
> +        return false;

Hmmm... I think if there are in sum no headers we may base the check on, just don't allow restart.  If there is simply neither etag nor last-modified, we cannot say the content is the same and can be appended based merely on content length.

::: netwerk/protocol/http/nsHttpTransaction.h
@@ +230,5 @@
>      // mTransactionDone  := transaction ran to completion or was interrupted
>      // mResponseComplete := transaction ran to completion
> +
> +    // For Restart-In-Progress Functionality
> +    PRInt64                         mToReadBeforeRestart;

This can be in the RestartVerifier class.

@@ +236,5 @@
> +    bool                            mReportedResponseHeader;
> +
> +    // protected by nsHttp::GetLock()
> +    nsHttpResponseHead             *mForTakeResponseHead;
> +    bool                            mTakenResponseHeader;

s/mTakenResponseHeader/mResponseHeadTaken/ ?  (or at least remove the trailing 'er')

@@ +260,5 @@
> +        ~RestartVerifier() {}
> +        
> +        void Set(PRInt64 contentLength, nsHttpResponseHead *head);
> +        bool Verify(PRInt64 contentLength, nsHttpResponseHead *head);
> +        bool Active() { return mActive; }

Maybe a bit better would be to call this IsDiscardingContent, what is the actual meaning of this flag.  It will make the code more readable.

@@ +261,5 @@
> +        
> +        void Set(PRInt64 contentLength, nsHttpResponseHead *head);
> +        bool Verify(PRInt64 contentLength, nsHttpResponseHead *head);
> +        bool Active() { return mActive; }
> +        void SetActive(bool val) { mActive = val; }

According the proposal to encapsulate the discarding code into this class, you may not need this setter.

@@ +287,5 @@
> +        bool                            mActive;
> +
> +        // true when ::Set has been called with a response header
> +        bool                            mSetup;
> +    } mRestartInProgressVerifier;

I was hoping this could be allocated dynamically, but we have to save headers before the channel takes the head.  That maneuver is not really a good design.  But what to do.. this will make the code and allocations larger (strings) just for a rare case of restart in progress..  I don't have a better idea, though.
Comment 6 Patrick McManus [:mcmanus] 2012-02-15 07:59:11 PST

> just don't allow restart.  If there is simply neither etag nor
> last-modified, we cannot say the content is the same and can be appended
> based merely on content length.

I really thought this was too conservative an approach. But after sleeping on it, I'm also concerned about false positives so I changed it to fail verification if we didn't store the original etag or l-m.

I'm going to ask you to re r? just because I took your verifier suggestions and that changed a significant amount of code.

also there was one case of IsActive() (it was the condition to Verify()), that should have been IsSetup() as active (now called isdiscarding()) gets cleared eventually, but you want to verify the headers each time you are restarted and even when there are 0 bytes passed back to content (because content could still have read the headers).

I also added the "exceeded max restarts check" to restartinprogress instead of relying on the one in restart(), so that we won't reset the state of a bunch of things and then close the transaction anyhow without resetting it.
Comment 7 Patrick McManus [:mcmanus] 2012-02-15 07:59:50 PST
Created attachment 597411 [details] [diff] [review]
v3
Comment 8 Honza Bambas (:mayhemer) 2012-02-15 09:48:31 PST
Comment on attachment 597411 [details] [diff] [review]
v3

Review of attachment 597411 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks

r=honzab

::: netwerk/protocol/http/nsHttpTransaction.cpp
@@ +847,5 @@
> +
> +    // Lock RestartInProgress() and TakeResponseHead() against main thread
> +    MutexAutoLock lock(*nsHttp::GetLock());
> +
> +    // Don't try and restartinprogress things that haven't gotten a response header

restartinprogress

@@ +867,5 @@
> +
> +    if (!mResponseHeadTaken && !mForTakeResponseHead) {
> +        // TakeResponseHeader() has not been called yet and this
> +        // is the first restart. Store the resp headers exclusively
> +        // for TakeResponseHead() which is called fro the main thread and

fro

@@ +1333,5 @@
>          *contentRead = count;
>      }
> +    
> +    if (mRestartInProgressVerifier.IsDiscardingContent() &&
> +        mRestartInProgressVerifier.ToReadBeforeRestart() && *contentRead) {

check on both IsDiscardingContent() and ToReadBeforeRestart() seems redundant.

@@ +1336,5 @@
> +    if (mRestartInProgressVerifier.IsDiscardingContent() &&
> +        mRestartInProgressVerifier.ToReadBeforeRestart() && *contentRead) {
> +        PRUint32 ignore =
> +            PR_MIN(*contentRead,
> +                   PRUint32(mRestartInProgressVerifier.ToReadBeforeRestart()));

Ups, I didn't spot this before... you have to PR_MIN(ToReadBeforeRestart(), PR_UINT32_MAX).  |ignore| could become something unexpected (even 0).

Maybe cache ToReadBeforeReastart() ?

::: netwerk/protocol/http/nsHttpTransaction.h
@@ +267,5 @@
> +            mAlreadyProcessed = val;
> +            mToReadBeforeRestart = val;
> +        }
> +        PRInt64 ToReadBeforeRestart() { return mToReadBeforeRestart; }
> +        void HaveReadBeforeRestart(PRUint32 amt)

ToDiscard and HaveDiscarded, but here I may be too picky...
Comment 9 Patrick McManus [:mcmanus] 2012-02-15 12:48:26 PST
(In reply to Honza Bambas (:mayhemer) from comment #8)

> ToDiscard and HaveDiscarded, but here I may be too picky...

I'm going to pass on that one, other than that all suggestions applied.
Comment 10 Patrick McManus [:mcmanus] 2012-02-15 12:49:37 PST
Created attachment 597531 [details] [diff] [review]
v4
Comment 11 Patrick McManus [:mcmanus] 2012-03-20 10:38:16 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/8dbf8965bac9
Comment 12 Matt Brubeck (:mbrubeck) 2012-03-20 12:59:37 PDT
Sorry, I backed this out on inbound because something in the push appeared to cause a leak in mochitest-browser-chrome:
https://hg.mozilla.org/integration/mozilla-inbound/rev/577da08878cb
Comment 13 Patrick McManus [:mcmanus] 2012-03-22 18:03:39 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/da069eeaa55a
Comment 14 Marco Bonardo [::mak] 2012-03-23 05:58:00 PDT
https://hg.mozilla.org/mozilla-central/rev/da069eeaa55a

Note You need to log in before you can comment on or make changes to this bug.