Closed Bug 1310197 Opened 8 years ago Closed 7 years ago

Implement TCP Fast Open in NSPR

Categories

(NSPR :: NSPR, defect)

defect
Not set
normal

Tracking

(firefox55 fixed)

RESOLVED FIXED
Tracking Status
firefox55 --- fixed

People

(Reporter: dragana, Assigned: dragana)

References

Details

Attachments

(2 files, 6 obsolete files)

Linux use SendTo to send data during TCP Fast Open.
OSX uses connectx
and Windows uses ConnectEx.

I made the interface for all platforms to be the same as on Linux. So necko and nss will use PR_SendTo to send data during TCP Fast Open.
Blocks: 1188435
Blocks: 1310201
Attached patch bug_1310197_nspr_v1.patch (obsolete) — Splinter Review
Attachment #8801102 - Flags: review?(ted)
Comment on attachment 8801102 [details] [diff] [review]
bug_1310197_nspr_v1.patch

sorry, dropping the review flag, I want to make a small change.
Attachment #8801102 - Flags: review?(ted)
Attached patch bug_1310197_nspr_v1.patch (obsolete) — Splinter Review
Attachment #8801102 - Attachment is obsolete: true
Attachment #8803345 - Flags: review?(ted)
Ted, we would really like to move forward with this patch. aiui it is additive and shouldn't really pose a regression risk. if you would like me to review it as well, I would be happy to do so. thanks!
Flags: needinfo?(ted)
Summary: Implement TCP Fast Open → Implement TCP Fast Open in NSPR
Comment on attachment 8803345 [details] [diff] [review]
bug_1310197_nspr_v1.patch

Review of attachment 8803345 [details] [diff] [review]:
-----------------------------------------------------------------

ted, I've read this code and believe it to be correct and 100% accretive (and therefore of low regression risk). it blocks another important feature - can we proceed on that basis?
Attachment #8803345 - Flags: review?(ted) → review+
Flags: needinfo?(ted)
Flags: needinfo?(ted)
Comment on attachment 8803345 [details] [diff] [review]
bug_1310197_nspr_v1.patch

Daniel, can you look at this patch as well? Thanks!!!
Attachment #8803345 - Flags: review?(daniel)
Comment on attachment 8803345 [details] [diff] [review]
bug_1310197_nspr_v1.patch

Review of attachment 8803345 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Just one question and some very minor nits.

::: nsprpub/pr/src/io/prsocket.c
@@ +1164,5 @@
>  {
>      *out_flags = 0;
> +#if defined(_WIN64)
> +    if (in_flags & PR_POLL_WRITE) {
> +        if (fd->secret->alreadyConnected) {

I couldn't see that 'alreadyConnected' is ever switched off again. Won't that make polling on this socket always return writable even after the connection has been established?

::: nsprpub/pr/src/md/windows/w95sock.c
@@ +436,5 @@
> +    // ConnectEx return TRUE on a success and FALSE on an error.
> +    if (_pr_win_connectex( (SOCKET)osfd, (struct sockaddr *) addr,
> +                           addrlen, buf, amount,
> +                           &rvSent, &fd->secret->ol) == TRUE)
> +    {

nit: brace moved to the previous line?

@@ +472,5 @@
> +            // this data already send.
> +            return amount;
> +        }
> +        while (err == ERROR_IO_PENDING)
> +        {

nit: open brace on the same line as "while" and the same comment for the three following if() clauses below.

::: nsprpub/pr/src/pthreads/ptio.c
@@ +2057,5 @@
> +#if !defined(DARWIN) || HAS_CONNECTX
> +    PRInt32 syserrno, bytes = -1;
> +    PRBool fNeedContinue = PR_FALSE;
> +    pt_SockLen addr_len;
> +        const PRNetAddr *addrp = addr;

indent mistake? (and one below)

@@ +2068,5 @@
> +
> +    PR_ASSERT(IsValidNetAddr(addr) == PR_TRUE);
> +#if defined(_PR_INET6)
> +        if (addr->raw.family == PR_AF_INET6) {
> +                md_af = AF_INET6;

nit: indent level hiccup

@@ +2125,5 @@
> +        bytes = pt_Continue(&op);
> +        syserrno = op.syserrno;
> +    }
> +    if (bytes < 0)
> +        pt_MapError(_PR_MD_MAP_SENDTO_ERROR, syserrno);

Use { braces } ?
Attachment #8803345 - Flags: review?(daniel) → review+
(In reply to Patrick McManus [:mcmanus] from comment #5)
> Comment on attachment 8803345 [details] [diff] [review]
> bug_1310197_nspr_v1.patch
> 
> Review of attachment 8803345 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ted, I've read this code and believe it to be correct and 100% accretive
> (and therefore of low regression risk). it blocks another important feature
> - can we proceed on that basis?

Yes, if the networking peers are satisfied with this code that's good enough for me.
Flags: needinfo?(ted)
@ted - thanks.

when we've got an updated patch with an r+ from me and :bagder, let's commit to m-c and ni? ted to make sure an nspr release happens in parallel. thanks!
(In reply to Daniel Stenberg [:bagder] from comment #7)
> Comment on attachment 8803345 [details] [diff] [review]
> bug_1310197_nspr_v1.patch
> 
> Review of attachment 8803345 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good. Just one question and some very minor nits.
> 
> ::: nsprpub/pr/src/io/prsocket.c
> @@ +1164,5 @@
> >  {
> >      *out_flags = 0;
> > +#if defined(_WIN64)
> > +    if (in_flags & PR_POLL_WRITE) {
> > +        if (fd->secret->alreadyConnected) {
> 
> I couldn't see that 'alreadyConnected' is ever switched off again. Won't
> that make polling on this socket always return writable even after the
> connection has been established?
> 

Thanks!! Go catch. This will happen really rare; it can only happen if connectEx returns true, which means socket is connected.
Attached patch bug_1310197_nspr_v1.patch (obsolete) — Splinter Review
rebased and fix Daniel's comments.
Attachment #8803345 - Attachment is obsolete: true
Attachment #8854947 - Flags: review+
Attached patch bug_1310197_nspr_v1.patch (obsolete) — Splinter Review
Update commit message.
Attachment #8854947 - Attachment is obsolete: true
Attachment #8854949 - Flags: review+
Attached patch bug_1310197_nspr_part2.patch (obsolete) — Splinter Review
This is needed for bug 1188435 comment 46. sndto will not be null but _PR_InvalidInt.
_PR_InvalidInt is private so doing it in more elegant way.
Attachment #8854951 - Flags: review?(mcmanus)
Comment on attachment 8854951 [details] [diff] [review]
bug_1310197_nspr_part2.patch

Review of attachment 8854951 [details] [diff] [review]:
-----------------------------------------------------------------

That isn't a natural solution for nspr. how about

1] extern PRIntn _PR_InvalidInt(void); extern PRInt64 _PR_InvalidInt64(void); - ? Its kludgy, but we already do some kludgy nspr stuff and at least we'd fail at link time if something changed.

or

2] testing sendto() at startup time with something bogus, but checking for PR_INVALID_METHOD_ERROR as the response
Attachment #8854951 - Flags: review?(mcmanus) → review-
(In reply to Patrick McManus [:mcmanus] from comment #14)
> Comment on attachment 8854951 [details] [diff] [review]
> bug_1310197_nspr_part2.patch
> 
> Review of attachment 8854951 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> That isn't a natural solution for nspr. how about
> 
> 1] extern PRIntn _PR_InvalidInt(void); extern PRInt64
> _PR_InvalidInt64(void); - ? Its kludgy, but we already do some kludgy nspr
> stuff and at least we'd fail at link time if something changed.
> 

I have tried that but it does not compile, it cannot find _PR_InvalidInt. I need to change the linking dependence but I do not know where that is :) I could try to figure it out, probably I need to expose _PR_InvalidInt or something


> or
> 
> 2] testing sendto() at startup time with something bogus, but checking for
> PR_INVALID_METHOD_ERROR as the response

not sure about this one. nspr will crash if you call sendto and that is not implemented. If I change that I am changing nspr interface and we do not want that.
(In reply to Dragana Damjanovic [:dragana] from comment #15)
> (In reply to Patrick McManus [:mcmanus] from comment #14)
> > Comment on attachment 8854951 [details] [diff] [review]
> > bug_1310197_nspr_part2.patch
> > 
> > Review of attachment 8854951 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > That isn't a natural solution for nspr. how about
> > 
> > 1] extern PRIntn _PR_InvalidInt(void); extern PRInt64
> > _PR_InvalidInt64(void); - ? Its kludgy, but we already do some kludgy nspr
> > stuff and at least we'd fail at link time if something changed.
> > 
> 
> I have tried that but it does not compile, it cannot find _PR_InvalidInt. I
> need to change the linking dependence but I do not know where that is :) I
> could try to figure it out, probably I need to expose _PR_InvalidInt or
> something
> 

yeah, that's true because nspr can live outside of libxul I suppose.

hmm.


> 
> > or
> > 
> > 2] testing sendto() at startup time with something bogus, but checking for
> > PR_INVALID_METHOD_ERROR as the response
> 
> not sure about this one. nspr will crash if you call sendto and that is not
> implemented. If I change that I am changing nspr interface and we do not
> want that.

I'm having flashbacks to seeing this before. it asserts, right? Which is kinda bizarre for doing feature detection.
The proposed patch doesn't solve the problem either the more I think about it.. if you need nspr v2 for sendto but only v1 is installed (the point of the patch) PR_SentToMethodPresent isn't going to be present either and some dynamic linker fail is going to happen :(
nsprpub/pr/tests/version.c seems to provide a guide to explicitly version testing.. its ugly but could work I suppose.
(In reply to Patrick McManus [:mcmanus] from comment #18)
> nsprpub/pr/tests/version.c seems to provide a guide to explicitly version
> testing.. its ugly but could work I suppose.

I do not why I have not think about it earlier. We have a couple of reserved_fn_* methods and they are all _PR_InvalidInt. I can compare sendto to them.

I have tried to check version using guideline from nsprpub/pr/tests/version.c, but it does not work. there is no libVersionPoint. I tried 2 different ways but it cannot find it in a nspr lib :(
sounds promising!
Attachment #8854951 - Attachment is obsolete: true
Pushed by dd.mozilla@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/54d50e4a99b2
Implement fast open nspr part. r=mcmanus,bagder
I have push this to mozilla-inbound.

Ted, we will need a nspr release.
Flags: needinfo?(ted)
Flags: needinfo?(ted)
Flags: needinfo?(dd.mozilla)
Attached patch bug_1310197_nspr_v1.patch (obsolete) — Splinter Review
Attachment #8854949 - Attachment is obsolete: true
Comment on attachment 8858328 [details] [diff] [review]
bug_1310197_nspr_v1.patch

ConnectContinue and Poll was not correct.
Attachment #8858328 - Flags: review?(mcmanus)
Attachment #8858328 - Flags: review?(daniel)
Comment on attachment 8858328 [details] [diff] [review]
bug_1310197_nspr_v1.patch

Review of attachment 8858328 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, I only detected some minor nits.

::: nsprpub/pr/src/io/prsocket.c
@@ +318,5 @@
> +    if (fd->secret->overlappedActive) {
> +        PR_ASSERT(fd->secret->nonblocking);
> +        PRInt32 rvSent;
> +        if (GetOverlappedResult(osfd, &fd->secret->ol, &rvSent, FALSE) == TRUE)
> +        {

open brace at the end of the if () line?

::: nsprpub/pr/src/pthreads/ptio.c
@@ +2082,5 @@
> +#ifndef _PR_HAVE_SOCKADDR_LEN
> +        addrCopy = *addr;
> +        addrCopy.raw.family = AF_INET6;
> +        addrp = &addrCopy;
> +#endif

For the case when _PR_HAVE_SOCKADDR_LEN is *not* defined, the code copies the addr struct and sets the family field to INET6 only when the family is already set to PR_AF_INET6, while for the case when _PR_HAVE_SOCKADDR_LEN *is* defined it always copies the struct and sets the family field. This difference could use a little comment to explain it, since it isn't that easily spotted at a first glance.

@@ +2132,5 @@
> +        bytes = pt_Continue(&op);
> +        syserrno = op.syserrno;
> +    }
> +    if (bytes < 0)
> +        pt_MapError(_PR_MD_MAP_SENDTO_ERROR, syserrno);

Forgot { braces } for this conditional block.

@@ +3251,5 @@
>      pt_Shutdown,
>      pt_Recv,
>      pt_Send,
>      (PRRecvfromFN)_PR_InvalidInt,
> +    pt_TCP_SendTo, // Thiss is for TCP Fast Open. Linux uses SendTo function for this. OSX uses connectx, but we imitate Linux.

s/Thiss/this
Attachment #8858328 - Flags: review?(daniel) → review+
Comment on attachment 8858328 [details] [diff] [review]
bug_1310197_nspr_v1.patch

Review of attachment 8858328 [details] [diff] [review]:
-----------------------------------------------------------------

::: nsprpub/pr/src/io/prsocket.c
@@ +1165,5 @@
> +
> +#if defined(_WIN64)
> +    if (in_flags & PR_POLL_WRITE) {
> +        if (fd->secret->alreadyConnected) {
> +            fd->secret->alreadyConnected = PR_FALSE;

let's clear this when we take action on the socket, not in level triggered poll
Attachment #8858328 - Flags: review?(mcmanus) → review+
addressed comments.
Attachment #8858328 - Attachment is obsolete: true
Attachment #8859886 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/919b9ff342f3
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Comment on attachment 8859886 [details] [diff] [review]
bug_1310197_nspr_v1.patch

Review of attachment 8859886 [details] [diff] [review]:
-----------------------------------------------------------------

::: nsprpub/pr/src/pthreads/ptio.c
@@ +2094,5 @@
> +#endif
> +
> +#ifdef _PR_HAVE_SOCKADDR_LEN
> +    // if _PR_HAVE_SOCKADDR_LEN is defined and it is PR_AF_INET6 we set family
> +    / to AF_INET6 and we set address length.

It seems that it missed a slash and result in build-fail.
Flags: needinfo?(dd.mozilla)
Sorry, I will have a patch to fix it.
Flags: needinfo?(dd.mozilla)
This fix a comment only!
Attachment #8860259 - Flags: review+
Depends on: 1358466
Unfortunately you haven't followed the NSPR project rules when committing these changes to the Mozilla tree.

NSPR is a separate product, with its own release cycle.

The nsprpub directory inside the Mozilla/Firefox tree must use released or beta snapshots of the NSPR project, only.
Direct commits into the mozilla/nsprpub aren't allowed.

The reason is that some consumers of Firefox, such as Linux distributions, package NSPR and NSS separately, and don't take the copy shipped as part of the Firefox sources, but use the released NSPR project archives for producing the system NSPR package. This means, if you don't follow the proper process, then those Linux distributions will miss the code that you have landed directly in the Firefox tree, and things will be really messed up...

All changes to NSPR must be checked in to the NSPR project tree at 
  https://hg.mozilla.org/projects/nspr/

Then you need to request that the NSPR/NSS group uplifts a new NSPR snapshot into mozilla-central.
The current bug for this process is bug 1350291.

(Hopefully we'll soon have a commit hook that prevents direct landings into the nsprpub directory, this is being worked on in bug 1348852.)

I'd like to kindly ask all reviewers to NSPR code to remind contributors of this process in the future.
I've pushed the NSPR portions of your patch to the NSPR tree:
  https://hg.mozilla.org/projects/nspr/rev/f39f28732459374ed4fa5a5a5b5b03d81c5d5aac
Target Milestone: --- → 4.15
I know the rules, I have not push this change to nspr project yet, because I wanted the changes to be tested for a short time on nightly to see if everything is fine before pushing it to nspr project. The gecko changes that actually use nspr code landed last Friday
Your commit was almost 3 weeks ago. It would have been helpful if you had mentioned the temporary nature of your commit in the commit message, this would have allowed me to identify your intentions. The commit messages in this bug also didn't give an easily to find clue.

Also, this bug had been marked resolved fixed, which suggested that the work on this is done. For a temporary experiment, I would have expected the bug to remain open until the experiment is completed and the final decisions have been made.

I would like to suggest, if you intend to repeat such experiments in the future, please announce them to the NSPR/NSS developers, to ensure that more people aware of the requirement to potentially backout local changes prior to a beta merge.
Sorry for confusion. I would like to ship this in 55. I just have not push it into nspr project, because I was waiting to see if everything is fine, actually to test this code. And that took too long, I pushed gecko code that uses this nspr code only on Friday. I wanted to be sure that code is stable before pushing it to nspr project.

For now the NSPR part of the code for TCP fast open looks stable. I have not seen any crashes in there. So thank you for pushing it to nspr project repo. We can keep it and I hope everything is going to work just fine.
Blocks: 1350291
(In reply to Kai Engert (:kaie:) from comment #39)
> I've pushed the NSPR portions of your patch to the NSPR tree:
>  
> https://hg.mozilla.org/projects/nspr/rev/
> f39f28732459374ed4fa5a5a5b5b03d81c5d5aac

I had missed the change to "configure" when landing into NSPR,
this commit adds the missing piece:
  https://hg.mozilla.org/projects/nspr/rev/f869b8701568
Depends on: 1374538
Sorry if this is a little unrelated and on a closed bug but I'm running into an issue upgrading to NSPR4.15 in my source tree when building Win 64-bit, which is directly related to the code added here. I need some help; maybe you can provide a pointer as to how to solve the following Win 64 build bustage?

 1:05.31 w95sock.obj : error LNK2019: unresolved external symbol WSAIoctl referenced in function _pr
_set_connectex
 1:05.31
 1:05.31 nss3.dll : fatal error LNK1120: 1 unresolved externals
Mark, please file a new, separate bug on this issue, where you explain what software (that uses NSPR) you are trying you build, and CC the people who were active on this bug. In the new bug you could needinfo Dragana, as she had provided the patches, and might know best what new link options you require.
See Also: → 1392513
Sorry for asking a dumb question here, but: from the code changed, TCP Fast Open seems to be available only to 64bit windows build and letting 32bit windows build untouched. Does it mean TCP Fast Open is only available in 64bit Windows?
Regressions: 1584791
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: