Closed Bug 1289145 Opened 8 years ago Closed 8 years ago

PollEvent can block on Clear

Categories

(NSPR :: NSPR, defect)

4.12
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1309681

People

(Reporter: dragana, Assigned: dragana)

References

Details

(Whiteboard: [necko-active])

Attachments

(4 files, 3 obsolete files)

Attached patch bug_pollEventBlocking.patch (obsolete) — Splinter Review
This is not a big problem, but when PollableEvent is created as a pipe it is blocking and it uses:
https://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptio.c#1281
and 
https://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptio.c#1308
 to read and write which will block because secret->nonblocking is 0.

So if we do not have an event signal nsSocketTransportService loop will block at mPollableEvent->Clear() and not at Poll (https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsSocketTransportService2.cpp#1136).


I notice that pt_Read with secret->nonblocking == 0 are used for some file reads.

The attached patch makes pipe file descriptor non blocking. We can eliminate this by checking if mSignaled==true in PollableEvent::Clear() as well.
Flags: needinfo?(mcmanus)
Whiteboard: [necko-active]
This patch does not changes nspr. There can be a cleaner solution if nspr would be changed (e.g. Get/SetSockOption added)
is it fair to summarize as saying that it isn't enough to have the OS FD be non blocking, the meta info in NSPR needs to match?

I can see that..

I rhink you ought to patch it up in nspr. Ted was really good last time about having a parallel checking to moz-central along with the nspr tree and that was pretty helpful to keeping development moving. We would still have to cut a new nspr release, but that's something the nspr team checks on periodically every time the channels move up.
Flags: needinfo?(mcmanus)
I say we should make the nspr change because your gecko-only patch is obviously super fragile
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Attachment #8774360 - Attachment is obsolete: true
Attachment #8774641 - Flags: review?(ted)
Attachment #8774642 - Flags: review?(mcmanus)
Comment on attachment 8774642 [details] [diff] [review]
bug_pollEventBlocking.patch

Review of attachment 8774642 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/PollableEvent.cpp
@@ +138,5 @@
>  #ifdef USEPIPE
>    SOCKET_LOG(("PollableEvent() using pipe\n"));
>    if (PR_CreatePipe(&mReadFD, &mWriteFD) == PR_SUCCESS) {
>      // make the pipe non blocking. NSPR asserts at
>      // trying to use SockOpt here

fix the comment
Attachment #8774642 - Flags: review?(mcmanus) → review+
Comment on attachment 8774641 [details] [diff] [review]
implementsetgetsockoptions_for_pipe.patch

Review of attachment 8774641 [details] [diff] [review]:
-----------------------------------------------------------------

dragana - maybe you should rebase that without the whitespace to make it easier for ted to review. ted - this might be related to a hang problem we see, could you take a look?
comment 7
Flags: needinfo?(ted)
remove the white space change.
Attachment #8774641 - Attachment is obsolete: true
Attachment #8774641 - Flags: review?(ted)
Attachment #8779303 - Flags: review?(ted)
Comment on attachment 8779303 [details] [diff] [review]
implementsetgetsockoptions_for_pipe.patch

Review of attachment 8779303 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think allowing all of PR_SetSocketOption on pipes makes sense, that's just going to allow some strange things to happen. Looking at the code, it looks like NSPR already sets pipe fds to nonblocking when they're created:
http://hg.mozilla.org/projects/nspr/file/default/pr/src/pthreads/ptio.c#l3340

which is called from PR_CreatePipe:
http://hg.mozilla.org/projects/nspr/file/6cf50e545775/pr/src/pthreads/ptio.c#l4487
Attachment #8779303 - Flags: review?(ted) → review-
Do we just need to fix that code to set `(*readPipe)->secret->nonblocking` and `(*writePipe)->secret->nonblocking`?
Flags: needinfo?(ted)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #11)
> Do we just need to fix that code to set `(*readPipe)->secret->nonblocking`
> and `(*writePipe)->secret->nonblocking`?

If this is ok. I am fine with it :)
Attachment #8779303 - Attachment is obsolete: true
Attachment #8786304 - Flags: review?(ted)
Comment on attachment 8786304 [details] [diff] [review]
bug_1289145.patch

Review of attachment 8786304 [details] [diff] [review]:
-----------------------------------------------------------------

That's much simpler. :)
Attachment #8786304 - Flags: review?(ted) → review+
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Feel free to cherry-pick this into mozilla-central, otherwise we'll pick it up whenever we sync up with NSPR next.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #16)
> Feel free to cherry-pick this into mozilla-central, otherwise we'll pick it
> up whenever we sync up with NSPR next.

Please don't cherry pick, distributions rely on being able to identify the snapshots used by Firefox based on the NSPR version number.
Let's check if the change works with all Firefox tests.
Started a try build:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a760fd3e8c26
Blocks: 1293329
(In reply to Kai Engert (:kaie) from comment #17)
> Please don't cherry pick, distributions rely on being able to identify the
> snapshots used by Firefox based on the NSPR version number.

As long as we don't rely on some newly-added feature, we should continue to work with older versions of NSPR. Distributions trying to match exactly what we ship vendored into mozilla-central but using external source packages is not a use case I care about all that much, honestly.
Comment on attachment 8786304 [details] [diff] [review]
bug_1289145.patch

Review of attachment 8786304 [details] [diff] [review]:
-----------------------------------------------------------------

::: nsprpub/pr/src/pthreads/ptio.c
@@ +3337,5 @@
>                  break;
>              case PR_DESC_PIPE:
>                  fd->methods = PR_GetPipeMethods();
>                  pt_MakeFdNonblock(osfd);
> +                fd->secret->nonblocking = PR_TRUE;

This change seems wrong. This change implies all NSPR
pipes are in blocking mode now.
(In reply to Wan-Teh Chang from comment #20)
> Comment on attachment 8786304 [details] [diff] [review]
> bug_1289145.patch
> 
> Review of attachment 8786304 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: nsprpub/pr/src/pthreads/ptio.c
> @@ +3337,5 @@
> >                  break;
> >              case PR_DESC_PIPE:
> >                  fd->methods = PR_GetPipeMethods();
> >                  pt_MakeFdNonblock(osfd);
> > +                fd->secret->nonblocking = PR_TRUE;
> 
> This change seems wrong. This change implies all NSPR
> pipes are in blocking mode now.

That is why I made the fist patch that introduced SetSocketOption for pipes.
Ted suggested that fd->secret->nonblocking = PR_TRUE; is ok :)

I can make a patch similar to my fist patch that allows only setting blocking/nonblocking sock option to a pipe.
Comment on attachment 8774642 [details] [diff] [review]
bug_pollEventBlocking.patch

Review of attachment 8774642 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/PollableEvent.cpp
@@ +138,5 @@
>  #ifdef USEPIPE
>    SOCKET_LOG(("PollableEvent() using pipe\n"));
>    if (PR_CreatePipe(&mReadFD, &mWriteFD) == PR_SUCCESS) {
>      // make the pipe non blocking. NSPR asserts at
>      // trying to use SockOpt here

I see. This is why you wrote the NSPR patch for ptio.c.

I think we should instead modify PR_NewPollableEvent()
in prpolevt.c. I will attach a patch.
Hi Dragana and Ted:

This patch outlines an alternative solution. This solution
assumes it is fine to restrict the new function to NSPR
internal use only.

If code outside NSPR also needs to put a pipe created by
PR_CreatePipe() in non-blocking mode, then we need to export the
new function or just make PR_SetSocketOption work on pipes (which
we've avoided because of the pipe/socket confusion).

I will let you two decide the final solution. Just wanted
to propose this solution for your consideration.
This is another alternative solution. We can implement
PR_NewPollableEvent() using PR_NewTCPSocketPair() on
Unix as well. A big advantage of this approach is that
no new function is needed.
I prefer that we do not use a socket pair. So I prefer the first one (side note: we will need ot change necko to use R_NewPollableEvent).
I'm changing this bug to the NSPR component, because all commited patches were to NSPR, so it can show up in the bug query for the NSPR release.
Component: Networking → NSPR
Product: Core → NSPR
Target Milestone: --- → 4.13
Version: unspecified → 4.12
This change broke NTLM single-sign-on in Linux. Firefox now gives up on talking to the ntlm_auth helper, because it gets -EAGAIN on reading from it when previously it would have blocked.

https://bugzilla.redhat.com/show_bug.cgi?id=1383918
(In reply to David Woodhouse from comment #27)
> This change broke NTLM single-sign-on in Linux. Firefox now gives up on
> talking to the ntlm_auth helper, because it gets -EAGAIN on reading from it
> when previously it would have blocked.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1383918

Thanks for reporting this and linking to the redhat bug.

Can you please file a new bug for this in our bugzilla to get it the right attention from our side?  

Our testing of NTLM support is virtually non-existent...  A long term problem, causing exactly these issues.
^^^ comment 28.  Thanks.
Flags: needinfo?(dwmw2)
Bug 1309681
Flags: needinfo?(dwmw2)
Depends on: 1309681
I suggest that we revert this change, and release a NSPR 4.13.1, where the only difference from NSPR 4.13 is the change to restore the original behavior.

NSPR is supposed to be a stable API.

We shouldn't change the behavior of the library in a way that introduces regressions for consuming applications.

The Firefox bug should be fixed in a different way.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Can we backout this patch?
Flags: needinfo?(ted)
Ted agreed with the plan from comment 31.

I'll work on the backout and a 4.13.1 release soon.

NSPR 4.13 is currently being used by mozilla-central 52 and aurora 51.

Is it fine to change both firefox 51/52 to use NSPR 4.13.1 ?
(In reply to Kai Engert (:kaie) from comment #33)
> Ted agreed with the plan from comment 31.
> 
> I'll work on the backout and a 4.13.1 release soon.
> 
> NSPR 4.13 is currently being used by mozilla-central 52 and aurora 51.
> 
> Is it fine to change both firefox 51/52 to use NSPR 4.13.1 ?

From gecko networking code it is fine to back it out.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #15)
> https://hg.mozilla.org/projects/nspr/rev/
> c5f7511ed27673f72c9b4bef15b4e476bc98f402
> Bug 1289145 - Make pipe non blocking. r=:tedmielczarek

Backed out from both NSPR trunk and NSPR_4_13_BRANCH, and releases as 4.13.1, with Ted's agreement.

https://hg.mozilla.org/projects/nspr/rev/1c573a95fb89
https://hg.mozilla.org/projects/nspr/rev/0fbb19c298ed

I've filed bug 1311366 to track the uplift into firefox 52 and 51.
I will close this bug as dup of 1309681. The solution is/will be posted there.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → DUPLICATE
Flags: needinfo?(ted)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: