Closed Bug 996535 Opened 6 years ago Closed 6 years ago

[RTSP] Convert linux socket to NSPR/PRFileDesc

Categories

(Firefox OS Graveyard :: RTSP, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(tracking-b2g:backlog)

RESOLVED FIXED
2.0 S1 (9may)
tracking-b2g backlog

People

(Reporter: johnshih.bugs, Assigned: johnshih.bugs)

References

Details

Attachments

(1 file, 6 obsolete files)

No description provided.
Hi John,

Thanks for filing this bug.
The link below is the overview of RTSP architecture depicted by Vincent.
https://bug831645.bugzilla.mozilla.org/attachment.cgi?id=778316

Although some class names were out of date, most of the component relationships are still valid.
You can get an overall view of point of the architecture.
OS: Linux → Gonk (Firefox OS)
Hardware: x86_64 → ARM
blocking-b2g: --- → backlog
Hi Steve,
I have a question with setting socket into block/blocking mode.
After some study, I can see the socket open from NSPR API is set to non-blocking mode by default [1]. Besides, there is also an API (PR_SetSocketOption) that allow us to set PR_SockOpt_Nonblocking flag [2]. So what I'm curious about it that the relation between osfd non-blocking mode and fd non-blocking flag. Does setting non-blocking flag to FALSE work the same as setting osfd to blocking mode? Or what the behavior (blocking or non-blocking) would the socket be if I didn't set the non-blocking flag?

Thanks,

[1] http://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptio.c#3266
[2] http://dxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsServerSocket.cpp?from=nsServersocket.cpp#349
Flags: needinfo?(sworkman)
Hi John,
This - http://www.scottklement.com/rpg/socktut/nonblocking.html - should explain blocking vs non-blocking sockets. As far as I know, osfd and fd non-blocking states should be the same.

Do you need to set the fd to be blocking for RTSP?
Flags: needinfo?(sworkman)
(In reply to Steve Workman [:sworkman] from comment #3)
> Hi John,
> This - http://www.scottklement.com/rpg/socktut/nonblocking.html - should
> explain blocking vs non-blocking sockets. As far as I know, osfd and fd
> non-blocking states should be the same.
> 
> Do you need to set the fd to be blocking for RTSP?

Yes, I found there are some code making socket blocking in RTSP, such as [1]. So I can just replace it by using PR_SetSocketOption to make prfd blocking, right?

Since using PR_SetSocketOption to make prfd blocking just set a flag to TRUE [2], so I just worry about if this flag can do the same thing as setting osfd with O_NONBLOCK flag. However, I bet they are the same too.. :-)

[1] http://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/rtsp/rtsp/ARTSPConnection.cpp?from=artspconnection.cpp&case=true#512
[2] http://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/io/prmapopt.c#54
Hi Steve,
The work is almost done, but I encounter a problem:
While I'm replacing select() by PR_Select [1], I found that PR_Select is likely to be obsolete in the near future... do you have any idea with that?

Thanks

[1] http://dxr.mozilla.org/mozilla-central/source/nsprpub/pr/include/obsolete/probslet.h?from=probslet.h#1
Flags: needinfo?(sworkman)
We use PR_Poll in nsSocketTransportService so you could start with that.

I assume this bug is just about converting the linux fds to PRFileDescs, right? And it's NOT about using the nsSocketTransportService to get the fd first??
Flags: needinfo?(sworkman)
(In reply to Steve Workman [:sworkman] from comment #6)
> We use PR_Poll in nsSocketTransportService so you could start with that.

I see. Thanks!

> 
> I assume this bug is just about converting the linux fds to PRFileDescs,
> right? And it's NOT about using the nsSocketTransportService to get the fd
> first??

Yes, you are right. But I can see if there is a chance to move forward of using nsSocketTransportService directly.
Hi Steve,

The current work includes just replacing linux socket APIs by NSPR APIs. Though our eventual goal is to use nsISocketTransportService directly, considering the complexity the work would be, I think this work is still a good start for us.

Besides, the patch definitely needs your careful review. Meanwhile, I'll keep testing it in order to make sure it won't cause regression.. :-)

Thanks!
Attachment #8412475 - Attachment is obsolete: true
Attachment #8412489 - Flags: review?(sworkman)
Comment on attachment 8412489 [details] [diff] [review]
Bug 996535 - [RTSP] Convert linux socket to NSPR/PRFileDesc. r=sworkman

Review of attachment 8412489 [details] [diff] [review]:
-----------------------------------------------------------------

Halfway through this, but I need to move on to other work. Some of the comments below should apply to all the files in the patch. So, maybe you can deal with those first and then upload another patch for review after.

Good work - f+.

::: netwerk/protocol/rtsp/rtsp/ARTPConnection.h
@@ +64,5 @@
>          kWhatPollStreams,
>          kWhatInjectPacket,
>      };
>  
> +    static const PRUint32 kSelectTimeoutUs;

s/Select/SocketPoll/

::: netwerk/protocol/rtsp/rtsp/ARTSPConnection.cpp
@@ +47,5 @@
>  namespace android {
>  
>  // static
> +const PRUint32 ARTSPConnection::kSelectTimeoutUs = 1000;
> +const PRUint32 ARTSPConnection::kSelectTimeoutRetries = 10000;

You changed these from signed to unsigned. Is that what you want?

@@ +63,5 @@
>      MakeUserAgent(&mUserAgent);
>  }
>  
>  ARTSPConnection::~ARTSPConnection() {
> +    if (mSocket != nullptr) {

if (mSocket) { ...

@@ +201,5 @@
>  }
>  
> +static PRStatus MakeSocketBlocking(PRFileDesc *fd, bool blocking) {
> +    // Check if socket is closed.
> +    if (fd == nullptr) {

if (!fd) {...

@@ +211,3 @@
>  
> +    opt.option = PR_SockOpt_Nonblocking;
> +    opt.value.non_blocking = blocking;

= !blocking;

Right?

@@ +211,4 @@
>  
> +    opt.option = PR_SockOpt_Nonblocking;
> +    opt.value.non_blocking = blocking;
> +    rv =  PR_SetSocketOption(fd, &opt);

Remove one space after '='.

@@ +215,2 @@
>  
> +    return rv;

Keep the return type as status_t, then...

return rv = PR_SUCCESS ? UNKNOWN_ERROR : OK;

@@ +254,5 @@
>      if (mUser.size() > 0) {
>          LOGV("user = '%s', pass = '%s'", mUser.c_str(), mPass.c_str());
>      }
>  
> +    //struct hostent *ent = gethostbyname(host.c_str());

Remove this line.

@@ +257,5 @@
>  
> +    //struct hostent *ent = gethostbyname(host.c_str());
> +    PRStatus status;
> +    PRHostEnt hostentry;
> +    void* buffer = PR_Malloc(PR_NETDB_BUF_SIZE);

Probably moz_xmalloc is fine here. Or whatever is used most commonly in the rest of this file.

@@ +260,5 @@
> +    PRHostEnt hostentry;
> +    void* buffer = PR_Malloc(PR_NETDB_BUF_SIZE);
> +
> +    status = PR_GetHostByName(
> +        host.c_str(), (char*)buffer, PR_NETDB_BUF_SIZE, &hostentry);

PR_GetHostByName is fine for now. But we'll want to integrate this with nsDNSService when we integrate with nsSocketTransportService. But that will look very different anyway ... :)

@@ +278,5 @@
>      MakeSocketBlocking(mSocket, false);
>  
> +    PRNetAddr remote;
> +    remote.inet.family = PR_AF_INET;
> +    remote.inet.ip = *((PRUint32 *) hostentry.h_addr_list[0]);

Another reason to use SocketTransportService at some point ... this code will use the first address from the DNS record only. If the connection attempt fails, the code will not retry the connection with another address from the record. SocketTransportService takes care of this internally, making sure that all addresses in the record are tried if necessary.

But fine for this bug :)

@@ +314,5 @@
>  }
>  
>  void ARTSPConnection::performDisconnect() {
> +    PR_Close(mSocket);
> +    mSocket = nullptr;

These two lines are repeated often. You might want to consider a CloseSocket function as part of ARTSPConnection.

@@ +356,5 @@
>          reply->post();
>          return;
>      }
>  
> +    PRPollDesc ws;

s/ws/writePollDesc/

@@ +364,2 @@
>  
> +    PRInt32 res = PR_Poll(&ws, 1, PR_MicrosecondsToInterval(kSelectTimeoutUs));

s/kSelectTimeoutUs/kSocketPollTimeoutUs/

s/res/numSocketsReadyToWrite/

@@ +367,5 @@
>  
>      if (res == 0) {
>          // select() timed out. Not yet connected.
>          if (mNumSelectTimeoutRetries < kSelectTimeoutRetries) {
>              mNumSelectTimeoutRetries++;

s/Select/SocketPoll/

You might need to change these for multiple variables, but since you're using PR_Poll and not select, it is a good idea.

@@ +386,2 @@
>  
> +    if (err != PR_IO_PENDING_ERROR) {

Is this definitely right? It means that if err == PR_SUCCESS we will enter this if branch.

@@ +443,2 @@
>  
> +        PRErrorCode code = PR_GetError();

s/code/errCode/

@@ +455,5 @@
>  
>                  reply->setInt32("result", ERROR_IO);
>                  reply->post();
>              } else {
> +                LOGE("Error sending rtsp request.");

Removing the string is ok, but please add %d for errCode.

@@ +476,5 @@
>      if (mState != CONNECTED) {
>          return;
>      }
>  
> +    PRPollDesc rs;

s/rs/readPollDesc/

@@ +483,2 @@
>  
> +    PRInt32 res = PR_Poll(&rs, 1, PR_MicrosecondsToInterval(kSelectTimeoutUs));

s/res/numSocketsReadyToRead/

@@ +531,2 @@
>  
> +        PRErrorCode code = PR_GetError();

s/code/errCode/

Please change all instances of this.

@@ +541,5 @@
>                  // Server closed the connection.
>                  LOGE("Server unexpectedly closed the connection.");
>                  return ERROR_IO;
>              } else {
> +                // LOGE("Error reading rtsp response. (%s)", strerror(errno));

Keep the LOGE; change %s and strerror to %d errCode.

Please change all instances of this.

::: netwerk/protocol/rtsp/rtsp/ARTSPConnection.h
@@ +44,5 @@
>  
>      void observeBinaryData(const sp<AMessage> &reply);
>  
>      static bool ParseURL(
> +            const char *url, AString *host, PRUint16 *port, AString *path,

s/PRUint16/uint16_t/

Check the other files too and try to use standard C++ types here. The NSPR types should be equivalent.

@@ +74,5 @@
>          DIGEST
>      };
>  
> +    static const PRUint32 kSelectTimeoutUs;
> +    static const PRUint32 kSelectTimeoutRetries;

s/PRUint32/uint32_t/

@@ +86,5 @@
>      int32_t mConnectionID;
>      int32_t mNextCSeq;
>      bool mReceiveResponseEventPending;
> +    PRFileDesc *mSocket;
> +    PRUint32 mNumSelectTimeoutRetries;

s/PRUint32/uint32_t/
Attachment #8412489 - Flags: review?(sworkman) → feedback+
(In reply to Steve Workman [:sworkman] from comment #10)
> Comment on attachment 8412489 [details] [diff] [review]
> Bug 996535 - [RTSP] Convert linux socket to NSPR/PRFileDesc. r=sworkman
> 
> Review of attachment 8412489 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Halfway through this, but I need to move on to other work. Some of the
> comments below should apply to all the files in the patch. So, maybe you can
> deal with those first and then upload another patch for review after.
> 
> Good work - f+.
> 

Thanks for your quick reviews! Really appreciate with that.

> 
> ::: netwerk/protocol/rtsp/rtsp/ARTSPConnection.cpp
> @@ +47,5 @@
> >  namespace android {
> >  
> >  // static
> > +const PRUint32 ARTSPConnection::kSelectTimeoutUs = 1000;
> > +const PRUint32 ARTSPConnection::kSelectTimeoutRetries = 10000;
> 
> You changed these from signed to unsigned. Is that what you want?

yes, its due to the definition of PR_MicrosecondsToInterval(..)

> 
> @@ +211,3 @@
> >  
> > +    opt.option = PR_SockOpt_Nonblocking;
> > +    opt.value.non_blocking = blocking;
> 
> = !blocking;
> 
> Right?
> 

You are right! thanks

> 
> @@ +257,5 @@
> >  
> > +    //struct hostent *ent = gethostbyname(host.c_str());
> > +    PRStatus status;
> > +    PRHostEnt hostentry;
> > +    void* buffer = PR_Malloc(PR_NETDB_BUF_SIZE);
> 
> Probably moz_xmalloc is fine here. Or whatever is used most commonly in the
> rest of this file.

What's the difference here? It seems that moz_xmalloc is also used not often as PR_Malloc is. (Note, malloc function is used since this patch)
patch update
Attachment #8412489 - Attachment is obsolete: true
Attachment #8413556 - Flags: review?(sworkman)
(In reply to John Shih from comment #11)
> (In reply to Steve Workman [:sworkman] from comment #10)
> > @@ +257,5 @@
> > >  
> > > +    //struct hostent *ent = gethostbyname(host.c_str());
> > > +    PRStatus status;
> > > +    PRHostEnt hostentry;
> > > +    void* buffer = PR_Malloc(PR_NETDB_BUF_SIZE);
> > 
> > Probably moz_xmalloc is fine here. Or whatever is used most commonly in the
> > rest of this file.
> 
> What's the difference here? It seems that moz_xmalloc is also used not often
> as PR_Malloc is. (Note, malloc function is used since this patch)

Sorry, I should have explained: moz_xmalloc (and its friends) are infallible, meaning that they will always return a non-null pointer. Out-of-memory errors are handled internally (abort!). There is another explanation here http://dxr.mozilla.org/mozilla-central/source/memory/mozalloc/mozalloc.h#65. As far as I know, this is useful for most memory allocations. You should only use the fallible version (moz_malloc) if you want to respond to the out-of-memory by allocating a smaller amount of memory (or something like that). In this case, if you can't allocate a buffer for a net address, something has gone really wrong :) There is no real way to recover here.
Blocks: 966625
Replace PR_Malloc, PR_Callor by moz_xmalloc, moz_xcalloc
Attachment #8413556 - Attachment is obsolete: true
Attachment #8413556 - Flags: review?(sworkman)
Attachment #8414956 - Flags: review?(sworkman)
(In reply to Steve Workman [:sworkman] from comment #13)
> (In reply to John Shih from comment #11)
> > (In reply to Steve Workman [:sworkman] from comment #10)
> > > @@ +257,5 @@
> > > >  
> > > > +    //struct hostent *ent = gethostbyname(host.c_str());
> > > > +    PRStatus status;
> > > > +    PRHostEnt hostentry;
> > > > +    void* buffer = PR_Malloc(PR_NETDB_BUF_SIZE);
> > > 
> > > Probably moz_xmalloc is fine here. Or whatever is used most commonly in the
> > > rest of this file.
> > 
> > What's the difference here? It seems that moz_xmalloc is also used not often
> > as PR_Malloc is. (Note, malloc function is used since this patch)
> 
> Sorry, I should have explained: moz_xmalloc (and its friends) are
> infallible, meaning that they will always return a non-null pointer.
> Out-of-memory errors are handled internally (abort!). There is another
> explanation here
> http://dxr.mozilla.org/mozilla-central/source/memory/mozalloc/mozalloc.h#65.
> As far as I know, this is useful for most memory allocations. You should
> only use the fallible version (moz_malloc) if you want to respond to the
> out-of-memory by allocating a smaller amount of memory (or something like
> that). In this case, if you can't allocate a buffer for a net address,
> something has gone really wrong :) There is no real way to recover here.

Thanks for your clarification! It's more clear now :)
Comment on attachment 8414956 [details] [diff] [review]
Bug 996535 - [RTSP] Convert linux socket to NSPR/PRFileDesc. r=sworkman

Review of attachment 8414956 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Getting there. Just some small changes, so one more review after this one I think.

I see you've added changes to do with TCP interleaving. Is this patch to be applied after Ethan's TCP interleaving one? I think I'm fine with the minor changes being included here, but please add a comment in the patch description to explain which bits of the interleaving you're adding.

Also, one small point. Please don't add r=sworkman to the patch description until I r+ it :) Thanks!

::: netwerk/protocol/rtsp/rtsp/ARTPConnection.cpp
@@ +59,5 @@
>  struct ARTPConnection::StreamInfo {
> +    PRFileDesc *mRTPSocket;
> +    PRFileDesc *mRTCPSocket;
> +    int mInterleavedRTP;
> +    int mInterleavedRTCP;

Assuming these are indexes, s/mInterleavedRTP/mInterleavedRTPIdx/ and for RTCP.

@@ +227,5 @@
>  }
>  
>  void ARTPConnection::onRemoveStream(const sp<AMessage> &msg) {
> +    PRFileDesc *rtpSocket, *rtcpSocket;
> +    void *s;

Init all three to nullptr.

@@ +273,4 @@
>      for (List<StreamInfo>::iterator it = mStreams.begin();
>           it != mStreams.end(); ++it) {
>          if ((*it).mIsInjected) {
> +            index += 2;

Can you add a comment for all the 'index += 2' statements please?

Would it be better to set in index in the for loop statement?

for (List<StreamInfo>::iterator it = mStreams.begin(), index = 0;
     it != mStreams.end();
     ++it, index += 2) { ...

Also, s/index/pollIndex/ please.

@@ +294,3 @@
>      }
>  
> +    if (count == 0) {

s/count/numSocketsToPoll/

::: netwerk/protocol/rtsp/rtsp/ARTSPConnection.cpp
@@ +204,4 @@
>          return UNKNOWN_ERROR;
>      }
>  
> +    PRStatus rv;

Init to PR_FAILURE.

@@ +251,5 @@
>      if (mUser.size() > 0) {
>          LOGV("user = '%s', pass = '%s'", mUser.c_str(), mPass.c_str());
>      }
>  
> +    PRStatus status;

Init to PR_FAILURE.

@@ +256,5 @@
> +    PRHostEnt hostentry;
> +    void* buffer = moz_xmalloc(PR_NETDB_BUF_SIZE);
> +
> +    status = PR_GetHostByName(
> +        host.c_str(), (char*)buffer, PR_NETDB_BUF_SIZE, &hostentry);

Try declaring buffer on the stack.

char buffer[PR_NDETFB_BUF_SIZE);

@@ +474,4 @@
>  
> +    int32_t numSocketsReadyToRead = PR_Poll(&readPollDesc, 1,
> +        PR_MicrosecondsToInterval(kSocketPollTimeoutUs));
> +    CHECK_GE(numSocketsReadyToRead, 0);

Can you add another check to see if it's greater than 1? That would be an error condition, right? I know the code currently just continues quietly here, but it seems like more of an error that might indicate that something else is wrong.

::: netwerk/protocol/rtsp/rtsp/ARTSPConnection.h
@@ +74,5 @@
>          DIGEST
>      };
>  
> +    static const uint32_t kSocketPollTimeoutUs;
> +    static const uint32_t kSocketPollTimeoutRetries;

Thank you for these renames!

::: netwerk/protocol/rtsp/rtsp/RTSPConnectionHandler.h
@@ +323,4 @@
>  
>          AString source;
>          AString server_port;
> +        PRStatus status;

Init to PR_FAILURE, please.

@@ +332,5 @@
>  
> +            void* buffer = moz_xmalloc(PR_NETDB_BUF_SIZE);
> +            PRHostEnt hostentry;
> +            status = PR_GetHostByName(mSessionHost.c_str(),
> +                (char*)buffer, PR_NETDB_BUF_SIZE, &hostentry);

I think you can just declare buffer on the stack; I see PR_GetHostByName used that way in several places.

char buffer[PR_NETDB_BUF_SIZE];

That way you don't have to worry about freeing it (which you don't do here ;) ).

@@ +394,5 @@
> +            return false;
> +        }
> +        ssize_t n = PR_SendTo(
> +                  rtpSocket, buf->data(), buf->size(), 0,
> +                  &addr, PR_INTERVAL_NO_WAIT);

nit: Please reformat this so that the params start on the same line as PR_SendTo, like you did with the rtcpSocket.

@@ +1303,5 @@
>          AString mURL;
> +        PRFileDesc *mRTPSocket;
> +        PRFileDesc *mRTCPSocket;
> +        int mInterleavedRTP;
> +        int mInterleavedRTCP;

s/mInterleavedRTP/mInterleavedRTPIdx/
Same with RTCP. Otherwise, it might be mistaken for a boolean.

@@ +1418,5 @@
>  
>              request.append("Transport: RTP/AVP/TCP;interleaved=");
>              request.append(interleaveIndex);
>              request.append("-");
>              request.append(interleaveIndex + 1);

Let's be explicit here and use info->mInterleavedRTP and ~RTCP in the values appended to the request.
Attachment #8414956 - Flags: review?(sworkman) → review-
Attached patch rtsp-conn-log.patch (obsolete) — Splinter Review
Print logs for RTSP connection handler.
Hi Steve,
After some debugging, the patch finally works fine on interleaving case (by applying Ethan's patch)

Also, I found there are other places using linux socket API, so I modifying them as well.

Thanks
Attachment #8414956 - Attachment is obsolete: true
Attachment #8418574 - Flags: review?(sworkman)
Comment on attachment 8418574 [details] [diff] [review]
Bug 996535 - [RTSP] Convert linux socket to NSPR/PRFileDesc. r=sworkman

Review of attachment 8418574 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. r=me. Just some minor changes to make before landing.

::: netwerk/protocol/rtsp/rtsp/ARTPConnection.cpp
@@ +278,1 @@
>          if ((*it).mIsInjected) {

Just in case, let's insert a check for pollIndex < pollCount at the top of the loop.

if (pollIndex >= pollCount) {
  CRASH;
}

::: netwerk/protocol/rtsp/rtsp/ARTPSession.cpp
@@ +96,5 @@
>  }
>  
>  // static
> +void ARTPSession::MakeUDPSocket(PRFileDesc **s, unsigned port) {
> +    *s = PR_OpenUDPSocket(PR_AF_INET);

null check: if (!s) CRASH

@@ +109,2 @@
>  
> +    if (PR_Bind(*rtpSocket, &addr) == PR_FAILURE) {

Where is rtpSocket defined?

::: netwerk/protocol/rtsp/rtsp/RTSPConnectionHandler.h
@@ +1309,5 @@
>          AString mURL;
> +        PRFileDesc *mRTPSocket;
> +        PRFileDesc *mRTCPSocket;
> +        int mInterleavedRTPIdx;
> +        int mInterleavedRTCPIdx;

Thank you for renaming these :)
Attachment #8418574 - Flags: review?(sworkman) → review+
Patch revised. r=sworkman
Attachment #8418574 - Attachment is obsolete: true
Attachment #8419162 - Flags: review+
Keywords: checkin-needed
Attachment #8417961 - Attachment is obsolete: true
Hi John, is there a recent Try push for this patch? If you need some pointers on what to run, you can consult with the link below.
https://wiki.mozilla.org/Sheriffing/How:To:Recommended_Try_Practices
Keywords: checkin-needed
Hi, 
here is try result: https://tbpl.mozilla.org/?tree=Try&rev=57ae44604d52
looks good (with only one unrelated fail)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/7770095ccedf
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.0 S1 (9may)
John,
Good job! Thanks for your hard work to resolve this bug. :)

Hi William,
This bug does some internal refactoring related to RTSP connection/socket codes.
It is not supposed to affect any RTSP functionality.

But since we don't have automated tests on Try server yet, I would like to have your support to take care of this when you execute manual tests.
The point is to pay more attention to RTSP connection to see if any regression occurs.
I guess maybe you already have enough test cases on this part.
Just let you be aware of this change. :)
Notice this! Thanks Ethan!:)
I will pay more attention on this while I run the second fullrun.
Depends on: 1009449
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.