nsSocketTransport::IsAlive does I/O on the main thread

NEW
Unassigned

Status

()

Core
Networking
P3
normal
4 years ago
4 months ago

People

(Reporter: briansmith, Unassigned)

Tracking

({main-thread-io, perf})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [snappy][main thread disk AND network I/O][necko-backlog])

nsSocketTransport::IsAlive can be called on the main thread. nsSocketTransport::IsAlive calls PR_Recv. PR_Recv on an SSL socket will attempt to drive the SSL handshake forward. If this happens while a Certificate handshake message is pending, then the libssl I/O layer will parse the Certificate message during the PR_Recv call. This may access the NSS certificate database on disk. Thus, main thread I/O.
now that you raise the issue, it seems also bad to do blocking disk i/o on the socket thread in the more general non-isalive case too. hadn't realized that was going on.
(In reply to Patrick McManus [:mcmanus] from comment #1)
> now that you raise the issue, it seems also bad to do blocking disk i/o on
> the socket thread in the more general non-isalive case too. hadn't realized
> that was going on.

I agree. I think that we may be able to remove the bad I/O generated from IsAlive, but within libssl there are some access to the certificate database that can cause disk reads. New libssl APIs are needed to fix that. I filed bug 951781 in NSS for this. However, I think we should find a workaround in Gecko if possible, because I suspect that this can cause severe jank due to the possibility of blocking the main thread on OCSP requests as well as disk I/O.
Keywords: perf
See Also: → bug 950240
Whiteboard: [snappy][main thread disk AND network I/O]
My understanding is that the purpose of this IsAlive() call is to determine if the socket has been closed (e.g. connection reset). So, I think one alternative solution is for IsAlive to be changed so that it navigates through the I/O layer stack until it gets to a socket I/O layer, and then doing the recv(PR_MSG_PEEK) directly on that I/O layer. Not only would that resolve this bug, but I think it would also be more correct for the intended purpose.
See Also: → bug 827264
Whiteboard: [snappy][main thread disk AND network I/O] → [snappy][main thread disk AND network I/O][necko-backlog]
You need to log in before you can comment on or make changes to this bug.