Closed Bug 1240168 Opened 8 years ago Closed 8 years ago

psm test assumed blocking semantics from main thread

Categories

(Core :: Security: PSM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: mcmanus, Assigned: mcmanus)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

hey david - we talked about auditing whether all tls was done on the
socket thread.. I expanded that question to see whether all socket
access was done from the socket thread.. I did it with a simple nspr
io layer that asserted the socket thread. I'll put that in a bug when
it runs cleanly - for now I'm picking through the fails.

this one is a psm test that was doing io from the main thread because
openoutputstream was giving direct access to the FD. I changed that
(again, in another patch not uploaded yet) to put a pipe in the middle
of the socket thread and the main thread which is the way all the
normal channel code works (this test isn't using the channel
code).. that fixed the assert but the test failed.

The test failed because it was assuming it was going to get a fully
synchronous write() semantic on the stream - so when it went to write
bytes to a server that was expected to fail (e.g. no_cipher_overlap)
it expected a synchronous error.. but now that the pipe was in between
that didn't happen.

I personally don't think nsITransport.idl offers a synchronous
interface.. it offers some semantics around blocking on full pipes,
but that's not the same thing. And I think all you could check was the
handshake that way anyhow (because the write handler had been
invoked)- it didn't really offer any semantics around the bytes that
were being written.

So I updated the test to simply allow the anticipated error in either
the read or the write handler. Hope that's ok.

This won't change the implementation for any channel users at all.
Comment on attachment 8708534 [details] [diff] [review]
test assumed blocking semantics from main thread

Review of attachment 8708534 [details] [diff] [review]:
-----------------------------------------------------------------

Ok - seems reasonable.

::: security/manager/ssl/tests/unit/test_weak_crypto.js
@@ +131,1 @@
>          deferred.reject(e);

nit: indentation

@@ +131,2 @@
>          deferred.reject(e);
> +       }

nit: I think this brace needs to be indented one more space
Attachment #8708534 - Flags: review?(dkeeler) → review+
hmm.. https://treeherder.mozilla.org/#/jobs?repo=try&revision=4dd10cb7bcf2&selectedJob=15543185

that try run combines this patch with the one that moves the IO off the main thread by putting a pipe between the main thread and the socket thread (the same way the https channels work). I think we all agree we don't want the IO on the main thread.

the test fails, inconsistently due to timing, on the case that is checking for NET_RESET intolerance fallback. It gets no_cipher_overlap instead.

I added some diags and psm does indeed still generate RESET - but it does so, by design according to the comments, exactly once.. after that it uses the underlying no_cipher_overlap error. So in the case the test fails the RESET has been absorbed by the async write, not yet visible to the main thread, and the available() call gets no_cipher_overlap.

This all works in an https channel because the http code and the tls code are all running on the same thread - so it does see the synchronous error and generates a retry without ever notifying the channel code (running mostly on the main thread).

I'm not 100% sure what to do here - one suggestion would be to just make the reset subtests into ones that use https channels.. that's a little more of a blackbox than the current unit test, but it would test the semantic. We could add an attribute to nsIHttpChannelInternal to indicate the number of retries the channel did to strengthen the check.

any other suggestions are even uglier.
Flags: needinfo?(dkeeler)
Blocks: 1240569, 1240568
(In reply to Patrick McManus [:mcmanus] from comment #4)
> the test fails, inconsistently due to timing, on the case that is checking
> for NET_RESET intolerance fallback. It gets no_cipher_overlap instead.

I'm not sure the part that tests that there is a NET_RESET before a NS_OK is really useful.

> I'm not 100% sure what to do here - one suggestion would be to just make the
> reset subtests into ones that use https channels.. that's a little more of a
> blackbox than the current unit test, but it would test the semantic.

Yeah, I think all we really need to be testing here is the end result. That is, if RC4 fallback isn't enabled for an RC4-only server, the result is no_cypher_overlap. If it is enabled, the result is a successful connection.

> We
> could add an attribute to nsIHttpChannelInternal to indicate the number of
> retries the channel did to strengthen the check.

That might be nice if it isn't too much work, but again I'm not sure it's necessary.
Flags: needinfo?(dkeeler)
so this version let's the intolerance iteration be either NET_RESET or no-cipher-overlap and then checks for OK on the next iteration as usual

 https://treeherder.mozilla.org/#/jobs?repo=try&revision=922a62a88d2a
(In reply to Patrick McManus [:mcmanus] from comment #8)
> so this version let's the intolerance iteration be either NET_RESET or
> no-cipher-overlap and then checks for OK on the next iteration as usual

I don't think this is correct. We have to check a non-NET_RESET error to make sure that the insecure fallback logic works. This change will make the test effectively useless.
> We have to check a non-NET_RESET error

Sorry, it's opposite. We have to check a NET_RESET error.
(In reply to Masatoshi Kimura [:emk] from comment #10)
> > We have to check a non-NET_RESET error
> 
> Sorry, it's opposite. We have to check a NET_RESET error.

see comment 5.

The patch changes where we checked for net_reset to allow net_reset or no-overlap but the next test repeats the same conditions and checks for OK (and therefore that there was an intolerance fallback performed). In short the test still checks that that an error created the intolerance fallback, it just is a little less picky about verifying which code that error was.

and that's because the net_reset simply isn't exposed reliably to the js in the main thread when psm is accessed asynchronously. (and its a pretty big bug that js isn't forced to do asynchronous networking - that's what this change blocks.) its trying to asynchronously test a semantic only necko can synchronously test - so a bit of an integration test is used instead.
Comment on attachment 8711831 [details] [diff] [review]
weak_crypto test assumed blocking semantics from main thread

Review of attachment 8711831 [details] [diff] [review]:
-----------------------------------------------------------------

This seems reasonable.
Attachment #8711831 - Flags: review?(dkeeler) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/5fdfe83cadc1db6d0940409e90cd70a62ab4122b
Bug 1240168 - weak_crypto test assumed blocking semantics from main thread r=keeler
https://hg.mozilla.org/mozilla-central/rev/5fdfe83cadc1
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: