Closed Bug 570616 Opened 14 years ago Closed 14 years ago

nsNSSSocketInfo cannot deserialize when mCert was null

Categories

(Core :: Security: PSM, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
Tracking Status
fennec 2.0b1+ ---

People

(Reporter: mayhemer, Assigned: mayhemer)

References

Details

Attachments

(3 files, 2 obsolete files)

When nsNSSSocketInfo::mCert is null, WriteCompoundObject(NS_ISUPPORTS_CAST(nsIX509Cert*, mCert)) silently fails.  Nothing is written to the stream.

During read, the object stream is trying to read CID and IID of the object, nsNSSCertificate is expected present in the stream.  But the object stream actually reads the data following after the place the certificate should be serialized and the stream read chain gets broken.


Possible solutions:

1. let nsNSSSocketInfo store a flag if mCert is null or not.  However, how to recognize we are reading an older version without that flag?  Not possible because the version number is written AFTER mCert object, so it cannot be used to recognize it.

2. we may hack this somehow similarly as we did in bug 329869 (patch 3, v2) [1].  But I don't see a way to peek from the stream in this case.

3. if it is ok to be dependent on the object stream implementation, store two in-existent GUIDs to fail the object load (instantiation) while preserving the stream consistency.

4. when mCert is null, store an object of a class inheriting nsISerializable and nsIClassInfo instead. That class would be returning in-existent GUID as ClassID and writing nothing to the stream.


[1] http://hg.mozilla.org/mozilla-central/rev/4851a9d9a60c
This should have been using NS_WriteOptionalCompoundObject, if the object can be null.  :(
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Blocks: 570003
Attached patch v1 (obsolete) — Splinter Review
I cannot find a better solution then this...  I don't think we necessarily need to be compatible.  If there were no certificate stored to the serialization stream, that stream is broken anyway.

Change to nsSSLStatus::nsSSLStatus (mCipherName = "") is added because nsSSLStatus::Write() calls nsBinaryOutputStream::WriteStringZ with mCipherName.get() that returns NULL when mCipherName has never been assigned a value (it is nsXPIDLCString with flag P_VOIDED).  nsBinaryOutputStream::WriteStringZ cannot handle null strings - crashes.  Maybe rather fix the nsBinaryOutputStream class?
Attachment #450757 - Flags: review?(bzbarsky)
(In reply to comment #2)
> 
> I cannot find a better solution then this...  I don't think we necessarily need
> to be compatible.  If there were no certificate stored to the serialization
> stream, that stream is broken anyway.

What happens when you attempt to read a broken stream?
Could it crash?
(In reply to comment #3)
> What happens when you attempt to read a broken stream?

If you mean a version 2 or less of the stream without the certificate, then happens following:
- binary stream reads CID and IID (it doesn't check it really is there)
- later it fails to instantiate the class, because the CID is broken and cannot read from the stream further
- it means the stream cursor moves somewhere in the middle of the Description message string, actually quit unexpectedly

This happens even now, w/o the patch for this bug.

> Could it crash?

It could potentially read a large number as a length of a string and try to allocate huge amount of memory to store that string to.  Question is, if we could do anything about it.


Another way to fix this could be to wrap the stream to a seekable stream (we can peek from or roll back in it) and check there is a CID of nsNSSCertificate class present or not.

However, I'm not sure we ever serialize the sec info WITHOUT the certificate except the new case of carrying it between processes.
I understand we talk about objects that are serialized to disk, and will be retrieved later from disk. This includes the times when users switch from older software to newer software.

In my opinion, the potential fatal results that you described in the previous comment must be avoided at all costs, random crashes (or random out-of-memory) because of this are not acceptable.

In my opinion, we must pay for the mistake by introducing a new on-disk version. The new on-disk format should introduce a new header, that will never be produced by the old code. That way you can distinguish the old stream from the new stream and avoid reading unexpected data.
Just to note: I don't see another way that to wrap the whole nsIObjectInputStream and hook ReadID to check whether one of the newly proposed magic or the nsNSSCertificate CID or the version magic or something even older is present.
Is this bug about disk-cache?
(In reply to comment #7)
> Is this bug about disk-cache?
As well.
Yeah, the only idea I have:

The new code should write a sequence of 2-4 magic bytes, use a byte combination that seems impossible (or very unlikely) to occur as the initial bytes for old the object serialization.

Have the new code read the magic bytes. If it doesn't match, un-read the 2-4 bytes (rewind), and use a read logic compatible with the old stream.


(I wish our object serialization would use a logic that always works right, and not require the programmer to distinguish between may-be-null or never-null. It should use some sort of data scheme and versioning.)
I'm just building and testing a patch that plays with the first (m0) 32 bits part of IIDs.  It clones some of the code of the nsObjectStream class, so we will be strongly dependent on the stream implementation.  I'm actually breaking ReadObject method apart to branch on different values of stored IDs.
Attached patch v2 (obsolete) — Splinter Review
This is it.  Most ugly patch I ever wrote.
Attachment #452056 - Flags: review?(kaie)
Hmm..  maybe not to be dependent on the object stream implementation, store the cert object our self in the Write() method using, WriteID, or even Write32 etc, and mCert->Write(stream) directly.  Opinions?
Attached patch v3Splinter Review
This is what works for me and is more or less OK.  This is urgent to review, we need bug 536301 landed soon.
Attachment #450757 - Attachment is obsolete: true
Attachment #452056 - Attachment is obsolete: true
Attachment #462039 - Flags: review?(bzbarsky)
Attachment #450757 - Flags: review?(bzbarsky)
Attachment #452056 - Flags: review?(kaie)
This might make the review simpler: the two methods (Read and Write) after the patch is applied.
Comment on attachment 462039 [details] [diff] [review]
v3

r=me.

This will still cause trouble if the new format is read by an old build, right?  Anything we can do about that?
Attachment #462039 - Flags: review?(bzbarsky) → review+
We can throw away all the magic and flagging stuff, and rely only on checking  the certificate CID is there or not.  That would go a bit against comment 5, but would be simpler, compatible in both directions, and actually, we still already do this in case we read an old version w/o the certificate (in that state the sec info will never be stored to cache or session store!).
So this blocks SecurityInfo--requesting blocking-fennec.

From my read of comment 16, it sounds like we want a followup bug to fix compat with old builds?
tracking-fennec: --- → ?
tracking-fennec: ? → 2.0a1+
blocking an a1 blocker
landed on e10s:

http://hg.mozilla.org/projects/electrolysis/rev/7f0bd8be0e64
tracking-fennec: 2.0a1+ → ?
(In reply to comment #17)
> So this blocks SecurityInfo--requesting blocking-fennec.
> 
> From my read of comment 16, it sounds like we want a followup bug to fix compat
> with old builds?

No, we had to figure this out before we have landed this patch.
Does the cache have a "version"? I'm perfectly fine with simply not using an incompatible cache (in both directions).
(In reply to comment #21)
> Does the cache have a "version"? 

Not automatically, the stream is responsible for versioning explicitly.  We recently added a version number to the stream (it was missing -- a bug), but it had been put AFTER the serialization of the certificate, so we cannot decide the cert is there or not.

> I'm perfectly fine with simply not using an
> incompatible cache (in both directions).

The old code will try to read the CID and IID of the stored certificate where a magic and bool flag are stored by the new code.  The old code cannot recognize it is 'a wrong cache'.

It is quit unpredictable how the old code will handle this, and it may introduce crashes, unexpected security states after restore from session store or after navigation back and forward after we modified the content with document.write (introducing a wyciwyg channel) => leads to a security or spoofing bugs.  It would need a downgrade to older Fx version, but anyway.


I start thinking of reintroducing the very first patch of this bug.  Just update it to check there is the CID of the certificate at the very start of the stream able to rollback when it doesn't to fix case when an older version didn't store the certificate.
We need to figure this out in the next few hours, or back out secinfo. Keep us posted!
To summarize the goals:

1. we want to fix missing certificate in the current and newer version of the code/cache - i.e. from now to recognize it was stored or not
2. we want to be backward compatible - i.e. the new code must be able to read the old code and as well handle case the cert was not stored by the old code (if this ever happened)
3. we have to be able to store the data in a way the old code is able to handle it correctly or at least safely


To fix 1 we have to introduce some flag and rise the version.

To fix 2 we still must "manually" check the cert had been stored or not, i.e. check there is its CID stored but be able to rollback when it is not and bypass cert load at all

To fix 3 we have to store some kind of an object OR an non-existing CID to let the old code fail to instantiate the cert object ; the stream first reads the CID and IID and then tries to instantiate the object using that CID.  If it fails, it doesn't call object->Read(this) then, i.e. it bypasses the object read, but had already red the CID and IID although it might not be stored there at all!

So, after a chat with jst, we decided on following solution to cover all of this:

The new code write:
- when there is not a certificate, store two bogus UIDs
- otherwise the same, just rise the version number

The new code read:
- read a single byte
- check m0 of the magic is this byte
- if equals, read and check the two bogus UIDs and flag there is not any certificate stored
- else, check this single byte with m0 of nsNSSCertificate CID.m0
- if equals, check the whole CID, IID and read the certificate (not using the object stream implementation)
- else, the single byte we read is the version, and flag there is no cert stored (we read older version w/o the certificate at this moment)

The old code read [1]:
- tries to load the cert from stream by calling stream->ReadObject
- the stream implementation reads CID and IID
- then it does do_CreateInstance using the CID
- if it fails, it bypasses the object read, but has already cut the two IDs from the stream
- so, we get null on output (no certificate) ; we have to double check, but the reinterpret_cast will leave the null pointer null
- now the old code continues reading following data with the stream at the correct position

[1] http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/security/manager/ssl/src/nsNSSIOLayer.cpp&rev=1.166#595
Benjamin advised to rise the disk cache version number.  No need to be compatible at all.  This code is not used for storing anywhere else but the cache, AFAIK.

This must go to mozilla-central along with the serialization patch.
Attachment #465696 - Flags: review?(benjamin)
Attachment #465696 - Flags: review?(benjamin) → review+
It sounds like things are really messed up now.
Let's not ship any code anywhere that produces data that can cause old software versions.

If it helps, let's back it out.

If you don't have a good fix that works forwards and backwards, the new software should throw away all old caches and write to new filenames.
(In reply to comment #27)
> Let's not ship any code anywhere that produces data that can cause old software
> versions.
............ to crash
(In reply to comment #26)
> Comment on attachment 465696 [details] [diff] [review]
> update the cache version number
> 
> http://hg.mozilla.org/projects/electrolysis/rev/593b16786b35

Do this also apply to the offline cache?
Comment on attachment 462039 [details] [diff] [review]
v3

>+    nsCOMPtr<nsISerializable> serializable =
>+        do_CreateInstance(kNSSCertificateCID, &rv);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+
>+    serializable->Read(stream);
>+    mCert = reinterpret_cast<nsNSSCertificate*>(serializable.get());
Thanks for the crash... try a static_cast next time?
(In reply to comment #30)
> Thanks for the crash... try a static_cast next time?

This has been there since forever, but not from nsISerializable but from nsISupports.

Is it a bug?  Have you reported it?
Depends on: 588298
tracking-fennec: ? → 2.0b1+
(In reply to comment #31)
> (In reply to comment #30)
> > Thanks for the crash... try a static_cast next time?
> This has been there since forever, but not from nsISerializable but from
> nsISupports.
But fortunately the nsISupports you received must have been the one for the first entry in the vtable, and thus safe to reinterpret.

> Is it a bug?  Have you reported it?
I can't remember the STR. I just patched it locally to avoid annoying myself.
(In reply to comment #31)
> Is it a bug?
Of course, it is a bug in the sense of "it is obviously wrong to reinterpret an nsISerializable* as an nsNSSCertificate*".

The bug is easier to avoid if you don't use XPCOM, of course:
nsRefPtr<nsNSSCertificate> cert = new nsNSSCertificate();
cert->Read(stream);
mCert = cert.forget();
Anyway, it looks like bug 588884 was filed for that.
What's the status on this bug? The m-c patch just bumped the cache version number.
This has to closed.  All patches has been landed and there is nothing more to do.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: