Closed Bug 531188 Opened 15 years ago Closed 15 years ago

Decompression failure with https://livechat.merlin.pl/

Categories

(NSS :: Libraries, defect, P2)

3.12.5
x86
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED
3.12.6

People

(Reporter: wtc, Assigned: agl)

Details

Attachments

(2 files, 3 obsolete files)

      No description provided.
While debugging Chrome issue http://crbug.com/28570, I found that
we're getting a decompression failure when talking to https://livechat.merlin.pl/.

But there are two bugs in the code that need to be fixed first
before you can investigate the decompression failure.

1. There is an unmatched
      ssl_ReleaseSpecReadLock(ss);

Unless we need to acquire the spec read lock at that point, the
fix is to simply remove the ssl_ReleaseSpecReadLock(ss) call.

2. The ssl_MapLowLevelError(SSL_ERROR_DECOMPRESSION_FAILURE)
call is wrong because crSpec->decompressor doesn't set an
error code when it fails, so the error code is stale.  In
my debugging, the stale error code is -5998 (PR_WOULD_BLOCK_ERROR),
so ssl_MapLowLevelError leaves it unchanged.

Either crSpec->decompressor needs to set an error code, or
we should just call PORT_SetError to set SSL_ERROR_DECOMPRESSION_FAILURE
directly here.  This patch does the latter, but I think the former
might be better.

Adam, could you please review this patch and also investigate
the decompression failure?  Thanks.
Attachment #414586 - Flags: review?(agl)
If you get a certificate revoked error in Chrome, please apply
the NSS patch in bug 515279, or turn off certificate revocation
checking.
Attachment #414586 - Flags: review?(agl) → review+
Comment on attachment 414586 [details] [diff] [review]
Remove unlock, set error code directly

Nelson, please see comment 1 for a description of this patch.
Would you prefer that I continue to use ssl_MapLowLevelError
and set the error code in crSpec->decompressor instead?
Attachment #414586 - Flags: superreview?(nelson)
(due to logistical reasons, I've lost my development box until Monday at the earliest so I won't be able to look at the decompression error until then.)
Wan-Teh, An error code should definitely be set in ssl3_DeflateDecompress.
Whether you set some generic error there, and then use ssl_MapLowLevelError 
to set error SSL_ERROR_DECOMPRESSION_FAILURE or just set error code 
SSL_ERROR_DECOMPRESSION_FAILURE in ssl3_DeflateDecompress is a matter of 
taste.  I don't have a strong preference.
Comment on attachment 414586 [details] [diff] [review]
Remove unlock, set error code directly

I'll give this SR+ because it's OK as is.
Attachment #414586 - Flags: superreview?(nelson) → superreview+
I decided to set an error code in ssl3_DeflateDecompress and
keep the ssl_MapLowLevelError call.

I checked in this patch on the NSS trunk (NSS 3.12.6).

Checking in ssl3con.c;
/cvsroot/mozilla/security/nss/lib/ssl/ssl3con.c,v  <--  ssl3con.c
new revision: 1.125; previous revision: 1.124
done
Attachment #414586 - Attachment is obsolete: true
Attachment #414622 - Flags: review?(nelson)
We should add a new error code for compression failure in
ssl3_DeflateCompress.  ssl3_DeflateInit and ssl3_DeflateInit
should also set an error code on failure.
Attachment #414622 - Flags: review?(nelson) → review+
Comment on attachment 414622 [details] [diff] [review]
Remove unlock, set error code in ssl3_DeflateDecompress (checked in)

r=nelson
zlib returns Z_BUF_ERROR is there's nothing to do (the avail_in is 0). We were previously triggering this and the zlib error was considered fatal.
Attachment #415281 - Flags: review?
Comment on attachment 415281 [details] [diff] [review]
fix cases where we use zlib with 0-length inputs.

r=wtc, with the change to declare z_stream* context first
in the functions for pre-C99 compilers.

Should we also check for Z_BUF_ERROR and handle it as a
non-fatal error?  See http://www.zlib.net/zlib_faq.html#faq05.
Attachment #415281 - Flags: review? → review+
Fixed patch. (And I messed up last time and included the git diff, not the CVS one.)

I don't think that we should handle Z_BUF_ERROR directly. The other case where it can occur is if we run out of output space. However, we should always have enough output space so I'd rather know about it in that case.
Attachment #415293 - Flags: review?
Is there some command line flag we can give to gcc to tell it to NOT allow
c99 specific features, but generate compile time errors instead?  It would 
surely help if gcc was working for us, rather than against us, on this matter.
Comment on attachment 415293 [details] [diff] [review]
move the conditional down to avoid breaking non-C99 compilers.

r=nelson
Attachment #415293 - Flags: review? → review+
Hmm.  The page Adam cited says that -std=c89 is equivalent to -ansi.  
But the Makefile for linux already specifies -ansi.  See line 137 of
security/coreconf/Linux.mk which begins

OS_CFLAGS               = $(DSO_CFLAGS) $(OS_REL_CFLAGS) $(ARCHFLAG) -ansi
Attachment #415281 - Attachment is obsolete: true
I fixed some style nits in Adam's patch and checked it in
on the NSS trunk (NSS 3.12.6).

Checking in ssl3con.c;
/cvsroot/mozilla/security/nss/lib/ssl/ssl3con.c,v  <--  ssl3con.c
new revision: 1.126; previous revision: 1.125
done
Attachment #415293 - Attachment is obsolete: true
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
I verified Adam's fix with Chromium on Linux.

Is it normal to send a 0-length SSL application data record?
The 0-length application data record seems to be the first
record of the server's response (consisting of two records
to our HTTP GET request:

<-- [
(202 bytes of 32, with 165 left over)
SSLRecord { [Tue Dec  1 10:19:01 2009]
   0: 17 03 01 00  20                                     | .... 
   type    = 23 (application_data)
   version = { 3,1 }
   length  = 32 (0x20)
            < encrypted >
}
(202 bytes of 160)
SSLRecord { [Tue Dec  1 10:19:01 2009]
   0: 17 03 01 00  a0                                     | .....
   type    = 23 (application_data)
   version = { 3,1 }
   length  = 160 (0xa0)
            < encrypted >
}
]
Status: RESOLVED → VERIFIED
So, this server was sending us EMPTY application data records !?!

RFC 2246 doesn't seem to disallow that, but I'd sure consider it a bug if
NSS sent such a record.  Maybe we should report this to livechat.merlin.pl
Do we know whose TLS library they use?
http://www.livechatinc.com/en/ seems to be the company. Their ticket page goes to a 404, so I wouldn't hold out much hope.
I added some printf statements to print the application data
records we receive from the server.  The printf output confirmed
what I speculated based on the SSL packet trace in comment 18.

The first application data record is 0-length.

The second application data record is the HTTP response:
HTTP/1.1 404 Not Found
Content-type: text/plain
Content-Length: 45
Connection: close

LIVECHAT Server 5.5.15 SQLAPI: File not found

So 0-length application data records allowed by the SSL protocol.
Perhaps there is a server bug that calls the SSL write function
with a 0-length buffer, and the SSL library they use faithfully
sends a 0-length application data record.
To be precise, the first application data record is 0-length,
after removing the block cipher padding (11 + 1 bytes) and the
MAC (20 bytes).  The original record length is 32 bytes.
I can replicate the problem with lighttpd+OpenSSL, so this quirk appears to be pretty common.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: