Closed Bug 244177 Opened 20 years ago Closed 20 years ago

nsScanner::Append() can overwrite the storage in the buffer it allocates.

Categories

(Core :: DOM: HTML Parser, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla1.8beta1

People

(Reporter: jst, Assigned: smontagu)

References

Details

(4 keywords, Whiteboard: [sg:fix])

Attachments

(2 files)

In nsScanner::Append(const char* aBuffer, PRUint32 aLen), we're given the number
of bytes in the input, we then call:

    mUnicodeDecoder->GetMaxLength(aBuffer, aLen, &unicharBufLen);
    nsScannerString::Buffer* buffer = nsScannerString::AllocBuffer(unicharBufLen
+ 1);

And then go on to attempt to convert the given characters to unicode using the
converter for the assumed charset. If the assumed charset is wrong (i.e. UTF-16)
and the given data is single-byte data, we see that the conversion fails, and we
attempt to consume a byte at a time and convert again. Now in this case, we're
given UTF-8 (or whatever single byte-based input) and we think our max length is
the given number of bytes/2 (that's what GetMaxLength() on the UTF-16 decoder
tells us), so if we're given 30 bytes, we'll allocate storage for 15 characters,
and as long as the unicode conversion fails, we'll keep chopping of character
after character until we've gone through them all. That means we'll in this case
end up trying to store 30 characters in the buffer we allocated to hold 15
characters.
Hmm... This could be a problem with any caller that uses our Unicode conversion
routines and does error recovery (eg the converter stream).  We probably need to
audit all callsites of GetMaxLength().  :(

I wish we had sane stream-based Unicode conversion APIs with error recovery in
the converters....
It looks like this problem exists before the string branch landing too.

Do you think it is enough to simply bounds check the unichars array and bail if
there isn't enough room?  should we return an error in that case?
Severity: normal → major
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.8alpha
Could that break fallback in some non-pathological cases?  If so, we need to do
something smarter; if not, that's probably fine.
*** Bug 246450 has been marked as a duplicate of this bug. ***
Attached patch PatchSplinter Review
Here's the patch I had prepared for the dupe, which does what Darin suggests in
comment 2.

I think the "something smarter" that we should really be doing is to fix bug
197120, but in the meanwhile this would be a useful safeguard.
Group: security
Comment on attachment 150637 [details] [diff] [review]
Patch

Maybe if I ask for reviews someone will comment ;-)
Attachment #150637 - Flags: superreview?(darin)
Attachment #150637 - Flags: review?(hjtoi-bugzilla)
Target Milestone: mozilla1.8alpha1 → mozilla1.8alpha2
Flags: blocking1.7?
Flags: blocking1.7? → blocking1.7-
I've always thought that we should skip 2-bytes at a type for a 2-byte encoding
(see bug 128896 comment 22).  That could be another possible fix, although a
little more involved...
So speaking of bug 128896, and in particular bug 128896 comment 23, do we need
to fix nsConverterInputStream::Fill as well?

The original analysis is, I assume, predicated on the unicode converter
reporting failure rather than success when it's passed a length of 0.  Is that
actually the case?
So, actually, based on what unicode decoders are *supposed* to be doing, I don't
see how this bug happens, since if *aDestLength is 0 the converter should return
a success value.  The converters in nsUCS2BEToUnicode.cpp don't quite follow
that contract, though, so I could see this happening if what they saw as a
backwards BOM occurred in the input.  (And I'm also a little puzzled how
nsUTF16LEToUnicode ever works on a big-endian platform for a file beginning with
a BOM, but that's getting off-topic.)
(I reread the code, so feel free to ignore my previous parenthetical.)
This was being triggered recently by a bug (246194) in the UTF16 decoder which
has since been fixed.
Target Milestone: mozilla1.8alpha2 → mozilla1.8beta
Comment on attachment 150637 [details] [diff] [review]
Patch

r=dveditz
Let's try to get this in.
Attachment #150637 - Flags: review?(hjtoi-bugzilla)
Attachment #150637 - Flags: review+
Attachment #150637 - Flags: approval1.7.x?
Attachment #150637 - Flags: approval-aviary?
Attachment #150637 - Flags: superreview?(darin) → superreview?(dbaron)
Comment on attachment 150637 [details] [diff] [review]
Patch

sr=dbaron given that a concise version of comment 10 is added in a source
comment
Attachment #150637 - Flags: superreview?(dbaron) → superreview+
Comment on attachment 150637 [details] [diff] [review]
Patch

a=asa for branches checkin.
Attachment #150637 - Flags: approval1.7.x?
Attachment #150637 - Flags: approval1.7.x+
Attachment #150637 - Flags: approval-aviary?
Attachment #150637 - Flags: approval-aviary+
(In reply to comment #14)
> (From update of attachment 150637 [details] [diff] [review])
> sr=dbaron given that a concise version of comment 10 is added in a source
> comment
> 

I'm afraid I don't understand comment 10 well enough to add a concise version of
it, especially if it is based on the assumption in comment 8 which I don't think
is correct, though perhaps that is my misunderstanding.
Why didn't this land?
Whiteboard: [sg:fix]
Because I can't meet the conditions for sr. See comment 16.
How about:

This is only needed because some decoders don't follow the nsIUnicodeDecoder
contract by always returning NS_OK_UDEC_MOREOUTPUT when *aDestLength is 0.  See
bug 244177.
-> smontagu
Assignee: darin → smontagu
Status: ASSIGNED → NEW
I was confused by the proposed text in comment 19. Did you mean returning
NS_OK_UDEC_MOREOUTPUT was wrong, or that that is what the buggy ones are NOT
doing? I'm sure I could look up the relevant interfaces and work it out, but if
it's meant to help it should probably make sense on its own.

OK, I looked it up, and despite copious comments in nsIUnicodeDecoder.h it was
not mentioned that the decoders are supposed to return NS_OK_UDEC_MOREOUTPUT.
Maybe the @return section could be updated to not refer to the deprecated
(according to the comments near the #defines) NS_PARTIAL_MORE_OUTPUT. I can see
how implementors got confused, though, zero translation was done, not partial.

If I'm understanding this right, how about

  This is only needed because some decoders don't follow the nsIUnicodeDecoder
  contract: they return a failure when *aDestLength is 0 rather than the
  correct NS_OK_UDEC_MOREOUTPUT.  See bug 244177.

Fix checked in to trunk and 1.7 branch.
Status: NEW → RESOLVED
Closed: 20 years ago
Keywords: fixed1.7
Resolution: --- → FIXED
Here's a patch backported to 1.4 (pre-string-defragmentation).	I think this is
the correct way to do things way back when, but I'd like a sanity check.
Attachment #164230 - Flags: superreview?(bzbarsky)
Attachment #164230 - Flags: review?(bzbarsky)
Comment on attachment 164230 [details] [diff] [review]
Backported to 1.4 branch

r+sr=bzbarsky
Attachment #164230 - Flags: superreview?(bzbarsky)
Attachment #164230 - Flags: superreview+
Attachment #164230 - Flags: review?(bzbarsky)
Attachment #164230 - Flags: review+
Blocks: 248511
This was *not* fixed on the aviary branch.
Flags: blocking-aviary1.0.3?
Keywords: fixed1.7fixed1.7.5
Whiteboard: [sg:fix] → [sg:fix] needed-aviary1.0
Attachment #150637 - Flags: approval-aviary1.0.3?
Comment on attachment 150637 [details] [diff] [review]
Patch

Received a=drivers for 1.0.3
Attachment #150637 - Flags: approval-aviary1.0.3? → approval-aviary1.0.3+
Fix checked into aviary branch

/cvsroot/mozilla/htmlparser/src/Attic/nsScanner.cpp,v  <--  nsScanner.cpp
new revision: 3.125.6.8.2.1; previous revision: 3.125.6.8
Flags: blocking-aviary1.0.3? → blocking-aviary1.0.3+
Whiteboard: [sg:fix] needed-aviary1.0 → [sg:fix]
Group: security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: