Closed Bug 732951 Opened 12 years ago Closed 12 years ago

EnsureMutable() returns true (success) even when it failed due to OOM

Categories

(Core :: XPCOM, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla14
Tracking Status
firefox10 - wontfix
firefox11 - wontfix
firefox12 + fixed
firefox13 + fixed
firefox14 + fixed
firefox-esr10 12+ fixed
status1.9.2 --- wanted

People

(Reporter: decoder, Assigned: MatsPalmgren_bugz)

References

Details

(Keywords: crash, Whiteboard: [sg:critical][qa-])

Crash Data

Attachments

(3 files, 2 obsolete files)

Tested on m-c revision 8ea5c983743f: I don't know exactly what's wrong here, but the OOM condition as indicated by the allocation trace below (after crash trace), led to this crash which looks potentially dangerous:


Program received signal SIGSEGV, Segmentation fault.
nsContentUtils::ASCIIToLower (aStr=...) at /srv/repos/browser/mozilla-central/content/base/src/nsContentUtils.cpp:5042
5042        PRUnichar c = *iter;
(gdb) bt 8
#0  nsContentUtils::ASCIIToLower (aStr=...) at /srv/repos/browser/mozilla-central/content/base/src/nsContentUtils.cpp:5042
#1  0x00002aaaac77f6b0 in RuleHash_CIHashKey (table=<optimized out>, key=<optimized out>) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:166
#2  0x00002aaaad249529 in PL_DHashTableOperate (table=0x2bde9d8, key=0x22bae90, op=PL_DHASH_ADD)
    at /srv/repos/browser/mozilla-central/objdir-ff-gcc64dbg/xpcom/build/pldhash.cpp:615
#3  0x00002aaaac780659 in RuleHash::AppendRuleToTable (this=0x2bde9a0, aTable=<optimized out>, aKey=<optimized out>, aRuleInfo=...)
    at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:566
#4  0x00002aaaac780745 in RuleHash::AppendRule (this=0x2bde9a0, aRuleInfo=...) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:610
#5  0x00002aaaac783648 in AddRule (aCascade=0x2bde9a0, aRuleInfo=0x2446cf0) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:2808
#6  RefreshRuleCascade (aPresContext=<optimized out>, this=0x27ef130) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:3148
#7  nsCSSRuleProcessor::RefreshRuleCascade (this=0x27ef130, aPresContext=<optimized out>) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:3095
(gdb) l
5037    nsContentUtils::ASCIIToLower(nsAString& aStr)
5038    {
5039      PRUnichar* iter = aStr.BeginWriting();
5040      PRUnichar* end = aStr.EndWriting();
5041      while (iter != end) {
5042        PRUnichar c = *iter;
5043        if (c >= 'A' && c <= 'Z') {
5044          *iter = c + ('a' - 'A');
5045        }
5046        ++iter;
(gdb) p iter
$1 = (PRUnichar *) 0x2aaaae4b3000
(gdb) p *iter
Cannot access memory at address 0x2aaaae4b3000
(gdb) p end        
$2 = (PRUnichar *) 0x7fffffffa4a0

The backtrace of the failing allocation is as follows:

#0 /srv/repos/browser/mozilla-central/objdir-ff-gcc64dbg/dist/bin/libmozalloc.so(moz_malloc+0x5f)
#1 nsStringBuffer::Alloc(unsigned long) at xpcom/string/src/nsSubstring.cpp:210
#2 nsAString_internal::MutatePrep(unsigned int, unsigned short**, unsigned int*) at xpcom/string/src/nsTSubstring.cpp:163
#3 nsAString_internal::ReplacePrepInternal(unsigned int, unsigned int, unsigned int, unsigned int) at xpcom/string/src/nsTSubstring.cpp:199
#4 nsAString_internal::Assign(unsigned short const*, unsigned int) at xpcom/string/src/nsTSubstring.cpp:335
#5 nsAString_internal::Assign(unsigned short const*, unsigned int) at xpcom/string/src/nsTSubstring.cpp:331
#6 nsAString_internal::EnsureMutable(unsigned int) at xpcom/string/src/nsTSubstring.cpp:295
#7 nsAString_internal::BeginWriting() at objdir-ff-gcc64dbg/content/base/src/../../../dist/include/nsTSubstring.h:159
#8 nsContentUtils::ASCIIToLower(nsAString_internal&) at content/base/src/nsContentUtils.cpp:5040
#9 RuleHash_CIHashKey at layout/style/nsCSSRuleProcessor.cpp:167
#10 PL_DHashTableOperate at objdir-ff-gcc64dbg/xpcom/build/pldhash.cpp:616
#11 AddSelector at layout/style/nsCSSRuleProcessor.cpp:2742
#12 AddRule at layout/style/nsCSSRuleProcessor.cpp:2848
#13 nsCSSRuleProcessor::GetRuleCascade(nsPresContext*) at layout/style/nsCSSRuleProcessor.cpp:3089
#14 nsCSSRuleProcessor::RulesMatching(AnonBoxRuleProcessorData*) at layout/style/nsCSSRuleProcessor.cpp:2320
#15 EnumRulesMatching<AnonBoxRuleProcessorData> at layout/style/nsStyleSet.cpp:479


If you need longer traces, more GDB information or a reproduction of this, let me know. Guessing Layout as component, but not sure about that of course.
nsStringBuffer::Alloc returns NULL if the malloc fails
 nsAString_internal::MutatePrep returns false
  nsAString_internal::ReplacePrepInternal returns false
  nsAString_internal::ReplacePrep (inlined) returns false
   nsAString_internal::Assign @335 is a NOP
    nsAString_internal::Assign @331 (nsTSubstring.cpp):

   315  void
   316  nsTSubstring_CharT::Assign( const char_type* data, size_type length )
   317    {
   318        // unfortunately, some callers pass null :-(
   319      if (!data)
   320        {
   321          Truncate();
   322          return;
   323        }
   324  
   325      if (length == size_type(-1))
   326        length = char_traits::length(data);
   327  
   328      if (IsDependentOn(data, data + length))
   329        {
   330          // take advantage of sharing here...
   331          Assign(string_type(data, length));
   332          return;
   333        }
   334  
   335      if (ReplacePrep(0, mLength, length))
   336        char_traits::copy(mData, data, length);
   337    }


Line 331 creates a temporary nsTString_CharT (nsTString.h):

    70        nsTString_CharT( const char_type* data,
                               size_type length = size_type(-1) )
    71          : substring_type()
    72          {
    73            Assign(data, length);
    74          }

and substring_type() is (nsTSubstring.h):

   620        nsTSubstring_CharT()
   621          : mData(char_traits::sEmptyBuffer),
   622            mLength(0),
   623            mFlags(F_TERMINATED) {}

since the Assign call on line 73 is a NOP this temp has the above values.
Then, on line 331 again, we Assign from the temp:

   364  nsTSubstring_CharT::Assign( const self_type& str )
   365    {
   366      // |str| could be sharable.  we need to check its flags to know how to
   367      // deal with it.
   368  
   369      if (&str == this)
   370        return;
   371  
   372      if (!str.mLength)
   373        {
   374          Truncate();
   375          mFlags |= str.mFlags & F_VOIDED;
   376        }
            ...

since str.mLength is zero we Truncate, aka SetCapacity(0):

   533  nsTSubstring_CharT::SetCapacity( size_type capacity )
   534    {
   535      // capacity does not include room for the terminating null char
   536  
   537      // if our capacity is reduced to zero, then free our buffer.
   538      if (capacity == 0)
   539        {
   540          ::ReleaseData(mData, mFlags);
   541          mData = char_traits::sEmptyBuffer;
   542          mLength = 0;
   543          SetDataFlags(F_TERMINATED);
   544        }
   545      else
            ...

assigning mData!

     nsAString_internal::EnsureMutable :

   292          // promote to a shared string buffer
   293          char_type* prevData = mData;
   294          Assign(mData, mLength);
   295          return mData != prevData;

EnsureMutable returns true!


      char_iterator BeginWriting()
        {
          return EnsureMutable() ? mData : char_iterator(0);
        }


=> we will write to char_traits::sEmptyBuffer !!!
Component: Layout → XPCOM
QA Contact: layout → xpcom
Christian, if my theory above is correct, the value of 'iter' should be
the same as &gNullChar or thereabout, can you confirm?
(In reply to Mats Palmgren [:mats] from comment #2)
> Christian, if my theory above is correct, the value of 'iter' should be
> the same as &gNullChar or thereabout, can you confirm?

Like this?

(gdb) p &gNullChar
$1 = (PRUnichar *) 0x2aaaae464cb8
(gdb) p iter
$2 = (PRUnichar *) 0x2aaaae4b3000

If you need anything else, let me know :)
Yes, thanks.  So 0x2aaaae4b3000 - 0x2aaaae464cb8 = 320328 (decimal),
which seems to support my theory.

Do you do your own builds?  if so, can you try saving the start value for 'iter',
like so:

 /* static */
+PRUnichar* start;
 void
 nsContentUtils::ASCIIToLower(nsAString& aStr)
 {
   PRUnichar* iter = aStr.BeginWriting();
   PRUnichar* end = aStr.EndWriting();
+  start = iter;
   while (iter != end) {
...

then the values of 'start' and &gNullChar should be the same.
Attached patch Like so? (obsolete) — — Splinter Review
Perhaps would could mark the string as F_VOIDED when MutatePrep fails
and have EnsureMutable() check for that instead.
Also, does making gNullChar const put it in a non-writable section
on some platforms? (should cause a safer crash)
(In reply to Mats Palmgren [:mats] from comment #4)
> then the values of 'start' and &gNullChar should be the same.

Confirmed:

(gdb) p start
$2 = (PRUnichar *) 0x2aaaae464cc8
(gdb) p &gNullChar
$3 = (PRUnichar *) 0x2aaaae464cc8

:) I'll try your patch now.
With the patch from comment 5 I get a crash due to iter being NULL now:

Program received signal SIGSEGV, Segmentation fault.
nsContentUtils::ASCIIToLower (aStr=...) at /srv/repos/browser/mozilla-central/content/base/src/nsContentUtils.cpp:5042
5042        PRUnichar c = *iter;
(gdb) p iter
$1 = (PRUnichar *) 0x0

If this is intended, then I would like to suggest to use the mozmalloc_abort function instead of crashing here, to keep crash stats consistent and allow automated detection of unhandled OOM bugs (such as my tool here does).
Yes, 0x0 is expected, that's what BeginWriting()/EndWriting() returns when
EnsureMutable() fails.  I'm not sure if we can make them infallible.
I think [sg:critical] is warranted given the large number of consumers that indirectly
depends on a correct return value of ns*String::EnsureMutable().
Assignee: nobody → matspal
Whiteboard: [sg:critical]
> If this is intended, then I would like to suggest to use the mozmalloc_abort

I'm not going to do that in this bug, because there's too many consumers
of BeginWriting/EndWriting which would become infallible.  And I think
it would be confusing to have some string methods be infallible while
others are fallible.

I'm in favor of making strings infallible eventually though, but not here
and now.
Summary: OOM Crash [@ nsContentUtils::ASCIIToLower] → EnsureMutable() returns true (success) even when it failed due to OOM
(In reply to Mats Palmgren [:mats] from comment #10)
> I'm not going to do that in this bug, because there's too many consumers
> of BeginWriting/EndWriting which would become infallible.  And I think
> it would be confusing to have some string methods be infallible while
> others are fallible.
> 
> I'm in favor of making strings infallible eventually though, but not here
> and now.

I didn't actually mean to make strings infallible in general (although that would of course be good). What I meant is that if we crash in this particular piece of code anyway guaranteed on OOM (in ASCIIToLower), can we check the pointer and abort instead?
Oh, I misunderstood you then.  I haven't actually looked at who calls
nsContentUtils::ASCIIToLower and if we can make that infallible.  I guess
I morphed this bug into the more general problem.  I'll make a separate patch
for ASCIIToLower...
Attached patch fix (obsolete) — — Splinter Review
Now using SetIsVoid(true) instead of just adding the F_VOIDED bit --
to maintain the invariant that mData == sEmptyBuffer:
http://mxr.mozilla.org/mozilla-central/source/xpcom/string/public/nsTSubstring.h#795

Try results pending:
https://tbpl.mozilla.org/?usebuildbot=1&tree=Try&rev=fe31ec4a4c33
Attachment #603410 - Flags: review?(bzbarsky)
Attachment #603088 - Attachment is obsolete: true
Comment on attachment 603410 [details] [diff] [review]
fix

I'd really like someone who knows the string code to review this....
Attachment #603410 - Flags: review?(bzbarsky) → review?(benjamin)
I don't think I understand yet why we should mark the string as void if mutateprep fails. 
Void strings have a particular meaning for the DOM which is different from a failure case. I would expect that if mutateprep fails, we should leave the string unchanged (and return false). Can we not just unwind by checking the return value of mutateprep?

I definitely think we ought to end up making string infallible by default and requiring explicit fallible_t calls in the cases where we can commonly expect incoming buffers to cause DOS-style crashes.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> I don't think I understand yet why we should mark the string as void if
> mutateprep fails. 

I needed some state to signal that the Assign inside the ctor failed.
Since we currently truncate the original string on OOM anyway, I figured
marking it Void would be acceptable.

> if mutateprep fails, we should leave the string unchanged

I agree.

> Can we not just unwind by checking the return value of mutateprep?

I guess I could invent a new flag to signal the failed Assign...
But I don't see why we need to use that, I think SetLength should work.
Assign(mData, mLength) is really slow too -- it *always* allocates
a temporary string, copies the characters there, then Mutateprep
the original (allocating again in this case), copies the characters
back, then deallocates the temp string.

Also, making sEmptyBuffer 'static const' didn't really work since
some consumers actually overwrite the terminating NUL.
(It did crash on Linux64, I just missed the orange on Try)
Attached patch fix, v2 — — Splinter Review
I think we can use SetLength() in this case too.  As a bonus, it's
faster because it just allocates a new StringBuffer and copies the
characters to that, avoiding an extra malloc/free and copying the
characters twice.
Attachment #603410 - Attachment is obsolete: true
Attachment #604114 - Flags: review?(benjamin)
Attachment #603410 - Flags: review?(benjamin)
Attachment #604117 - Flags: review?(bzbarsky)
Comment on attachment 604114 [details] [diff] [review]
fix, v2

... and this leaves the string unchanged if the allocation fails.
(In reply to Mats Palmgren [:mats] from comment #16)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> > I don't think I understand yet why we should mark the string as void if
> > mutateprep fails. 
> 
> I needed some state to signal that the Assign inside the ctor failed.
> Since we currently truncate the original string on OOM anyway, I figured
> marking it Void would be acceptable.

I don't like the idea of using Void for this.  Void-ness should be as opaque as possible to the string library -- it's a hack designed to support the DOM string-or-null concept, and otherwise shouldn't be used.  As far as the string library is concerned you should think of "void" as having as much meaning as "purple".
OK, the new patch doesn't use it.
bsmedberg: If we change the string code signatures what binary components break? Do people link against the Firefox copy of strings, or do they compile in their own copy of the string library through the SDK? Since the ESR is affected we need to figure out the impact.
This doesn't affect the external string API at all, so I don't think this affects anything.
Comment on attachment 604115 [details] [diff] [review]
Make nsContentUtils::ASCIIToLower/ASCIIToUpper return NS_ERROR_OUT_OF_MEMORY when string allocation fails

r=me
Attachment #604115 - Flags: review?(bzbarsky) → review+
Comment on attachment 604117 [details] [diff] [review]
Propagate nsContentUtils::ASCIIToLower/ASCIIToUpper error

r=me
Attachment #604117 - Flags: review?(bzbarsky) → review+
Comment on attachment 604114 [details] [diff] [review]
fix, v2

I assume that we currently cannot make SetLength be NS_WARN_UNUSED_RESULT because large parts of the tree would fail to compile? Please make sure there is a followup to make an infallible version of this API and make the fallible version use NS_WARN_UNUSED_RESULT.

In nsTSubstring_CharT::SetLength I think it would be more readable to switch the conditionals:

if (!SetCapacity(length))
  return false;

mLength = length;
return true;

r=me with that note & change
Attachment #604114 - Flags: review?(benjamin) → review+
[Triage Comment]

Now that this has been on central for a bit can someone set a? on the patches for ESR landing? Please nominate if this is not showing any regressions.
Comment on attachment 604114 [details] [diff] [review]
fix, v2

[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: crash, some risk of being exploitable
Testing completed (on m-c, etc.): baked on trunk since 2012-03-21
Risk to taking this patch (and alternatives if risky): medium risk, no alternatives
String changes made by this patch: none
Attachment #604114 - Flags: approval-mozilla-esr10?
Attachment #604114 - Flags: approval-mozilla-beta?
Attachment #604114 - Flags: approval-mozilla-aurora?
Attachment #604115 - Flags: approval-mozilla-esr10?
Attachment #604115 - Flags: approval-mozilla-beta?
Attachment #604115 - Flags: approval-mozilla-aurora?
Attachment #604117 - Flags: approval-mozilla-esr10?
Attachment #604117 - Flags: approval-mozilla-beta?
Attachment #604117 - Flags: approval-mozilla-aurora?
(In reply to Mats Palmgren [:mats] from comment #32)
> Risk to taking this patch (and alternatives if risky): medium risk, no
> alternatives

What kind of fallout would we be on the lookout for if approved for FF12? Would this show up as a crash, web regression, or something else entirely?
Comment on attachment 604114 [details] [diff] [review]
fix, v2

[Triage Comment]
Approving for all branches given this is sg:crit and there's still 2 weeks before release to hopefully shake out any fallout.
Attachment #604114 - Flags: approval-mozilla-esr10?
Attachment #604114 - Flags: approval-mozilla-esr10+
Attachment #604114 - Flags: approval-mozilla-beta?
Attachment #604114 - Flags: approval-mozilla-beta+
Attachment #604114 - Flags: approval-mozilla-aurora?
Attachment #604114 - Flags: approval-mozilla-aurora+
Attachment #604115 - Flags: approval-mozilla-esr10?
Attachment #604115 - Flags: approval-mozilla-esr10+
Attachment #604115 - Flags: approval-mozilla-beta?
Attachment #604115 - Flags: approval-mozilla-beta+
Attachment #604115 - Flags: approval-mozilla-aurora?
Attachment #604115 - Flags: approval-mozilla-aurora+
Attachment #604117 - Flags: approval-mozilla-esr10?
Attachment #604117 - Flags: approval-mozilla-esr10+
Attachment #604117 - Flags: approval-mozilla-beta?
Attachment #604117 - Flags: approval-mozilla-beta+
Attachment #604117 - Flags: approval-mozilla-aurora?
Attachment #604117 - Flags: approval-mozilla-aurora+
Is this something QA is able to verify?
Whiteboard: [sg:critical] → [sg:critical][qa?]
> Is this something QA is able to verify?

I think you would need the Reporter's OOM-testing tool for that.
Christian, would you mind testing this to confirm it's fixed in Firefox 12 Beta, 13 Aurora, 14 Nightly and latest-esr10 nightly?
Whiteboard: [sg:critical][qa?] → [sg:critical][qa-]
Is it an expected side effect that e.g. the following returns a pointer to char_traits::sEmptyBuffer (which wasn't the case before):
  nsCAutoString a;
  nsCAutoString b(a);
  char *c = b.BeginWriting();

It's not entirely a problem because bug 504660 removed the constness of sEmptyBuffer, but that seems dangerous...
Yes, that is expected.
(In reply to Mats Palmgren [:mats] from comment #41)
> Yes, that is expected.

But is it right? Doesn't it get you back to comment 1 scenario.
(In reply to Mike Hommey [:glandium] from comment #42)
> (In reply to Mats Palmgren [:mats] from comment #41)
> > Yes, that is expected.
> 
> But is it right? Doesn't it get you back to comment 1 scenario.

Forget it, I see what's happening.
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: