Last Comment Bug 732951 - EnsureMutable() returns true (success) even when it failed due to OOM
: EnsureMutable() returns true (success) even when it failed due to OOM
Status: RESOLVED FIXED
[sg:critical][qa-]
: crash
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Trunk
: x86_64 Linux
: -- critical (vote)
: mozilla14
Assigned To: Mats Palmgren (vacation)
:
Mentors:
Depends on:
Blocks: 687256
  Show dependency treegraph
 
Reported: 2012-03-05 06:40 PST by Christian Holler (:decoder)
Modified: 2012-05-18 13:25 PDT (History)
14 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
wontfix
-
wontfix
+
fixed
+
fixed
+
fixed
12+
fixed
wanted


Attachments
Like so? (3.80 KB, patch)
2012-03-05 15:57 PST, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
fix (3.80 KB, patch)
2012-03-06 12:28 PST, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
fix, v2 (2.54 KB, patch)
2012-03-08 10:29 PST, Mats Palmgren (vacation)
benjamin: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
akeybl: approval‑mozilla‑esr10+
Details | Diff | Splinter Review
Make nsContentUtils::ASCIIToLower/ASCIIToUpper return NS_ERROR_OUT_OF_MEMORY when string allocation fails (4.42 KB, patch)
2012-03-08 10:32 PST, Mats Palmgren (vacation)
bzbarsky: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
akeybl: approval‑mozilla‑esr10+
Details | Diff | Splinter Review
Propagate nsContentUtils::ASCIIToLower/ASCIIToUpper error (4.45 KB, patch)
2012-03-08 10:32 PST, Mats Palmgren (vacation)
bzbarsky: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
akeybl: approval‑mozilla‑esr10+
Details | Diff | Splinter Review

Description Christian Holler (:decoder) 2012-03-05 06:40:18 PST
Tested on m-c revision 8ea5c983743f: I don't know exactly what's wrong here, but the OOM condition as indicated by the allocation trace below (after crash trace), led to this crash which looks potentially dangerous:


Program received signal SIGSEGV, Segmentation fault.
nsContentUtils::ASCIIToLower (aStr=...) at /srv/repos/browser/mozilla-central/content/base/src/nsContentUtils.cpp:5042
5042        PRUnichar c = *iter;
(gdb) bt 8
#0  nsContentUtils::ASCIIToLower (aStr=...) at /srv/repos/browser/mozilla-central/content/base/src/nsContentUtils.cpp:5042
#1  0x00002aaaac77f6b0 in RuleHash_CIHashKey (table=<optimized out>, key=<optimized out>) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:166
#2  0x00002aaaad249529 in PL_DHashTableOperate (table=0x2bde9d8, key=0x22bae90, op=PL_DHASH_ADD)
    at /srv/repos/browser/mozilla-central/objdir-ff-gcc64dbg/xpcom/build/pldhash.cpp:615
#3  0x00002aaaac780659 in RuleHash::AppendRuleToTable (this=0x2bde9a0, aTable=<optimized out>, aKey=<optimized out>, aRuleInfo=...)
    at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:566
#4  0x00002aaaac780745 in RuleHash::AppendRule (this=0x2bde9a0, aRuleInfo=...) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:610
#5  0x00002aaaac783648 in AddRule (aCascade=0x2bde9a0, aRuleInfo=0x2446cf0) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:2808
#6  RefreshRuleCascade (aPresContext=<optimized out>, this=0x27ef130) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:3148
#7  nsCSSRuleProcessor::RefreshRuleCascade (this=0x27ef130, aPresContext=<optimized out>) at /srv/repos/browser/mozilla-central/layout/style/nsCSSRuleProcessor.cpp:3095
(gdb) l
5037    nsContentUtils::ASCIIToLower(nsAString& aStr)
5038    {
5039      PRUnichar* iter = aStr.BeginWriting();
5040      PRUnichar* end = aStr.EndWriting();
5041      while (iter != end) {
5042        PRUnichar c = *iter;
5043        if (c >= 'A' && c <= 'Z') {
5044          *iter = c + ('a' - 'A');
5045        }
5046        ++iter;
(gdb) p iter
$1 = (PRUnichar *) 0x2aaaae4b3000
(gdb) p *iter
Cannot access memory at address 0x2aaaae4b3000
(gdb) p end        
$2 = (PRUnichar *) 0x7fffffffa4a0

The backtrace of the failing allocation is as follows:

#0 /srv/repos/browser/mozilla-central/objdir-ff-gcc64dbg/dist/bin/libmozalloc.so(moz_malloc+0x5f)
#1 nsStringBuffer::Alloc(unsigned long) at xpcom/string/src/nsSubstring.cpp:210
#2 nsAString_internal::MutatePrep(unsigned int, unsigned short**, unsigned int*) at xpcom/string/src/nsTSubstring.cpp:163
#3 nsAString_internal::ReplacePrepInternal(unsigned int, unsigned int, unsigned int, unsigned int) at xpcom/string/src/nsTSubstring.cpp:199
#4 nsAString_internal::Assign(unsigned short const*, unsigned int) at xpcom/string/src/nsTSubstring.cpp:335
#5 nsAString_internal::Assign(unsigned short const*, unsigned int) at xpcom/string/src/nsTSubstring.cpp:331
#6 nsAString_internal::EnsureMutable(unsigned int) at xpcom/string/src/nsTSubstring.cpp:295
#7 nsAString_internal::BeginWriting() at objdir-ff-gcc64dbg/content/base/src/../../../dist/include/nsTSubstring.h:159
#8 nsContentUtils::ASCIIToLower(nsAString_internal&) at content/base/src/nsContentUtils.cpp:5040
#9 RuleHash_CIHashKey at layout/style/nsCSSRuleProcessor.cpp:167
#10 PL_DHashTableOperate at objdir-ff-gcc64dbg/xpcom/build/pldhash.cpp:616
#11 AddSelector at layout/style/nsCSSRuleProcessor.cpp:2742
#12 AddRule at layout/style/nsCSSRuleProcessor.cpp:2848
#13 nsCSSRuleProcessor::GetRuleCascade(nsPresContext*) at layout/style/nsCSSRuleProcessor.cpp:3089
#14 nsCSSRuleProcessor::RulesMatching(AnonBoxRuleProcessorData*) at layout/style/nsCSSRuleProcessor.cpp:2320
#15 EnumRulesMatching<AnonBoxRuleProcessorData> at layout/style/nsStyleSet.cpp:479


If you need longer traces, more GDB information or a reproduction of this, let me know. Guessing Layout as component, but not sure about that of course.
Comment 1 Mats Palmgren (vacation) 2012-03-05 14:56:33 PST
nsStringBuffer::Alloc returns NULL if the malloc fails
 nsAString_internal::MutatePrep returns false
  nsAString_internal::ReplacePrepInternal returns false
  nsAString_internal::ReplacePrep (inlined) returns false
   nsAString_internal::Assign @335 is a NOP
    nsAString_internal::Assign @331 (nsTSubstring.cpp):

   315  void
   316  nsTSubstring_CharT::Assign( const char_type* data, size_type length )
   317    {
   318        // unfortunately, some callers pass null :-(
   319      if (!data)
   320        {
   321          Truncate();
   322          return;
   323        }
   324  
   325      if (length == size_type(-1))
   326        length = char_traits::length(data);
   327  
   328      if (IsDependentOn(data, data + length))
   329        {
   330          // take advantage of sharing here...
   331          Assign(string_type(data, length));
   332          return;
   333        }
   334  
   335      if (ReplacePrep(0, mLength, length))
   336        char_traits::copy(mData, data, length);
   337    }


Line 331 creates a temporary nsTString_CharT (nsTString.h):

    70        nsTString_CharT( const char_type* data,
                               size_type length = size_type(-1) )
    71          : substring_type()
    72          {
    73            Assign(data, length);
    74          }

and substring_type() is (nsTSubstring.h):

   620        nsTSubstring_CharT()
   621          : mData(char_traits::sEmptyBuffer),
   622            mLength(0),
   623            mFlags(F_TERMINATED) {}

since the Assign call on line 73 is a NOP this temp has the above values.
Then, on line 331 again, we Assign from the temp:

   364  nsTSubstring_CharT::Assign( const self_type& str )
   365    {
   366      // |str| could be sharable.  we need to check its flags to know how to
   367      // deal with it.
   368  
   369      if (&str == this)
   370        return;
   371  
   372      if (!str.mLength)
   373        {
   374          Truncate();
   375          mFlags |= str.mFlags & F_VOIDED;
   376        }
            ...

since str.mLength is zero we Truncate, aka SetCapacity(0):

   533  nsTSubstring_CharT::SetCapacity( size_type capacity )
   534    {
   535      // capacity does not include room for the terminating null char
   536  
   537      // if our capacity is reduced to zero, then free our buffer.
   538      if (capacity == 0)
   539        {
   540          ::ReleaseData(mData, mFlags);
   541          mData = char_traits::sEmptyBuffer;
   542          mLength = 0;
   543          SetDataFlags(F_TERMINATED);
   544        }
   545      else
            ...

assigning mData!

     nsAString_internal::EnsureMutable :

   292          // promote to a shared string buffer
   293          char_type* prevData = mData;
   294          Assign(mData, mLength);
   295          return mData != prevData;

EnsureMutable returns true!


      char_iterator BeginWriting()
        {
          return EnsureMutable() ? mData : char_iterator(0);
        }


=> we will write to char_traits::sEmptyBuffer !!!
Comment 2 Mats Palmgren (vacation) 2012-03-05 14:58:53 PST
Christian, if my theory above is correct, the value of 'iter' should be
the same as &gNullChar or thereabout, can you confirm?
Comment 3 Christian Holler (:decoder) 2012-03-05 15:26:23 PST
(In reply to Mats Palmgren [:mats] from comment #2)
> Christian, if my theory above is correct, the value of 'iter' should be
> the same as &gNullChar or thereabout, can you confirm?

Like this?

(gdb) p &gNullChar
$1 = (PRUnichar *) 0x2aaaae464cb8
(gdb) p iter
$2 = (PRUnichar *) 0x2aaaae4b3000

If you need anything else, let me know :)
Comment 4 Mats Palmgren (vacation) 2012-03-05 15:49:07 PST
Yes, thanks.  So 0x2aaaae4b3000 - 0x2aaaae464cb8 = 320328 (decimal),
which seems to support my theory.

Do you do your own builds?  if so, can you try saving the start value for 'iter',
like so:

 /* static */
+PRUnichar* start;
 void
 nsContentUtils::ASCIIToLower(nsAString& aStr)
 {
   PRUnichar* iter = aStr.BeginWriting();
   PRUnichar* end = aStr.EndWriting();
+  start = iter;
   while (iter != end) {
...

then the values of 'start' and &gNullChar should be the same.
Comment 5 Mats Palmgren (vacation) 2012-03-05 15:57:23 PST
Created attachment 603088 [details] [diff] [review]
Like so?

Perhaps would could mark the string as F_VOIDED when MutatePrep fails
and have EnsureMutable() check for that instead.
Also, does making gNullChar const put it in a non-writable section
on some platforms? (should cause a safer crash)
Comment 6 Christian Holler (:decoder) 2012-03-05 16:28:06 PST
(In reply to Mats Palmgren [:mats] from comment #4)
> then the values of 'start' and &gNullChar should be the same.

Confirmed:

(gdb) p start
$2 = (PRUnichar *) 0x2aaaae464cc8
(gdb) p &gNullChar
$3 = (PRUnichar *) 0x2aaaae464cc8

:) I'll try your patch now.
Comment 7 Christian Holler (:decoder) 2012-03-05 16:35:09 PST
With the patch from comment 5 I get a crash due to iter being NULL now:

Program received signal SIGSEGV, Segmentation fault.
nsContentUtils::ASCIIToLower (aStr=...) at /srv/repos/browser/mozilla-central/content/base/src/nsContentUtils.cpp:5042
5042        PRUnichar c = *iter;
(gdb) p iter
$1 = (PRUnichar *) 0x0

If this is intended, then I would like to suggest to use the mozmalloc_abort function instead of crashing here, to keep crash stats consistent and allow automated detection of unhandled OOM bugs (such as my tool here does).
Comment 8 Mats Palmgren (vacation) 2012-03-05 17:09:59 PST
Yes, 0x0 is expected, that's what BeginWriting()/EndWriting() returns when
EnsureMutable() fails.  I'm not sure if we can make them infallible.
Comment 9 Mats Palmgren (vacation) 2012-03-06 11:57:37 PST
I think [sg:critical] is warranted given the large number of consumers that indirectly
depends on a correct return value of ns*String::EnsureMutable().
Comment 10 Mats Palmgren (vacation) 2012-03-06 12:01:34 PST
> If this is intended, then I would like to suggest to use the mozmalloc_abort

I'm not going to do that in this bug, because there's too many consumers
of BeginWriting/EndWriting which would become infallible.  And I think
it would be confusing to have some string methods be infallible while
others are fallible.

I'm in favor of making strings infallible eventually though, but not here
and now.
Comment 11 Christian Holler (:decoder) 2012-03-06 12:04:27 PST
(In reply to Mats Palmgren [:mats] from comment #10)
> I'm not going to do that in this bug, because there's too many consumers
> of BeginWriting/EndWriting which would become infallible.  And I think
> it would be confusing to have some string methods be infallible while
> others are fallible.
> 
> I'm in favor of making strings infallible eventually though, but not here
> and now.

I didn't actually mean to make strings infallible in general (although that would of course be good). What I meant is that if we crash in this particular piece of code anyway guaranteed on OOM (in ASCIIToLower), can we check the pointer and abort instead?
Comment 12 Mats Palmgren (vacation) 2012-03-06 12:26:23 PST
Oh, I misunderstood you then.  I haven't actually looked at who calls
nsContentUtils::ASCIIToLower and if we can make that infallible.  I guess
I morphed this bug into the more general problem.  I'll make a separate patch
for ASCIIToLower...
Comment 13 Mats Palmgren (vacation) 2012-03-06 12:28:30 PST
Created attachment 603410 [details] [diff] [review]
fix

Now using SetIsVoid(true) instead of just adding the F_VOIDED bit --
to maintain the invariant that mData == sEmptyBuffer:
http://mxr.mozilla.org/mozilla-central/source/xpcom/string/public/nsTSubstring.h#795

Try results pending:
https://tbpl.mozilla.org/?usebuildbot=1&tree=Try&rev=fe31ec4a4c33
Comment 14 Boris Zbarsky [:bz] 2012-03-07 10:00:07 PST
Comment on attachment 603410 [details] [diff] [review]
fix

I'd really like someone who knows the string code to review this....
Comment 15 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-03-07 11:37:28 PST
I don't think I understand yet why we should mark the string as void if mutateprep fails. 
Void strings have a particular meaning for the DOM which is different from a failure case. I would expect that if mutateprep fails, we should leave the string unchanged (and return false). Can we not just unwind by checking the return value of mutateprep?

I definitely think we ought to end up making string infallible by default and requiring explicit fallible_t calls in the cases where we can commonly expect incoming buffers to cause DOS-style crashes.
Comment 16 Mats Palmgren (vacation) 2012-03-08 10:25:57 PST
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> I don't think I understand yet why we should mark the string as void if
> mutateprep fails. 

I needed some state to signal that the Assign inside the ctor failed.
Since we currently truncate the original string on OOM anyway, I figured
marking it Void would be acceptable.

> if mutateprep fails, we should leave the string unchanged

I agree.

> Can we not just unwind by checking the return value of mutateprep?

I guess I could invent a new flag to signal the failed Assign...
But I don't see why we need to use that, I think SetLength should work.
Assign(mData, mLength) is really slow too -- it *always* allocates
a temporary string, copies the characters there, then Mutateprep
the original (allocating again in this case), copies the characters
back, then deallocates the temp string.

Also, making sEmptyBuffer 'static const' didn't really work since
some consumers actually overwrite the terminating NUL.
(It did crash on Linux64, I just missed the orange on Try)
Comment 17 Mats Palmgren (vacation) 2012-03-08 10:29:07 PST
Created attachment 604114 [details] [diff] [review]
fix, v2

I think we can use SetLength() in this case too.  As a bonus, it's
faster because it just allocates a new StringBuffer and copies the
characters to that, avoiding an extra malloc/free and copying the
characters twice.
Comment 18 Mats Palmgren (vacation) 2012-03-08 10:32:00 PST
Created attachment 604115 [details] [diff] [review]
Make nsContentUtils::ASCIIToLower/ASCIIToUpper return NS_ERROR_OUT_OF_MEMORY when string allocation fails
Comment 19 Mats Palmgren (vacation) 2012-03-08 10:32:59 PST
Created attachment 604117 [details] [diff] [review]
Propagate nsContentUtils::ASCIIToLower/ASCIIToUpper error
Comment 20 Mats Palmgren (vacation) 2012-03-08 10:33:34 PST
https://tbpl.mozilla.org/?tree=Try&rev=625d184c44c9
Comment 21 Mats Palmgren (vacation) 2012-03-08 10:40:30 PST
Comment on attachment 604114 [details] [diff] [review]
fix, v2

... and this leaves the string unchanged if the allocation fails.
Comment 22 David Baron :dbaron: ⌚️UTC+8 (review requests must explain patch) 2012-03-08 12:14:50 PST
(In reply to Mats Palmgren [:mats] from comment #16)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> > I don't think I understand yet why we should mark the string as void if
> > mutateprep fails. 
> 
> I needed some state to signal that the Assign inside the ctor failed.
> Since we currently truncate the original string on OOM anyway, I figured
> marking it Void would be acceptable.

I don't like the idea of using Void for this.  Void-ness should be as opaque as possible to the string library -- it's a hack designed to support the DOM string-or-null concept, and otherwise shouldn't be used.  As far as the string library is concerned you should think of "void" as having as much meaning as "purple".
Comment 23 Mats Palmgren (vacation) 2012-03-08 12:23:58 PST
OK, the new patch doesn't use it.
Comment 24 Daniel Veditz [:dveditz] 2012-03-08 13:26:33 PST
bsmedberg: If we change the string code signatures what binary components break? Do people link against the Firefox copy of strings, or do they compile in their own copy of the string library through the SDK? Since the ESR is affected we need to figure out the impact.
Comment 25 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-03-08 13:34:03 PST
This doesn't affect the external string API at all, so I don't think this affects anything.
Comment 26 Boris Zbarsky [:bz] 2012-03-09 23:29:34 PST
Comment on attachment 604115 [details] [diff] [review]
Make nsContentUtils::ASCIIToLower/ASCIIToUpper return NS_ERROR_OUT_OF_MEMORY when string allocation fails

r=me
Comment 27 Boris Zbarsky [:bz] 2012-03-09 23:29:40 PST
Comment on attachment 604117 [details] [diff] [review]
Propagate nsContentUtils::ASCIIToLower/ASCIIToUpper error

r=me
Comment 28 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2012-03-12 11:38:52 PDT
Comment on attachment 604114 [details] [diff] [review]
fix, v2

I assume that we currently cannot make SetLength be NS_WARN_UNUSED_RESULT because large parts of the tree would fail to compile? Please make sure there is a followup to make an infallible version of this API and make the fallible version use NS_WARN_UNUSED_RESULT.

In nsTSubstring_CharT::SetLength I think it would be more readable to switch the conditionals:

if (!SetCapacity(length))
  return false;

mLength = length;
return true;

r=me with that note & change
Comment 31 Lukas Blakk [:lsblakk] use ?needinfo 2012-04-06 14:58:41 PDT
[Triage Comment]

Now that this has been on central for a bit can someone set a? on the patches for ESR landing? Please nominate if this is not showing any regressions.
Comment 32 Mats Palmgren (vacation) 2012-04-07 15:39:49 PDT
Comment on attachment 604114 [details] [diff] [review]
fix, v2

[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: crash, some risk of being exploitable
Testing completed (on m-c, etc.): baked on trunk since 2012-03-21
Risk to taking this patch (and alternatives if risky): medium risk, no alternatives
String changes made by this patch: none
Comment 33 Alex Keybl [:akeybl] 2012-04-09 13:00:55 PDT
(In reply to Mats Palmgren [:mats] from comment #32)
> Risk to taking this patch (and alternatives if risky): medium risk, no
> alternatives

What kind of fallout would we be on the lookout for if approved for FF12? Would this show up as a crash, web regression, or something else entirely?
Comment 34 Alex Keybl [:akeybl] 2012-04-10 11:57:55 PDT
Comment on attachment 604114 [details] [diff] [review]
fix, v2

[Triage Comment]
Approving for all branches given this is sg:crit and there's still 2 weeks before release to hopefully shake out any fallout.
Comment 37 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-04-16 10:00:23 PDT
Is this something QA is able to verify?
Comment 38 Mats Palmgren (vacation) 2012-04-16 10:10:19 PDT
> Is this something QA is able to verify?

I think you would need the Reporter's OOM-testing tool for that.
Comment 39 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-04-16 10:23:34 PDT
Christian, would you mind testing this to confirm it's fixed in Firefox 12 Beta, 13 Aurora, 14 Nightly and latest-esr10 nightly?
Comment 40 Mike Hommey [:glandium] 2012-05-10 11:34:16 PDT
Is it an expected side effect that e.g. the following returns a pointer to char_traits::sEmptyBuffer (which wasn't the case before):
  nsCAutoString a;
  nsCAutoString b(a);
  char *c = b.BeginWriting();

It's not entirely a problem because bug 504660 removed the constness of sEmptyBuffer, but that seems dangerous...
Comment 41 Mats Palmgren (vacation) 2012-05-10 12:57:43 PDT
Yes, that is expected.
Comment 42 Mike Hommey [:glandium] 2012-05-10 23:52:34 PDT
(In reply to Mats Palmgren [:mats] from comment #41)
> Yes, that is expected.

But is it right? Doesn't it get you back to comment 1 scenario.
Comment 43 Mike Hommey [:glandium] 2012-05-11 00:06:05 PDT
(In reply to Mike Hommey [:glandium] from comment #42)
> (In reply to Mats Palmgren [:mats] from comment #41)
> > Yes, that is expected.
> 
> But is it right? Doesn't it get you back to comment 1 scenario.

Forget it, I see what's happening.

Note You need to log in before you can comment on or make changes to this bug.