Last Comment Bug 634373 - crash [@ nsCSSStyleSheet::WillDirty] using cross_fuzzv3
: crash [@ nsCSSStyleSheet::WillDirty] using cross_fuzzv3
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: CSS Parsing and Computation (show other bugs)
: Trunk
: All All
: P3 critical (vote)
: mozilla5
Assigned To: David Baron :dbaron: ⌚️UTC-8
:
: Jet Villegas (:jet)
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-15 12:52 PST by eherokles
Modified: 2011-05-04 14:26 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed


Attachments
Here is the WinDbg output (97.48 KB, text/plain)
2011-02-15 12:52 PST, eherokles
no flags Details
Seems to be the same (105.30 KB, text/plain)
2011-02-17 05:40 PST, eherokles
no flags Details
The windbg output from nightly 23.2.2011 (79.79 KB, text/plain)
2011-02-23 12:11 PST, eherokles
no flags Details
audit for null sheet handling (4.86 KB, patch)
2011-04-29 11:25 PDT, David Baron :dbaron: ⌚️UTC-8
bzbarsky: review+
Details | Diff | Splinter Review
audit for null sheet handling (7.59 KB, patch)
2011-04-30 08:16 PDT, David Baron :dbaron: ⌚️UTC-8
no flags Details | Diff | Splinter Review
patch (8.47 KB, patch)
2011-04-30 11:44 PDT, David Baron :dbaron: ⌚️UTC-8
bzbarsky: review+
bugzilla: approval‑mozilla‑aurora+
Details | Diff | Splinter Review

Description eherokles 2011-02-15 12:52:57 PST
Created attachment 512565 [details]
Here is the WinDbg output

I´ve userd cross_fuzzv3 on
firefox4b11 32bit windowsxp

PRIMARY_PROBLEM_CLASS:  NULL_CLASS_PTR_DEREFERENCE
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2011-02-15 13:03:23 PST
Hmm.  That's crashing at address 0x3c == 60, which is the offset of mComplete in nsCSSStyleSheetInner, suggesting that mInner is null.  I don't see how it can be null on trunk, though: it's set in the constructor and never nulled out.  David, any ideas?
Comment 2 Boris Zbarsky [:bz] (still a bit busy) 2011-02-15 13:08:55 PST
Though.... looks like 60 might also be the offset of mInner in nsCSSStyleSheet (though my count says 56... but it depends on what happens with those PRPackedBools, etc).  So it's possible the _sheet_ is null.

Perhaps DeclarationChanged is being called with a second arg of PR_TRUE after the sheet has gone away?  I don't see anything obvious preventing that from happening.
Comment 3 David Baron :dbaron: ⌚️UTC-8 2011-02-15 17:57:24 PST
What version of Firefox was this in?
Comment 4 David Baron :dbaron: ⌚️UTC-8 2011-02-15 17:58:25 PST
oh, never mind.  I was just skimming for a UA string, and didn't see one.
Comment 5 David Baron :dbaron: ⌚️UTC-8 2011-02-15 18:19:41 PST
(In reply to comment #2)
> Though.... looks like 60 might also be the offset of mInner in nsCSSStyleSheet
> (though my count says 56... but it depends on what happens with those
> PRPackedBools, etc).  So it's possible the _sheet_ is null.
> 
> Perhaps DeclarationChanged is being called with a second arg of PR_TRUE after
> the sheet has gone away?  I don't see anything obvious preventing that from
> happening.

Given the windbg output, it's easy enough to tell for sure:

 1. On a Linux box, download
http://releases.mozilla.org/pub/mozilla.org/firefox/releases/4.0b11/update/win32/en-US/firefox-4.0b11.complete.mar

 2. run the alias unmar-full-update='MAR=~/builds/mozilla-central/obj/firefox-debugopt/dist/host/bin/mar ~/builds/mozilla-central/mozilla/tools/update-packaging/unwrap_full_update.pl' to extract it

 3. objdump -Cd xul.dll

Which shows the context is:

1027130a:       57                      push   %edi
1027130b:       8b f8                   mov    %eax,%edi
1027130d:       8b 47 3c                mov    0x3c(%edi),%eax  <== CRASH HERE
10271310:       83 78 4c 00             cmpl   $0x0,0x4c(%eax)
10271314:       0f 85 ca 27 15 00       jne    0x103c3ae4
1027131a:       33 c0                   xor    %eax,%eax
1027131c:       5f                      pop    %edi
1027131d:       c3                      ret    


hg cat -rFIREFOX_4_0b11_RELEASE nsCSSStyleSheet.cpp | cat -n gives (crash on line 1553):

  1550  nsresult
  1551  nsCSSStyleSheet::WillDirty()
  1552  {
  1553    if (!mInner->mComplete) {
  1554      // Do nothing
  1555      return NS_OK;
  1556    }
  1557  
  1558    if (EnsureUniqueInner() == eUniqueInner_CloneFailed) {
  1559      return NS_ERROR_OUT_OF_MEMORY;
  1560    }
  1561    return NS_OK;
  1562  }

So pretty clearly |this| is null.

FWIW, the offsets, for 32-bit, look to me like:

nsCSSStyleSheet:
  0x00 vtbl. nsIStyleSheet
  0x04 vtbl. nsIDOMCSSStyleSheet
  0x08 vtbl. nsICSSLoaderObserver
  0x0c mRefCnt
  0x10 mTitle
           mData
  0x14     mLength
  0x18     mFlags
  0x1c mMedia
  0x20 mNext
  0x24 mParent
  0x28 mOwnerRule
  0x2c mRuleCollection
  0x30 mDocument
  0x34 mOwningNode
  0x38 mDisabled
  0x39 mDirty
  0x3c mInner
  ...

nsCSSStyleSheetInner:
  0x00 mSheets
           mHdr
  0x04     Header: mLength
  0x08     Header: mCapacity
  0x0c     padding for 64-bit alignment (?)
  0x10     array item 0
  0x2c     array item 7
  0x30 mSheetURI
  0x34 mOriginalSheetURI
  0x38 mBaseURI
  0x3c mPrincipal
  0x40 mOrderedRules
  0x44 mNameSpaceMap
  0x48 mFirstChild
  0x4c mComplete
  ...

which agrees with this.
Comment 6 Boris Zbarsky [:bz] (still a bit busy) 2011-02-15 19:37:26 PST
Ah, I missed mRefCnt on the sheet; that's why I was getting the offset wrong there.  Yeah, so we have a null sheet.  We currently assert this doesn't happen; we should check instead.
Comment 7 Boris Zbarsky [:bz] (still a bit busy) 2011-02-15 19:37:51 PST
Or CC sheets + rules and have rules hold strong refs to sheets.
Comment 8 eherokles 2011-02-17 05:40:46 PST
Created attachment 513102 [details]
Seems to be the same
Comment 9 eherokles 2011-02-23 12:10:09 PST
I´ve used cross_fuzz(random seed, no logging) against firefox4pre13b (nightly:23.2.2011), and get the same crash, but it seems to start at another origin.
Well, have a look and the new Windbg output. May be it is helpful.
Comment 10 eherokles 2011-02-23 12:11:34 PST
Created attachment 514569 [details]
The windbg output from nightly 23.2.2011
Comment 11 David Baron :dbaron: ⌚️UTC-8 2011-04-29 11:25:54 PDT
Created attachment 529146 [details] [diff] [review]
audit for null sheet handling

It was pretty obvious what to do in most cases, except the one directly related to this bug (the StyleRule::DeclarationChanged case).

I should really add some tests too.
Comment 14 David Baron :dbaron: ⌚️UTC-8 2011-04-30 08:16:59 PDT
Created attachment 529291 [details] [diff] [review]
audit for null sheet handling

As I said, I think the ~GroupRule change is actually unneeded -- replaced with an assertion.

This also adds tests.
Comment 15 David Baron :dbaron: ⌚️UTC-8 2011-04-30 08:18:09 PDT
And any hint as to how I might make the first and third test show a crash (without the patch) would be appreciated, especially since I *think* the first test ought to be equivalent to the original report.
Comment 16 David Baron :dbaron: ⌚️UTC-8 2011-04-30 11:21:48 PDT
Oh... I think I can make the rule match something, and then not restyle yet.
Comment 17 David Baron :dbaron: ⌚️UTC-8 2011-04-30 11:44:04 PDT
Created attachment 529304 [details] [diff] [review]
patch

Now I have tests for all the cases that show the crash without the patch.  I also left in the two tests that don't crash, since they exercise the dom-wrapper-with-no-rule case.
Comment 18 Boris Zbarsky [:bz] (still a bit busy) 2011-05-02 14:19:30 PDT
Comment on attachment 529304 [details] [diff] [review]
patch

r=me
Comment 19 David Baron :dbaron: ⌚️UTC-8 2011-05-02 18:49:45 PDT
http://hg.mozilla.org/mozilla-central/rev/064d7c5425a6
Comment 20 David Baron :dbaron: ⌚️UTC-8 2011-05-02 18:51:39 PDT
Comment on attachment 529304 [details] [diff] [review]
patch

This is a bunch of null-checks (some in old code, some in new-for-animations code) -- so low risk crash fixes that fix crashes that seem somewhat likely to be hit by fuzzers.
Comment 21 David Baron :dbaron: ⌚️UTC-8 2011-05-04 14:25:43 PDT
https://hg.mozilla.org/releases/mozilla-aurora/rev/b11d407b1975

Note You need to log in before you can comment on or make changes to this bug.