This testcases causes nsBlockFrame::FindLineFor to look at a freed frame




13 years ago
12 years ago


(Reporter: jruderman, Assigned: mats)


(Blocks 1 bug, {testcase, verified1.8.0.7, verified1.8.1})

Dependency tree / graph
Bug Flags:
blocking1.8.1 +
blocking1.8.0.7 +
in-testsuite +

Firefox Tracking Flags

(Not tracked)


(Whiteboard: [sg:critical?])


(4 attachments, 2 obsolete attachments)



13 years ago
Steps to reproduce:

1. Run a Firefox build with DEBUG_TRACEMALLOC_FRAMEARENA defined under Valgrind.  (Additional setup that probably isn't relevant: Pear has Valgrind 3.1.1, and the Firefox build has patches from bug 287116 and bug 184614 to fix other errors Valgrind detects.)

2. Load the testcase.

Result: Valgrind complains about an invalid read on a deleted frame.  Simplified output below, will attach more complete output.

Invalid read of size 4
  at nsBlockFrame::FindLineFor(nsIFrame*) 
  by nsBlockFrame::PrepareChildIncrementalReflow(nsBlockReflowState&) 
  by nsBlockFrame::Reflow(nsPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned&)

Address 0x7ED3380 is 48 bytes inside a block of size 52 free'd
  at free (vg_replace_malloc.c:235)
  by PR_Free   
  by FrameArena::FreeFrame(unsigned, void*) 
  by PresShell::FreeFrame(unsigned, void*) 
  by nsFrame::Destroy() 
  by nsSplittableFrame::Destroy()

Comment 1

13 years ago
Posted file testcase (obsolete) —

Comment 2

13 years ago
So is the frame passed to FindLineFor dead?  Or something else?  Line numbers would really help here.

Comment 4

13 years ago
I can't reproduce with a debug build (built from the same source as the opt build with which I can reproduce it).  I think that's going to make it difficult to get a stack trace with line numbers.

One thing I tried was adding an assertion to nsBlockFrame::FindLineFor that attempts to check whether it has been given a deleted frame.  My assertion complains if aFrame->GetParent() & 0ffff0000 == 0xdddd0000.  I was able to reproduce a bug (this bug?) with a different testcase earlier this week using that trick, but now I can't.  That testcase involved height and clear.

Any ideas for what else I can try?  Would it help if you had an account on the Linux QA machines I use through VNC?
You could try compiling an opt build with -g to add symbols without changing the code.

Comment 6

13 years ago
The opt build already has symbols -- that's why the stacks from Valgrind are readable.
You should be able to get line numbers compiled into your opt build too.
(In reply to comment #6)
> The opt build already has symbols -- that's why the stacks from Valgrind are
> readable.

There are a variety of types of symbols which might or might not be present for each piece of the program compiled. Valgrind, like gdb, works with what it can get; even my stripped /usr/lib/mozilla-firefox/firefox-bin image has 44283 symbols (exported symbols, GOT/PLT relocs, etc). Your valgrind stack has some frames with source line numbers, and some without. The -g compiler flag will add .debug_foo sections to the currently compiling obj file, which will greatly increase the resolution of things like Valgrind. But it's not an all-or-nothing affair. You might need to do a clean rebuild with your opt flags set to '-O2 -g'.

To check, run 'readelf -S' to get a section list for If there's a section called .debug_line in there, *some* of the objects contributing to the .so have debug line records. Unless the SIZE column in the output shows lots of data (in mozilla's case, probably megabytes) you might not have rebuilt every object in the .so with -g.

Comment 9

13 years ago
Not sure why this approach worked, but it did ;)
Attachment #221938 - Attachment is obsolete: true

Comment 10

13 years ago
Attachment #221939 - Attachment is obsolete: true

Comment 11

13 years ago
So I think it's fc->mPlaceholder that's dead, not aFrame.  I still can't figure out how to reproduce on Mac OS X, though :/

Comment 12

13 years ago
Posted file Frame dump
The problem seem to occur when removing a placeholder in
nsBlockFrame::RemoveFrame that is also in the float cache.

Comment 13

13 years ago
Posted patch Patch rev. 1Splinter Review
This fixes the two testcases. I also ran it over a set of ~220 URLs and
got this statistics: there were 2217 calls to RemovePlaceholderDescendantsOf
and only 8 of those had a float cache item (exactly 1 in all cases).
For that item we peeked at 2.125 frames on average (including aFrame) and got
no match (i.e. none of the pages triggered the bug).
(so I don't think the suggested fix will have any performance impact)
Attachment #222785 - Flags: review?(roc)

Comment 14

13 years ago
Comment on attachment 222785 [details] [diff] [review]
Patch rev. 1

The NS_ABORT_IF_FALSE in RemovePlaceholderDescendantsOf
is a mistake, I'll remove that...


13 years ago
Assignee: nobody → mats.palmgren

Comment 15

13 years ago
I got this error on the "Linux prometheus" Tinderbox:
nsLineBox.cpp:518: error: invalid conversion from `nsIFrame*' to `long int'

So I did this change:
-      if (NS_UNLIKELY(frame)) {
+      if (NS_UNLIKELY(frame != nsnull)) {

I wonder why this should be necessary though...
(it didn't break any of the other builds so what's special with "prometheus"?)

Comment 16

13 years ago
Checked in on trunk 2006-05-24 18:48 PDT

Closed: 13 years ago
Resolution: --- → FIXED
Compiler version?

Comment 18

13 years ago
prometheus tinderbox build log says:

uname: Linux 2.4.21-27.0.4.ELsmp #1 SMP Sat Apr 16 18:43:06 EDT 2005 i686 i686 i386 GNU/Linux
Compiler is -- gcc (gcc (GCC) 3.3.2 20031022 (Red Hat Linux 3.3.2-1)

The compile error I got (comment 15) seems like a bug to me, I would expect
that if "if(x);" compiles then so should "if(NS_UNLIKELY(x));" for any given x.

Is sizeof(long)==32 and sizeof(void*)==64 on this box by any chance?
(In reply to comment #18)
> uname: Linux 2.4.21-27.0.4.ELsmp #1 SMP Sat Apr 16
> 18:43:06 EDT 2005 i686 i686 i386 GNU/Linux

> Is sizeof(long)==32 and sizeof(void*)==64 on this box by any chance?

Not with that uname output.

We should probably just fix the NS_LIKELY and NS_UNLIKELY macros to cast in some way (but only in the case where they're actually interesting).

Comment 20

13 years ago
(filed bug 340244 on improving NS_LIKELY/NS_UNLIKELY)
Neither testcase seems to crash ff1.5.0.4 or a debug Bon Echo build, is this a trunk-only bug?

Comment 22

13 years ago
(In reply to comment #21)
> Neither testcase seems to crash ff1.5.0.4 or a debug Bon Echo build,
> is this a trunk-only bug?

Note that isn't a crasher. I tried one of my pre-patches that has trace
printf's and I can see that, 1) the testcase do trigger the code that the
patch adds. 2) without the patch we do find a placeholder that is a
descendant of a deleted frame in nsBlockFrame::FindLineFor.

I don't have a Valgrind build for the 1.8 branch, but I would say the
printf's strongly suggests the invalid read also occurs on branch (and
that the patch would fix it).
Flags: blocking1.8.1?
Flags: blocking1.8.0.6?
Whiteboard: [sg:critical?]
Attachment #222785 - Flags: approval1.8.1?
Attachment #222785 - Flags: approval1.8.0.6?


13 years ago
Flags: blocking1.8.1? → blocking1.8.1+


13 years ago
Attachment #222785 - Flags: approval1.8.1? → approval1.8.1+
Flags: blocking1.8.0.6? → blocking1.8.0.6+

Comment 23

13 years ago
Checked in to MOZILLA_1_8_BRANCH at 2006-07-28 11:20 PDT.
Keywords: fixed1.8.1
Comment on attachment 222785 [details] [diff] [review]
Patch rev. 1

approved for 1.8.0 branch, a=dveditz for drivers
Attachment #222785 - Flags: approval1.8.0.7? → approval1.8.0.7+

Comment 25

13 years ago
Checked in to MOZILLA_1_8_0_BRANCH at 2006-08-10 03:18 PDT.
Keywords: fixed1.8.0.7
Depends on: 348688

Comment 26

13 years ago
The real fix for this bug is in bug 348688 (which backed out this patch).
Real fix will be on branches soon...

Comment 27

13 years ago
ff2b2 debug windows, linux no crash
ff2b2 windows, linux, macppc no crash

Comment 28

13 years ago
v.fixed on both branches with 8/24 nightly builds, no crash with testcase:
Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: Gecko/20060824 Firefox/
Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1b2) Gecko/20060824 BonEcho/2.0b2


13 years ago
Group: security
Flags: in-testsuite?

Comment 29

12 years ago
Crashtests checked in.  I think sayrer runs through the crashtests with valgrind enabled frequently enough ;)
Flags: in-testsuite? → in-testsuite+
You need to log in before you can comment on or make changes to this bug.