Closed
Bug 946658
Opened 11 years ago
Closed 10 years ago
Intermittent browser_dbg_variables-view-popup-07.js | [@ nsView::DoResetWidgetBounds(bool, bool)] or [@ nsViewManager::ProcessPendingUpdatesForView(nsView*, bool)]
Categories
(Core :: Web Painting, defect)
Tracking
()
RESOLVED
FIXED
mozilla30
Tracking | Status | |
---|---|---|
firefox27 | --- | wontfix |
firefox28 | --- | wontfix |
firefox29 | --- | fixed |
firefox30 | + | fixed |
firefox-esr24 | --- | unaffected |
b2g-v1.2 | --- | unaffected |
b2g-v1.3 | --- | fixed |
b2g-v1.3T | --- | fixed |
b2g-v1.4 | --- | fixed |
People
(Reporter: cbook, Assigned: MatsPalmgren_bugz)
References
()
Details
(5 keywords, Whiteboard: [adv-main29+])
Crash Data
Attachments
(3 files, 3 obsolete files)
12.18 KB,
text/plain
|
Details | |
55.79 KB,
text/plain
|
Details | |
7.23 KB,
patch
|
tnikkel
:
review+
abillings
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Rev5 MacOSX Mountain Lion 10.8 fx-team debug test mochitest-browser-chrome on 2013-12-04 19:55:34 PST for push bf901d895201 slave: talos-mtnlion-r5-061 https://tbpl.mozilla.org/php/getParsedLog.php?id=31472407&tree=Fx-Team Summary TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/devtools/debugger/test/browser_dbg_variables-view-popup-07.js | application terminated with exit code 256 PROCESS-CRASH | chrome://mochitests/content/browser/browser/devtools/debugger/test/browser_dbg_variables-view-popup-07.js | application crashed [@ nsView::DoResetWidgetBounds(bool, bool)] also leaked TEST-UNEXPECTED-FAIL | leakcheck | 2968 bytes leaked (CompositorChild, CondVar, Mutex, PCompositorChild, PLayerTransactionChild, ...) but not sure if this belongs to the tracking bugs 20:36:06 WARNING - PROCESS-CRASH | chrome://mochitests/content/browser/browser/devtools/debugger/test/browser_dbg_variables-view-popup-07.js | application crashed [@ nsView::DoResetWidgetBounds(bool, bool)] 20:36:06 INFO - Crash dump filename: /var/folders/22/cg9zv2f53yd1fn8byjsck83r00000w/T/tmp0zxQG7/minidumps/DA3C4403-9249-4042-930A-83DE146F429D.dmp 20:36:06 INFO - Operating system: Mac OS X 20:36:06 INFO - 10.8.0 12A269 20:36:06 INFO - CPU: amd64 20:36:06 INFO - family 6 model 42 stepping 7 20:36:06 INFO - 8 CPUs 20:36:06 INFO - Crash reason: EXC_BAD_ACCESS / 0x0000000d 20:36:06 INFO - Crash address: 0x0 20:36:06 INFO - Thread 0 (crashed) 20:36:06 INFO - 0 XUL!nsView::DoResetWidgetBounds(bool, bool) [nsViewManager.h:bf901d895201 : 63 + 0x0] 20:36:06 INFO - rbx = 0x000000018883bac0 r12 = 0x0000000112632760 20:36:06 INFO - r13 = 0x0000000000000000 r14 = 0x000000018883bac0 20:36:06 INFO - r15 = 0x0000000000000000 rip = 0x00000001027038e1 20:36:06 INFO - rsp = 0x00007fff5fbe1e30 rbp = 0x00007fff5fbe1ed0 20:36:06 INFO - Found by: given as instruction pointer in context 20:36:06 INFO - 1 XUL!nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) [nsViewManager.cpp:bf901d895201 : 383 + 0xe] 20:36:06 INFO - rbx = 0x000000018883bac0 r12 = 0x0000000112632760 20:36:06 INFO - r13 = 0x0000000000000000 r14 = 0x000000018883bac0 20:36:06 INFO - r15 = 0x0000000000000000 rip = 0x000000010270717c 20:36:06 INFO - rsp = 0x00007fff5fbe1ee0 rbp = 0x00007fff5fbe1f30 20:36:06 INFO - Found by: call frame info 20:36:06 INFO - 2 XUL!nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) [nsViewManager.cpp:bf901d895201 : 389 + 0x18] 20:36:06 INFO - rbx = 0x000000018883bac0 r12 = 0x0000000112632760 20:36:06 INFO - r13 = 0x0000000000000000 r14 = 0x000000015dc3af00 20:36:06 INFO - r15 = 0x0000000000000000 rip = 0x000000010270719e 20:36:06 INFO - rsp = 0x00007fff5fbe1f40 rbp = 0x00007fff5fbe1f90 20:36:06 INFO - Found by: call frame info 20:36:06 INFO - 3 XUL!nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) [nsViewManager.cpp:bf901d895201 : 389 + 0x18] 20:36:06 INFO - rbx = 0x000000015dc3af00 r12 = 0x0000000112632760 20:36:06 INFO - r13 = 0x0000000000000000 r14 = 0x000000016297fae0 20:36:06 INFO - r15 = 0x0000000000000000 rip = 0x000000010270719e 20:36:06 INFO - rsp = 0x00007fff5fbe1fa0 rbp = 0x00007fff5fbe1ff0 20:36:06 INFO - Found by: call frame info 20:36:06 INFO - 4 XUL!nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) [nsViewManager.cpp:bf901d895201 : 389 + 0x18] 20:36:06 INFO - rbx = 0x000000016297fae0 r12 = 0x0000000112632760 20:36:06 INFO - r13 = 0x0000000000000000 r14 = 0x0000000166119790 20:36:06 INFO - r15 = 0x0000000000000000 rip = 0x000000010270719e 20:36:06 INFO - rsp = 0x00007fff5fbe2000 rbp = 0x00007fff5fbe2050 20:36:06 INFO - Found by: call frame info
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•11 years ago
|
Component: General → Graphics: Layers
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 18•10 years ago
|
||
Looks like a Layout bug to me. I don't like seeing 0x5a5a5a5a5a5a5a5a in the top stack frame ... smells use-after-free.
Group: layout-core-security
Severity: normal → critical
Component: Graphics: Layers → Layout: View Rendering
Keywords: csectype-uaf,
sec-critical
Assignee | ||
Comment 19•10 years ago
|
||
Assignee | ||
Comment 20•10 years ago
|
||
stack frame #0: https://hg.mozilla.org/integration/mozilla-inbound/file/bb5c1b2dc7e4/view/public/nsView.h#l286 I'm guessing this is the call site: https://hg.mozilla.org/integration/mozilla-inbound/file/bb5c1b2dc7e4/view/src/nsViewManager.cpp#l382 and that 'aView' has been deleted. Looking at the loop at line 387, and seeing what this method does, in particular the script blocker on line 417, it seems plausible that arbitrary nsViews might get deleted inside the loop, including 'childView' which would lead to the crash at hand. Am I missing something?
Comment 21•10 years ago
|
||
Sounds plausible to me.
Comment 22•10 years ago
|
||
Actually it might be a bit more complicated then that. We get to ProcessPendingUpdatesForView via PresShell::FlushPendingNotifications, so it must be a call to UpdateWidgetGeometry, which means aFlushDirtyRegion is false and we don't hit the script blocker code. But we know (from eg https://bug968838.bugzilla.mozilla.org/attachment.cgi?id=8378668) that ResetWidgetBounds can re-enter into WillPaintWindow which will call ProcessPendingUpdatesForView with aFlushDirtyRegion true. Maybe we can walk the view tree, saving the widgets we find that need painting or bounds resetting in an array (that holds refs to the widgets), then process the array by getting the view from each widget and doing the painting or bounds resetting. (A small unrelated bug I just noticed is that ProcessPendingUpdatesForView always has this being the root vm (as the assetion at the top can attest) even with views from child vms, but it also looks members of this in ways which only make sense if this was the vm for the view. Probably doesn't cause many (any?) problems since the only child widgets we have now are popups and plugins.)
Comment 23•10 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #18) > Looks like a Layout bug to me. I don't like seeing 0x5a5a5a5a5a5a5a5a in > the top stack frame ... smells use-after-free. Yes, that's definitely our poison-on-free() value.
Updated•10 years ago
|
status-firefox27:
--- → ?
status-firefox28:
--- → affected
status-firefox29:
--- → affected
status-firefox30:
--- → affected
tracking-firefox30:
--- → +
Comment 24•10 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #22) > (A small unrelated bug I just noticed is that ProcessPendingUpdatesForView > always has this being the root vm (as the assetion at the top can attest) > even with views from child vms, but it also looks members of this in ways > which only make sense if this was the vm for the view. Probably doesn't > cause many (any?) problems since the only child widgets we have now are > popups and plugins.) Filed this as bug 978001 with patch.
Assignee | ||
Comment 25•10 years ago
|
||
This implements what you suggested in comment 22. Also, it splits up the method into three. The entry point ProcessPendingUpdatesForView now adds a script blocker around the entire operation, not separately per widget. Ditto for SetPainting(true/false). The 'this' vm is the root vm, as before. It then calls ProcessPendingUpdatesRecurse on aView's vm, which recurse over the view tree and collect the widgets into an array, or calls FlushDirtyRegionToWidget directly when there isn't one. Finally it iterates the widget array and calls ProcessPendingUpdatesPaint for the associated view with its vm as 'this'. It should be equivalent to the current code, except for wrapping the entire view tree operation in a script blocker. I think it would be good if we can get away with that. The methods names are rather random, I chose a common prefix ProcessPendingUpdates to associate them with each other. Bike shedding is welcome. It did trigger one assertion in the TextFrame code though: https://tbpl.mozilla.org/php/getParsedLog.php?id=35491913&tree=Try&full=1 which is this assertion: http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsTextFrame.cpp#2642 so appearantly it lead to a situation where we are painting a text frame before it has been reflowed. This doesn't really seem like a bug to me, since with interruptable reflow you can easily have unreflowed text frames in the tree. I see that we have a few bugs on file on this assertion already. I've included a fix. https://tbpl.mozilla.org/?tree=Try&rev=da343bba5e14 https://tbpl.mozilla.org/?tree=Try&rev=ec9a45f789f6 (I've triggered a bunch of "bc" runs on 10.8 to see if the crash still occurs)
Assignee | ||
Comment 26•10 years ago
|
||
Fwiw, there were no crashes in 139 "bc" runs on 10.8. And it looks green otherwise, with the usual crop of known oranges.
Assignee | ||
Comment 27•10 years ago
|
||
I will investigate the nsTextFrame assertion a bit further...
Assignee | ||
Comment 28•10 years ago
|
||
It's the "testuser2" nsTextFrame inside a nsTextControlFrame that hasn't been reflowed yet. Its ancestors up to and including the nsTextControlFrame are marked dirty. Given the test is test_basic_form_autocomplete.html I suspect the <input> value was changed by form fill. The old code is more eager to run script runners - it's wrapping each "presShell->Paint" with a separate script blocker, thus running new ones after each paint. In my first "wip" patch, they are delayed and only run after all painting is done. So it might be that some script runner in nsTextControlFrame / form-fill code depend on the old behavior.
Assignee | ||
Comment 29•10 years ago
|
||
Running the ResetWidgetBounds() part outside the script blocker seems to work without triggering the nsTextFrame assertion. This should be robust enough to not crash I think. https://tbpl.mozilla.org/?tree=Try&rev=320cbb7d11a0 (I still think the nsTextFrame assertion is essentially bogus, but I don't have time to pursue it.)
Attachment #8384289 -
Attachment is obsolete: true
Attachment #8385389 -
Flags: review?(tnikkel)
Assignee | ||
Comment 30•10 years ago
|
||
Forgot to include a link to the Android/B2G test run: https://tbpl.mozilla.org/?tree=Try&rev=51111a806cfe
Comment 31•10 years ago
|
||
Comment on attachment 8385389 [details] [diff] [review] fix >+ for (size_t i = 0; i < widgets.Length(); ++i) { I think Length returns a uint32_t. >+ nsView* view = nsView::GetViewFor(widgets[i]); >+ if (view) { >+ view->ResetWidgetBounds(false, true); >+ } >+ } >+ if (aFlushDirtyRegion) { >+ nsAutoScriptBlocker scriptBlocker; >+ SetPainting(true); Since ResetWidgetBounds can re-enter I think at this point we need to guard against |this| being destroyed. Perhaps just do this per-widget and call it lower down when we have a view and a vm from the widget. >+ for (size_t i = 0; i < widgets.Length(); ++i) { >+ nsIWidget* widget = widgets[i]; >+ nsView* view = nsView::GetViewFor(widget); >+ if (view && widget->NeedsPaint()) { >+ view->GetViewManager()->ProcessPendingUpdatesPaint(widget); >+ } >+ } >+ SetPainting(false); >+ } >+} It looks like if we encounter a widget that has NeedsPaint() being false then we won't call FlushDirtyRegionToWidget, whereas we did before. Otherwise looks good.
Attachment #8385389 -
Flags: review?(tnikkel)
Assignee | ||
Comment 32•10 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #31) > >+ for (size_t i = 0; i < widgets.Length(); ++i) { > > I think Length returns a uint32_t. Fixed, but I think it's a bug that it doesn't return a size_t. > >+ if (aFlushDirtyRegion) { > >+ nsAutoScriptBlocker scriptBlocker; > >+ SetPainting(true); > > Since ResetWidgetBounds can re-enter I think at this point we need to guard > against |this| being destroyed. Perhaps just do this per-widget and call it > lower down when we have a view and a vm from the widget. Fair enough. I'd like to keep it wrapped around the whole loop though so I added a strong ref on the shell and check that 'this' is still its vm. > It looks like if we encounter a widget that has NeedsPaint() being false > then we won't call FlushDirtyRegionToWidget, whereas we did before. ProcessPendingUpdatesRecurse does that for non-widget views. https://tbpl.mozilla.org/?tree=Try&rev=3f8f8b4e505f https://tbpl.mozilla.org/?tree=Try&rev=faa0e0080fa0
Assignee: nobody → matspal
Attachment #8385389 -
Attachment is obsolete: true
Attachment #8386749 -
Flags: review?(tnikkel)
Assignee | ||
Comment 33•10 years ago
|
||
Comment on attachment 8386749 [details] [diff] [review] fix > > It looks like if we encounter a widget that has NeedsPaint() being false > > then we won't call FlushDirtyRegionToWidget, whereas we did before. > > ProcessPendingUpdatesRecurse does that for non-widget views. Oh, I see that I misunderstood your comment there.
Attachment #8386749 -
Flags: review?(tnikkel)
Updated•10 years ago
|
Summary: Intermittent TEST-UNEXPECTED-FAIL | test/browser_dbg_variables-view-popup-07.js | application terminated with exit code 256 | [@ nsView::DoResetWidgetBounds(bool, bool)] | | leakcheck | 2968 bytes leaked (CompositorChild, CondVar, Mutex, PCompositorChild → Intermittent browser_dbg_variables-view-popup-07.js | [@ nsView::DoResetWidgetBounds(bool, bool)] or [@ nsViewManager::ProcessPendingUpdatesForView(nsView*, bool)]
Assignee | ||
Comment 35•10 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #31) > It looks like if we encounter a widget that has NeedsPaint() being false > then we won't call FlushDirtyRegionToWidget, whereas we did before. Good catch. The name "NeedsPaint" seems slightly misleading though - "HasNonEmptyVisibleArea" would be a better name for what it actually does so skipping FlushDirtyRegionToWidget when it's false would probably be OK. Anyway, let's play it safe and not change that bit. https://tbpl.mozilla.org/?tree=Try&rev=00286c634d36
Attachment #8386749 -
Attachment is obsolete: true
Attachment #8387388 -
Flags: review?(tnikkel)
Comment 36•10 years ago
|
||
Comment on attachment 8387388 [details] [diff] [review] fix Looks good, thanks!
Attachment #8387388 -
Flags: review?(tnikkel) → review+
Comment 37•10 years ago
|
||
Maybe this will fix bug 974542?
Assignee | ||
Comment 38•10 years ago
|
||
https://tbpl.mozilla.org/?usebuildbot=1&tree=Mozilla-Inbound
Flags: in-testsuite-
Comment 39•10 years ago
|
||
landing |
I think you meant https://hg.mozilla.org/integration/mozilla-inbound/rev/4fe82d0824d1
Assignee | ||
Comment 40•10 years ago
|
||
Yes, thank you. I've backed it out temporarily to investigate a Tp5 regression: https://hg.mozilla.org/integration/mozilla-inbound/rev/d1c54bc2bb71
Assignee | ||
Comment 41•10 years ago
|
||
The perf-o-matic chart btw: http://graphs.mozilla.org/graph.html#tests=[[257,131,35]]&sel=1394102638000,1394275438000&displayrange=7&datatype=running
Comment 42•10 years ago
|
||
There's since been many new results and the graph has not gone back done. Something else must have caused it.
Comment 43•10 years ago
|
||
Looking at the graph values this changeset also seems to be in the regression range https://hg.mozilla.org/integration/mozilla-inbound/rev/45ac7b7d7466
Assignee | ||
Comment 44•10 years ago
|
||
I don't see a difference after the backout either, so I've re-landed it: https://hg.mozilla.org/integration/mozilla-inbound/rev/399e4c0b9a6f
Assignee | ||
Comment 45•10 years ago
|
||
(In reply to Timothy Nikkel (:tn) from comment #43) > Looking at the graph values this changeset also seems to be in the > regression range Thanks, the other people with patches in the regression window are already notified.
Comment 46•10 years ago
|
||
landing |
https://hg.mozilla.org/mozilla-central/rev/399e4c0b9a6f
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Updated•10 years ago
|
Group: layout-core-security → core-security
Comment 47•10 years ago
|
||
Is this safe to take on 28 this late in the game?
status-b2g-v1.3:
--- → affected
status-b2g-v1.4:
--- → fixed
status-firefox-esr24:
--- → unaffected
Flags: needinfo?(matspal)
Assignee | ||
Comment 48•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #47) > Is this safe to take on 28 this late in the game? No, that's too risky IMO. I'd like at least a couple of weeks in Beta for something like this to find any regressions. It should be OK for Aurora though.
Flags: needinfo?(matspal)
Assignee | ||
Comment 49•10 years ago
|
||
Comment on attachment 8387388 [details] [diff] [review] fix [Approval Request Comment] Bug caused by (feature/regressing bug #): - User impact if declined: use-after-free crash Testing completed (on m-c, etc.): on m-c since 2014-03-08 Risk to taking this patch (and alternatives if risky): low-to-medium risk String or IDL/UUID changes made by this patch: none
Attachment #8387388 -
Flags: approval-mozilla-aurora?
Updated•10 years ago
|
Attachment #8387388 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 50•10 years ago
|
||
landing |
https://hg.mozilla.org/releases/mozilla-aurora/rev/28d2834f0617
Comment 51•10 years ago
|
||
We may want to get this on b2g28 (v1.3) at some point still.
Assignee | ||
Comment 52•10 years ago
|
||
Note that you need the patches in bug 978001 too.
Comment 53•10 years ago
|
||
Took a look at crash-stats. This seems to have fixed our top two crashes that have nsView in the signature.
Comment 54•10 years ago
|
||
We've shipped a couple betas now that include this change. Mats, think this has baked enough that we're OK getting it on b2g28?
Flags: needinfo?(matspal)
Comment 56•10 years ago
|
||
Thanks! https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/31177245de3f
status-b2g-v1.2:
--- → unaffected
Updated•10 years ago
|
status-b2g-v1.3T:
--- → fixed
Updated•10 years ago
|
Whiteboard: [adv-main29+]
Updated•9 years ago
|
Group: core-security
Updated•6 years ago
|
Component: Layout: View Rendering → Layout: Web Painting
You need to log in
before you can comment on or make changes to this bug.
Description
•