Closed Bug 847002 Opened 12 years ago Closed 12 years ago

crash in AlignWithLayerPixels @ PresShell::Paint

Categories

(Core Graveyard :: Widget: Android, defect)

22 Branch
ARM
Android
defect
Not set
critical

Tracking

(firefox20 unaffected, firefox21 fixed, firefox22+ fixed)

RESOLVED FIXED
mozilla22
Tracking Status
firefox20 --- unaffected
firefox21 --- fixed
firefox22 + fixed

People

(Reporter: scoobidiver, Assigned: kats)

References

Details

(4 keywords, Whiteboard: [native-crash])

Crash Data

Attachments

(2 files, 1 obsolete file)

With the below stack trace, it first showed up in 22.0a1/20130301 and is now #2 top crasher in that build (currently hit by 6 users). The regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b0e08db3bc2a&tochange=993d7aff3109 I suspect bug 844275. Signature PresShell::Paint(nsView*, nsRegion const&, unsigned int) More Reports Search UUID 4d56fe7c-3c63-43ec-bf76-9c1792130301 Date Processed 2013-03-01 23:08:42 Uptime 74 Last Crash 1.4 minutes before submission Install Age 8.1 hours since version was first installed. Install Time 2013-03-01 15:03:38 Product FennecAndroid Version 22.0a1 Build ID 20130301030909 Release Channel nightly OS Android OS Version 0.0.0 Linux 2.6.32.9-perf #1 PREEMPT Mon Apr 2 10:39:26 KST 2012 armv7l KDDI/PTI06/jmasai/8x50:2.2.1/FRG83/sirius_alpha_ver4_0402:user/release-keys Build Architecture arm Build Architecture Info Crash Reason SIGSEGV Crash Address 0x0 App Notes AdapterDescription: 'Qualcomm -- Adreno 200 -- OpenGL ES 2.0 1309647 -- Model: IS06, Product: PTI06, Manufacturer: PANTECH, Hardware: qcom' EGL? EGL+ GL Context? GL Context+ GL Layers? GL Layers+ PANTECH IS06 KDDI/PTI06/jmasai/8x50:2.2.1/FRG83/sirius_alpha_ver4_0402:user/release-keys Processor Notes sp-processor01.phx1.mozilla.com_16743:2008; exploitablity tool: ERROR: unable to analyze dump EMCheckCompatibility True Adapter Vendor ID Qualcomm Adapter Device ID Adreno 200 Device PANTECH IS06 Android API Version 8 (REL) Android CPU ABI armeabi-v7a Frame Module Signature Source 0 libxul.so PresShell::Paint obj-firefox/dist/include/Layers.h:262 1 libxul.so nsView::GetNearestWidget const view/src/nsView.cpp:852 2 libxul.so AlignWithLayerPixels layout/generic/nsGfxScrollFrame.cpp:1787 More reports at: https://crash-stats.mozilla.com/report/list?product=FennecAndroid&signature=PresShell%3A%3APaint%28nsView*%2C%20nsRegion%20const%26%2C%20unsigned%20int%29
The stack in these crashes looks messed up, frame 0 is the only reasonable thing. That being said, it does look like it could be caused by bug 844275, because if GetLayerManager returns null in PresShell::Paint, then the function could end up calling HasShadowManager() on a null object, which would explain the Layers.h:262 crash source. I will investigate.
Assignee: nobody → bugmail.mozilla
Blocks: 844275
Crash Signature: [@ PresShell::Paint(nsView*, nsRegion const&, unsigned int)] → [@ PresShell::Paint(nsView*, nsRegion const&, unsigned int) ]
I stuck a breakpoint at nsPresShell::Paint to see where it gets called from. The main codepaths I found are these: #0 PresShell::Paint (this=0x5eef6540, aViewToPaint=0x621fe9c0, aDirtyRegion=..., aFlags=1) at /Users/kats/zspace/mozilla-git/layout/base/nsPresShell.cpp:5513 #1 0x6788d878 in nsViewManager::ProcessPendingUpdatesForView (this=0x62402a00, aView=0x621fe9c0, aFlushDirtyRegion=true) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:400 #2 0x6788f3ee in nsViewManager::ProcessPendingUpdates (this=0x62402a00) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:1120 #3 0x67250d6a in nsRefreshDriver::Tick (this=0x61299200, aNowEpoch=1362398490480590, aNowTime=...) at /Users/kats/zspace/mozilla-git/layout/base/nsRefreshDriver.cpp:960 #0 PresShell::Paint (this=0x5eef6540, aViewToPaint=0x621fe9c0, aDirtyRegion=..., aFlags=2) at /Users/kats/zspace/mozilla-git/layout/base/nsPresShell.cpp:5513 #1 0x6788d5a4 in nsViewManager::Refresh (this=0x62402a00, aView=0x621fe9c0, aRegion=...) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:335 #2 0x6788e1b4 in nsViewManager::PaintWindow (this=0x62402a00, aWidget=0x612b5140, aRegion=..., aFlags=0) at /Users/kats/zspace/mozilla-git/view/src/nsViewManager.cpp:650 #3 0x6788c6aa in nsView::PaintWindow (this=0x621fe9c0, aWidget=0x612b5140, aRegion=..., aFlags=0) at /Users/kats/zspace/mozilla-git/view/src/nsView.cpp:972 #4 0x681cf6cc in nsWindow::DrawTo (this=0x612b5140, targetSurface=0x64ebf220, invalidRect=...) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:981 #5 0x681cf92e in nsWindow::DrawTo (this=0x612b4c00, targetSurface=0x64ebf220, invalidRect=...) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:1021 #6 0x681cfb9a in nsWindow::OnDraw (this=0x612b4c00, ae=0x61f67050) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:1072 #7 0x681cf1f4 in nsWindow::OnGlobalAndroidEvent (ae=0x61f67050) at /Users/kats/zspace/mozilla-git/widget/android/nsWindow.cpp:871 The first codepath is guarded by a call to widget->NeedsPaint [1] which should return false because sCompositorPaused will be true while we don't have a layer manager. The second codepath is guarded by a direct check of sCompositorPaused at [2] so that also shouldn't get hit when we don't have a layer manager. There must be some other way this codepath is getting, so I'd like to turn the NS_ASSERTION into something that actually aborts in order to get a stack from it. [1] http://mxr.mozilla.org/mozilla-central/source/view/src/nsViewManager.cpp#375 [2] http://mxr.mozilla.org/mozilla-central/source/widget/android/nsWindow.cpp#1047
I'm hoping crashing when layer manager is null will get us a more reliable stack as to when this condition is getting hit. The NS_ASSERTION just above doesn't do anything in release builds.
Attachment #720674 - Flags: review?(matt.woodrow)
Whiteboard: [native-crash] → [native-crash][leave open]
Attachment #720674 - Flags: review?(matt.woodrow) → review+
Well. It's crashing at the MOZ_CRASH, but the stack is still bananas. https://crash-stats.mozilla.com/report/index/fc6e630e-5fc9-4f36-9bba-8ca542130306
I'm able to reproduce this crash using the following steps: 1. Open Fennec 2. Install Adblock Plus 3. Press "Filter subscriptions" list button in Adblock settings -build: Firefox for Android 22.0a1 (2013-03-12) -device: Samsung Galaxy Nexus -OS: Android 4.1.1
Keywords: reproducible
Blocks: 840703
No longer blocks: 840703
(In reply to Andreea Pod from comment #7) > I'm able to reproduce this crash using the following steps: > > 1. Open Fennec > 2. Install Adblock Plus > 3. Press "Filter subscriptions" list button in Adblock settings > Awesome, thanks for the STR! I'm able to reproduce this crash as well now. It looks like the code is trying to paint a eWindowType_popup window with mPopupType of ePopupTypeMenu. For this kind of window, ShouldUseOffMainThreadCompositing returns false, and so we don't use the global/static OMTC layer manager. However nsWindow::NeedsPaint() still returns true for this window because it is in the top window's window hierarchy and sCompositorPaused is false. I think the correct fix here is to return false from nsWindow::NeedsPaint() if GetLayerManager() returns null. I had debated doing that in bug 844275 but decided not to since I didn't see how this condition could be hit. Well, now I know.
Slight cosmetic fixup
Attachment #724724 - Attachment is obsolete: true
Attachment #724724 - Flags: review?(chrislord.net)
Attachment #724725 - Flags: review?(chrislord.net)
Component: Layout → Widget: Android
Comment on attachment 724725 [details] [diff] [review] Guard codepaths that assume GetLayerManager doesn't return null Review of attachment 724725 [details] [diff] [review]: ----------------------------------------------------------------- Looks fine. I guess we're none the wiser as to when this was happening? ::: widget/android/nsWindow.cpp @@ +2329,5 @@ > > bool > nsWindow::NeedsPaint() > { > + if (sCompositorPaused || FindTopLevel() != nsWindow::TopWindow() || !GetLayerManager(nullptr)) { Can there be a null layer manager when the compositor isn't paused?
Attachment #724725 - Flags: review?(chrislord.net) → review+
(In reply to Chris Lord [:cwiiis] from comment #11) > Looks fine. I guess we're none the wiser as to when this was happening? Well it was trying to paint a nsWindow that wasn't the main window. On Android we don't really support that, and in this case the "undefined behaviour" resulting was a crash. But we shouldn't ever crash so this fixes the crash. > > Can there be a null layer manager when the compositor isn't paused? For the non-main window, yes. For the main window there will always be a non-null layer manager if the compositor isn't paused. https://hg.mozilla.org/integration/mozilla-inbound/rev/cf64837a7dd7
I also backed out the patch from attachment 720674 [details] [diff] [review] since it didn't really help with getting better stack traces and it clutters the code. https://hg.mozilla.org/integration/mozilla-inbound/rev/7d91d21471d0
Whiteboard: [native-crash][leave open] → [native-crash]
Attachment #720674 - Flags: checkin+ → checkin-
Comment on attachment 724725 [details] [diff] [review] Guard codepaths that assume GetLayerManager doesn't return null [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 844275 and friends User impact if declined: sometimes crashes when multiple XUL windows are present, e.g. in the STR in comment 7 Testing completed (on m-c, etc.): locally only. this should bake on m-c to verify it fixes the crashes before being uplifted. Risk to taking this patch (and alternatives if risky): low risk, fennec only String or UUID changes made by this patch: none
Attachment #724725 - Flags: approval-mozilla-aurora?
Comment 15 corresponds to the backout patch in comment 13. The m-c cset for the fix from comment 12 is: https://hg.mozilla.org/mozilla-central/rev/cf64837a7dd7
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Attachment #724725 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Target Milestone: --- → mozilla22
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8) > It looks like the code is trying to paint a eWindowType_popup window with > mPopupType of ePopupTypeMenu. (In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12) > Well it was trying to paint a nsWindow that wasn't the main window. On > Android we don't really support that, and in this case the "undefined > behaviour" resulting was a crash. But we shouldn't ever crash so this fixes > the crash. Why do we create these popup windows on android then? Can we just not create them in the first place?
That would be the correct solution, yes. But I don't know who's creating the popup windows. In the general case it might even be addons, which we can't fix, so having this guard is good.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: