1153736 - crash in nsTextFrame::EnsureTextRun(nsTextFrame::TextRunType, gfxContext*, nsIFrame*, nsLineList_iterator const*, unsigned int*)

Reporter

Description

•

10 years ago

This bug was filed from the Socorro interface and is report bp-0bb425dc-34db-467f-87cb-85c862150413. ============================================================= crashed while doing sheriff work on treeherder (like marking failures etc) Crashing Thread Frame Module Signature Source 0 XUL nsTextFrame::EnsureTextRun(nsTextFrame::TextRunType, gfxContext*, nsIFrame*, nsLineList_iterator const*, unsigned int*) obj-firefox/x86_64/dist/include/nsExpirationTracker.h 1 XUL nsTextFrame::PaintText(nsRenderingContext*, nsPoint, nsRect const&, nsCharClipDisplayItem const&, gfxTextContextPaint*, nsTextFrame::DrawPathCallbacks*) layout/generic/nsTextFrame.cpp 2 XUL nsDisplayText::Paint(nsDisplayListBuilder*, nsRenderingContext*) layout/generic/nsTextFrame.cpp 3 XUL mozilla::FrameLayerBuilder::PaintItems(nsTArray<mozilla::FrameLayerBuilder::ClippedDisplayItem>&, nsIntRect const&, gfxContext*, nsRenderingContext*, nsDisplayListBuilder*, nsPresContext*, nsIntPoint const&, float, float, int) layout/base/FrameLayerBuilder.cpp 4 XUL mozilla::FrameLayerBuilder::DrawPaintedLayer(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*) layout/base/FrameLayerBuilder.cpp 5 XUL mozilla::layers::ClientTiledLayerBuffer::PostValidate(nsIntRegion const&) gfx/layers/client/TiledContentClient.cpp 6 XUL mozilla::layers::TiledLayerBuffer<mozilla::layers::ClientTiledLayerBuffer, mozilla::layers::TileClient>::Update(nsIntRegion const&, nsIntRegion const&) gfx/layers/TiledLayerBuffer.h 7 XUL mozilla::layers::ClientTiledLayerBuffer::PaintThebes(nsIntRegion const&, nsIntRegion const&, void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*) gfx/layers/client/TiledContentClient.cpp 8 XUL mozilla::layers::ClientTiledPaintedLayer::RenderHighPrecision(nsIntRegion&, nsIntRegion const&, void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*) gfx/layers/client/ClientTiledPaintedLayer.cpp 9 XUL mozilla::layers::ClientTiledPaintedLayer::RenderLayer() gfx/layers/client/ClientTiledPaintedLayer.cpp 10 XUL mozilla::layers::ClientContainerLayer::RenderLayer() gfx/layers/client/ClientContainerLayer.h 11 XUL mozilla::layers::ClientContainerLayer::RenderLayer() gfx/layers/client/ClientContainerLayer.h 12 XUL mozilla::layers::ClientLayerManager::EndTransactionInternal(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/client/ClientLayerManager.cpp 13 XUL mozilla::layers::ClientLayerManager::EndTransaction(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/client/ClientLayerManager.cpp 14 XUL nsDisplayList::PaintRoot(nsDisplayListBuilder*, nsRenderingContext*, unsigned int) layout/base/nsDisplayList.cpp 15 XUL nsLayoutUtils::PaintFrame(nsRenderingContext*, nsIFrame*, nsRegion const&, unsigned int, unsigned int) layout/base/nsLayoutUtils.cpp 16 XUL PresShell::Paint(nsView*, nsRegion const&, unsigned int) layout/base/nsPresShell.cpp 17 XUL nsViewManager::ProcessPendingUpdatesPaint(nsIWidget*) view/nsViewManager.cpp 18 XUL nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) view/nsViewManager.cpp 19 XUL nsRefreshDriver::Tick(long long, mozilla::TimeStamp) layout/base/nsRefreshDriver.cpp 20 XUL nsRefreshDriver::FinishedWaitingForTransaction() layout/base/nsRefreshDriver.cpp 21 XUL mozilla::layers::CompositorChild::RecvDidComposite(unsigned long long const&, unsigned long long const&) gfx/layers/ipc/CompositorChild.cpp 22 XUL mozilla::layers::PCompositorChild::OnMessageReceived(IPC::Message const&) obj-firefox/x86_64/ipc/ipdl/PCompositorChild.cpp 23 XUL mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) ipc/glue/MessageChannel.cpp 24 XUL mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message const&) ipc/glue/MessageChannel.cpp 25 XUL mozilla::ipc::MessageChannel::OnMaybeDequeueOne() ipc/glue/MessageChannel.cpp 26 XUL MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) ipc/chromium/src/base/message_loop.cc 27 XUL MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc 28 XUL mozilla::ipc::DoWorkRunnable::Run()

Mats Palmgren (inactive)

Comment 1

•

10 years ago

The top frame indicates this source line: http://mxr.mozilla.org/mozilla-central/source/xpcom/ds/nsExpirationTracker.h?rev=fec90cbfbaad#149 We probably got there from MarkUsed(), which is called here: http://mxr.mozilla.org/mozilla-central/source/layout/generic/nsTextFrame.cpp?rev=fec90cbfbaad#2622

BMO Automation

Updated

•

10 years ago

Crash Signature: [@ nsTextFrame::EnsureTextRun(nsTextFrame::TextRunType, gfxContext*, nsIFrame*, nsLineList_iterator const*, unsigned int*)] → [@ nsTextFrame::EnsureTextRun(nsTextFrame::TextRunType, gfxContext*, nsIFrame*, nsLineList_iterator const*, unsigned int*)] [@ nsTextFrame::EnsureTextRun]

BugBot [:suhaib / :marco/ :calixte]

Comment 2

•

9 years ago

Crash volume for signature 'nsTextFrame::EnsureTextRun': - nightly (version 50): 0 crashes from 2016-06-06. - aurora (version 49): 0 crashes from 2016-06-07. - beta (version 48): 25 crashes from 2016-06-06. - release (version 47): 43 crashes from 2016-05-31. - esr (version 45): 4 crashes from 2016-04-07. Crash volume on the last weeks: W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - nightly 0 0 0 0 0 0 0 - aurora 0 0 0 0 0 0 0 - beta 3 2 1 0 5 11 3 - release 3 7 8 3 5 6 4 - esr 0 2 1 1 0 0 0 Affected platforms: Windows, Mac OS X

status-firefox48: --- → affected

status-firefox-esr45: --- → affected

BugBot [:suhaib / :marco/ :calixte]

Comment 3

•

6 years ago

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → WONTFIX

BugBot [:suhaib / :marco/ :calixte]

Comment 4

•

6 years ago

Closing because no crashes reported for 12 weeks.

Calixte Denizet (:calixte)

Comment 5

•

6 years ago

There are still some crashes so reopen it.

status-firefox48: affected → wontfix

status-firefox64: --- → affected

status-firefox65: --- → affected

status-firefox-esr45: affected → wontfix

status-firefox-esr60: --- → affected

Calixte Denizet (:calixte)

Updated

•

6 years ago

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Marcia Knous [:marcia]

Comment 6

•

6 years ago

Marking 66/67 as affected. Overall, not a very high volume crash. nsTextFrame::EnsureTextRun signature is the only one that is still crashing. High correlation to accessibility being on.

(100.0% in signature vs 24.18% overall) accessibility = Active [100.0% vs 29.81% if startup_crash = 0]
(73.33% in signature vs 03.51% overall) reason = SIGSEGV /SEGV_ACCERR
(53.33% in signature vs 01.93% overall) Module "libcsc.so" = true
(53.33% in signature vs 02.54% overall) Module "libexynosgscaler.so" = true
(53.33% in signature vs 02.78% overall) Module "libexynosscaler.so" = true
(60.00% in signature vs 12.58% overall) Module "libSEC_EGL.so" = true
(60.00% in signature vs 13.06% overall) Module "gralloc.exynos5.so" = true
(60.00% in signature vs 13.06% overall) Module "u:object_r:system_security_prop:s0" = true
(60.00% in signature vs 14.51% overall) Module "esecomm.jar" = true
(60.00% in signature vs 15.24% overall) Module "boot-esecomm.oat" = true
(60.00% in signature vs 15.24% overall) Module "system@framework@boot-esecomm.art" = true

status-firefox66: --- → affected

status-firefox67: --- → affected

[:philipp]

Comment 7

•

6 years ago

could this spike in crashes on android during 66 be related to bug 1518247?

Flags: needinfo?(eitan)

Eitan Isaacson [:eeejay]

Assignee

Comment 8

•

6 years ago

Good chance it is. I think this is because we calculate accessible names much more often. It looks like it aggrevated something in layout. At least from the accessibility modules perspective, I think we are doing are due diligence and null checking the frames. I'll see if I can somehow trigger this locally.

Flags: needinfo?(eitan)

Marcia Knous [:marcia]

Comment 9

•

6 years ago

This is #6 overall top crash in Fennec 66b11. I guess we shall see where it ends up in b13 which just shipped.

Eitan Isaacson [:eeejay]

Assignee

Comment 10

•

6 years ago

I'm hoping the patch in bug 1531346 will remedy this. But I want to wait before calling it a dup.

Eitan Isaacson [:eeejay]

Assignee

Updated

•

6 years ago

Assignee: nobody → eitan

Marcia Knous [:marcia]

Comment 11

•

6 years ago

It looks as if one signature in 67 has a crash in 20190308093915, the other signature has a crash in 20190307094951. Since then on nightly 67 I haven't seen further crashes.

RaresB

Updated

•

4 years ago

QA Whiteboard: qa-not-actionable

BMO Automation

Updated

•

3 years ago

Severity: critical → S2

Daniel Holbert [:dholbert]

Comment 12

•

3 years ago

•

Edited

Right now, it looks like a substantial portion of our crashes with this signature are stack overflows, without any particularly deep backtrace (i.e. not a recursive death spiral type overflow). e.g. this one: bp-c4fa80d3-88b7-4416-9f85-c24cb0221013

These stack overflows seem to be pointing to our call to BuildTextRuns -- we're exhausting stack memory when we try to allocate space for our local variables when we enter that function. Maybe there's something we can do to make that function a little cheaper in terms of stack-space-usage?

At first glance, it looks like the heavyweight things there are:

BuildTextRunsScanner scanner which brings along some AutoTArray instances which each reserve some stack space. (on the order of 100 pointers worth in total, I think? Three arrays with size 10, 50, and 10, where the first array stores pointer-pairs, and the last two arrays store individual pointers.)
Another AutoTArray<char16_t, BIG_TEXT_NODE_SIZE> buffer;
https://searchfox.org/mozilla-central/rev/76ccfc801e6b736c844cde3fddeab7a748fc8515/layout/generic/nsTextFrame.cpp#1590

Note that BIG_TEXT_NODE_SIZE is 4096, defined here, which goes back to bug 333659 in 2007 in this commit. 4096 feels quite large... That means this one AutoTArray uses at least 8KB of stack space (uint16_t is 2 bytes, and we're storing 4096 2-byte values, which is 8192 i.e. 8KB). That feels substantial. I wonder how often we actually grow to use all 4096 entries there, and if we could start lower and failing over to heap allocation when we do need the space.

jfkthame, do you think we could change our AutoTArray usage there (and maybe BIG_TEXT_NODE_SIZE in general) to be more conservative of stack space, to avoid triggering stack overflows when entering this function?

Crash Signature: [@ nsTextFrame::EnsureTextRun(nsTextFrame::TextRunType, gfxContext*, nsIFrame*, nsLineList_iterator const*, unsigned int*)] [@ nsTextFrame::EnsureTextRun] → [@ nsTextFrame::EnsureTextRun]

Flags: needinfo?(jfkthame)

Daniel Holbert [:dholbert]

Comment 13

•

3 years ago

(note: the crash report I linked in the previous comment was for a 32-bit system. There are also quite a few stack-overflow crashes with this signature on 64-bit systems, but so far I haven't been able to find a stack overflow here for a 64-bit system with a useful backtrace; they all seem to be missing any call-tree info. e.g. bp-c8be21bc-3676-43c5-a2c6-47f980220902 . I wonder if that's indicative of something else going wrong, vs. if we've somehow lost the ability to stackwalk when there's a stack overflow on some 64-bit systems, vs. something else.)

Daniel Holbert [:dholbert]

Comment 14

•

3 years ago

(In reply to Daniel Holbert [:dholbert] from comment #12)

jfkthame, do you think we could change our AutoTArray usage there (and maybe BIG_TEXT_NODE_SIZE in general) to be more conservative of stack space, to avoid triggering stack overflows when entering this function?

(This is just spitballing a little bit; I'm also not entirely sure that reducing this weight would reduce overall crash volume. Our stack limit is on the order of multiple megabytes, according to kmag's recollection -- so if we're reaching that limit, then shaving off a portion of 8KB in just this one function probably won't help us too much; maybe we'd still overshoot a little later on.)

Jonathan Kew [:jfkthame]

Comment 15

•

3 years ago

We could certainly try reducing BIG_TEXT_NODE_SIZE, though before picking a new number it might be interesting to do some instrumentation to see what the distribution of sizes is like while browsing various kinds of content. But as you suggest, it's not that big compared to the stack limit, so it's unclear whether it would really help significantly. In many cases, I think the real problem is recursion further up the call stack. EnsureTextRun (and the BuildTextRunsScanner it uses) does have a fairly large stack frame, and so it will often show up as the frame that finally overflows, but if we make that frame smaller it just means we'll get a few more levels of recursion in before it happens.

The other thing I wonder, looking at bp-c4fa80d3-88b7-4416-9f85-c24cb0221013 from comment 12, is what else is already on the main thread's stack, beyond the ~70 frames here which seem to be a callback processing a window-resized message. These stack frames don't look like nearly enough to cause overflow. Is it possible for this callback to be called by Windows while we're deep inside something else already?

Flags: needinfo?(jfkthame)

Wayne Mery (:wsmwk)

Updated

•

3 years ago

status-thunderbird_esr102: --- → affected

Whiteboard: [tbird crash]

BugBot [:suhaib / :marco/ :calixte]

Comment 16

•

3 years ago

Since the crash volume is low (less than 15 per week), the severity is downgraded to S3. Feel free to change it back if you think the bug is still critical.

For more information, please visit auto_nag documentation.

Severity: S2 → S3

Bugzilla

crash in nsTextFrame::EnsureTextRun(nsTextFrame::TextRunType, gfxContext, nsIFrame, nsLineList_iterator const, unsigned int)

Categories

(Core :: Layout, defect)

Tracking

()

People

(Reporter: cbook, Assigned: eeejay)

References

Details

(Keywords: crash, Whiteboard: [tbird crash])

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Updated

Updated

Comment 12

Comment 13

Comment 14

Comment 15

Updated

Comment 16