Closed Bug 1270232 Opened 8 years ago Closed 2 years ago

Startup crash in mozilla::FrameLayerBuilder::PaintItems

Categories

(Core :: Web Painting, defect)

49 Branch
Unspecified
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox47 --- affected
firefox48 --- affected
firefox49 --- affected
firefox-esr45 --- affected
firefox50 --- affected

People

(Reporter: u279076, Unassigned)

Details

(Keywords: crash, Whiteboard: gfx-noted)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-2e08b5e2-68e2-43f0-9add-aeafe2160502.
=============================================================
0 	xul.dll 	mozilla::FrameLayerBuilder::PaintItems(nsTArray<mozilla::FrameLayerBuilder::ClippedDisplayItem>&, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, gfxContext*, nsRenderingContext*, nsDisplayListBuilder*, nsPresContext*, mozilla::gfx::IntPointTyped<mozilla::gfx::UnknownUnits> const&, float, float, int) 	layout/base/FrameLayerBuilder.cpp:5576
1 	xul.dll 	mozilla::FrameLayerBuilder::DrawPaintedLayer(mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*) 	layout/base/FrameLayerBuilder.cpp:5725
2 	xul.dll 	mozilla::layers::ClientPaintedLayer::PaintThebes() 	gfx/layers/client/ClientPaintedLayer.cpp:95
3 	xul.dll 	mozilla::layers::ClientPaintedLayer::RenderLayerWithReadback(mozilla::layers::ReadbackProcessor*) 	gfx/layers/client/ClientPaintedLayer.cpp:149
4 	xul.dll 	mozilla::layers::ClientContainerLayer::RenderLayer() 	gfx/layers/client/ClientContainerLayer.h:65
5 	xul.dll 	mozilla::layers::ClientLayerManager::EndTransactionInternal(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/client/ClientLayerManager.cpp:299
6 	xul.dll 	mozilla::layers::ClientLayerManager::EndTransaction(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/client/ClientLayerManager.cpp:342
7 	xul.dll 	nsDisplayList::PaintRoot(nsDisplayListBuilder*, nsRenderingContext*, unsigned int) 	layout/base/nsDisplayList.cpp:1844
8 	xul.dll 	nsLayoutUtils::PaintFrame(nsRenderingContext*, nsIFrame*, nsRegion const&, unsigned int, nsDisplayListBuilderMode, nsLayoutUtils::PaintFrameFlags) 	layout/base/nsLayoutUtils.cpp:3572
9 	xul.dll 	PresShell::Paint(nsView*, nsRegion const&, unsigned int) 	layout/base/nsPresShell.cpp:6366
10 	xul.dll 	nsViewManager::ProcessPendingUpdatesPaint(nsIWidget*) 	view/nsViewManager.cpp:467
11 	xul.dll 	nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) 	view/nsViewManager.cpp:398
12 	xul.dll 	nsViewManager::ProcessPendingUpdates() 	view/nsViewManager.cpp:1100
13 	xul.dll 	nsRefreshDriver::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:1900
14 	xul.dll 	mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, __int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:274
15 	xul.dll 	mozilla::RefreshDriverTimer::TickRefreshDrivers(__int64, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) 	layout/base/nsRefreshDriver.cpp:246
16 	xul.dll 	mozilla::RefreshDriverTimer::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:265
17 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:588
18 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:508
19 	xul.dll 	nsRunnableMethodImpl<void ( mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::*)(mozilla::TimeStamp), 1, mozilla::TimeStamp>::Run() 	obj-firefox/dist/include/nsThreadUtils.h:709
20 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:989
21 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp:290
22 	xul.dll 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:98
23 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:226
24 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:206
25 	xul.dll 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp:156
26 	xul.dll 	nsAppShell::Run() 	widget/windows/nsAppShell.cpp:262
27 	xul.dll 	nsAppStartup::Run() 	toolkit/components/startup/nsAppStartup.cpp:284
28 	xul.dll 	XREMain::XRE_mainRun() 	toolkit/xre/nsAppRunner.cpp:4347
29 	xul.dll 	XREMain::XRE_main(int, char** const, nsXREAppData const*) 	toolkit/xre/nsAppRunner.cpp:4451
30 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:4559
31 	firefox.exe 	do_main 	browser/app/nsBrowserApp.cpp:220
32 	firefox.exe 	NS_internal_main(int, char**, char**) 	browser/app/nsBrowserApp.cpp:360
33 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:135
34 	firefox.exe 	__scrt_common_main_seh 	f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:255
Ø 35 	kernel32.dll 	kernel32.dll@0xa553 	
Ø 36 	ntdll.dll 	ntdll.dll@0x4f720 	
Ø 37 	kernelbase.dll 	kernelbase.dll@0x514f 	
=============================================================
More reports: https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3AFrameLayerBuilder%3A%3APaintItems&date=2016-05-04+18%3A35%3A48&product=Firefox&version=Firefox%3A49.0a1&range_unit=hours&range_value=168#tab-reports

This crash has been reported 48 times in a couple of days, apparently all by a single Nightly user. The user is on Windows 10 with an NVIDIA GeForce 9500 GT using the latest driver (9.18.13.4195).

The reports start with the Nightly from May 2nd, I'm not sure if anything stands out as a regression:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1461a4071341c282afcf7b72e33036412d2251d4&tochange=77cead2cd20300623eea2416bc9bce4d5021df09

It is intriguing that this seems isolated to a single person, however. As a startup crash this likely means the user is stranded, although the last report was from 2016-05-02T18:52:28+00:00 so I suspect the user did something on their end to work around it or we've lost them altogether.

I'm not sure what we can do in this situation but I'm filing it nonetheless.
Flags: needinfo?(milan)
Matt, any ideas?
Component: Graphics: Layers → Layout
Flags: needinfo?(milan) → needinfo?(matt.woodrow)
Yes, it looks like this user got a corrupt DLL somehow.

We're crashing due to an invalid instruction.

The minidump has the memory for the instructions it's trying to run, and I compared those to what were in the dll downloaded from the symbol server.

We have a single byte that changed from 0x4D in the symbol server dll, to 0xCD in the users memory (byte 7 changed from 0 to 1).

That's no longer a valid instruction, and we crash every time this function runs (it's painting, so basically as soon as we start up).

I have no useful guesses as to how this user ended up with a corrupt DLL on their drive.
Flags: needinfo?(matt.woodrow)
This definitely isn't a layout bug, but I don't know where it belongs either.

I suspect we only caught this because the user was persistent enough to try 48 times before finally giving up.

If random binary corruption happens to other users they'd likely give up (and abandon firefox) much earlier than this, and the crash volume would be too low to register on any analysis.

I would assume each corruption would be for a random bit (so an entirely different crash stack), and the only correlation between them would be the EXCEPTION_ILLEGAL_INSTRUCTION crash reason.

It seems plausible that this happens to a lot of users (see bug 1034706 comment 44 for another example of corrupted memory), and we're failing to spot the correlation and thus volume.

Anthony, do you know who would be the best person to look into this further?
Flags: needinfo?(anthony.s.hughes)
(In reply to Matt Woodrow (:mattwoodrow) from comment #3)
> Anthony, do you know who would be the best person to look into this further?

Unfortunately not.
Flags: needinfo?(anthony.s.hughes)
Nicholas, anything "Uptime" can do about figuring out a home for weird crashes like this one?  See comment 2 and bug 1034706 comment 44, for example.  The volume is low here, but if there is randomness involved, there may be a lot lurking.
Flags: needinfo?(n.nethercote)
In bug 1034706 and bug 1270554 we've been having discussions about JIT crashes that appear to be caused by bad hardware -- bit flips probably caused by faulty memory. One idea for dealing with it is to perform some kind of checksum on JITted code to make sure that this hasn't happened.

Corrupted binaries are a similar sort of problem. Maybe we could do a checksum of our code, such as libxul? I'll add a note to the Uptime wiki about this idea.
Flags: needinfo?(n.nethercote)
Crash volume for signature 'mozilla::FrameLayerBuilder::PaintItems':
 - nightly(version 50):0 crashes from 2016-06-06.
 - aurora (version 49):3 crashes from 2016-06-07.
 - beta   (version 48):102 crashes from 2016-06-06.
 - release(version 47):422 crashes from 2016-05-31.
 - esr    (version 45):31 crashes from 2016-04-07.

Crash volume on the last weeks:
            W. N-1  W. N-2  W. N-3  W. N-4  W. N-5  W. N-6  W. N-7
 - nightly       0       0       0       0       0       0       0
 - aurora        1       0       1       0       0       0       1
 - beta          5      28       8      12      15      15      14
 - release      63      51      47      57      64      61      57
 - esr           0       2       1       3      12       5       0

Affected platforms: Windows, Mac OS X, Linux
Crash volume for signature 'mozilla::FrameLayerBuilder::PaintItems':
 - nightly (version 51): 0 crashes from 2016-08-01.
 - aurora  (version 50): 1 crash from 2016-08-01.
 - beta    (version 49): 16 crashes from 2016-08-02.
 - release (version 48): 88 crashes from 2016-07-25.
 - esr     (version 45): 33 crashes from 2016-05-02.

Crash volume on the last weeks (Week N is from 08-22 to 08-28):
            W. N-1  W. N-2  W. N-3
 - nightly       0       0       0
 - aurora        0       1       0
 - beta          5       6       4
 - release      24      26      12
 - esr           1       1       1

Affected platforms: Windows, Mac OS X, Linux

Crash rank on the last 7 days:
           Browser   Content     Plugin
 - nightly
 - aurora
 - beta    #5767     #1187
 - release #701
 - esr     #4858
Component: Layout → Layout: View Rendering
Component: Layout: View Rendering → Layout: Web Painting
QA Whiteboard: qa-not-actionable

Since the crash volume is low (less than 5 per week), the severity is downgraded to S3. Feel free to change it back if you think the bug is still critical.

For more information, please visit auto_nag documentation.

Severity: critical → S3

The crashing code no longer exists.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.