Intermittent Assertion failure: !aState.GetIsCurrentlyDecoded(), at FrameAnimator.cpp:471

RESOLVED FIXED in Firefox 60

Status

()

P4
normal
RESOLVED FIXED
a year ago
10 months ago

People

(Reporter: aryx, Assigned: aosmond)

Tracking

(Blocks: 1 bug, {intermittent-failure, regression})

unspecified
mozilla60
intermittent-failure, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox57 wontfix, firefox58 wontfix, firefox59 wontfix, firefox60 fixed)

Details

Attachments

(2 attachments, 1 obsolete attachment)

https://treeherder.mozilla.org/logviewer.html#?job_id=122856153&repo=autoland

09:49:58     INFO - REFTEST TEST-LOAD | file:///Users/cltbld/tasks/task_1502638480/build/tests/reftest/tests/layout/reftests/invalidation/test-animated-image-layers-background.html | 898 / 996 (90%)
09:49:58     INFO - ++DOMWINDOW == 313 (0x124d34800) [pid = 1644] [serial = 2723] [outer = 0x10fe5d000]
09:49:58     INFO - [Child 1644] WARNING: stylo: ServoStyleSets cannot respond to document state changes yet (only matters for chrome documents). See bug 1290285.: file /home/worker/workspace/build/src/layout/base/PresShell.cpp, line 4327
09:49:58     INFO - Assertion failure: !aState.GetIsCurrentlyDecoded(), at /home/worker/workspace/build/src/image/FrameAnimator.cpp:471
09:49:58     INFO - #01: mozilla::image::RasterImage::LookupFrameInternal(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, unsigned int, mozilla::image::PlaybackType) [image/RasterImage.cpp:294]
09:49:58     INFO - 
09:49:58     INFO - #02: mozilla::image::RasterImage::LookupFrame(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, unsigned int, mozilla::image::PlaybackType) [image/RasterImage.cpp:358]
09:49:58     INFO - 
09:49:58     INFO - #03: mozilla::image::RasterImage::GetFrameInternal(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, unsigned int, unsigned int) [image/RasterImage.cpp:582]
09:49:58     INFO - 
09:49:58     INFO - #04: mozilla::image::RasterImage::GetCurrentImage(mozilla::layers::ImageContainer*, unsigned int) [mfbt/Tuple.h:329]
09:49:58     INFO - 
09:49:58     INFO - #05: mozilla::image::RasterImage::GetImageContainer(mozilla::layers::LayerManager*, unsigned int) [mfbt/Tuple.h:329]
09:49:58     INFO - 
09:49:58     INFO - #06: nsDisplayImageContainer::GetContainer(mozilla::layers::LayerManager*, nsDisplayListBuilder*) [xpcom/base/nsCOMPtr.h:402]
09:49:58     INFO - 
09:49:58     INFO - #07: nsDisplayBackgroundImage::BuildLayer(nsDisplayListBuilder*, mozilla::layers::LayerManager*, mozilla::ContainerLayerParameters const&) [mfbt/AlreadyAddRefed.h:121]
09:49:58     INFO - 
09:49:58     INFO - #08: mozilla::ContainerState::ProcessDisplayItems(nsDisplayList*) [mfbt/AlreadyAddRefed.h:121]
09:49:58     INFO - 
09:49:58     INFO - #09: mozilla::FrameLayerBuilder::BuildContainerLayerFor(nsDisplayListBuilder*, mozilla::layers::LayerManager*, nsIFrame*, nsDisplayItem*, nsDisplayList*, mozilla::ContainerLayerParameters const&, mozilla::gfx::Matrix4x4Typed<mozilla::gfx::UnknownUnits, mozilla::gfx::UnknownUnits> const*, unsigned int) [layout/painting/FrameLayerBuilder.cpp:5704]
09:49:58     INFO - 
09:49:58     INFO - #10: nsDisplayList::PaintRoot(nsDisplayListBuilder*, gfxContext*, unsigned int) [layout/painting/nsDisplayList.cpp:2203]
09:49:58     INFO - 
09:49:58     INFO - #11: nsLayoutUtils::PaintFrame(gfxContext*, nsIFrame*, nsRegion const&, unsigned int, nsDisplayListBuilderMode, nsLayoutUtils::PaintFrameFlags) [mfbt/AlreadyAddRefed.h:121]
09:49:58     INFO - 
09:49:58     INFO - #12: mozilla::PresShell::Paint(nsView*, nsRegion const&, unsigned int) [layout/base/PresShell.cpp:6473]
09:49:58     INFO - 
09:49:58     INFO - #13: nsDOMWindowUtils::UpdateLayerTree() [gfx/src/nsRegion.h:75]
09:49:58     INFO - 
09:49:59     INFO - ###!!! [Parent][MessageChannel] Error: (msgtype=0x150078,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
09:49:59     INFO - Assertion failure: mDestroyed, at /home/worker/workspace/build/src/gfx/layers/composite/ImageLayerComposite.cpp:44
09:49:59     INFO - #01: <name omitted> [memory/mozalloc/mozalloc.h:224]
09:49:59     INFO - 
09:49:59     INFO - #02: mozilla::layers::ContainerLayer::RemoveChild(mozilla::layers::Layer*) [gfx/layers/Layers.h:827]
09:49:59     INFO - 
09:49:59     INFO - #03: mozilla::layers::ContainerLayerComposite::~ContainerLayerComposite() [gfx/layers/composite/ContainerLayerComposite.cpp:666]
09:49:59     INFO - 
09:49:59     INFO - #04: mozilla::layers::ContainerLayerProperties::~ContainerLayerProperties() [memory/mozalloc/mozalloc.h:224]
09:49:59     INFO - 
09:49:59     INFO - #05: nsTArray_Impl<mozilla::UniquePtr<mozilla::layers::LayerPropertiesBase, mozilla::DefaultDelete<mozilla::layers::LayerPropertiesBase> >, nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned long, unsigned long) [xpcom/ds/nsTArray.h:2011]
09:49:59     INFO - 
09:49:59     INFO - #06: mozilla::layers::ContainerLayerProperties::~ContainerLayerProperties() [xpcom/ds/nsTArray-inl.h:21]
09:49:59     INFO - 
09:49:59     INFO - #07: mozilla::layers::ContainerLayerProperties::~ContainerLayerProperties() [memory/mozalloc/mozalloc.h:224]
09:49:59     INFO - 
09:49:59     INFO - #08: nsTArray_Impl<mozilla::UniquePtr<mozilla::layers::LayerPropertiesBase, mozilla::DefaultDelete<mozilla::layers::LayerPropertiesBase> >, nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned long, unsigned long) [xpcom/ds/nsTArray.h:2011]
09:49:59     INFO - 
09:49:59     INFO - #09: mozilla::layers::ContainerLayerProperties::~ContainerLayerProperties() [xpcom/ds/nsTArray-inl.h:21]
09:49:59     INFO - 
09:49:59     INFO - #10: mozilla::layers::ContainerLayerProperties::~ContainerLayerProperties() [memory/mozalloc/mozalloc.h:224]
09:49:59     INFO - 
09:49:59     INFO - #11: mozilla::layers::LayerManagerComposite::UpdateAndRender() [mfbt/UniquePtr.h:340]
09:49:59     INFO - 
09:49:59     INFO - #12: mozilla::layers::LayerManagerComposite::EndTransaction(mozilla::TimeStamp const&, mozilla::layers::LayerManager::EndTransactionFlags) [mfbt/RefPtr.h:314]
09:49:59     INFO - 
09:49:59     INFO - #13: mozilla::layers::CompositorBridgeParent::CompositeToTarget(mozilla::gfx::DrawTarget*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*) [gfx/layers/ipc/CompositorBridgeParent.cpp:1044]
09:49:59     INFO - 
09:49:59     INFO - #14: mozilla::layers::CompositorVsyncScheduler::Composite(mozilla::TimeStamp) [gfx/layers/ipc/CompositorVsyncScheduler.cpp:263]
09:49:59     INFO - 
09:49:59     INFO - #15: mozilla::detail::RunnableMethodImpl<mozilla::layers::CompositorVsyncScheduler*, void (mozilla::layers::CompositorVsyncScheduler::*)(mozilla::TimeStamp), true, (mozilla::RunnableKind)1, mozilla::TimeStamp>::Run() [xpcom/threads/nsThreadUtils.h:1122]
09:49:59     INFO - 
09:49:59     INFO - #16: MessageLoop::RunTask(already_AddRefed<nsIRunnable>) [xpcom/base/nsCOMPtr.h:374]
09:49:59     INFO - 
09:49:59     INFO - #17: MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask&&) [ipc/chromium/src/base/message_loop.cc:460]
09:49:59     INFO - 
09:49:59     INFO - #18: MessageLoop::DoWork() [ipc/chromium/src/base/message_loop.cc:535]
09:49:59     INFO - 
09:49:59     INFO - #19: base::MessagePumpDefault::Run(base::MessagePump::Delegate*) [ipc/chromium/src/base/message_pump_default.cc:36]
09:49:59     INFO - 
09:49:59     INFO - #20: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:599]
09:49:59     INFO - 
09:49:59     INFO - #21: base::Thread::ThreadMain() [ipc/chromium/src/base/thread.cc:184]
09:49:59     INFO - 
09:49:59     INFO - #22: ThreadFunc(void*) [ipc/chromium/src/base/platform_thread_posix.cc:39]
09:49:59     INFO - 
09:49:59     INFO - #23: libsystem_pthread.dylib + 0x405a
09:49:59     INFO - 
09:49:59     INFO - #24: libsystem_pthread.dylib + 0x3fd7
Comment hidden (Intermittent Failures Robot)
Priority: -- → P4
Summary: Intermittent Assertion failure: !aState.GetIsCurrentlyDecoded(), at FrameAnimator.cpp:471 → stylo: Intermittent Assertion failure: !aState.GetIsCurrentlyDecoded(), at FrameAnimator.cpp:471
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
[Tracking Requested - why for this release]: regression

bughunter sees this on https://www.stickmanbangkok.com/readers-submissions/
you may need to reload. occurs in Windows, Linux. Reproduced manually on Windows 7 64bit.
Blocks: 532972, 1343597
tracking-firefox57: --- → ?
Keywords: regression
Assignee: nobody → bwerth
Created attachment 8910537 [details]
Reduced testcase

The greatly reduced testcase shows that this is just an animated GIF that is being manipulated by script in some way. The script is minified jQuery and so it's not at all obvious to me what is being done to the image.

This is still intermittent. I can reproduce on macOS by repeatedly opening a new tab with the testcase. Reloading an existing tab does not trigger the crash for me. When the crash occurs, it seems to do so before the image has been rendered, which may explain the intermittent. If the script runs before the first paint: a crash.

I'm still working on understanding the rendering state when the crash happens. The assert that is failing is that both mCompositedFrameInvalid and mHasBeenDecoded are true on the AnimationState when it's time to draw the frame in FrameAnimator::GetCompositedFrame.
Thanks for reducing this!

Can you confirm that this only happens with stylo disabled? Looking at the history, it seems like this bug was tagged as a stylo bug because it appeared one on the stylo-only CI. But that might just be chance, and given that this is sounding pretty image-y I'm wondering if it's unrelated to stylo.
Flags: needinfo?(bwerth)
Bughunter is running with the default prefs and should have had stylo enabled when it was reproduced there. I just reproduced locally on Fedora definitely with stylo enabled.
Reproduces without servo enabled for me on macOS. Does not appear to be a stylo bug.
Flags: needinfo?(bwerth)
Summary: stylo: Intermittent Assertion failure: !aState.GetIsCurrentlyDecoded(), at FrameAnimator.cpp:471 → Intermittent Assertion failure: !aState.GetIsCurrentlyDecoded(), at FrameAnimator.cpp:471
status-firefox57: --- → wontfix
status-firefox58: --- → affected
tracking-firefox57: ? → ---
Comment hidden (Intermittent Failures Robot)
Comment hidden (mozreview-request)
I believe attachment 8944820 [details] solves the problem; it at least prevents my reduced testcase attachment 8910537 [details] from crashing. I'm not exactly sure why this fix works, but the change is a commonsense optimization that possibly removes a race condition. I hope that bbirtles can verify whether or not this is sensible in a review.
Attachment #8944820 - Flags: review?(bbirtles)
Attachment #8944820 - Flags: review?(bbirtles) → review?(aosmond)
Comment hidden (Intermittent Failures Robot)
(Assignee)

Comment 14

11 months ago
(In reply to Brad Werth [:bradwerth] from comment #12)
> I believe attachment 8944820 [details] solves the problem; it at least
> prevents my reduced testcase attachment 8910537 [details] from crashing. I'm
> not exactly sure why this fix works, but the change is a commonsense
> optimization that possibly removes a race condition. I hope that bbirtles
> can verify whether or not this is sensible in a review.

Sorry for the delay on this. I have been looking at this on and off, but I haven't convinced myself about the fix yet. These assertions were tricky to get right in my head the first time :).
(Assignee)

Comment 15

11 months ago
There is a race condition here related to sync decoding. I'll see what I can dig up from rr.
(Assignee)

Comment 16

11 months ago
Comment on attachment 8944820 [details]
Bug 1389988 Part 1: Prevent AnimationState::UpdateStateInternal from repeatedly setting mIsCurrentlyDecoded true every frame; avoiding race condition.

The problem is very different and sort of complicated. I put together something that will (should?) fix the intermittent, assuming it is the same as the attached test case.

I could reproduce easily if I changed dom.ipc.processCount from 4 to 1 to ensure the additional tabs are in the same content process.
Attachment #8944820 - Flags: review?(aosmond)
(Assignee)

Comment 17

11 months ago
Created attachment 8946348 [details] [diff] [review]
0001-Bug-1389988-Handle-a-data-race-between-a-new-sync-de.patch, v1

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=070b850c991bf0a00133fb6d1139df5412d9679c
(Assignee)

Comment 18

11 months ago
mozreview-review
Comment on attachment 8944820 [details]
Bug 1389988 Part 1: Prevent AnimationState::UpdateStateInternal from repeatedly setting mIsCurrentlyDecoded true every frame; avoiding race condition.

https://reviewboard.mozilla.org/r/214978/#review222084

This shows there is a data race elsewhere.
Attachment #8944820 - Flags: review-
Comment on attachment 8944820 [details]
Bug 1389988 Part 1: Prevent AnimationState::UpdateStateInternal from repeatedly setting mIsCurrentlyDecoded true every frame; avoiding race condition.

Happy to have a real fix. Thanks!
Attachment #8944820 - Attachment is obsolete: true
Assignee: bwerth → nobody
(Assignee)

Comment 20

11 months ago
(In reply to Brad Werth [:bradwerth] from comment #19)
> Comment on attachment 8944820 [details]
> Bug 1389988 Part 1: Prevent AnimationState::UpdateStateInternal from
> repeatedly setting mIsCurrentlyDecoded true every frame; avoiding race
> condition.
> 
> Happy to have a real fix. Thanks!

And thank you for the reduced test case and making me stare at the review for a while, I wouldn't have solved it otherwise.
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
(Assignee)

Updated

11 months ago
Attachment #8946348 - Flags: review?(tnikkel)
Component: Layout → ImageLib
Comment on attachment 8946348 [details] [diff] [review]
0001-Bug-1389988-Handle-a-data-race-between-a-new-sync-de.patch, v1

Review of attachment 8946348 [details] [diff] [review]:
-----------------------------------------------------------------

::: image/DecoderFactory.cpp
@@ +135,5 @@
>    decoder->SetOutputSize(aOutputSize);
>    decoder->SetDecoderFlags(aDecoderFlags | DecoderFlags::FIRST_FRAME_ONLY);
>    decoder->SetSurfaceFlags(aSurfaceFlags);
>  
> +  nsresult rv = decoder->Init();

This is super nit picky, but this could also return already initialized and we don't want to confuse that with the surface cache insertion failure. So I guess just return a generic failure here?

::: image/DecoderFactory.h
@@ +63,5 @@
>     *                    image.
>     * @param aDecoderFlags Flags specifying the behavior of this decoder.
>     * @param aSurfaceFlags Flags specifying the type of output this decoder
>     *                      should produce.
> +   * @param aTask Task representing the decoder.

I think the style is to use aOutTask for this, no?

@@ +68,2 @@
>     */
> +  static nsresult

Since the already initialized return value has a special meaning you should document that please.
Attachment #8946348 - Flags: review?(tnikkel) → review+

Comment 22

10 months ago
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e8e9b5a102ac
Handle a data race between a new sync decode request and a pending decoder. r=tnikkel
(Assignee)

Updated

10 months ago
Duplicate of this bug: 1398316

Comment 24

10 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/e8e9b5a102ac
Status: ASSIGNED → RESOLVED
Last Resolved: 10 months ago
status-firefox60: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Andrew, do you think this is applicable to 59 and worth uplift to beta?
status-firefox58: affected → wontfix
status-firefox59: --- → ?
Flags: needinfo?(aosmond)
(Assignee)

Comment 26

10 months ago
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #25)
> Andrew, do you think this is applicable to 59 and worth uplift to beta?

This was introduced in bug 1343341 and release 55, so it has been there for a while. I think it is safe to uplift, but there isn't any urgency for an end user perspective. It will only assert in debug builds, and probably flicker once at worst in a release build. Happy to put in the request if it helps for automated tests.
Flags: needinfo?(aosmond)
No sign of this actively affecting mozilla-beta, so let's have this ride the trains.
status-firefox59: ? → wontfix
status-firefox-esr52: --- → unaffected
You need to log in before you can comment on or make changes to this bug.