Closed Bug 1501923 Opened Last year Closed Last year

Crash in RtlAcquireSRWLockExclusive | mozilla::image::imgFrame::AddSizeOfExcludingThis

Categories

(Core :: ImageLib, defect, P3, critical)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox-esr60 --- unaffected
firefox63 --- unaffected
firefox64 --- unaffected
firefox65 --- fixed

People

(Reporter: calixte, Assigned: aosmond)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is
report bp-118850ca-34db-4d7a-ae5a-e07d20181025.
=============================================================

Top 10 frames of crashing thread:

0 ntdll.dll RtlAcquireSRWLockExclusive 
1 xul.dll mozilla::image::imgFrame::AddSizeOfExcludingThis image/imgFrame.cpp:1054
2 xul.dll mozilla::image::AnimationFrameDiscardingQueue::AddSizeOfExcludingThis image/AnimationFrameBuffer.cpp:314
3 xul.dll mozilla::image::AnimationSurfaceProvider::AddSizeOfExcludingThis image/AnimationSurfaceProvider.cpp:211
4 xul.dll void mozilla::image::CachedSurface::SurfaceMemoryReport::Add image/SurfaceCache.cpp:199
5 xul.dll static void mozilla::image::SurfaceCache::CollectSizeOfSurfaces image/SurfaceCache.cpp:1694
6 xul.dll mozilla::image::RasterImage::CollectSizeOfSurfaces image/RasterImage.cpp:730
7 xul.dll mozilla::image::ImageMemoryCounter::ImageMemoryCounter image/Image.cpp:44
8 xul.dll static __int64 imgMemoryReporter::ImagesContentUsedUncompressedDistinguishedAmount image/imgLoader.cpp:186
9 xul.dll nsMemoryReporterManager::GetImagesContentUsedUncompressed xpcom/base/nsMemoryReporterManager.cpp:2663

=============================================================

There is 1 crash in nightly 65 with buildid 20181024133237. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1465619.

[1] https://hg.mozilla.org/mozilla-central/rev?node=453d21db2a30
Flags: needinfo?(aosmond)
Definitely me, although it puzzles me, as I don't see how the frame pointer could be null.

There are a few cases where it updates the AnimationFrameDiscardingQueue:

1) When a new frame is inserted, it will move a RefPtr into the queue. This is most certainly not null. It won't ever clear the pointer (unless it is a recycling queue, but that doesn't appear to be active, and should be still disabled by default at the moment) -- it just removes entries from the queue itself.

2) When the queue is initialized, it takes all of the existing frames from the AnimationFrameRetainedBuffer. In theory the first frame pointer is stored separately, and the first move to the unique storage could prevent us from putting the other copy into the mDisplay queue. However we are supposed to guarantee that we have advanced past the first frame, so that should never happen...

I cast suspicion on #2. Maybe if the user got really unlucky and somehow switched to a discarding queue when we advanced to the end of the animation? Not sure on the exact mechanism yet...
Assignee: nobody → aosmond
Priority: -- → P3
Andrew, is this something you can prioritize? We have about 2.5 weeks before soft freeze for 65.
(In reply to Jared Wein [:jaws] (Regression Engineering Owner for 65) (please needinfo? me) from comment #2)
> Andrew, is this something you can prioritize? We have about 2.5 weeks before
> soft freeze for 65.

This crash has happened only once, as far as I can tell. I don't have a fix because I don't know how it could happen. If the volume increases on beta, I can reinvestigate.
Flags: needinfo?(aosmond)
Ironic that the volume went up on nightly the day after I said it only happened once... let me take another look.
Flags: needinfo?(aosmond)
I think I see the problem. The animation probably got reset, and then immediately crossed the threshold, so mGetIndex == 0. We already moved the first frame reference out of the array, so we insert a null. I think this might visually appear as a frozen animation, until a memory report is generated and we crash.
Flags: needinfo?(aosmond)
If an animated frame buffer was reset just before the necessary frame to
cross the discard threshold, followed by said frame being inserted by
the decoder, it would insert a null pointer into the display queue for
the first frame. This is because it assumed that we have always advanced
past the first frame -- which was true, but the reset placed us back at
the beginning.

This would initially manifest to the user as the animation stopping,
since it could not advance past the first frame. Once a memory report
was requested, it would crash because we assume every frame in the
display queue is valid.

This patch removes the assumption about what frame we have advanced to.
Status: NEW → ASSIGNED
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3bc19b3fb522
Fix crash where we reset an animation just before crossing the discard threshold. r=tnikkel
https://hg.mozilla.org/mozilla-central/rev/3bc19b3fb522
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
You need to log in before you can comment on or make changes to this bug.