1826257 - Crash in [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow ]

Ryan VanderMeulen [:RyanVM]

Reporter

Description

•

2 years ago

Crash report: https://crash-stats.mozilla.org/report/index/e158d570-766d-46ee-ae0d-4d5ec0230403

Reason: SIGSEGV / SEGV_MAPERR

Top 10 frames of crashing thread:

0  libc.so  libc.so@0x1abc0  
1  libxul.so  mozilla::image::SurfaceFilter::AdvanceRow  image/SurfacePipe.h:128
1  libxul.so  mozilla::image::SwizzleFilter<mozilla::image::BlendAnimationFilter<mozilla::image::SurfaceSink> >::DoAdvanceRowFromBuffer  image/SurfaceFilters.h:98
2  libxul.so  mozilla::image::SurfaceFilter::AdvanceRow  image/SurfacePipe.h:141
2  libxul.so  mozilla::image::SurfaceFilter::WriteBuffer<unsigned int>  image/SurfacePipe.h:300
2  libxul.so  mozilla::image::SurfacePipe::WriteBuffer<unsigned int>  image/SurfacePipe.h:705
2  libxul.so  mozilla::image::nsPNGDecoder::WriteRow  image/decoders/nsPNGDecoder.cpp:851
3  libxul.so  MOZ_PNG_push_proc_row  media/libpng/pngpread.c
3  libxul.so  MOZ_PNG_proc_IDAT_data  media/libpng/pngpread.c:879
4  libxul.so  MOZ_PNG_push_read_IDAT  media/libpng/pngpread.c:755

Timothy Nikkel (:tnikkel)

Comment 1

•

2 years ago

I think this might be related to bug 1753060.

Updated

•

2 years ago

Severity: -- → S3

Andrew McCreight [:mccr8]

Updated

•

5 months ago

Crash Signature: [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] → [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow ]

Summary: Crash in [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] → Crash in [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow ]

Gabriele Svelto [:gsvelto]

Comment 2

•

5 months ago

Bug 1895527 means that all the stacks for this crash tend to coalesce on a smaller number of signatures which I've added. I may have misinterpreted the results from our bit-flip detection heuristic in bug 1753060. It's true that in some signatures it triggers a lot, but the addresses that we detect as potential bit-flips are very close to a very large allocation, so it looks more like we've overflown a large buffer than an actual bit-flip. We know that the bit-flip detection logic can give false positives in this case and the Android crashes are even more similar to potential overflows as they often happen on a page boundary. We should look into this again.

Crash Signature: [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow ] → [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::ResetToFirstRow] [@ memcpy | mozilla::image::BlendAnimationFilter<T>::DoAdvanceRow] [@ …

Gabriele Svelto [:gsvelto]

Comment 3

•

5 months ago

Note: the crash isn't spiking, it's just a signature change, the volume hasn't changed much over time.

BugBot [:suhaib / :marco/ :calixte]

Comment 4

•

5 months ago

The bug is linked to a topcrash signature, which matches the following criterion:

Top 10 AArch64 and ARM crashes on release

:tnikkel, could you consider increasing the severity of this top-crash bug?

For more information, please visit BugBot documentation.

Flags: needinfo?(tnikkel)

Keywords: topcrash

Chris Peterson [:cpeterson]

Comment 5

•

5 months ago

Looks like we have some new libc crash signatures on Android starting a few months ago.

Crash Signature: [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::ResetToFirstRow] [@ memcpy | mozilla::image::BlendAnimationFilter<T>::DoAdvanceRow] [@ … → [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::ResetToFirstRow] [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so@0x52ba0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ lib…

status-firefox126: --- → affected

status-firefox127: --- → affected

status-firefox128: --- → ?

Timothy Nikkel (:tnikkel)

Updated

•

5 months ago

Flags: needinfo?(tnikkel)

Gabriele Svelto [:gsvelto]

Comment 6

•

5 months ago

I think I may have found something useful for diagnosing the bug. Many comments mention one or more of these three things: watching video, Firefox being very slow (probably due to swapping) and the UI briefly flashing white. This lead me to check the contents of the GraphicsCriticalError annotation and practically all the crashes I've looked at have this error:

CompositorBridgeChild receives IPC close with reason=AbnormalShutdown

So IIUC the GPU process crashed and the crash we're experiencing here is likely fallout from this issue.

Ryan VanderMeulen [:RyanVM]

Reporter

Updated

•

3 months ago

Crash Signature: [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::ResetToFirstRow] [@ libc.so@0x1abc0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so@0x52ba0 | mozilla::image::SurfaceFilter::AdvanceRow] [@ lib… → [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow] [@ libc.so | mozilla::image::SurfaceFilter::ResetToFirstRow] [@ memcpy | mozilla::image::BlendAnimationFilter<T>::DoAdvanceRow] [@ memcpy | mozilla::image::BlendAnimationFilter<T>::DoResetToFirstR…

status-firefox126: affected → wontfix

status-firefox127: affected → wontfix

status-firefox128: ? → affected

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 7

•

3 months ago

This seems to have spiked around the time of Fx126 shipping in case that points to anything obvious.

status-firefox129: --- → affected

status-firefox130: --- → affected

status-firefox-esr128: --- → affected

Timothy Nikkel (:tnikkel)

Comment 8

•

3 months ago

Spiked exactly the same time as bug 1903810, which apparantly only occurs after the gpu process has been disabled ( https://bugzilla.mozilla.org/show_bug.cgi?id=1907135#c2 )

Comment 9

•

3 months ago

I think that's probably a coincidence - My assumption is that the spike in bug 1903810 was probably due to a software update. We had previously encountered the same crash on different samsung devices in bug 1868825, which also required the GPU process to have been disabled

Timothy Nikkel (:tnikkel)

Comment 10

•

3 months ago

Comment 6 here linked this bug to a gpu process crash. Could whatever caused that gpu process crash be behind the gpu process getting disabled for bug 1903810?

Jamie Nicol [:jnicol]

Comment 11

•

3 months ago

GPU process crashes will happen for a large variety of reasons, and a single crash will not cause the GPU process to be disabled.

It seems we do have an issue with the GPU process being disabled too frequently, and whilst in the background so without the user noticing. My hunch is that whilst the app is in the background the process is repeatedly getting launched for some reason, then killed by the OS to free resources. This was probably responsible for the vast majority of cases in bug 1903810. We're tracking that in bug 1907135.

Comment 6 indicates here users are seeing the GPU process crash whilst the app is in the foreground, so likely due to a genuine crash as opposed to an OS kill. A quick glance at some crash reports in these signatures shows that the launch count is low and that the GPU process is still enabled. So it seems to be related to the GPU process crashing, but not being disabled.

Could we perhaps be attempting to write to a shmem that has become invalid following a GPU process crash?

Bugzilla

Quick Search

Crash in [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow ]

Categories

(Core :: Graphics: ImageLib, defect)

Tracking

()

People

(Reporter: RyanVM, Unassigned)

References

Details

(Keywords: crash, topcrash)

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11