Crash in [@ libc.so | mozilla::image::SurfaceFilter::AdvanceRow ]
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
People
(Reporter: RyanVM, Unassigned)
References
Details
(Keywords: crash, topcrash)
Crash Data
Crash report: https://crash-stats.mozilla.org/report/index/e158d570-766d-46ee-ae0d-4d5ec0230403
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libc.so libc.so@0x1abc0
1 libxul.so mozilla::image::SurfaceFilter::AdvanceRow image/SurfacePipe.h:128
1 libxul.so mozilla::image::SwizzleFilter<mozilla::image::BlendAnimationFilter<mozilla::image::SurfaceSink> >::DoAdvanceRowFromBuffer image/SurfaceFilters.h:98
2 libxul.so mozilla::image::SurfaceFilter::AdvanceRow image/SurfacePipe.h:141
2 libxul.so mozilla::image::SurfaceFilter::WriteBuffer<unsigned int> image/SurfacePipe.h:300
2 libxul.so mozilla::image::SurfacePipe::WriteBuffer<unsigned int> image/SurfacePipe.h:705
2 libxul.so mozilla::image::nsPNGDecoder::WriteRow image/decoders/nsPNGDecoder.cpp:851
3 libxul.so MOZ_PNG_push_proc_row media/libpng/pngpread.c
3 libxul.so MOZ_PNG_proc_IDAT_data media/libpng/pngpread.c:879
4 libxul.so MOZ_PNG_push_read_IDAT media/libpng/pngpread.c:755
Updated•2 years ago
|
Updated•5 months ago
|
Comment 2•5 months ago
|
||
Bug 1895527 means that all the stacks for this crash tend to coalesce on a smaller number of signatures which I've added. I may have misinterpreted the results from our bit-flip detection heuristic in bug 1753060. It's true that in some signatures it triggers a lot, but the addresses that we detect as potential bit-flips are very close to a very large allocation, so it looks more like we've overflown a large buffer than an actual bit-flip. We know that the bit-flip detection logic can give false positives in this case and the Android crashes are even more similar to potential overflows as they often happen on a page boundary. We should look into this again.
Comment 3•5 months ago
|
||
Note: the crash isn't spiking, it's just a signature change, the volume hasn't changed much over time.
Comment 4•5 months ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 AArch64 and ARM crashes on release
:tnikkel, could you consider increasing the severity of this top-crash bug?
For more information, please visit BugBot documentation.
Comment 5•5 months ago
|
||
Looks like we have some new libc crash signatures on Android starting a few months ago.
Updated•5 months ago
|
Comment 6•5 months ago
|
||
I think I may have found something useful for diagnosing the bug. Many comments mention one or more of these three things: watching video, Firefox being very slow (probably due to swapping) and the UI briefly flashing white. This lead me to check the contents of the GraphicsCriticalError
annotation and practically all the crashes I've looked at have this error:
CompositorBridgeChild receives IPC close with reason=AbnormalShutdown
So IIUC the GPU process crashed and the crash we're experiencing here is likely fallout from this issue.
Reporter | ||
Updated•3 months ago
|
Reporter | ||
Comment 7•3 months ago
|
||
This seems to have spiked around the time of Fx126 shipping in case that points to anything obvious.
Comment 8•3 months ago
|
||
Spiked exactly the same time as bug 1903810, which apparantly only occurs after the gpu process has been disabled ( https://bugzilla.mozilla.org/show_bug.cgi?id=1907135#c2 )
Comment 9•3 months ago
|
||
I think that's probably a coincidence - My assumption is that the spike in bug 1903810 was probably due to a software update. We had previously encountered the same crash on different samsung devices in bug 1868825, which also required the GPU process to have been disabled
Comment 10•3 months ago
|
||
Comment 6 here linked this bug to a gpu process crash. Could whatever caused that gpu process crash be behind the gpu process getting disabled for bug 1903810?
Comment 11•3 months ago
|
||
GPU process crashes will happen for a large variety of reasons, and a single crash will not cause the GPU process to be disabled.
It seems we do have an issue with the GPU process being disabled too frequently, and whilst in the background so without the user noticing. My hunch is that whilst the app is in the background the process is repeatedly getting launched for some reason, then killed by the OS to free resources. This was probably responsible for the vast majority of cases in bug 1903810. We're tracking that in bug 1907135.
Comment 6 indicates here users are seeing the GPU process crash whilst the app is in the foreground, so likely due to a genuine crash as opposed to an OS kill. A quick glance at some crash reports in these signatures shows that the launch count is low and that the GPU process is still enabled. So it seems to be related to the GPU process crashing, but not being disabled.
Could we perhaps be attempting to write to a shmem that has become invalid following a GPU process crash?
Description
•