Closed Bug 623161 Opened 15 years ago Closed 9 years ago

Crash in neon_composite_src_8888_0565

Categories

(Core :: Graphics, defect)

ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
blocking2.0 --- -
fennec - ---

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash, mobile, qawanted, Whiteboard: [need-str][at-risk][mobile-crash][native-crash])

Crash Data

It is #14 top crasher in Fennec 4.0b3 for the last week. Signature libxul.so@0xbff7ec UUID 7641d3fb-26b5-479e-8327-3e6c82101222 Time 2010-12-22 15:28:30.514616 Uptime 2091 Install Age 2091 seconds (34.9 minutes) since version was first installed. Product Fennec Version 4.0b3 Build ID 20101221205132 Branch 1.9 OS Linux OS Version 0.0.0 Linux 2.6.32.15-leedroid_2.2f #1 PREEMPT Fri Nov 5 18:14:11 CET 2010 armv7l CPU arm Crash Reason SIGSEGV Crash Address 0x409391d0 App Notes HTC HTC Desire htc_wwe/htc_bravo/bravo/bravo:2.2/FRF91/226611:user/release-keys Frame Module Signature [Expand] Source 0 libxul.so libxul.so@0xbff7ec 1 libxul.so neon_composite_src_8888_0565 gfx/cairo/libpixman/src/pixman-arm-neon.c:45 2 libxul.so neon_composite_src_8888_0565 gfx/cairo/libpixman/src/pixman-arm-neon.c:45 3 libxul.so _moz_pixman_image_composite32 gfx/cairo/libpixman/src/pixman.c:837 4 libxul.so _moz_pixman_image_composite gfx/cairo/libpixman/src/pixman.c:881 5 libxul.so _cairo_image_surface_composite gfx/cairo/cairo/src/cairo-image-surface.c:1175 6 libxul.so _cairo_surface_composite gfx/cairo/cairo/src/cairo-surface.c:1828 7 libxul.so _clip_and_composite_trapezoids gfx/cairo/cairo/src/cairo-surface-fallback.c:790 8 libxul.so _cairo_surface_fallback_paint gfx/cairo/cairo/src/cairo-surface-fallback.c:1042 9 libxul.so _cairo_surface_paint gfx/cairo/cairo/src/cairo-surface.c:2020 10 libxul.so _cairo_gstate_paint gfx/cairo/cairo/src/cairo-gstate.c:988 11 libxul.so _moz_cairo_paint gfx/cairo/cairo/src/cairo.c:2118 12 libxul.so _moz_cairo_paint_with_alpha gfx/cairo/cairo/src/cairo.c:2147 13 libxul.so gfxContext::Paint gfx/thebes/gfxContext.cpp:739 14 libxul.so gfxPlatform::OptimizeImage gfx/thebes/gfxPlatform.cpp:389 15 libxul.so imgFrame::Optimize nsAutoPtr.h:954 16 libxul.so mozilla::imagelib::RasterImage::DecodingComplete modules/libpr0n/src/RasterImage.cpp:1055 17 libxul.so mozilla::imagelib::Decoder::PostDecodeDone nsCOMPtr.h:800 18 libxul.so mozilla::imagelib::nsPNGDecoder::end_callback modules/libpr0n/decoders/nsPNGDecoder.cpp:830 19 libxul.so MOZ_PNG_push_have_end modules/libimg/png/pngpread.c:1907 20 libxul.so MOZ_PNG_push_read_chunk modules/libimg/png/pngpread.c:364 21 libxul.so MOZ_PNG_proc_some_data modules/libimg/png/pngpread.c:65 22 libxul.so MOZ_PNG_process_data modules/libimg/png/pngpread.c:39 23 libxul.so mozilla::imagelib::nsPNGDecoder::WriteInternal modules/libpr0n/decoders/nsPNGDecoder.cpp:349 24 libxul.so mozilla::imagelib::Decoder::Write modules/libpr0n/src/Decoder.cpp:102 25 libxul.so mozilla::imagelib::RasterImage::WriteToDecoder modules/libpr0n/src/RasterImage.cpp:2199 26 libxul.so mozilla::imagelib::RasterImage::AddSourceData modules/libpr0n/src/RasterImage.cpp:1181 27 libxul.so mozilla::imagelib::RasterImage::WriteToRasterImage modules/libpr0n/src/RasterImage.cpp:2674 28 libxul.so nsStringInputStream::ReadSegments xpcom/io/nsStringStream.cpp:288 29 libxul.so imgTools::DecodeImageData modules/libpr0n/src/imgTools.cpp:119 30 libxul.so nsFaviconService::OptimizeFaviconImage toolkit/components/places/src/nsFaviconService.cpp:1021 31 libxul.so mozilla::places::SetFaviconDataStep::Run toolkit/components/places/src/AsyncFaviconHelpers.cpp:820 32 libxul.so mozilla::places::AsyncFaviconStepperInternal::Step nsCOMPtr.h:492 33 libxul.so mozilla::places::FetchNetworkIconStep::OnStopRequest toolkit/components/places/src/AsyncFaviconHelpers.cpp:736 34 libxul.so nsHttpChannel::OnStopRequest nsCOMPtr.h:663 35 libxul.so nsInputStreamPump::OnStateStop nsCOMPtr.h:663 36 libxul.so nsInputStreamPump::OnInputStreamReady netwerk/base/src/nsInputStreamPump.cpp:403 37 libxul.so nsInputStreamReadyEvent::Run nsCOMPtr.h:663 38 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:626 39 libxul.so NS_ProcessNextEvent_P nsThreadUtils.cpp:250 40 libxul.so mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:111 41 libxul.so MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:220 42 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:512 43 libxul.so nsBaseAppShell::Run widget/src/xpwidgets/nsBaseAppShell.cpp:198 44 libxul.so nsAppStartup::Run toolkit/components/startup/src/nsAppStartup.cpp:192 45 libxul.so XRE_main toolkit/xre/nsAppRunner.cpp:3693 46 libxul.so GeckoStart toolkit/xre/nsAndroidStartup.cpp:131 47 libc.so libc.so@0x10f47 48 libc.so libc.so@0x10a33 More reports at: http://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=libxul.so%400xbff7ec
tracking-fennec: --- → ?
Summary: Fennec crash mainly at startup [@ libxul.so@0xbff7ec ] → Fennec crash mainly at startup [@ libxul.so@0xbff7ec ] [@ libxul.so@0xbff808 ] [@ libxul.so@0xbff818 ] [@ libxul.so@0xbff7f0 ]
Summary: Fennec crash mainly at startup [@ libxul.so@0xbff7ec ] [@ libxul.so@0xbff808 ] [@ libxul.so@0xbff818 ] [@ libxul.so@0xbff7f0 ] → Fennec crash mainly at startup [@ neon_composite_src_8888_0565 ] [@ libxul.so@0xbff7ec ] [@ libxul.so@0xbff808 ] [@ libxul.so@0xbff818 ] [@ libxul.so@0xbff7f0 ]
blocking2.0: --- → ?
tracking-fennec: ? → 2.0+
blocking2.0: ? → -
Summary: Fennec crash mainly at startup [@ neon_composite_src_8888_0565 ] [@ libxul.so@0xbff7ec ] [@ libxul.so@0xbff808 ] [@ libxul.so@0xbff818 ] [@ libxul.so@0xbff7f0 ] → Fennec crash mainly at startup [@ neon_composite_src_8888_0565 ] [@ libxul.so@0xbff7ec ] [@ libxul.so@0xbff808 ] [@ libxul.so@0xbff818 ] [@ libxul.so@0xbff7f0 ] [@ libxul.so@0xc31be0 ][@ libxul.so@0xc29f98 ]
Jeff, mind taking a look at this?
Assignee: nobody → joe
Joe, have you looked into this at all?
Whiteboard: [need-str][at-risk]
Keywords: mobile
This is topcrasher #41 in b4. Not sure if it still blocks, given that.
Keywords: qawanted
Not 100% sure, but this could be related to bug 622470? would it be possible that DecodeComplete stuff accessing already Discarded surface somehow? Another possibility is that current cairo has some unknown memory corruption issue which is fixed in Cairo 1.10, but I don't know which cairo change exactly fixing this problem... I had some issues and strange crashes 6 month ago, see bug 569669, and it still show some wrong rendered Input fields with some zoom levels (amazon login page and some others) on my Galaxy Tab.
Summary: Fennec crash mainly at startup [@ neon_composite_src_8888_0565 ] [@ libxul.so@0xbff7ec ] [@ libxul.so@0xbff808 ] [@ libxul.so@0xbff818 ] [@ libxul.so@0xbff7f0 ] [@ libxul.so@0xc31be0 ][@ libxul.so@0xc29f98 ] → Fennec crash mainly at startup [@ neon_composite_src_8888_0565 ][@ @0x0 | neon_composite_src_8888_0565 ][@ libxul.so@0xbff7ec ][@ libxul.so@0xbff808 ][@ libxul.so@0xbff818 ][@ libxul.so@0xbff7f0 ][@ libxul.so@0xc31be0 ][@ libxul.so@0xc29f98 ]
Are there any desktop firefox crashes with gfxPlatform::OptimizeImage showing up in a backtrace?
> Are there any desktop firefox crashes with gfxPlatform::OptimizeImage showing > up in a backtrace? Bug 619048 for the imgFrame::Optimize frame, but stack traces are different.
Anyway, I would suggest to look for problems somewhere around mozilla::imagelib::RasterImage::DecodingComplete like Oleg already mentioned. Pixman itself has a reasonably reliable test suite for this part of code, which makes the off-by-one or similar bugs really difficult to introduce. Of course nothing can be 100% guaranteed, but I would be really surprised if this particular issue turned out to be a bug somewhere in pixman NEON optimizations.
It seems to me that this is much more likely to be a memory corruption issue elsewhere in Firefox. And I bet that bug 619048 is similar/the same, since they're both crashes under Optimize, but I don't have any ideas as to what this is. We should keep our eyes on this, because it's worrying, but I don't think this needs to block Fennec 2, as it's a low topcrasher. Re-nominating for consideration.
tracking-fennec: 2.0+ → ?
tracking-fennec: ? → 2.0-
Summary: Fennec crash mainly at startup [@ neon_composite_src_8888_0565 ][@ @0x0 | neon_composite_src_8888_0565 ][@ libxul.so@0xbff7ec ][@ libxul.so@0xbff808 ][@ libxul.so@0xbff818 ][@ libxul.so@0xbff7f0 ][@ libxul.so@0xc31be0 ][@ libxul.so@0xc29f98 ] → Crash [@ neon_composite_src_8888_0565 ][@ @0x0 | neon_composite_src_8888_0565 ][@ libxul.so@0xc1bf88 ][@ libxul.so@0xc1bf98 ][@ libxul.so@0xc1bfa4 ][@ libxul.so@0xc1bfb0 ][@ libxul.so@0xc1bf6c ][@ libxul.so@0xc1bf68 ][@ libxul.so@0xc1bf60 ]
The reports with SIGBUS indicate alignment problems. Unless the pointers to buffers are total garbage, the buffers might be just not allocated right. Pixman requires 4 byte alignment for any image buffers even for 8bpp and 16bpp pixel formats.
I'm a bit lost trying to understand how these automated reports could be useful the way they are. Is it just a hint to the part of code, which somebody has to carefully review and guess what could be wrong. Then blindly fix something and see whether the overall crash statistics improves? So do we have absolutely no idea how to reproduce this issue? I mean even an URL where it is likely to be triggered would be a great help.
Well, in any case looks like: 1. SIGBUS related problems are prevailing now, and this is different from the SIGSEGV originally reported 2. Android uses more strict /proc/cpu/alignment settings than some random linux: http://groups.google.com/group/android-kernel/browse_thread/thread/e6f978ec3803cfb1 I'll try to test fennec a bit later today to see if I manage to reproduce any alignment issues.
I tried to have some long browsing session but could not reproduce any bugs or crashes (I guess I suck as a tester). Looking at the crash reports, there are suspiciously many SIGILL/SIGBUS in addition to SIGSEGV. Which reminds me about one issue. Old revisions of ARM Cortex-A8 have multiple thumb/thumb2 related hardware bugs which have similar SIGILL/SIGBUS symptoms unless workarounded. Workarounds are also not entirely free and cost some performance. A very brief description of the thumb bug is the following: after a context switch, it may happen that the code starts executing in the wrong arm/thumb state, naturally causing the unlucky application to die for various reasons, SIGILL/SIGBUS being quite common among them. I don't known about any thumb problems with snapdragon processors though, and these also seem to be showing in the crash reports. So not everything is so clear. Anyway there are two questions regarding crash statistics: 1. Can we in any way get the values in registers at the time of crash and identify the exact crash location (with the disassembly dump)? Checking whether we are in a wrong state should be pretty easy. 2. Can we get a separate statistics for N900 to see whether it is also affected by these crashes? As far as I know, N900 build of Fennec is supposed to be thumb free.
so, that's a fun bug. note that the kernel needs to protect processes from this and from memory the n900 kernel doesn't - we trust that processes will not use those instructions.
One more observation. We can compare a 'traditional' bug from the top of the current crashers list https://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2011-04-02%2006%3A00%3A00&signature=nsThebesFontMetrics%3A%3AGetMetrics&version=Fennec%3A4.0 with this 'neon_composite_src_8888_0565' bug https://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2011-04-02%2006%3A00%3A00&signature=neon_composite_src_8888_0565&version=Fennec%3A4.0 A noticeable difference is that for 'neon_composite_src_8888_0565' bug we have a really high percentage of reports coming from 'cyanogenmod', while for the other bug 'cyanogenmod' is not so common. After googling a bit, I got an impression that this 'cyanogenmod' is an unofficial custom android firmware. So my random guess is that 'cyanogenmod' might have a bit broken kernel. Even in the linux world I had to struggle with some kernel bugs and missing errata workarounds: http://www.spinics.net/lists/arm-kernel/msg79995.html http://lists.denx.de/pipermail/u-boot/2010-February/067923.html What is (not so) fun is that some people found it hard to believe that their deadlocks or image corruption issues were not the fault of NEON optimizations in pixman, but were actually caused by the problems in their bootloader/kernel.
Same crash on Firefox 4.0.1 for linux Signature libxul.so@0xed0892 UUID d2c221d2-8f25-4046-9aa2-86ed02110511 Uptime 2.0 days Last Crash 173272 seconds (2.0 days) before submission Install Age 348420 seconds (4.0 days) since version was first installed. Install Time 2011-05-04 08:17:51 Product Firefox Version 4.0.1 Build ID 20110429093851 Release Channel unknown Branch 2.0 OS Linux OS Version 0.0.0 Linux 2.6.35-23-generic-pae #41~lucid1-Ubuntu SMP Thu Dec 2 23:51:29 UTC 2010 i686 CPU x86 CPU Info GenuineIntel family 6 model 23 stepping 10 Crash Reason SIGSEGV Crash Address 0x19bada7c
Signature libxul.so@0xeb93d2 UUID f1f437e5-321b-4ffc-9dfd-c7a302110512 Uptime 20.8 hours Last Crash 75402 seconds (20.9 hours) before submission Install Age 707432 seconds (1.2 weeks) since version was first installed. Install Time 2011-05-04 08:17:51 Product Firefox Version 4.0.1 Build ID 20110429093851 Release Channel unknown Branch 2.0 OS Linux OS Version 0.0.0 Linux 2.6.35-23-generic-pae #41~lucid1-Ubuntu SMP Thu Dec 2 23:51:29 UTC 2010 i686 CPU x86 CPU Info GenuineIntel family 6 model 23 stepping 10 Crash Reason SIGSEGV Crash Address 0x87344858 User Comments damn Processor Notes EMCheckCompatibility True
Please change platform to All !
It isn't only Fennec crash, regular Firefox too
(In reply to comment #22) > It isn't only Fennec crash, regular Firefox too Surely regular Firefox is not totally bug free either. But what makes you think that it's the same bug?
I just thought that this bug report contains several bugs. And [@ libxul.so@0xc1bf60 ] is one of them.
(In reply to comment #24) > I just thought that this bug report contains several bugs. And > [@ libxul.so@0xc1bf60 ] is one of them. Your stack traces don't include neon_composite_src_8888_0565 as in comment 0, so it is not the same bug. File a new one.
OK
Crash Signature: [@ neon_composite_src_8888_0565 ] [@ @0x0 | neon_composite_src_8888_0565 ] [@ libxul.so@0xc1bf88 ] [@ libxul.so@0xc1bf98 ] [@ libxul.so@0xc1bfa4 ] [@ libxul.so@0xc1bfb0 ] [@ libxul.so@0xc1bf6c ] [@ libxul.so@0xc1bf68 ] [@ libxul.so@0xc1bf60 ]
It still happens in Fennec 9.0b1.
Crash Signature: [@ neon_composite_src_8888_0565 ] [@ @0x0 | neon_composite_src_8888_0565 ] [@ libxul.so@0xc1bf88 ] [@ libxul.so@0xc1bf98 ] [@ libxul.so@0xc1bfa4 ] [@ libxul.so@0xc1bfb0 ] [@ libxul.so@0xc1bf6c ] [@ libxul.so@0xc1bf68 ] [@ libxul.so@0xc1bf60 ] → libxul.so@0xa6a390 ] [@ libxul.so@0xa6a3dc ] [@ neon_composite_src_8888_0565 ] [@ @0x0 | neon_composite_src_8888_0565 ] [@ libxul.so@0xc1bf88 ] [@ libxul.so@0xc1bf98 ] [@ libxul.so@0xc1bfa4 ] [@ libxul.so@0xc1bfb0 ] [@ libxul.so@0xc1bf6c ] [@ libx…
Summary: Crash [@ neon_composite_src_8888_0565 ][@ @0x0 | neon_composite_src_8888_0565 ][@ libxul.so@0xc1bf88 ][@ libxul.so@0xc1bf98 ][@ libxul.so@0xc1bfa4 ][@ libxul.so@0xc1bfb0 ][@ libxul.so@0xc1bf6c ][@ libxul.so@0xc1bf68 ][@ libxul.so@0xc1bf60 ] → Crash in neon_composite_src_8888_0565
Whiteboard: [need-str][at-risk] → [need-str][at-risk][mobile-crash]
Keywords: topcrash
This has happened on Native Fennec too. They're mostly near-NULL crashes under Imagelib. It seems unlikely that these would be OOM, if for no other reason than that tends to be handled pretty well inside Cairo. I'm sort of at a loss here.
Whiteboard: [need-str][at-risk][mobile-crash] → [need-str][at-risk][mobile-crash][native-crash]
Will bug 695498 fix it?
Crash Signature: libxul.so@0xa6a390 ] [@ libxul.so@0xa6a3dc ] [@ neon_composite_src_8888_0565 ] [@ @0x0 | neon_composite_src_8888_0565 ] [@ libxul.so@0xc1bf88 ] [@ libxul.so@0xc1bf98 ] [@ libxul.so@0xc1bfa4 ] [@ libxul.so@0xc1bfb0 ] [@ libxul.so@0xc1bf6c ] [@ libx… → libxul.so@0xa82cd7 | neon_composite_src_8888_0565] [@ neon_composite_src_8888_0565 ] [@ @0x0 | neon_composite_src_8888_0565 ] [@ libxul.so@0xa84108 | libxul.so@0xa82d97 | neon_composite_src_8888_0565] [@ libxul.so@0xa87564 | libxul.so@0xa82dd7 | neon_c…
I only saw one crash on this : Product: 'GT-I9100', Manufacturer: 'samsung' at about:blank with build 20120117042008. Looks like a startup crash. https://crash-stats.mozilla.com/report/index/fa0c1990-c92a-47c7-b351-c82ce2120117
Whiteboard: [need-str][at-risk][mobile-crash][native-crash] → [need-str][at-risk][mobile-crash]
(In reply to Naoki Hirata :nhirata from comment #31) > Seems to be resolved. It's back in FennecAndroid 14.0b2 and is #54 top crasher. See https://crash-stats.mozilla.com/report/list?signature=neon_composite_src_8888_0565
Crash Signature: [@ neon_composite_src_8888_0565 ] [@ @0x0 | neon_composite_src_8888_0565 ] [@ libxul.so@0xa84108 | libxul.so@0xa82d97 | neon_composite_src_8888_0565] [@ libxul.so@0xa87564 | libxul.so@0xa82dd7 | neon_composite_src_0565_0565] [@ libxul.so@0xa84048 | li… → [@ neon_composite_src_8888_0565 ] [@ @0x0 | neon_composite_src_8888_0565 ]
Whiteboard: [need-str][at-risk][mobile-crash] → [need-str][at-risk][mobile-crash][native-crash]
There are only 4 crashes in 22.0.
Assignee: joe → nobody
Closing this bug report as incomplete since this has no recent reports with supported Fennec versions. Please reopen this bug report if you can reproduce this crash.
Status: REOPENED → RESOLVED
Closed: 13 years ago9 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.