Closed Bug 609257 Opened 11 years ago Closed 11 years ago

Crash on closing test pilot popup reminder [@ @0x0 | MemoryReporter_MallocAllocated::GetMemoryUsed ]

Categories

(Core :: XPCOM, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

VERIFIED DUPLICATE of bug 611405

People

(Reporter: andrea.turrini, Assigned: cjones)

References

Details

(Keywords: crash)

Crash Data

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:2.0b7pre) Gecko/20101012 Firefox/4.0b7pre
Build Identifier: Mozilla/5.0 (X11; Linux x86_64; rv:2.0b7pre) Gecko/20101012 Firefox/4.0b7pre

Today 3 Nov. the test pilot has announced a new study (A Week in the Life of a Browser (v2)) using a popup reminder about it. If I click the close button in the top right corner of the popup, about after 5 seconds firefox crashes, triggering the Crash reporter.

Crash id is fc9be095-d412-455b-bd56-bf5102101103
Crash f806c3d9-4c04-4da4-93c7-87c1c2101103 is obtained using the same profile with all add-ons disabled, except for TestPilot.
The problem does not occur with a fresh profile.


Reproducible: Always

Steps to Reproduce:
1. Start firefox
2. Wait for testpilot popup
3. Close the popup
Actual Results:  
Firefox crashes

Expected Results:  
Firefox should close the popup without crashing
Attached file Crash backtrace
The crash occurs also when I click on the "More info" link in the popup. A new window is open, the content is as usual except for the title and the description of the study that are replaced by a string that is more or less "Loading test case informations..." and then the whole firefox crashes.
More testing, more sources of crashes...

I have edited the prefs.js file in order to add the following property:
user_pref("extensions.testpilot.popup.showOnNewStudy", false);
in order to avoid the popup.
As a side effect, now I obtain that firefox crashes between 5 to 30 seconds after I launch it and the property is reverted to its default value (true).

I will attach a backtrace of such crash.

For your information, I had not got a similar behaviour with the previous test pilot study ("Firefox 4 Beta Interface (part 2)"), even using the same profile and with the same enabled extensions.

Now I disable the test pilot add-on in order to be able to use firefox for more than 10 minutes without having a crash or an annoying popup...
Severity: normal → critical
Component: General → XPCOM
Keywords: crash
Product: Firefox → Core
QA Contact: general → xpcom
Summary: Crash on closing test pilot popup reminder → Crash on closing test pilot popup reminder [@ getMallocAllocated | MemoryReporter_MallocAllocated::GetMemoryUsed ]
Now firefox crashes also when a fresh profile is used: I have created a new user, lunched firefox, closed it, added user_pref("extensions.testpilot.popup.showOnNewStudy", false); inside prefs.js and then re-launched firefox, that is crashed within 20 seconds.
Duplicate of this bug: 610199
confirmed by bug 610199
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Crash on closing test pilot popup reminder [@ getMallocAllocated | MemoryReporter_MallocAllocated::GetMemoryUsed ] → Crash on closing test pilot popup reminder [@ @0x0 | MemoryReporter_MallocAllocated::GetMemoryUsed ]
As expected it also crashes opening about:memory
With FF4.0b7 on x86 (32bit) I get another signature while opening about:memory.
[@ @0x0 | XPCWrappedNative::CallMethod ] 

Example of a crash is here:
http://crash-stats.mozilla.com/report/index/b95dd675-d03a-4298-86f5-93f6f2101107

According to the comments related to that signature it's the same way to reproduce.
blocking2.0: --- → ?
I also obtain a crash while opening about:memory but now I do not think that test-pilot is the culprit: I removed it from /usr/lib64/firefox/extensions (so it is no more available as extension; in fact it is not listed in about:addons); I removed the .mozilla directory of my testing user and then I started firefox. about:memory still crashes firefox.
Andrea, this is expected. The test-pilot study is just the trigger to exercise the code leading to the crash. The about:memory page does it the same way.
I'm going to mark this as blocking beta 7 until we determine the root cause or a valid mitigation technique (like turning off the test pilot study until this is fixed in a future beta, etc).
blocking2.0: ? → beta7+
Group: mozilla-confidential
Hi guys,
I just looked at this bug.  It seems like it's got nothing to do with the notification per se, but rather happens when the "Week in the Life of a Browser" study starts up and queries the memory usage stats (using the exact same code as about:memory).  

We have the ability to issue an update for a study that is already running; clients will pick up the update on restart or when the Test Pilot extension checks for updates (once per 24 hours).  If we can't figure out a fix for the underlying bug and we need to release b7, my proposed workaround is this: issue an update to the study so that it will simply skip the memory usage query if the firefox version equals b7, thereby avoiding the code that triggers the crash.
accidently marked it moco conf
Group: mozilla-confidential
Is there a downside to doing comment 13 asap? I believe we can still reproduce this with about:memory so we should just turn off the about:memory test pilot part on b7. We can later re-enable if we fix this for b7.
I cannot reproduce this with beta7 (build1) on ubuntu 9.04.  about:memory also doesn't crash. 

For those that do see the crash, might there be something distinct about your setups?  I'm using a vanilla profile with no other extensions.
I also can't reproduce this on Mac64. Jono, is there any way we can turn off the about:memory bits only for certain versions on certain platforms?
Never saw this on Mac, Win7 nor win XP in sign off testing for the study.

Wouldn't fixing the root cause in about:memory (if it can be solved) be a better fix than just turning off one trigger?
Assignee: nobody → jones.chris.g
(In reply to comment #18)
> Wouldn't fixing the root cause in about:memory (if it can be solved) be a
> better fix than just turning off one trigger?

Yes, but as we are trying to kick beta7 out the door turning it off is a valid mitigation strategy (just for that build of course...we'd still want to fix the underlying issue eventually).
Yes, we can turn off the about:memory bits for only certain versions on certain platforms.

Question: For which OS should I disable it?  All OS?  Just Windows?  Just Win 7?
I'll clone for the test pilot stuff. Let's keep this bug on fixing the underlying issue.
Blocks: 610488
I can't repro on Ubuntu 10.4 x86-64 either.

Andrea or Wolfgang, which linux distros/versions are you using to reproduce?
I've been trying to reproduce this problem on Win 7, Mac 10.6, and Linux Ubuntu 10.10 (32 and 64bit), but no luck all day.
tl;dr: this specific bug shouldn't block beta7, but I think we already knew that.  Someone with the powers mind clearing the flag?

A random sample of the kernel versions above seems to mostly point back at suse 11.3, with a spattering of very out-of-date ubuntu 8.04.  Suse 11.3 appears to use gcc 4.5, and ubuntu 8.04 uses gcc 4.2.  I can't repro the crash on my ubuntu 8.04 machine that's up to date.

I suspect that the problem here is in our (non-android) linux jemalloc reporter's use of __attribute__((weak)) to link to jemalloc symbols.  Since our build machines use gcc 4.3.3, my guess is that
 (1) the very old gcc 4.2 dynamic linker chokes on 4.3.3 __attribute__((weak)) symbols.  This was fixed in an update.
 (2) the gcc 4.5.(?) dynamic linker chokes on 4.3.3 __attribute__((weak)) symbols.  This may be either an outstanding gcc bug or has been fixed in gcc trunk.

Either way with (2), we're up a creek without a paddle.  I totally agree with just turning off memory reporting for linux in bug 610488.

Our options for fixing are
 (a) Stop using __attribute__((weak)) to link to jemalloc stats.  Probably the best solution, but pretty hard in our current (non-android) linux build config.
 (b) Switch our builders to gcc 4.5 and hope the problem goes away.  Easiest.  (I don't care about way out-of-date ubuntu 8.04 machines and apparently no one else has either.)
 (c) Use __attribute__((weak)) symbols fallibly, that is, null-check the resolved symbols.  This sucks and won't protect against crashing on nonnull (i.e. possibly exploitably) from other linker compat bugs.
Flag cleared, forgot previously.
blocking2.0: beta7+ → ---
(In reply to comment #22)
> I can't repro on Ubuntu 10.4 x86-64 either.
> 
> Andrea or Wolfgang, which linux distros/versions are you using to reproduce?

I am using openSUSE 11.3 fully updated. I installed firefox from  http://download.opensuse.org/repositories/mozilla:/experimental/openSUSE_11.3/ repository

gcc -v output is:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib64/gcc/x86_64-suse-linux/4.5/lto-wrapper
Target: x86_64-suse-linux
Configured with: ../configure --prefix=/usr --infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.5 --enable-ssp --disable-libssp --disable-plugin --with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux' --disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib --enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch --enable-version-specific-runtime-libs --program-suffix=-4.5 --enable-linux-futex --without-system-libunwind --enable-gold --with-plugin-ld=/usr/bin/gold --with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
Thread model: posix
gcc version 4.5.0 20100604 [gcc-4_5-branch revision 160292] (SUSE Linux) 
and kernel is 2.6.34.7-0.5-desktop
Are we going to hit this when we update GCC on thursday? CCing catlee.
Presumably the suse 11.3 problem will go away, but who knows about machines with 4.4/4.3/4.2 installations.  Would be good to know.
I fired off 13fd707b5c5f at tryserver to get some builds for testing.
[@ @0x0 | XPCWrappedNative::CallMethod ] is another Linux only stack that shows up in Beta7 data with comments relating to closing the Test Pilot popup.
It's likely something in the toolchain. I verified that a stock 4.0b8pre build from mozilla is not crashing while the builds created on openSUSE 11.3 (gcc 4.5) and also 11.2 (gcc 4.4) are.
Andrea, Wolfgang, could you try to verify that this problem is now fixed by the the updated study? See bug 610488
about:memory wfm on x64 ubuntu 10.4 (gcc 4.4.3).
(In reply to comment #33)
> Andrea, Wolfgang, could you try to verify that this problem is now fixed by the
> the updated study? See bug 610488

Now "A Week in the Life of a Browser (v2)" starts without problems, at least for me.
Similarly, the user_pref("extensions.testpilot.popup.showOnNewStudy", false);
property does not crash firefox when I launch it.
Depends on: 611405
The patches in bug 611405 fix the crash and make about:memory functional
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 611405
Status: RESOLVED → VERIFIED
No longer depends on: 611405
Crash Signature: [@ @0x0 | MemoryReporter_MallocAllocated::GetMemoryUsed ]
You need to log in before you can comment on or make changes to this bug.