Closed Bug 891840 Opened 10 years ago Closed 9 years ago

Intermittent test_meta_viewport6.html,test_pixel_lengths.html | application timed out after 330 seconds with no output and/or application crashed [@ nsPresContext::PreferenceChanged(char const*)][@ nsPresContext::AppUnitsPerDevPixel()]

Categories

(Core :: DOM: Core & HTML, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36
Tracking Status
firefox31 --- disabled
firefox32 --- disabled
firefox33 --- disabled
firefox34 --- disabled
firefox35 --- fixed
firefox36 --- fixed
firefox-esr24 --- unaffected
firefox-esr31 --- disabled
b2g-v1.3 --- disabled
b2g-v1.3T --- disabled
b2g-v1.4 --- disabled
b2g-v2.0 --- disabled
b2g-v2.0M --- disabled
b2g-v2.1 --- disabled
b2g-v2.2 --- fixed

People

(Reporter: cbook, Assigned: johns)

References

()

Details

(Keywords: crash, intermittent-failure)

https://tbpl.mozilla.org/php/getParsedLog.php?id=25120138&tree=Mozilla-Inbound

b2g_ics_armv7a_gecko_emulator_vm mozilla-inbound opt test mochitest-1 on 2013-07-10 04:47:47 PDT for push 0d65b1c05b2f

slave: tst-linux64-ec2-135

05:33:35  WARNING -  TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_meta_viewport6.html | application timed out after 330 seconds with no output
05:33:35     INFO -  INFO | automation.py | Application ran for: 0:39:33.740034
05:33:35     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpILxctSpidlog
05:33:37     INFO -  mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-inbound-ics_armv7a_gecko/1373451898/b2g-25.0a1.en-US.android-arm.crashreporter-symbols.zip
05:33:37     INFO -  checking for crashes in '/data/local/tests/profile/minidumps'
05:33:37     INFO -  Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-inbound-ics_armv7a_gecko/1373451898/b2g-25.0a1.en-US.android-arm.crashreporter-symbols.zip
05:34:02  WARNING -  PROCESS-CRASH | /tests/content/base/test/test_meta_viewport6.html | application crashed [@ nsPresContext::PreferenceChanged(char const*)]
05:34:02     INFO -  Crash dump filename: /tmp/tmpa2tVRD/5e579ef0-cd7a-b841-6f368fc0-71fa0a22.dmp
05:34:02     INFO -  Operating system: Android
05:34:02     INFO -                    0.0.0 Linux 2.6.29-00294-g701690d #1 Mon May 20 22:43:07 CST 2013 armv7l Android/full/generic:4.0.4.0.4.0.4/OPENMASTER/1893:eng/test-keys
05:34:02     INFO -  CPU: arm
05:34:02     INFO -       0 CPUs
05:34:02     INFO -  Crash reason:  SIGSEGV
05:34:02     INFO -  Crash address: 0x10
05:34:02     INFO -  Thread 0 (crashed)
05:34:02     INFO -   0  libxul.so!nsPresContext::PreferenceChanged(char const*) [nsDeviceContext.h:0d65b1c05b2f : 61 + 0x0]
05:34:02     INFO -       r4 = 0x418bbcdc    r5 = 0x4490d800    r6 = 0xbe8d9880    r7 = 0x00000001
05:34:02     INFO -       r8 = 0x0000003f    r9 = 0x00010011   r10 = 0x43934740    fp = 0x41f7fa10
05:34:02     INFO -       sp = 0xbe8d9840    lr = 0x40652c05    pc = 0x407c2f62
05:34:02     INFO -      Found by: given as instruction pointer in context
05:34:02     INFO -   1  libxul.so!nsPresContext::PrefChangedCallback(char const*, void*) [nsPresContext.cpp:0d65b1c05b2f : 161 + 0x7]
05:34:02     INFO -       r4 = 0x43934740    r5 = 0x00000003    r6 = 0x4074e331    r7 = 0x00000000
05:34:02     INFO -       r8 = 0x0000003f    r9 = 0x00010011   r10 = 0x43934740    fp = 0x41f7fa10
05:34:02     INFO -       sp = 0xbe8d98c8    pc = 0x407c3123
05:34:02     INFO -      Found by: call frame info
05:34:02     INFO -   2  libxul.so!mozilla::ValueObserver::Observe(nsISupports*, char const*, unsigned short const*) [Preferences.cpp:0d65b1c05b2f : 143 + 0x3]
05:34:02     INFO -       r4 = 0x43934740    r5 = 0x00000003    r6 = 0x4074e331    r7 = 0x00000000
05:34:02     INFO -       r8 = 0x0000003f    r9 = 0x00010011   r10 = 0x43934740    fp = 0x41f7fa10
05:34:02     INFO -       sp = 0xbe8d98d0    pc = 0x4074e367
05:34:02     INFO -      Found by: call frame info
05:34:02     INFO -   3  libxul.so!nsPrefBranch::NotifyObserver(char const*, void*) [nsPrefBranch.cpp:0d65b1c05b2f : 724 + 0xb]
05:34:02     INFO -       r4 = 0xbe8d9938    r5 = 0xbe8d99cc    r6 = 0x4074e331    r7 = 0x00000000
05:34:02     INFO -       r8 = 0x0000003f    r9 = 0x00010011   r10 = 0x43934740    fp = 0x41f7fa10
05:34:02     INFO -       sp = 0xbe8d9938    pc = 0x40750bdb
05:34:02     INFO -      Found by: call frame info
05:34:02     INFO -   4  libxul.so!pref_DoCallback [prefapi.cpp:0d65b1c05b2f : 956 + 0x7]
05:34:02     INFO -       r4 = 0x43499a50    r5 = 0x00000000    r6 = 0x00000000    r7 = 0x41f585b8
05:34:02     INFO -       r8 = 0x439348a0    r9 = 0x00000000   r10 = 0x00000000    fp = 0x00000000
05:34:02     INFO -       sp = 0xbe8d9a60    pc = 0x40752435
05:34:02     INFO -      Found by: call frame info
05:34:02     INFO -   5  libxul.so!pref_HashPref [prefapi.cpp:0d65b1c05b2f : 802 + 0x3]
05:34:02     INFO -       r4 = 0x00000020    r5 = 0x43499938    r6 = 0x41f585b8    r7 = 0x43115794
05:34:02     INFO -       r8 = 0xbe8d9ddc    r9 = 0x00000000   r10 = 0x00000000    fp = 0x00000000
05:34:02     INFO -       sp = 0xbe8d9a78    pc = 0x407525cb
05:34:02     INFO -      Found by: call frame info
05:34:02     INFO -   6  libxul.so!PREF_SetCharPref [prefapi.cpp:0d65b1c05b2f : 258 + 0x3]
Summary: Intermittent TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_meta_viewport6.html | application timed out after 330 seconds with no output and crashed application crashed [@ nsPresContext::PreferenceChanged(char const*)] → Intermittent TEST-UNEXPECTED-FAIL | test_meta_viewport6.html,test_pixel_lengths.html | application timed out after 330 seconds with no output and/or application crashed [@ nsPresContext::PreferenceChanged(char const*)]
Johnny, this is a long-running crash we've been hitting in automation. Any chance you could help find an owner? :)
Flags: needinfo?(jst)
Summary: Intermittent TEST-UNEXPECTED-FAIL | test_meta_viewport6.html,test_pixel_lengths.html | application timed out after 330 seconds with no output and/or application crashed [@ nsPresContext::PreferenceChanged(char const*)] → Intermittent test_meta_viewport6.html,test_pixel_lengths.html | application timed out after 330 seconds with no output and/or application crashed [@ nsPresContext::PreferenceChanged(char const*)]
Summary: Intermittent test_meta_viewport6.html,test_pixel_lengths.html | application timed out after 330 seconds with no output and/or application crashed [@ nsPresContext::PreferenceChanged(char const*)] → Intermittent test_meta_viewport6.html,test_pixel_lengths.html | application timed out after 330 seconds with no output and/or application crashed [@ nsPresContext::PreferenceChanged(char const*)][@ nsPresContext::AppUnitsPerDevPixel()]
Sorry for the silence here, I did spend some time digging in here trying to figure out what we're actually crashing on here, but failed to make much headway. It seems pretty clear that this is in layout code though, so bumping this to the layout folks. Jet, can you please have one of your guys dig in here?
Flags: needinfo?(jst) → needinfo?(bugs)
To Jonathan for a look. It seems like our usage of SpecialPowers.pushPrefEnv() is running into a very bad race condition (smells similar to bug 701060.) Can you help resolve this one?
Assignee: nobody → jwatt
Flags: needinfo?(bugs)
One thing that is clear here is that the symbols are busted:

15:48:56     INFO -   0  libxul.so!nsPresContext::PreferenceChanged(char const*) [task.h:31abf5bacc6c : 27 + 0x0]
15:48:56     INFO -      eip = 0xb54bfe2c   esp = 0xbff73b70   ebp = 0xbff73d2c   ebx = 0xb6dd98c8
15:48:56     INFO -      esi = 0x9c581c00   edi = 0xbff73bc4   eax = 0x00000000   ecx = 0x00006970
15:48:56     INFO -      edx = 0xb619aeb4   efl = 0x00210202
15:48:56     INFO -      Found by: given as instruction pointer in context

nsPresContext::PreferenceChanged doesn't call CancelableTask's ctor or dtor anywhere (checked under a debugger), so the task.h:27 part is bogus, which is unfortunate as that doesn't reveal much about the crash...  Ted, any idea why?
Flags: needinfo?(ted)
I really don't know what's going on with the symbols here, but I don't think this is a Breakpad issue, it seems like it's just the compiler being weird. The crash in comment 233 is at libxul.so + 0x15b1323, and the relevant bits of the symbol file look like (with filenames+line numbers expanded into hg links):
15b130f http://hg.mozilla.org/releases/mozilla-beta/annotate/10904f84c309/layout/base/nsPresContext.cpp#l935
15b1320 http://hg.mozilla.org/releases/mozilla-beta/annotate/10904f84c309/xpcom/base/nsAutoPtr.h#l1029
15b1323 http://hg.mozilla.org/releases/mozilla-beta/annotate/10904f84c309/ipc/chromium/src/base/task.h#l27
15b1326 http://hg.mozilla.org/releases/mozilla-beta/annotate/10904f84c309/layout/base/nsPresContext.cpp#l914

It would be interesting to grab a full build with symbols from tinderbox and disassemble this function to see WTF the codegen actually looks like here.
Flags: needinfo?(ted)
The symbols on the stack looks valid though - it's a preference change that
is being notified to a pres context.  Maybe we crash b/c the pres context is in
the process of being destroyed?  I wonder if we can stress test this to crash
it more reliably.  The prefs we listen for are here:
http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresContext.cpp?rev=5333221d0d23#1079
Changing one of those, while creating/destroying/reloading a lot of <iframe>s
might trigger it.
(In reply to Mats Palmgren (:mats) from comment #238)
> The symbols on the stack looks valid though - it's a preference change that
> is being notified to a pres context.  Maybe we crash b/c the pres context is
> in
> the process of being destroyed?  I wonder if we can stress test this to crash
> it more reliably.  The prefs we listen for are here:
> http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresContext.
> cpp?rev=5333221d0d23#1079
> Changing one of those, while creating/destroying/reloading a lot of <iframe>s
> might trigger it.

Any luck with this, Mats? :). We're still hitting this crash pretty reliably in automation as you can see.
Flags: needinfo?(matspal)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #270)
> Any luck with this, Mats? :). We're still hitting this crash pretty reliably
> in automation as you can see.

I'm busy with other work for the foreseeable future so it's not likely I'll
get to this anytime soon, sorry.  If someone else can make a standalone
testcase that reproduces the crash reliably, then I'll take a look.
Flags: needinfo?(matspal)
Per comment 235, I guess that last needinfo really should have gone to jwatt anyway.
Flags: needinfo?(jwatt)
I think it's time to considering disabling these tests if we can't fix this crash.
Flags: needinfo?(jst)