Closed Bug 1054819 Opened 5 years ago Closed 5 years ago

Intermittent test_bug348497.html,test_bug409604.html,test_copypaste.xhtml | application crashed [@ mozPersonalDictionary::Release()] when new HTTP cache is enabled

Categories

(Core :: Spelling checker, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla34
Tracking Status
firefox32 --- fixed
firefox33 --- fixed
firefox34 --- fixed
firefox-esr24 --- unaffected
firefox-esr31 --- unaffected

People

(Reporter: jduell.mcbugs, Assigned: michal)

References

Details

(Keywords: crash, intermittent-failure)

Attachments

(1 file, 3 obsolete files)

This is blocking turning on mochitests for the new HTTP cache, which is about to land on release, so fairly urgent.

For some reason this test is crashing reliably (but only for e10s-mochitests, and only on the Linux-x64 results:  the Linux x64 Opt results are fine) when the new HTTP cache is turned on.

See the try results here for examples of the failure (note the patch that turns on the new cache):  it's in the M-e10s "1" tests:

   https://tbpl.mozilla.org/?tree=Try&rev=fe5dcd1b7abb
Drew: you wrote this test--can you take a look?
Flags: needinfo?(adw)
Note that bug 1043843 may very well fix this. It appears to be pretty close to landing.
How is it that we haven't seen this on Beta/Aurora/Nightly?  We have trains, don't we?
Because cache2 was explicitly disabled in automation when it landed and was never re-enabled. See bug 1053517.
How can we then release cache2?
That's kind of the point of these bugs...
Also, e10s Mochitests have only been enabled very recently.  (And in fact, this particular test suite is probably going to be hidden again soon for leaking all the time...)
> How can we then release cache2?

We had it passing automation when we landed it, and it's been preff'd on for all nightly/aurora/beta users for the whole train ride (only disabled for automated tests), so we're reasonably confident that it actually works--we just have these mop-up bugs for tests that have gone orange since we actually landed it on trunk.  Other than one memory leak we haven't seen anything that's actually broken with the cache.

But yes, this is not how I generally like to release code.
I can take a look yet this week (if there is no other owner already looking at this).
Assignee: nobody → honzab.moz
Thanks, I probably won't be able to look at this soon.
Flags: needinfo?(adw)
I didn't see this failure on my latest try run on the cache, FWIW:

   https://tbpl.mozilla.org/?tree=Try&rev=2513b38dfc51
This also frequently hits test_bug409604.html and occasionally test_bug348497.html.
https://tbpl.mozilla.org/php/getParsedLog.php?id=46139573&tree=Try
https://tbpl.mozilla.org/php/getParsedLog.php?id=46139692&tree=Try

18:31:07  WARNING -  TEST-UNEXPECTED-FAIL | /tests/dom/events/test/test_bug409604.html | application terminated with exit code 11
18:31:07     INFO -  runtests.py | Application ran for: 0:04:22.789839
18:31:07     INFO -  zombiecheck | Reading PID log: /tmp/tmpId7QP7pidlog
18:31:07     INFO -  ==> process 1740 launched child process 1780
18:31:07     INFO -  zombiecheck | Checking for orphan process with PID: 1780
18:31:20     INFO -  4295 INFO mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/7ef713c7-ba4f-6585-598ad749-03d53780.dmp
18:31:20     INFO -  4296 INFO mozcrash Saved app info as /builds/slave/test/build/blobber_upload_dir/7ef713c7-ba4f-6585-598ad749-03d53780.extra
18:31:20  WARNING -  PROCESS-CRASH | /tests/dom/events/test/test_bug409604.html | application crashed [@ mozPersonalDictionary::Release()]
18:31:20     INFO -  Crash dump filename: /tmp/tmpsySkfH.mozrunner/minidumps/7ef713c7-ba4f-6585-598ad749-03d53780.dmp
18:31:20     INFO -  Operating system: Linux
18:31:20     INFO -                    0.0.0 Linux 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686
18:31:20     INFO -  CPU: x86
18:31:20     INFO -       GenuineIntel family 6 model 62 stepping 4
18:31:20     INFO -       1 CPU
18:31:20     INFO -  Crash reason:  SIGSEGV
18:31:20     INFO -  Crash address: 0x0
18:31:20     INFO -  Thread 37 (crashed)
18:31:20     INFO -   0  libxul.so!mozPersonalDictionary::Release() [mozPersonalDictionary.cpp:fe5dcd1b7abb : 36 + 0x16]
18:31:20     INFO -      eip = 0xb42c1d4a   esp = 0xa68ff020   ebp = 0xa68ff048   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9ef993e0   edi = 0xb725c180   eax = 0x00000000   ecx = 0xb75d98ac
18:31:20     INFO -      edx = 0x00000000   efl = 0x00010282
18:31:20     INFO -      Found by: given as instruction pointer in context
18:31:20     INFO -   1  libxul.so!mozPersonalDictionaryLoader::~mozPersonalDictionaryLoader() [nsAutoPtr.h:fe5dcd1b7abb : 852 + 0x8]
18:31:20     INFO -      eip = 0xb42c1fdd   esp = 0xa68ff050   ebp = 0xa68ff068   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9fac2880   edi = 0x9fac2884
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   2  libxul.so!mozPersonalDictionaryLoader::~mozPersonalDictionaryLoader() [mozPersonalDictionary.cpp:fe5dcd1b7abb : 48 + 0x8]
18:31:20     INFO -      eip = 0xb42c200c   esp = 0xa68ff070   ebp = 0xa68ff088   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9fac2880   edi = 0x9fac2884
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   3  libxul.so!nsRunnable::Release() [nsThreadUtils.cpp:fe5dcd1b7abb : 32 + 0x8]
18:31:20     INFO -      eip = 0xb2956b22   esp = 0xa68ff090   ebp = 0xa68ff0c8   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9fac2880   edi = 0x9fac2884
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   4  libxul.so!nsCOMPtr<nsIRunnable>::~nsCOMPtr() [nsCOMPtr.h:fe5dcd1b7abb : 527 + 0x8]
18:31:20     INFO -      eip = 0xb28f45cc   esp = 0xa68ff0d0   ebp = 0xa68ff0e8   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0xa68ff128   edi = 0xa68ff128
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   5  libxul.so!nsThreadPool::Run() [nsThreadPool.cpp:fe5dcd1b7abb : 221 + 0xb]
18:31:20     INFO -      eip = 0xb293d4dc   esp = 0xa68ff0f0   ebp = 0xa68ff148   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0xa85ffd60   edi = 0xa68ff128
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   6  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:fe5dcd1b7abb : 770 + 0x13]
18:31:20     INFO -      eip = 0xb293a894   esp = 0xa68ff150   ebp = 0xa68ff1a8   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9ef99470   edi = 0xa68ff178
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   7  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:fe5dcd1b7abb : 265 + 0xf]
18:31:20     INFO -      eip = 0xb2956d12   esp = 0xa68ff1b0   ebp = 0xa68ff1e8   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9efa8a60   edi = 0x9ec38ec0
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   8  libxul.so!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [MessagePump.cpp:fe5dcd1b7abb : 326 + 0xb]
18:31:20     INFO -      eip = 0xb2bce83a   esp = 0xa68ff1f0   ebp = 0xa68ff238   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9efa8a60   edi = 0x9ec38ec0
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -   9  libxul.so!MessageLoop::RunInternal() [message_loop.cc:fe5dcd1b7abb : 229 + 0x8]
18:31:20     INFO -      eip = 0xb2bb24f0   esp = 0xa68ff240   ebp = 0xa68ff268   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9ec38ec0   edi = 0x9ec38ec0
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -  10  libxul.so!MessageLoop::Run() [message_loop.cc:fe5dcd1b7abb : 222 + 0x7]
18:31:20     INFO -      eip = 0xb2bb2516   esp = 0xa68ff270   ebp = 0xa68ff298   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9ec38ec0   edi = 0x9ec38ec0
18:31:20     INFO -      Found by: call frame info
18:31:20     INFO -  11  libxul.so!nsThread::ThreadFunc(void*) [nsThread.cpp:fe5dcd1b7abb : 347 + 0x8]
18:31:20     INFO -      eip = 0xb2939f23   esp = 0xa68ff2a0   ebp = 0xa68ff2e8   ebx = 0xb6d4f8a4
18:31:20     INFO -      esi = 0x9ef99470   edi = 0x9ec38ec0
18:31:20     INFO -      Found by: call frame info
Summary: test_copypaste.xhtml failing when new HTTP cache is enabled. → Intermittent test_bug348497.html,test_bug409604.html,test_copypaste.xhtml | application crashed [@ mozPersonalDictionary::Release()] when new HTTP cache is enabled
Attached patch fix (obsolete) — Splinter Review
The event posted to the main thread probably finishes earlier than the one that posted it, so the refptr is released on the wrong thread.
Attachment #8475121 - Flags: review?(honzab.moz)
Component: Serializers → Spelling checker
Looks like there's a problem with this patch:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46263202&tree=Try and lots more like it.
Attached patch patch v2 (obsolete) — Splinter Review
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #14)
> Looks like there's a problem with this patch:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=46263202&tree=Try and lots
> more like it.

Sorry, this patch works correctly.
Assignee: honzab.moz → michal.novotny
Attachment #8475121 - Attachment is obsolete: true
Attachment #8475121 - Flags: review?(honzab.moz)
Attachment #8475212 - Flags: review?(honzab.moz)
Comment on attachment 8475212 [details] [diff] [review]
patch v2

Review of attachment 8475212 [details] [diff] [review]:
-----------------------------------------------------------------

::: extensions/spellcheck/src/mozPersonalDictionary.cpp
@@ +59,3 @@
>  
> +    // Release the dictionary on the main thread
> +    nsCOMPtr<nsIThread> mainThread = do_GetMainThread();

do_GetMainThread() may return null during shutdown, you need to handle that case here.

@@ +59,5 @@
>  
> +    // Release the dictionary on the main thread
> +    nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
> +    mozPersonalDictionary *dict = nullptr;
> +    mDict.swap(dict);

Nit: the more common way of doing this would be:

mDict.forget(&dict);
Attachment #8475212 - Flags: review?(honzab.moz) → review-
On the trunk push including the v2 patch, the crashes do appear to be fixed. However, there's an intermittent leak that looks maybe-related:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46282126&tree=Try

I wonder if Ehsan's comments will address it, though.
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #17)
> On the trunk push including the v2 patch, the crashes do appear to be fixed.
> However, there's an intermittent leak that looks maybe-related:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=46282126&tree=Try
> 
> I wonder if Ehsan's comments will address it, though.

I said it would on IRC, but on the second thought, it's not clear to me if it does any more...
Michal, isn't simpler to make that crap's refcnt thread-safe?
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #17)
> On the trunk push including the v2 patch, the crashes do appear to be fixed.
> However, there's an intermittent leak that looks maybe-related:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=46282126&tree=Try
> 
> I wonder if Ehsan's comments will address it, though.

No, in fact when do_GetMainThread() returns null, we will intentionally leak the dictionary since it is better than releasing the object on a wrong thread.
Attached patch patch v3Splinter Review
Attachment #8475212 - Attachment is obsolete: true
Attachment #8475835 - Flags: review?(ehsan)
Attached patch patch v4 (obsolete) — Splinter Review
(In reply to Honza Bambas (:mayhemer) from comment #19)
> Michal, isn't simpler to make that crap's refcnt thread-safe?

Maybe, have a look at the patch.

https://tbpl.mozilla.org/?tree=Try&rev=6d40bfa651df
Attachment #8475854 - Flags: review?(honzab.moz)
Comment on attachment 8475854 [details] [diff] [review]
patch v4

Review of attachment 8475854 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, I'm afraid we need to keep the CC support - I wasn't aware :((
Comment on attachment 8475854 [details] [diff] [review]
patch v4

Review of attachment 8475854 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, I'm afraid we need to keep the CC support - I wasn't aware :((
Attachment #8475854 - Flags: review?(honzab.moz)
Attachment #8475854 - Attachment is obsolete: true
Attachment #8475835 - Flags: review?(ehsan) → review+
Comment on attachment 8475835 [details] [diff] [review]
patch v3

Approval Request Comment
[Feature/regressing bug #]: Http cache2
[User impact if declined]:  test failure (we want to turn on tests for cache2)
[Describe test coverage new/current, TBPL]:  yes, mochitest-1 discovered this.
[Risks and why]: low--just proxying release of an object to the main thread.
[String/UUID change made/needed]: none
Attachment #8475835 - Flags: approval-mozilla-beta?
Attachment #8475835 - Flags: approval-mozilla-aurora?
Attachment #8475835 - Flags: approval-mozilla-beta?
Attachment #8475835 - Flags: approval-mozilla-beta+
Attachment #8475835 - Flags: approval-mozilla-aurora?
Attachment #8475835 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/9c62e399d472
Status: NEW → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.