Closed Bug 924348 Opened 11 years ago Closed 10 years ago

Intermittent PROCESS-CRASH | /tests/dom/indexedDB/test/test_add_put.html | application crashed [@ sqlite3LeaveMutexAndCloseZombie] or [@ hashDestroy]

Categories

(Toolkit :: Storage, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29
Tracking Status
firefox26 --- wontfix
firefox27 --- fixed
firefox28 --- fixed
firefox29 --- fixed
firefox-esr24 --- unaffected
b2g-v1.2 --- fixed
b2g-v1.3 --- fixed

People

(Reporter: cbook, Assigned: janv)

References

()

Details

(5 keywords, Whiteboard: [adv-main27+])

Crash Data

Attachments

(1 file, 1 obsolete file)

Rev4 MacOSX Lion 10.7 fx-team opt test mochitest-3 on 2013-10-08 03:34:28 PDT for push 295575989d53

slave: talos-r4-lion-057

https://tbpl.mozilla.org/php/getParsedLog.php?id=28820726&tree=Fx-Team


WARNING -  PROCESS-CRASH | /tests/dom/indexedDB/test/test_add_put.html | application crashed [@ sqlite3LeaveMutexAndCloseZombie]
03:38:12     INFO -  Crash dump filename: /var/folders/hk/v8w8k7ss2g1fjccmkfk0t1l400000w/T/tmpolKVBu/minidumps/6553A854-BC1C-43EC-A742-F6D75017B8B9.dmp
03:38:12     INFO -  Operating system: Mac OS X
03:38:12     INFO -                    10.7.2 11C74
03:38:12     INFO -  CPU: amd64
03:38:12     INFO -       family 6 model 23 stepping 10
03:38:12     INFO -       2 CPUs
03:38:12     INFO -  Crash reason:  EXC_BAD_ACCESS / 0x0000000d
03:38:12     INFO -  Crash address: 0x0
03:38:12     INFO -  Thread 41 (crashed)
03:38:12     INFO -   0  libnss3.dylib!sqlite3LeaveMutexAndCloseZombie [sqlite3.c : 116556 + 0x0]
03:38:12     INFO -      rbx = 0x00000000ffffffff   r12 = 0x0000000000000000
03:38:12     INFO -      r13 = 0x0000000000000000   r14 = 0x000aa80000001500
03:38:12     INFO -      r15 = 0x000000011d23fc00   rip = 0x0000000100522d04
03:38:12     INFO -      rsp = 0x000000013ef6ab20   rbp = 0x000000013ef6ab50
03:38:12     INFO -      Found by: given as instruction pointer in context
03:38:12     INFO -   1  libnss3.dylib!sqlite3Close [sqlite3.c : 116460 + 0x7]
03:38:12     INFO -      rbx = 0x0000000000000000   r12 = 0x000000011f425630
03:38:12     INFO -      r13 = 0x000000011d23fc00   r14 = 0x0000000000000002
03:38:12     INFO -      r15 = 0x0000000000000048   rip = 0x000000010052e2c8
03:38:12     INFO -      rsp = 0x000000013ef6ab60   rbp = 0x000000013ef6ab90
03:38:12     INFO -      Found by: call frame info
03:38:12     INFO -   2  XUL!mozilla::storage::Connection::internalClose() [mozStorageConnection.cpp:295575989d53 : 834 + 0x8]
03:38:12     INFO -      rbx = 0x0000000000000000   r12 = 0x000000011f425630
03:38:12     INFO -      r13 = 0x0000000000000000   r14 = 0x000000011af5c270
03:38:12     INFO -      r15 = 0x000000011f425600   rip = 0x0000000101e0fc66
03:38:12     INFO -      rsp = 0x000000013ef6aba0   rbp = 0x000000013ef6abc0
03:38:12     INFO -      Found by: call frame info
03:38:12     INFO -   3  XUL!mozilla::storage::Connection::Close() [mozStorageConnection.cpp:295575989d53 : 1068 + 0x7]
03:38:12     INFO -      rbx = 0x0000000000000000   r12 = 0x000000011f425630
03:38:12     INFO -      r13 = 0x0000000000000000   r14 = 0x000000011af5c270
03:38:12     INFO -      r15 = 0x000000011f425600   rip = 0x0000000101e0e92b
03:38:12     INFO -      rsp = 0x000000013ef6abd0   rbp = 0x000000013ef6abf0
03:38:12     INFO -      Found by: call frame info
03:38:12     INFO -   4  XUL!mozilla::dom::indexedDB::CommitHelper::Run() [IDBTransaction.cpp:295575989d53 : 923 + 0x5]
03:38:12     INFO -      rbx = 0x000000011f425620   r12 = 0x000000011f425630
03:38:12     INFO -      r13 = 0x0000000000000000   r14 = 0x0000000000000003
03:38:12     INFO -      r15 = 0x000000011f425600   rip = 0x0000000101a4604b
03:38:12     INFO -      rsp = 0x000000013ef6ac00   rbp = 0x000000013ef6aca0
03:38:12     INFO -      Found by: call frame info
Tentatively moving into Storage since we recently upgraded SQLite.
Component: DOM: IndexedDB → Storage
Product: Core → Toolkit
This has been happening for awhile now and marked as bug 881641.
From looking at bug 881641, looks like this started some time around early September.
Summary: Intermittent PROCESS-CRASH | /tests/dom/indexedDB/test/test_add_put.html | application crashed [@ sqlite3LeaveMutexAndCloseZombie] → Intermittent PROCESS-CRASH | /tests/dom/indexedDB/test/test_add_put.html | application crashed [@ sqlite3LeaveMutexAndCloseZombie] or [@ hashDestroy]
I actually think it may be more related to the recent changes in bug 874814. I suppose early september or late August doesn't make a big difference.
(In reply to Marco Bonardo [:mak] from comment #14)
> I actually think it may be more related to the recent changes in bug 874814.
> I suppose early september or late August doesn't make a big difference.

The regression range seems to be solidly supporting the regression starting in the second week of September based on my Try bisecting thus far. Maybe I'm reading too much into things, but I can't help but notice that both this and bug 925251 are showing libnss on the stack as well?
Depends on: 925251
Congratulations, Jan! Try bisection confirms that you are the lucky winner of the "who caused this orange?" award for your breakthrough performance in bug 785884!

If you could please take a look, it would be greatly appreciated :)
Assignee: nobody → Jan.Varga
Blocks: 785884
Flags: needinfo?(Jan.Varga)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #19)
> Congratulations, Jan! Try bisection confirms that you are the lucky winner
> of the "who caused this orange?" award for your breakthrough performance in
> bug 785884!
> 
> If you could please take a look, it would be greatly appreciated :)

What a pleasure :)
Seriously, I tried to avoid random failures by running under valgrind and re-triggering on try multiple times. I'll see what I can do.
Flags: needinfo?(Jan.Varga)
(In reply to Jan Varga [:janv] from comment #20)
> What a pleasure :)
> Seriously, I tried to avoid random failures by running under valgrind and
> re-triggering on try multiple times. I'll see what I can do.

This is now merged to even more branches. Can you please take a look soon? :)
Flags: needinfo?(Jan.Varga)
Flags: needinfo?(Jan.Varga)
Ping (again).
Flags: needinfo?(Jan.Varga)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #118)
> Ping (again).

I'm investigating it, upgrading to SQLite 3.8.1 didn't help.
It's quite hard since it only happens in optimized builds :(
Flags: needinfo?(Jan.Varga)
(In reply to Jan Varga [:janv] from comment #119)
> I'm investigating it, upgrading to SQLite 3.8.1 didn't help.

Given that this started prior to SQLite 3.8.0 landing, is that a surprise?
Andrew, is there way we can bump the priority of investigating this top orange that affects nearly every active branch?
Flags: needinfo?(overholt)
Sorry for not getting back here sooner.  Jan and I discussed this the other day and he's going to investigate in the very near future.
Flags: needinfo?(overholt)
(In reply to Andrew Overholt [:overholt] from comment #191)
> Sorry for not getting back here sooner.  Jan and I discussed this the other
> day and he's going to investigate in the very near future.

Jan, don't suppose you have a rough eta for when you might be able to take a look at this? :-)
Flags: needinfo?(Jan.Varga)
(In reply to Ed Morley [:edmorley UTC+0] from comment #216)
> (In reply to Andrew Overholt [:overholt] from comment #191)
> > Sorry for not getting back here sooner.  Jan and I discussed this the other
> > day and he's going to investigate in the very near future.
> 
> Jan, don't suppose you have a rough eta for when you might be able to take a
> look at this? :-)

The good news is I'm now able to reproduce it locally, it's not easy to reproduce it though.
But still better than debugging on try :)

I'm on Mac OS X 10.9.1, XCode 5.0.2, using mozconfig from browser/config/mozconfigs/macosx-universal/nightly
Flags: needinfo?(Jan.Varga)
It seems I have a fix for this:
https://tbpl.mozilla.org/?tree=Try&rev=c601f6159ff0
Attached patch fix (obsolete) — Splinter Review
Ok, I couldn't reproduce the crash with this patch locally nor on try (2 x 100 retriggers).
Note that I also saw different stack traces while I was debugging it, so this patch can actually eliminate other crashes tracked separately.
Attachment #8359089 - Flags: review?(bent.mozilla)
Attached patch fix v2Splinter Review
It seems Ben is away...
This patch uses a new method LockedAddRef(), suggested by Kyle.
Attachment #8359089 - Attachment is obsolete: true
Attachment #8359089 - Flags: review?(bent.mozilla)
Attachment #8359307 - Flags: review?(khuey)
https://hg.mozilla.org/mozilla-central/rev/d2fe5f5b558e

Jan, thank you very much for tracking this down! One final request - please request approval for uplift to Aurora/Beta/b2g26 :)
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Ok, but I think we should wait a day or so to be 100% sure the fix is correct.
Comment on attachment 8359307 [details] [diff] [review]
fix v2

[Approval Request Comment]
Bug caused by (feature/regressing bug #): temporary storage, bug 785884
User impact if declined: The browser crashes in an unpredictable way (possible security issue).
Testing completed: The fix landed on m-c on Jan 13th, 2014, the patch was tested on try prior landing, M3 was retriggered 200 times and we didn't see any crashes. RyanVM says "no issues with bug 924348 so far"
Risk to taking this patch (and alternatives if risky): The patch should be safe.
String or UUID changes made by this patch: No strings.
Attachment #8359307 - Flags: approval-mozilla-beta?
Attachment #8359307 - Flags: approval-mozilla-b2g26?
Attachment #8359307 - Flags: approval-mozilla-aurora?
Do we need to do something else with this bug, like marking as security-sensitive ? It's been public for a long time.
Flags: needinfo?(dveditz)
Attachment #8359307 - Flags: approval-mozilla-beta?
Attachment #8359307 - Flags: approval-mozilla-beta+
Attachment #8359307 - Flags: approval-mozilla-b2g26?
Attachment #8359307 - Flags: approval-mozilla-b2g26+
Attachment #8359307 - Flags: approval-mozilla-aurora?
Attachment #8359307 - Flags: approval-mozilla-aurora+
ok, I landed the patch on mozilla-aurora and mozilla-beta, RyanVM will land it on mozilla-b2g26_v1_2
I don't know if it does any good to hide the bug at this point but at least doing so will trigger inclusion in the release advisories.
Group: core-security
Flags: needinfo?(dveditz)
Whiteboard: [adv-main27+]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: