Closed Bug 832609 Opened 13 years ago Closed 12 years ago

test_823965.html : Assertion failure: (next == this) == (prev == this), at ../../../dist/include/mozilla/LinkedList.h:191

Categories

(Core :: DOM: Device Interfaces, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla24
Tracking Status
firefox19 --- fixed
firefox20 --- fixed
firefox21 --- wontfix
firefox22 + fixed
firefox23 + fixed
firefox24 --- fixed
b2g18 --- fixed

People

(Reporter: jfkthame, Assigned: justin.lebar+bug)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(2 files, 1 obsolete file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=18943119&tree=Mozilla-Inbound Assertion failure: (next == this) == (prev == this), at ../../../dist/include/mozilla/LinkedList.h:191 TEST-UNEXPECTED-FAIL | /tests/dom/devicestorage/test/test_823965.html | Exited with code 11 during test run PROCESS-CRASH | /tests/dom/devicestorage/test/test_823965.html | application crashed [@ mozilla::LinkedListElement<nsDOMMemoryFile::DataOwner>::isInList() const] Thread 31 (crashed) Operating system: Linux 0.0.0 Linux 2.6.31.5-127.fc12.x86_64 #1 SMP Sat Nov 7 21:11:14 EST 2009 x86_64 CPU: amd64 family 6 model 23 stepping 10 2 CPUs Crash reason: SIGSEGV Crash address: 0x0 Thread 31 (crashed) 0 libxul.so!mozilla::LinkedListElement<nsDOMMemoryFile::DataOwner>::isInList() const [LinkedList.h : 191 + 0x18] rbx = 0x000000000b07f720 r12 = 0x0000000000000000 r13 = 0x0000000000000001 r14 = 0x00007f1d6aafecc0 r15 = 0x0000000000000000 rip = 0x00007f1dccbd2f9e rsp = 0x00007f1d6aafeb10 rbp = 0x00007f1d6aafeb10 Found by: given as instruction pointer in context 1 libxul.so!mozilla::LinkedListElement<nsDOMMemoryFile::DataOwner>::~LinkedListElement [LinkedList.h : 120 + 0x4] rbx = 0x000000000b07f720 r12 = 0x0000000000000000 r13 = 0x0000000000000001 r14 = 0x00007f1d6aafecc0 r15 = 0x0000000000000000 rip = 0x00007f1dccbd3039 rsp = 0x00007f1d6aafeb20 rbp = 0x00007f1d6aafeb30 Found by: call frame info 2 libxul.so!nsDOMMemoryFile::DataOwner::Release() [nsDOMFile.h : 399 + 0x7] rbx = 0x000000000b07f720 r12 = 0x0000000000000000 r13 = 0x0000000000000001 r14 = 0x00007f1d6aafecc0 r15 = 0x0000000000000000 rip = 0x00007f1dccbee8f9 rsp = 0x00007f1d6aafeb40 rbp = 0x00007f1d6aafeb50 Found by: call frame info 3 libxul.so!DataOwnerAdapter::Release() [nsDOMFile.cpp : 87 + 0x4] rbx = 0x0000000002c04be0 r12 = 0x0000000000000000 r13 = 0x0000000000000001 r14 = 0x00007f1d6aafecc0 r15 = 0x0000000000000000 rip = 0x00007f1dccbee9d8 rsp = 0x00007f1d6aafeb60 rbp = 0x00007f1d6aafeb70 Found by: call frame info 4 libxul.so!nsTArray_Impl<nsCOMPtr<nsIInputStream>, nsTArrayInfallibleAllocator>::DestructRange(unsigned int, unsigned int) [nsTArray.h : 432 + 0x4] rbx = 0x0000000003244ab0 r12 = 0x0000000003244ab0 r13 = 0x0000000000000001 r14 = 0x00007f1d6aafecc0 r15 = 0x0000000000000000 rip = 0x00007f1dcd843657 rsp = 0x00007f1d6aafeb80 rbp = 0x00007f1d6aafeb90 Found by: call frame info 5 libxul.so!nsTArray_Impl<nsCOMPtr<nsIInputStream>, nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned int, unsigned int) [nsTArray.h : 1017 + 0xc] rbx = 0x0000000000000000 r12 = 0x0000000003e682b8 r13 = 0x0000000000000001 r14 = 0x00007f1d6aafecc0 r15 = 0x0000000000000000 rip = 0x00007f1dcd8438e3 rsp = 0x00007f1d6aafeba0 rbp = 0x00007f1d6aafebc0 Found by: call frame info ...etc.
Can this LinkedList be used by multiple threads?
...without locking? What's happening here is that the linked list destructor is running, and it's checking whether the node is in a list. In this case, it's partially in a list (one of the pointer is to self, but the other pointer is not). That's only a valid state if the list is currently being mutated. Which means that either we have memory corruption or the list is being mutated /while the destructor is running/.
Well DataOwner's dtor can run on any thread, so if there's no locking we need to add some.
Attached patch Patch (obsolete) — Splinter Review
Assignee: nobody → khuey
Status: NEW → ASSIGNED
Attachment #705443 - Flags: review?(justin.lebar+bug)
Depends on: 833913
khuey and I talked about this on IRC, and we agree this patch is scary. So what we're going to do is disable this memory reporter on b2g18 (that's safest possible change we could make there) and then try a simpler approach for trunk (register one memory reporter for each large blob). This simpler approach requires bug 833913.
Attachment #705443 - Attachment is obsolete: true
Attachment #705443 - Flags: review?(justin.lebar+bug)
Attachment #705531 - Flags: review?(justin.lebar+bug)
Comment on attachment 705531 [details] [diff] [review] Disable the reporter on b2g18. [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 803684 User impact if declined: crashes, memory corruption, bad times Testing completed: Looks about right to me. Risk to taking this patch (and alternatives if risky): Hard to imagine things getting worse as a result of this patch. String or UUID changes made by this patch: none
Attachment #705531 - Flags: review?(justin.lebar+bug)
Attachment #705531 - Flags: review+
Attachment #705531 - Flags: approval-mozilla-b2g18?
Attachment #705531 - Flags: approval-mozilla-b2g18? → approval-mozilla-b2g18+
Whiteboard: checkin-needed
Keywords: checkin-needed
Whiteboard: checkin-needed
Comment on attachment 705531 [details] [diff] [review] Disable the reporter on b2g18. [Approval Request Comment] See comment 9.
Attachment #705531 - Flags: approval-mozilla-beta?
Attachment #705531 - Flags: approval-mozilla-aurora?
Attachment #705531 - Flags: approval-mozilla-beta?
Attachment #705531 - Flags: approval-mozilla-beta+
Attachment #705531 - Flags: approval-mozilla-aurora?
Attachment #705531 - Flags: approval-mozilla-aurora+
Aand we never followed up with this bug. :(
The paper-over fix needs to be uplifted to Aurora again, and we need to actually fix this. It's not just the dtor; the constructor also can run on any thread, afaict from bug 873623.
Comment on attachment 705531 [details] [diff] [review] Disable the reporter on b2g18. We need to re-land this on Aurora. :(
Attachment #705531 - Flags: approval-mozilla-aurora+ → approval-mozilla-aurora?
Ah crap. I hate it when I lose track of bugs.
I'm compiling a patch for this that involves making a StaticMutex, which is something I've wanted in many other cases. If this works, it would be pretty nice...
Assignee: khuey → justin.lebar+bug
No way this will compile anywhere else. https://tbpl.mozilla.org/?tree=Try&rev=6469459a5db5
Depends on bug 873801, which has its own dependency.
Attachment #751407 - Flags: review?(khuey)
Depends on: 873801
Comment on attachment 705531 [details] [diff] [review] Disable the reporter on b2g18. Approving for Aurora 23 - should we also get this on Beta 22? We aren't tracking this bug because it's not clear that we have signs of significant, reproducible user impact here. Please nominate if you believe it to be more serious.
Attachment #705531 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
> We aren't tracking this bug because it's not clear that we have signs of significant, > reproducible user impact here. Probably because few sites use workers. > should we also get this on Beta 22? Yes; thanks for asking. > Please nominate if you believe it to be more serious. I believe this to be pretty serious. I'll elaborate in a PM.
No longer depends on: 833913
Comment on attachment 705531 [details] [diff] [review] Disable the reporter on b2g18. Re-nom'ing for beta. It's a shame we missed this for the last release, but that ship has sailed.
Attachment #705531 - Flags: approval-mozilla-beta+ → approval-mozilla-beta?
Checkin-needed to re-land on Aurora (and beta, if and when we get re-approval).
Keywords: checkin-needed
Whiteboard: [for checkin-needed, see comment 28]
Keywords: checkin-needed
Whiteboard: [for checkin-needed, see comment 28]
(In reply to Ryan VanderMeulen [:RyanVM][Out May 23-27] from comment #29) > https://hg.mozilla.org/releases/mozilla-aurora/rev/729fb1f47da8 > > Not sure what the beta holdup is here... It's not clear that this intermittent failure is hitting FF22, so we were waiting for jlebar to elaborate on severity here (didn't get a PM).
I sent a PM to lsblakk immediately after posting the comment. I didn't get a response; maybe it didn't go through. I'll just say it here: I think you could exploit this bug without a lot of difficulty.
Attachment #705531 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Keywords: checkin-needed
Gah, https://hg.mozilla.org/integration/mozilla-inbound/rev/56f89cd46261 landed with this bug in the commit message instead of bug 873801. At least they're related bugs...
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Ah, that's why this hasn't landed on beta: The m-c merge removed checkin-needed. I'll land this on beta myself.
That and the whole holiday weekend thing...
I just hit the following assertion: http://pastebin.mozilla.org/2565047 It was my understanding that this bug should have fixed it.
That's not an assertion, that's just a stack. I bet that's a refcounting threadsafety assertion firing.
Well, now it's an assertion + a stack:) http://pastebin.mozilla.org/2565061 And yes, it looks like a refcounting threadsafety assertion.
Yeah my guess is that we create it on whatever thread wants to use it first which may not be the main thread. It should really just have threadsafe refcounting. Open a new bug please.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: