Closed Bug 1026951 Opened 10 years ago Closed 10 years ago

Intermittent blank.html | application crashed [@ nsInputStreamPump::OnStateStop()] | Assertion failure: NS_IsMainThread() (OnStateStop should only be called on the main thread.)

Categories

(Core :: General, defect)

x86
Windows 8
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33
Tracking Status
firefox32 --- unaffected
firefox33 --- fixed
firefox34 --- fixed
firefox-esr24 --- unaffected
firefox-esr31 --- unaffected
b2g-v1.4 --- unaffected
b2g-v2.0 --- unaffected
b2g-v2.1 --- fixed

People

(Reporter: cbook, Assigned: michal)

References

()

Details

(Keywords: crash, intermittent-failure)

Attachments

(1 file)

WINNT 6.2 mozilla-inbound debug test reftest on 2014-06-18 01:12:34 PDT for push 2dab457aed98

slave: t-w864-ix-064

https://tbpl.mozilla.org/php/getParsedLog.php?id=41940450&tree=Mozilla-Inbound

01:16:18  WARNING -  PROCESS-CRASH | file:///C:/slave/test/build/tests/reftest/tests/layout/reftests/reftest-sanity/blank.html | application crashed [@ nsInputStreamPump::OnStateStop()]
01:16:18     INFO -  Crash dump filename: c:\users\cltbld~1.t-w\appdata\local\temp\tmpjreb6h.mozrunner\minidumps\1de4aef1-18e4-42c1-a00d-c348bc0266a7.dmp
01:16:18     INFO -  Operating system: Windows NT
01:16:18     INFO -                    6.2.9200
01:16:18     INFO -  CPU: x86
01:16:18     INFO -       GenuineIntel family 6 model 30 stepping 5
01:16:18     INFO -       8 CPUs
01:16:18     INFO -  Crash reason:  EXCEPTION_BREAKPOINT
01:16:18     INFO -  Crash address: 0x7023d665
01:16:18     INFO -  Thread 29 (crashed)
01:16:18     INFO -   0  xul.dll!nsInputStreamPump::OnStateStop() [nsInputStreamPump.cpp:2dab457aed98 : 676 + 0x20]
01:16:18     INFO -      eip = 0x7023d665   esp = 0x0f43fa28   ebp = 0x0f43fa40   ebx = 0x0ebb32b0
01:16:18     INFO -      esi = 0x0ebb32b0   edi = 0x00000003   eax = 0x00000000   ecx = 0x7460ff12
01:16:18     INFO -      edx = 0x0f43de10   efl = 0x00000216
01:16:18     INFO -      Found by: given as instruction pointer in context
01:16:18     INFO -   1  xul.dll!nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream *) [nsInputStreamPump.cpp:2dab457aed98 : 440 + 0xc]
01:16:18     INFO -      eip = 0x7023dbb1   esp = 0x0f43fa48   ebp = 0x0f43fa60
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   2  xul.dll!nsInputStreamReadyEvent::Run() [nsStreamUtils.cpp:2dab457aed98 : 88 + 0x10]
01:16:18     INFO -      eip = 0x701b5f1c   esp = 0x0f43fa68   ebp = 0x0f43fa78
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   3  xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:2dab457aed98 : 766 + 0xd]
01:16:18     INFO -      eip = 0x701ce694   esp = 0x0f43fa80   ebp = 0x0f43fadc
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   4  xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:2dab457aed98 : 263 + 0xc]
01:16:18     INFO -      eip = 0x70166066   esp = 0x0f43fae4   ebp = 0x0f43faf0
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   5  xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [MessagePump.cpp:2dab457aed98 : 336 + 0x9]
01:16:18     INFO -      eip = 0x70417d63   esp = 0x0f43faf8   ebp = 0x0f43fb1c
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   6  xul.dll!MessageLoop::RunInternal() [message_loop.cc:2dab457aed98 : 229 + 0x8]
01:16:18     INFO -      eip = 0x703e7c84   esp = 0x0f43fb24   ebp = 0x0f43fb3c
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   7  xul.dll!MessageLoop::RunHandler() [message_loop.cc:2dab457aed98 : 222 + 0x4]
01:16:18     INFO -      eip = 0x703e9d49   esp = 0x0f43fb44   ebp = 0x0f43fb70
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   8  xul.dll!MessageLoop::Run() [message_loop.cc:2dab457aed98 : 196 + 0x6]
01:16:18     INFO -      eip = 0x703ea3a9   esp = 0x0f43fb78   ebp = 0x0f43fb90
01:16:18     INFO -      Found by: call frame info
01:16:18     INFO -   9  xul.dll!nsThread::ThreadFunc(void *) [nsThread.cpp:2dab457aed98 : 346 + 0x11]


01:16:04     INFO -  Assertion failure: NS_IsMainThread() (OnStateStop should only be called on the main thread.), at c:\builds\moz2_slave\m-in-w32-d-0000000000000000000\build\netwerk\base\src\nsInputStreamPump.cpp:676
Looks cache-related?
Flags: needinfo?(honzab.moz)
See Also: → 1026965, 1027803
Not necessarily cache related.  The thread root doesn't seem to be the cache IO thread.

Could be caused by bug 1013638, but that landed on  2014-06-13.  This has regularly started a bit later.

Michal, any thoughts?

CC'ing also other people that might know about some change potentially causing this.
Flags: needinfo?(honzab.moz)
Since it seems most probable, tentatively blocking bug 1013638.
Blocks: 1013638
Michal, any idea here?
Flags: needinfo?(michal.novotny)
Blocks: 1039536
Honza, any other suggestions for who can look into this? I'm getting ready to attempt a backout soon due to the ongoing, cross-branch nature of this and the total lack of attention it's getting.
Flags: needinfo?(honzab.moz)
I'll work on it once I finish another bug. I hope I'll get to it today or tomorrow.
Flags: needinfo?(michal.novotny)
Flags: needinfo?(honzab.moz)
(In reply to Michal Novotny (:michal) from comment #74)
> I'll work on it once I finish another bug. I hope I'll get to it today or
> tomorrow.

Any updates here? :)
Flags: needinfo?(michal.novotny)
nsInputStreamPump::OnStateStop is not called on the main thread in a very specific case when:

- the pump has been retargeted to a background thread
- nsInputStreamPump::OnInputStreamReady calls nsInputStreamPump::OnStateTransfer on the given event target
- the pump is suspended on some other thread while OnStateTransfer leaves the monitor at http://hg.mozilla.org/mozilla-central/annotate/5299864050ee/netwerk/base/src/nsInputStreamPump.cpp#l596
- nsInputStreamPump::OnStateTransfer enters the monitor again and returns STATE_STOP
- the pump wants to retarget to the main thread at http://hg.mozilla.org/mozilla-central/annotate/5299864050ee/netwerk/base/src/nsInputStreamPump.cpp#l464
- but since the pump is now suspended EnsureWaiting() is not called and the loop is not broken at http://hg.mozilla.org/mozilla-central/annotate/5299864050ee/netwerk/base/src/nsInputStreamPump.cpp#l474
- the monitor is exited at the end of the for cycle and the pump is resumed on the another thread
- the monitor is entered again at the beginning of the for cycle, now the pump is not suspended and the status is STATE_STOP, so OnStateStop() is called on a wrong thread
Flags: needinfo?(michal.novotny)
Attached patch fixSplinter Review
https://tbpl.mozilla.org/?tree=Try&rev=dd0f1401e4ad
https://tbpl.mozilla.org/?tree=Try&rev=84de54b3212e

The first push should verify that nothing was broken by the patch.
The second push runs reftests on Windows many times to verify that the assertion was fixed.
Assignee: nobody → michal.novotny
Attachment #8472950 - Flags: review?(sworkman)
Comment on attachment 8472950 [details] [diff] [review]
fix

Review of attachment 8472950 [details] [diff] [review]:
-----------------------------------------------------------------

Impressive find. r=me.
Attachment #8472950 - Flags: review?(sworkman) → review+
test_crash_manager.js has been updated recently, so that failure might be fixed. Took the liberty of pushing to try with an updated repo:

https://tbpl.mozilla.org/?tree=Try&rev=892f2985807b
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=892f2985807b
I went ahead and pushed this to inbound because the Try run is green and we're running out of time to get this uplifted to Aurora and Beta.

https://hg.mozilla.org/integration/mozilla-inbound/rev/9d229b7007cc
https://hg.mozilla.org/mozilla-central/rev/9d229b7007cc

Thanks for the patch, Michal! Please request Aurora approval on this when you get a chance :)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Comment on attachment 8472950 [details] [diff] [review]
fix

Approval Request Comment
[Feature/regressing bug #]: new http cache
[User impact if declined]: crash
[Describe test coverage new/current, TBPL]: existing reftest
[Risks and why]: fairly low: we found a special case that needed to be handled with 3 line fix.
[String/UUID change made/needed]: none
Attachment #8472950 - Flags: approval-mozilla-aurora?
Attachment #8472950 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
QA Whiteboard: [qa-]
Target Milestone: mozilla34 → mozilla33
I'm on FF beta and Aurora channels but I still get this error. Is it just me?
(In reply to comexx from comment #119)
> I'm on FF beta and Aurora channels but I still get this error. Is it just me?

This bug was filed for a specific instance we were hitting in our test automation. If you're hitting this crash as well, you should file a new bug in Core::Networking with the details (build, steps to reproduce, crash reporter links if you have them, etc). Thanks!
Thanks Ryan! Sorry for posting in wrong section.
No problem, thanks for asking at least!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: