Closed Bug 529668 Opened 15 years ago Closed 15 years ago

test_getNewsMessage.js fails after bug 515051 check in (Stream listener registered in a network request channel eats JS error messages)

Categories

(MailNews Core :: Networking: NNTP, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3.1a1

People

(Reporter: standard8, Assigned: standard8)

References

Details

(Keywords: regression)

Attachments

(3 files)

Since bug 515051 landed (and its comm-central build bustage bug 529057) in mozilla-central, test_getNewsMessage.js fails with

TEST-UNEXPECTED-FAIL | /buildbot/linux-comm-central-check/build/objdir/mozilla/_tests/xpcshell/test_news/unit/test_getNewsMessage.js | test failed (with xpcshell return code: -11), see following log:

(log attached)

I've not looked at the code to know if it is anything simple or not.
Oh, this could be a crash (which our tinderboxes probably wouldn't show as one) so running this in debug or something may just show it up.
Attached file Failure Log
and here's the actual log...
Assignee: nobody → honzab.moz
My debug build hasn't shown this failure. Additionally a couple of builders have passed tests. I'm by no means convinced yet, but I'm trying a couple of clobbers on various boxes to see if that fixes the problems.
FWIW a release build (built from scratch locally) also works.
Clobber of the tinderboxes didn't work.
Ok finally reproduced it, top of stack:

#0  0x020bdcb4 in nsStreamListenerTee::OnStopRequest (this=0x873030, request=0x871990, context=0x88c0a0, status=0) at /Users/moztest/comm/main/src/mozilla/netwerk/base/src/nsStreamListenerTee.cpp:73
#1  0x01ba59b5 in nsNNTPProtocol::CleanupAfterRunningUrl (this=0x871980) at /Users/moztest/comm/main/src/mailnews/news/src/nsNNTPProtocol.cpp:5117
#2  0x01bb6656 in nsNNTPProtocol::ProcessProtocolState (this=0x871980, url=0x88c0a4, inputStream=0x871d8c, sourceOffset=97, length=287) at /Users/moztest/comm/main/src/mailnews/news/src/nsNNTPProtocol.cpp:5045
#3  0x0185a07f in nsMsgProtocol::OnDataAvailable (this=0x87198c, request=0x871c20, ctxt=0x88c0a0, inStr=0x871d8c, sourceOffset=97, count=287) at /Users/moztest/comm/main/src/mailnews/base/util/nsMsgProtocol.cpp:359
#4  0x02088e4f in nsInputStreamPump::OnStateTransfer (this=0x871c20) at /Users/moztest/comm/main/src/mozilla/netwerk/base/src/nsInputStreamPump.cpp:508
#5  0x020893aa in nsInputStreamPump::OnInputStreamReady (this=0x871c20, stream=0x871d8c) at /Users/moztest/comm/main/src/mozilla/netwerk/base/src/nsInputStreamPump.cpp:398
#6  0x00584f2e in nsInputStreamReadyEvent::Run (this=0x4dbbcd0) at /Users/moztest/comm/main/src/mozilla/xpcom/io/nsStreamUtils.cpp:112
#7  0x005b87b8 in nsThread::ProcessNextEvent (this=0x80f450, mayWait=1, result=0xbfffd0d4) at /Users/moztest/comm/main/src/mozilla/xpcom/threads/nsThread.cpp:527

The failure is EXC_BAD_ACCESS, mObserver seems to be 0xe3 on my run which seems a little wrong.

This doesn't happen 100% of the time, but it does happen very frequently.
Seems like this uncovers a bad bug, someone is writing to a location it should not write to... Really not simple to find.

I have no useful stack on linux, on mac (10.6) after I attach and continue gdb xpcshell in check-interactive mode, xpcshell quits immediately. It seems not reproducible on windows.
I did a bit of debugging on this and was able to catch the point at which the memory gets overwritten. The basic procedure seems to be:

- The test starts of, sets up the urls and protocols and starts getting mail.
- This calls into nsStreamListenerTee::Init which sets mObserver to 0x0.
- The news then does its job...
- then on completion of getting the mail the code calls  nsNNTPProtocol::CleanupAfterRunningUrl which in turn calls nsStreamListenerTee::OnStopRequest
- The nsStreamListenerTee::OnStopRequest in turn calls into the js code in the test here:

http://hg.mozilla.org/comm-central/annotate/54c135ebb58c/mailnews/news/test/unit/test_getNewsMessage.js#l31

when it gets to the do_check_eq on line 140 we get these calls to getStack/CreateStack.

At this point, that is where the memory containing mObserver is overwritten.

If I step back out from that point to nsStreamListenerTee::OnStopRequest, then 'this' is an nsIStackFrame.
So now I've done that analysis, I've just realised what is happening here.

In streamListener.onStopRequest in the test file, we're calling closeCachedConnections before returning.

That's wrong, because we're in the middle of onStopRequest and that is closing the cached connections which tidies up the nsNNTPProtocol and nsStreamListenerTee objects, hence when js comes along and wants to write a stack, it overwrites the freed memory.

We need to let the onStopRequest unwrap itself and so we need to move closeCachedConnections out to a separate function called after a minimal timeout.
Attached patch The fixSplinter Review
Assignee: honzab.moz → bugzilla
Status: NEW → ASSIGNED
Attachment #413847 - Flags: review?(bienvenu)
Attachment #413847 - Flags: review?(bienvenu) → review+
Checked in: http://hg.mozilla.org/comm-central/rev/7a4e8a9fd5ed

The first linux run has gone green, so this should be fixed now.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 3.1a1
You need to log in before you can comment on or make changes to this bug.