Last Comment Bug 390219 - Trunk topcrash [@ nsXMLHttpRequest::OnStartRequest]
: Trunk topcrash [@ nsXMLHttpRequest::OnStartRequest]
Status: RESOLVED FIXED
: crash, regression, verified1.8.1.18
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: All All
: -- critical (vote)
: mozilla1.9alpha8
Assigned To: Jonas Sicking (:sicking) No longer reading bugmail consistently
:
:
Mentors:
: 457411 457746 (view as bug list)
Depends on:
Blocks: 326337
  Show dependency treegraph
 
Reported: 2007-07-30 16:14 PDT by Benjamin Smedberg [:bsmedberg]
Modified: 2011-06-09 14:58 PDT (History)
18 users (show)
benjamin: blocking1.9+
samuel.sidler+old: blocking1.8.1.18+
bzbarsky: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch to fix (4.19 KB, patch)
2007-09-05 17:50 PDT, Jonas Sicking (:sicking) No longer reading bugmail consistently
jst: review+
jst: superreview+
jst: approval1.9+
Details | Diff | Splinter Review
tests (6.86 KB, patch)
2007-09-18 16:49 PDT, Jonas Sicking (:sicking) No longer reading bugmail consistently
peterv: review+
peterv: superreview+
jonas: approval1.9+
Details | Diff | Splinter Review
What actually got checked in (2.30 KB, patch)
2008-09-29 12:59 PDT, Boris Zbarsky [:bz] (still a bit busy)
dveditz: approval1.8.1.18+
Details | Diff | Splinter Review

Description Benjamin Smedberg [:bsmedberg] 2007-07-30 16:14:20 PDT
Trunk topcrash #32 is a new regression.

http://crash-stats.mozilla.com/report/list?range_unit=weeks&branch=1.9&range_value=2&signature=nsXMLHttpRequest%3A%3AOnStartRequest(nsIRequest+*%2CnsISupports+*)

I'm going to presumptively blame bug 389508
Comment 2 Mike Connor [:mconnor] 2007-07-30 20:20:02 PDT
I think this isn't bad enough to block M7, its #17 for 3.0a7pre, with 1/40th of the crashes of the #1 topcrash.  Definitely needed for M8.
Comment 3 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-31 09:51:26 PDT
Would be great if someone could attach a testcase or give steps to reproduce.
Comment 4 Adam Guthrie 2007-07-31 15:25:29 PDT
I'm not sure what graph I was looking at before, but this is the one I wanted: http://crash-stats.mozilla.com/report/list?range_unit=months&query_search=signature&query_type=contains&product=Firefox&branch=1.9&signature=nsXMLHttpRequest%3A%3AOnStartRequest(nsIRequest+*%2CnsISupports+*)&query=nsXMLHttpRequest%3A%3AOnStartRequest&range_value=3.

This started on 2007-07-02, which is when the patch for bug 326337 landed. In fact, it's crashing on one of the lines of code that peterv added.
Comment 5 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-07-31 15:28:37 PDT
Btw, I think that line should set the owner on |channel|, not |mChannel|. The two are different for multipart requests. Need mochitests here.
Comment 6 Peter Van der Beken [:peterv] 2007-08-25 12:43:55 PDT
Sicking: I'm assigning this to you, because I've seen you write a patch :-).
Comment 7 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-09-05 17:50:08 PDT
Created attachment 279830 [details] [diff] [review]
Patch to fix

What's happening is that we get OnStartRequest/OnStopRequests from old requests that has been canceled. These requests come in after we have reopened a new request, so the current checks for being in the uninitialized state don't work as a new request has been opened.
Comment 8 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-09-05 17:53:02 PDT
I've verified that the patch does fix the problem, but i'd like to create a mochitest too. Still working on that part.
Comment 9 Johnny Stenback (:jst, jst@mozilla.com) 2007-09-05 18:12:56 PDT
Comment on attachment 279830 [details] [diff] [review]
Patch to fix

- In nsXMLHttpRequest::OnStartRequest():

+  nsCOMPtr<nsIMultiPartChannel> mpChannel = do_QueryInterface(request);
+  if (mpChannel) {
+    nsCOMPtr<nsIChannel> baseChannel;
+    rv = mpChannel->GetBaseChannel(getter_AddRefs(baseChannel));
+    NS_ENSURE_SUCCESS(rv, rv);
+    
+    if (baseChannel != mChannel) {
+      return NS_OK;
+    }
+  }
+  else if (request != mChannel) {
+    return NS_OK;
+  }

This hunk of code is duplicated in OnStopRequest(). Not a big deal, but it could be split out for better maintainability, maybe.

r+sr=jst either way.
Comment 10 Nickolay_Ponomarev 2007-09-15 09:28:11 PDT
Any reason this wasn't checked in?
Comment 11 Peter Van der Beken [:peterv] 2007-09-15 09:38:51 PDT
This was checked in 10 days ago.
Comment 12 Peter Van der Beken [:peterv] 2007-09-15 09:40:33 PDT
Still need a mochitest though.
Comment 14 Jonas Sicking (:sicking) No longer reading bugmail consistently 2007-09-18 16:49:00 PDT
Created attachment 281392 [details] [diff] [review]
tests

This adds tests for this bug and bug 326337
Comment 15 Boris Zbarsky [:bz] (still a bit busy) 2008-09-29 12:59:52 PDT
Created attachment 340973 [details] [diff] [review]
What actually got checked in

This applies to 1.8.1 and fixes bug 457411.
Comment 16 Boris Zbarsky [:bz] (still a bit busy) 2008-09-29 13:00:16 PDT
*** Bug 457411 has been marked as a duplicate of this bug. ***
Comment 17 Boris Zbarsky [:bz] (still a bit busy) 2008-09-29 13:01:19 PDT
These tests landed
Comment 18 Boris Zbarsky [:bz] (still a bit busy) 2008-09-29 13:01:44 PDT
*** Bug 457746 has been marked as a duplicate of this bug. ***
Comment 19 Daniel Veditz [:dveditz] 2008-10-01 14:52:57 PDT
Comment on attachment 340973 [details] [diff] [review]
What actually got checked in

Approved for 1.8.1.18, a=dveditz for release-drivers
Comment 20 Boris Zbarsky [:bz] (still a bit busy) 2008-10-07 08:36:43 PDT
Checked in on 1.8 branch.
Comment 21 Henrik Skupin (:whimboo) [away 09/30 - 10/06] 2008-10-21 15:42:39 PDT
Boris, this crash still happens for Firefox 3.0.3 even there is a much smaller amount of frames (lets say only 2):

http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=nsXMLHttpRequest%3A%3AOnStartRequest&date=&range_value=1&range_unit=months&do_query=1&signature=nsXMLHttpRequest%3A%3AOnStartRequest(nsIRequest*%2C%20nsISupports*)
Comment 22 Boris Zbarsky [:bz] (still a bit busy) 2008-10-21 18:12:51 PDT
Henrik, that link shows no results...  In any case you'd want to file a separate bug on it if it's happening.
Comment 23 Henrik Skupin (:whimboo) [away 09/30 - 10/06] 2008-10-21 23:24:24 PDT
(In reply to comment #22)
> Henrik, that link shows no results...  In any case you'd want to file a
> separate bug on it if it's happening.

Boris, just to double check before filing a new bug. Could you have a short look at these crashes?

bp-ae3452ae-9965-11dd-8cb4-001a4bd43e5c
bp-25bf57e8-9633-11dd-80b7-001cc45a2c28
bp-57b2ca5e-9eb3-11dd-b91c-001321b13766 (suspicious module)
Comment 24 Jonas Sicking (:sicking) No longer reading bugmail consistently 2008-10-22 03:04:20 PDT
Definitely different.
Comment 25 Boris Zbarsky [:bz] (still a bit busy) 2008-10-22 06:01:20 PDT
The first one is crashing on this line:

  1501 hpradhan 1.90 if (mState & XML_HTTP_REQUEST_PARSEBODY) {

The second one on this one:

  1460 jst 1.101 nsCOMPtr<nsIDOMEventListener> proxy = new nsLoadListenerProxy(requestWeak);

The third at:

  1474 nsCOMPtr<nsIHttpChannel> httpChannel(do_QueryInterface(mChannel));  

None are the crash location for this bug, and while the first and third may be due to the |this| object being garbage, the second can only happen due to OOM or something.  At least assuming the line numbers are correct.  If they are, I would wager that these are cases when we crashed with a corrupted stack and that the stack info is just bogus or something.
Comment 26 timeless 2008-10-22 06:20:09 PDT
the mState crash looks more like a dead this than a null this based on the crash address... not certain....

the second is a null pointer crash
1458 nsWeakPtr requestWeak =
1459 jwalden 1.186 do_GetWeakReference(static_cast<nsIXMLHttpRequest*>(this));
1460 jst 1.101 nsCOMPtr<nsIDOMEventListener> proxy = new nsLoadListenerProxy(requestWeak);

my guess is that this is null, the weakreference would return an error in the omitted out parameter and you'd have a null pointer which is probably . 

the one w/ crash address 0 implies a null this pointer - the third is almost certainly here:
1471 request->GetStatus(&status); 

with request being null.
-- these are probably not related to this bug.
Comment 27 Al Billings [:abillings] 2008-10-28 12:25:32 PDT
Marking verified for 1.8.1.18 since it fixes bug 457411, which is verified.

Note You need to log in before you can comment on or make changes to this bug.