1031697 - Huge Aurora 32 crash spike after landing of bug 1013587, several signatures

Reporter

Description

•

11 years ago

This bug was filed from the Socorro interface and is report bp-c90f7e96-2101-4cf8-8d55-5f0242140628. ============================================================= https://hg.mozilla.org/releases/mozilla-aurora/rev/18d3fdc4a940 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0 ID:20140628004001 Aurora32.0as(2014-06-28) crashes after update.

Alice0775 White

Reporter

Comment 1

•

11 years ago

Reproducible: often Steps To Reproduce: 1. Open mail.live.com and logged in 2. Open www.jongla.com if necessary 3. Exit browser and Restart Browser 4. Restore Previous Session

Alice0775 White

Reporter

Comment 2

•

11 years ago

Regression window(mozilla-aurora) Bad: https://hg.mozilla.org/releases/mozilla-aurora/rev/d9c6731ad8c3 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0 ID:20140627064630 Crash: https://hg.mozilla.org/releases/mozilla-aurora/rev/04303003b896 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0 ID:20140627073329 Pushlog: http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=d9c6731ad8c3&tochange=04303003b896

Alice0775 White

Reporter

Updated

•

11 years ago

Keywords: regression

Alice0775 White

Reporter

Updated

•

11 years ago

Summary: crash in nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions() → Aurora32.0a1, crash in nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions()

Alice0775 White

Reporter

Comment 3

•

11 years ago

another crash id bp-3b676ac9-8088-4573-ba5c-ebe4a2140628 bp-69c80ed5-b5dc-414d-9fe8-b92d02140628

Crash Signature: [@ nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions()] → [@ nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions()] , crash in NS_CycleCollectorSuspect3 , [@ nsXPCWrappedJS::Call…

Kacper Michajłow [:kasper93]

Comment 4

•

11 years ago

(In reply to Alice0775 White from comment #3) > another crash id > bp-3b676ac9-8088-4573-ba5c-ebe4a2140628 > bp-69c80ed5-b5dc-414d-9fe8-b92d02140628 No point in posting crash ids... Here you have almost 10000 reports. https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=3&signature=nsTArray_base%3CnsTArrayFallibleAllocator%2C+nsTArray_CopyWithConstructors%3CJS%3A%3AHeap%3CJSObject*%3E+%3E+%3E%3A%3AIncrementLength%28unsigned+int%29+|+mozilla%3A%3Anet%3A%3AHttpBaseChannel%3A%3AApplyContentConversions%28%29#tab-reports

Alice0775 White

Reporter

Comment 5

•

11 years ago

In local build Last Good: 417056b401e5 First Bad: 80f77d10896b Regressed by: 80f77d10896b Honza Bambas — Bug 1013587 - HTTP cache v2: Start preload on input stream open for existing entries. r=michal, a=lmandel

Blocks: 1013587

Alice0775 White

Reporter

Comment 6

•

11 years ago

workaround browser.cache.use_new_backend_temp = false

Alice0775 White

Reporter

Comment 7

•

11 years ago

Bug 1013587 should be backed out from Aurora.

Flags: needinfo?(honzab.moz)

Robert Kaiser

Comment 8

•

11 years ago

This is URGENT. Due to this, we have 20x the crashes as before on Aurora (and we already had elevated crash rates before).

Crash Signature: [@ nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions()] , crash in NS_CycleCollectorSuspect3 , [@ nsXPCWrappedJS::Call… → [@ nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions()] [@ NS_CycleCollectorSuspect3 ] [@ nsXPCWrappedJS::CallMethod(uns…

Robert Kaiser

Comment 10

•

11 years ago

Sheriffs, can the backout of bug 1013587 on Aurora please be done ASAP? This is causing a *really* bad amount of crashes (we had 77 crashes per 100 ADI yesterday, where usual rate should be around 2 crashes per 100 ADI).

Crash Signature: [@ nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions()] [@ NS_CycleCollectorSuspect3 ] [@ nsXPCWrappedJS::CallMethod(uns… → [@ nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions()] [@ nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithCons…

Flags: needinfo?(sheriffs)

Honza Bambas (:mayhemer)

Comment 11

•

11 years ago

bug 1013587 cannot be on landed w/o bug 1013638. I've caused the confusion here, since I dropped approval request for that bug based on some intermittent test failure that is probably caused by that patch and forgot to remove a? for this patch as well. Sorry!

Flags: needinfo?(honzab.moz)

Nigel Babu [:nigelb]

Comment 12

•

11 years ago

Backed out from Aurora https://hg.mozilla.org/releases/mozilla-aurora/rev/8e6d0f924669

Nigel Babu [:nigelb]

Updated

•

11 years ago

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Robert Kaiser

Comment 13

•

11 years ago

Thanks Nigel. We also should see to get a "nightly" aurora build done ASAP after the backout goes green.

Flags: needinfo?(sheriffs)

Summary: Aurora32.0a1, crash in nsTArray_base<nsTArrayFallibleAllocator, nsTArray_CopyWithConstructors<JS::Heap<JSObject*> > >::IncrementLength(unsigned int) | mozilla::net::HttpBaseChannel::ApplyContentConversions() → Huge Aurora 32 crash spike after landing of bug 1013587, several signatures

Robert Kaiser

Updated

•

11 years ago

status-firefox32: affected → fixed

tracking-firefox32: ? → ---

Keywords: topcrash

Nigel Babu [:nigelb]

Comment 14

•

11 years ago

The tree looks okay. I've triggered a new nightly built. Should be done in a while! Thanks for hand-holding me through this, Kairo :)

Katelyn Gadd (:kael)

Comment 16

•

11 years ago

Is there an affordance for end-users stuck in this scenario when a breakage like this makes it past Nightly? Grabbing the latest build (with the fix) required starting up Aurora manually using a separate profile, and that seems beyond the skill level of a typical user. Maybe it doesn't matter on the assumption that a broken build like this would never hit release channel. (I seem to remember things like session restore getting disabled in this scenario in older builds of FF, but in this case I had 3 restarts restore session then immediately crash.)

Ed Morley [:emorley]

Comment 17

•

11 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #10) > Sheriffs, can the backout of bug 1013587 on Aurora please be done ASAP? This > is causing a *really* bad amount of crashes (we had 77 crashes per 100 ADI > yesterday, where usual rate should be around 2 crashes per 100 ADI). Be aware that needinfo doesn't work on watch-alias email addresses due to bug 179701, only CC does (which the needinfo may/may not do, depending on the watch-alias' settings). Also in these situations anyone can perform the backout, not just sheriffs - so in many cases it's best to ask on IRC, or even ping the original developer since there may be conflicts when performing the backout - which they would be much better suited to handle :-)

Robert Kaiser

Comment 18

•

11 years ago

(In reply to Ed Morley [:edmorley UTC+0] from comment #17) > Also in these situations anyone can perform the backout, not just sheriffs - > so in many cases it's best to ask on IRC, or even ping the original > developer since there may be conflicts when performing the backout - which > they would be much better suited to handle :-) I did ask on IRC as well and nigelb thankfully stepped in after a bit. It was Sunday, I was on my way off to some event (otherwise I'd have done the backout myself), Honza wasn't on when I asked (he came in when Nigel was already on it), and this was *really* bad (see the crash rates I stated in comment #10, for yesterday it even jumped to 280 crashes per 100 ADI, and I heard many comments that Aurora was completely unusable), so I pulled all registers I could think of to get this fixes as soon as in any way possible. FWIW, the second aurora "nightly" build for yesterday that Nigel triggered when the backout was green looks pretty decent in the little crash data we have from it, back to normal levels (but given that the Windows version only was uploaded at 10pm UTC and we use full UTC days in crash-stats, it's not reflected much in yesterday's data yet).

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

status-firefox33: --- → unaffected

Target Milestone: --- → mozilla32

Robert Kaiser

Comment 19

•

11 years ago

Honza, I was just thinking about this one in more depth, and wondered why we did not catch such a large issue with our automation shown on tbpl - are we missing some test coverage there?

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Comment 20

•

11 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #19) > Honza, I was just thinking about this one in more depth, and wondered why we > did not catch such a large issue with our automation shown on tbpl - are we > missing some test coverage there? I think I've explained this enough already. I just had to drop the a+ from the second patch - you have land on m-a, since it was dependent on a prerequisite patch (w/o it it crashes) that I removed a+ for at [1] (suspected to cause some test failures). You've just landed something w/o a prerequisite, you have not checked dependencies. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1013638#c9

Flags: needinfo?(honzab.moz)

Robert Kaiser

Comment 21

•

11 years ago

(In reply to Honza Bambas (:mayhemer) from comment #20) > You've just landed something w/o a prerequisite, you have not checked > dependencies. Yes, I know all that! What I was asking is why we had no automated tests catching it after it was done. It took us more than a day and looking at crash-stats from the wild to catch the problem, while we should be catching huge issues like that right after checkin by tests going orange. Can we reasonably have tests for this and missed creating some or are the cases where we hit this just untestable and we were unfortunate?

Flags: needinfo?(honzab.moz)

Ed Morley [:emorley]

Comment 22

•

11 years ago

kairo has an extremely valid point - Honza, please can you follow this up (perhaps in a new bug). Whilst you weren't to blame for the crashes, it still highlighted a gap in our test coverage that would be good to fill if possible. Thanks :-)

Honza Bambas (:mayhemer)

Comment 23

•

11 years ago

Ah, I get it now. I honestly hit the bug (and filed bug 1013638) only locally and only rarely (not with every request). I can open a new bug here to build a test for this, but I will probably not have time to work on it anyway. Are you OK with that?

Flags: needinfo?(honzab.moz)

Ed Morley [:emorley]

Comment 24

•

11 years ago

New bug filed for making a test, with someone needinfo'd sounds good to me :-)

Honza Bambas (:mayhemer)

Updated

•

11 years ago

Depends on: 1034087

Robert Kaiser

Comment 25

•

11 years ago

(In reply to Ed Morley [:edmorley UTC+0] from comment #24) > New bug filed for making a test, with someone needinfo'd sounds good to me > :-) I agree, thanks for doing that!

(Away)

Comment 26

•

11 years ago

This issue still accounts for half of our Aurora crashes on any given day. Do we have any way to get people past this build?

Andrei Eftimie

Updated

•

11 years ago

Depends on: 1045509

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Updated

•

11 years ago

Keywords: verifyme

Ada [:adalucinet]

Comment 27

•

11 years ago

Reproduced on Aurora 2014-06-28 with STR from comment 1. No crash encountered with Firefox 32 beta 4 (Build ID: 20140804164216) on Windows 7 x32: Mozilla/5.0 (Windows NT 6.1; rv:32.0) Gecko/20100101 Firefox/32.0 Crash reports for 32 beta 3 are present for the last 2 signatures: [@ nsXPCWrappedJS::AddRef()]: https://crash-stats.mozilla.com/report/list?signature=nsXPCWrappedJS%3A%3AAddRef%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&version=Firefox%3A32.0b&hang_type=any&date=2014-08-05+13%3A00%3A00&range_value=2#tab-reports [@ NS_CycleCollectorSuspect3 ]: https://crash-stats.mozilla.com/report/list?signature=NS_CycleCollectorSuspect3&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&version=Firefox%3A32.0b&hang_type=any&date=2014-08-05+13%3A00%3A00&range_value=2#reports Any idea why? Are those related? If there's anyone else that could help me out on this matter earlier - since KaiRo is currently out of office - please let me know..

Flags: needinfo?(kairo)

Robert Kaiser

Comment 28

•

11 years ago

(In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #27) > Any idea why? Are those related? As you can see in those signatures' "Bugzilla" tabs, there's a lot of different bugs connected to those rather generic signatures. It's expected that a few of the signatures on this bug will still exist, as they did before as well - as long as the spike is gone, everything's fine here, though.

Flags: needinfo?(kairo)

[SV Manager] Florin Mezei, QA (:FlorinMezei)

Comment 29

•

11 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com, slow reaction due to vacation backlog) from comment #28) > (In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #27) > > Any idea why? Are those related? > > As you can see in those signatures' "Bugzilla" tabs, there's a lot of > different bugs connected to those rather generic signatures. It's expected > that a few of the signatures on this bug will still exist, as they did > before as well - as long as the spike is gone, everything's fine here, > though. Marking as Verified, thanks Robert!

Status: RESOLVED → VERIFIED

status-firefox32: fixed → verified

Keywords: verifyme