Closed Bug 852467 Opened 11 years ago Closed 11 years ago

nsDisableOldMaxSmartSizePrefEvent runs on the gecko main thread, blocks for long periods of time

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla23

Tracking Flags:

Tracking

Status

firefox21

affected

firefox22

---

affected

firefox23

---

fixed

fennec

21+

---

People

(Reporter: kats, Assigned: michal)

References

Details

(Whiteboard: [snappy])

Attachments

(1 file)

fix 11 years ago Michal Novotny [:michal] 1.14 KB, patch	mayhemer : review+	Details \| Diff \| Splinter Review

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Reporter

Description

•

11 years ago

See https://bugzilla.mozilla.org/show_bug.cgi?id=797615#c342.

The fact that this runnable runs on the gecko main thread and blocks on the cache in the call to nsCacheService::SetDiskSmartSize(); means that other things that need to run on the gecko main thread (e.g. some quick compositor operations) don't get to run for many seconds (I reported seeing a delay of ~18 seconds in bug 797615 comment 326). The Java UI thread needs to run these operations synchronously and times out waiting for the operations to run, which snowballs into various graphics initialization errors and much badness.

Also tagging as snappy because an 18-second block on the gecko main thread seems pretty bad.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Reporter

Comment 1

•

11 years ago

Disabling the disk cache also fixes the intermittent testSystemPages failure. (https://tbpl.mozilla.org/?tree=Try&rev=336832f5b5d5)

tracking-fennec: --- → ?

Jason Duell

Comment 2

•

11 years ago

Poked at this, and it looks like the only obvious thing that could cause the blocking here is that the code grabs the cache lock.  So the real culprit is whatever other code is holding the lock for that time.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Reporter

Comment 3

•

11 years ago

The try build log at [1] has the stack traces for all the threads (there are 12 instances of the crash in that log file, but they are all very similar). As an example, from the first crash in that log, there are threads running in methods like:

nsHttpConnectionMgr::GetSpdyPreferredConn(nsHttpConnectionMgr::nsConnectionEntry*)

and

nsDiskCacheBlockFile::Open(nsIFile*, unsigned int, unsigned int, nsDiskCache::CorruptCacheInfo*)

I don't know this code that well but it looks like that second one is probably holding the cache lock.

[1] https://tbpl.mozilla.org/php/getParsedLog.php?id=20803898&tree=Try&full=1

Mark Finkle (:mfinkle) (use needinfo?)

Updated

•

11 years ago

tracking-fennec: ? → 22+

Erin Lancaster [:elan]

Updated

•

11 years ago

tracking-fennec: 22+ → ?

Erin Lancaster [:elan]

Updated

•

11 years ago

tracking-fennec: ? → 22+

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Reporter

Updated

•

11 years ago

Blocks: 856811

Scoobidiver (away)

Updated

•

11 years ago

Blocks: 834243

Brad Lassey [:blassey] (use needinfo?)

Comment 4

•

11 years ago

the bug this is blocking is tracking firefox 21

tracking-fennec: 22+ → 21+

tracking-firefox21: --- → ?

bhavana bajaj [:bajaj]

Updated

•

11 years ago

status-firefox21: --- → affected

tracking-firefox21: ? → +

Chris Peterson [:cpeterson]

Updated

•

11 years ago

status-firefox22: --- → affected

status-firefox23: --- → affected

bhavana bajaj [:bajaj]

Comment 5

•

11 years ago

Patrick, spoke to :kats and we believe this falls under the necko team helping out with investigation/next steps here.
Assigning this to you to help with reassignment as needed.

Bug 834243 - which is a top-crasher is blocked on investigation due to this.

Assignee: nobody → mcmanus

Patrick McManus [:mcmanus]

Updated

•

11 years ago

Assignee: mcmanus → michal.novotny

Doug Turner (:dougt)

Comment 6

•

11 years ago

Bajaj, How is this bug related to bug 834243?

bhavana bajaj [:bajaj]

Comment 7

•

11 years ago

(In reply to Doug Turner (:dougt) from comment #6)
> Bajaj, How is this bug related to bug 834243?

Doug, Check : https://bugzilla.mozilla.org/show_bug.cgi?id=834243#c28 , https://bugzilla.mozilla.org/show_bug.cgi?id=834243#c27 . I also checked with :kats & he said we need help from necko team resolve this issue to move forward on the top- crasher.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Reporter

Comment 8

•

11 years ago

Most of my investigation about why this is a problem is on bug 797615. See comment 342 and nearby comments on that bug.

Doug Turner (:dougt)

Comment 9

•

11 years ago

Kats, do you know what the other threads are doing when we are holding that lock?

Michal Novotny [:michal]

Assignee

Comment 10

•

11 years ago

(In reply to Doug Turner (:dougt) from comment #9)
> Kats, do you know what the other threads are doing when we are holding that
> lock?

The thread that is holding the lock is opening the disk cache. It seems that just opening few files and reading headers/bitmaps takes ages on android. 

I'm currently testing a fix for this issue on try server https://tbpl.mozilla.org/?tree=Try&rev=8626f02da4d4

Michal Novotny [:michal]

Assignee

Comment 11

•

11 years ago

Attached patch fix — Details — Splinter Review

Attachment #741377 - Flags: review?(honzab.moz)

Honza Bambas (:mayhemer)

Updated

•

11 years ago

Attachment #741377 - Flags: review?(honzab.moz) → review+

Michal Novotny [:michal]

Assignee

Comment 12

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/e73333270ce5

Ryan VanderMeulen [:RyanVM]

Comment 13

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/e73333270ce5

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla23

Scoobidiver (away)

Updated

•

11 years ago

Blocks: 864103

Scoobidiver (away)

Updated

•

11 years ago

status-firefox23: affected → fixed

You need to log in before you can comment on or make changes to this bug.