Diagnose how long cache IO thread queues can be

RESOLVED FIXED in Firefox 50

Status

()

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: mayhemer, Assigned: mayhemer)

Tracking

(Blocks: 1 bug)

Trunk
mozilla50
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox49 affected, firefox50 fixed)

Details

(Whiteboard: [necko-active])

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

2 years ago
We have a small rate crash in mozilla::net::CacheIOThread::LoopOneLevel:

https://crash-stats.mozilla.com/search/?signature=~mozilla%3A%3Anet%3A%3ACacheIOThread%3A%3ALoopOneLevel&date=%3C%3D2016-06-01&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature

Which sometimes crashes on OOM.  Might be worth checking how long the queue can get.

I think a DIAGNOSTIC_ASSERT for 1000 operations could do as a start.  Or should it be an unpliftable telemetry?
(Assignee)

Updated

2 years ago
Assignee: nobody → honzab.moz
(Assignee)

Updated

2 years ago
Whiteboard: [necko-active]
(Assignee)

Updated

2 years ago
Summary: Diagnose how long cache IO thread queues can be long → Diagnose how long cache IO thread queues can be
(Assignee)

Updated

2 years ago
Status: NEW → ASSIGNED
(Assignee)

Comment 1

2 years ago
Created attachment 8761654 [details] [diff] [review]
v1

- we accumulate at the moment we dequeue a runnable on the IO thread, we do so separately for each level
- there are 10 buckets for each IO thread level with 30 as a granularity
- when there are e.g. 65 pending runnables at the moment we dequeue, we accumulate bucket #2 (60) and remember to only report (accumulate) when the queue length is more then 90 (60 + 30) next time
- everything over 300 is reported in the last bucket
- some types clean up
Attachment #8761654 - Flags: review?(michal.novotny)
Attachment #8761654 - Flags: review?(michal.novotny) → review+
(Assignee)

Updated

2 years ago
Keywords: checkin-needed

Comment 2

2 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5038112b5f78
Cache I/O queue length telemetry, r=michal
Keywords: checkin-needed
sorry had to back this out for bustage like https://treeherder.mozilla.org/logviewer.html#?job_id=30066432&repo=mozilla-inbound
Flags: needinfo?(honzab.moz)

Comment 4

2 years ago
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/977d6bed6e0a
Backed out changeset 5038112b5f78 for bustage
(Assignee)

Comment 5

2 years ago
Created attachment 8765441 [details] [diff] [review]
v1.1

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cee75e4f673b14f8a714dfa368e8a71a795bbefb

The two errors fixed, let's see.
Attachment #8761654 - Attachment is obsolete: true
Flags: needinfo?(honzab.moz)
Attachment #8765441 - Flags: review+
(Assignee)

Comment 6

2 years ago
Green, let's do it!
Keywords: checkin-needed

Comment 7

2 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4070b1ce5332
Cache I/O queue length telemetry, r=michal
Keywords: checkin-needed

Comment 8

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/4070b1ce5332
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox50: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in before you can comment on or make changes to this bug.