graph innodb pending buffer writes and log checkpoint age

RESOLVED FIXED

Status

Cloud Services
Operations: Metrics/Monitoring
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: atoll, Assigned: petef)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

Created attachment 561408 [details] [diff] [review]
untested implementation of collection for new values and laying groundwork for the new graphs

Hi, there are two sets of values that expose where we're exceeding IO capacity in SCL2.

One is the "Pending writes" section under BUFFER POOL, which contains three useful values (LRU, flush list, single page).  These should generally be 0, but any time incoming writes exceed available IO, some or all will spike.  Graph absolute values, flush list is the most important.

Other is the "Checkpoint age" values under LOG, which contains two useful values ("Checkpoint age" and "Checkpoint age target").  As this approaches 100% the database will begin to delay incoming requests to flush pending IO to disk.

I believe that the high values of checkpoint age in scl2 are also why innodb shutdowns are taking so long right now.  New empty databases have 0 for flush list and a small checkpoint age, upgraded sync1 (full) database has non-0 flush list and a high checkpoint age.

I've included a partial patch with this request to help explain what I'm talking about.  It's untested, but may help speed things along a bit.
(Assignee)

Updated

7 years ago
Assignee: nobody → petef
Status: NEW → ASSIGNED
You need to log in before you can comment on or make changes to this bug.