Open Bug 834955 Opened 11 years ago Updated 2 years ago

Sqlite.jsm API to get database size

Categories

(Toolkit :: Storage, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: gps, Unassigned)

Details

For FHR, we want to put a Telemetry probe on the size of the database. While we may consider putting such a telemetry probe on all databases via Storage or Sqlite.jsm, I figured a good place to start would be an API on Sqlite.jsm to get the current database's size. We probably want to use OS.File instead of the nsIFile on the mozIStorageConnection instance because nsIFile is synchronous. Although, as Vladan stated on another bug, if we measure at database open time, the stat() is probably cached or cheaply looked up.

Marco: Feel free to push back if you feel this is a silly idea or if you think we should do things differently.
I think using OS.File is quite good.
Maybe I'd not add a bunch of .fileSize .memSize .whatSize, but a getDatabaseStatus (or such) call, that returns an object of properties where we can add further stuff in future?
Actually, there is a fact to take into account, do we care about real file size, or database size?
The question sounds dumb but there's a reason: chunked growth.
To reduce database file system fragmentation we grow the database size by a certain amount, for example Places.sqlite grows by 10MB at a time, but the actual database only takes a part of this space.

The file size is the one you get with OS.file.stat.
The size modulo chunked growth zeros is PRAGMA page_count * PRAGMA page_size

places.sqlite and cookies.sqlite are the only internal DBs using chunked growth afaik.
PS: the pragma calculation doesn't take into account some headers, so it's not 100% correct, but those headers are small and uninsteresting in the whole measure.
(In reply to Marco Bonardo [:mak] from comment #2)
> Maybe I'd not add a bunch of .fileSize .memSize .whatSize, but a
> getDatabaseStatus (or such) call, that returns an object of properties where
> we can add further stuff in future?

Assuming the primary use case is reporting on groups of statistics, a monolithic API is OK. If we have users only interested in single components, then individual APIs are probably better (less overhead).

(In reply to Marco Bonardo [:mak] from comment #3)
> Actually, there is a fact to take into account, do we care about real file
> size, or database size?

Probably both! We have an open issue for FHR to investigate a more optimal chunked growth setting.
(In reply to Gregory Szorc [:gps] from comment #5)
> Assuming the primary use case is reporting on groups of statistics, a
> monolithic API is OK. If we have users only interested in single components,
> then individual APIs are probably better (less overhead).

Or we may make a frankenAPI like:
db.stats("bacon") => { bacon: "ok" }
db.stats(["bacon", "eggs"]) => { bacon: "ok", "eggs": "ok" }
db.stats("all") => { all the stats }
Component: General → Storage
Type: defect → enhancement
Priority: -- → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.