668378 - Telemetry sqlite IO

Assignee

Description

•

14 years ago

My proposal for monitoring sqlite io via telemetry is to shim the sqlite vfs with telemetry-wrapped equivalents. Then we get histograms for sqlite_open, sqlite_read. The benefit of this approach is that we get time and amount of io. Alternatively we can monitor every sqlite command and just measure the time spent on that. That will require wrapping every sqlite-usage in some sort of telemetry.

(dormant account)

Assignee

Comment 1

•

14 years ago

Attached patch begining of an sqlite wrapper (obsolete) — Details — Splinter Review

wip

(dormant account)

Assignee

Comment 2

•

14 years ago

Attached patch basically it (obsolete) — Details — Splinter Review

This basically does what I want. Just need to clean it up.

Attachment #545302 - Attachment is obsolete: true

(dormant account)

Assignee

Comment 3

•

14 years ago

Marco, So this is a rough prototype. How should I structure the code so I can land this? Should I put the wrapper funcs into a separate file? any other conventions I should follow here?

(dormant account)

Assignee

Updated

•

14 years ago

Blocks: 572459

Marco Bonardo [:mak]

Comment 4

•

14 years ago

well, for sure this code can't live in the Service source file, the patch looks scarier than it may be :) We already have another vfs for quota, that lives in db/sqlite3/src/test_quota.c this may probably be structured similarly, even if, being made of our own code, it should probably live in Storage and methods should not be called sqlite3_something, probably moz_sqlite3_something may work. mozTelemetryVFSShim may work for the file name. But right now I guess I may have more questions than answers. How is this interacting with the quota shim? To me looks like if quota is enabled it overrides your VFS and so we don't measure any db using quota (like indexedDB)? And I assume the same for any db using OpenDatabaseWithVFS. I guess if, being both measuring I/O in some sense, there shouldn't rather be a single VFS doing both. Did you measure the overhead of calling telemetry at each read and write? Even if I expect IO time to dominate, we should obviously reduce as much as possible the probe effect. Does accumulate act only on memory stuff?

(dormant account)

Assignee

Comment 5

•

14 years ago

(In reply to comment #4) > well, for sure this code can't live in the Service source file, the patch > looks scarier than it may be :) > We already have another vfs for quota, that lives in > db/sqlite3/src/test_quota.c this may probably be structured similarly, even > if, being made of our own code, it should probably live in Storage and > methods should not be called sqlite3_something, probably > moz_sqlite3_something may work. mozTelemetryVFSShim may work for the file > name. > > But right now I guess I may have more questions than answers. > > How is this interacting with the quota shim? To me looks like if quota is > enabled it overrides your VFS and so we don't measure any db using quota > (like indexedDB)? And I assume the same for any db using OpenDatabaseWithVFS. Sounds plausible. > > I guess if, being both measuring I/O in some sense, there shouldn't rather > be a single VFS doing both. Good idea. I didn't realize we already did this elsewhere. > Did you measure the overhead of calling telemetry at each read and write? > Even if I expect IO time to dominate, we should obviously reduce as much as > possible the probe effect. Does accumulate act only on memory stuff? The overhead fine. It's a few method calls. " Does accumulate act only on memory stuff?" I assume you are asking if accumulate only touches memory(and not disk). That's correct. No extra IO caused by this. I'll look into this some more tomorrow. So far it sounds like it would make sense to just have a single vfs that can opt in to quotas and telemetry

(dormant account)

Assignee

Comment 6

•

14 years ago

Attached patch how about this? (obsolete) — Details — Splinter Review

Attachment #545537 - Attachment is obsolete: true

Attachment #546041 - Flags: review?(mak77)

(dormant account)

Assignee

Comment 7

•

14 years ago

Comment on attachment 546041 [details] [diff] [review] how about this? Didn't mean to imply this was review quality yet.

Attachment #546041 - Flags: review?(mak77) → feedback?(mak77)

Marco Bonardo [:mak]

Comment 8

•

13 years ago

Some thought I had along the week end. This is going to give us general SQLite IO patterns, but what's the final scope of this measurement? The issue is that I feel like we are going to get polluted data, due to a single issue: add-ons. While it's absolutely useful to measure internal SQLite implementers, it may be tricky in the wild. I don't expect add-ons to have the same attention we try to have with IO patterns and queries quality, a single add-on may easily put cookies or Places or any other db on its knees, due to badly optimized queries. We'd get almost useless data, since we don't have direct control on add-ons code. As an internal issue instead, the IO may greatly change based on the datasize, a 10MB places.sqlite is likely going to have different patterns from a 100MB one, a db with lots of bookmarks or tags will have different patterns from one with few bookmarks but large history even if they have same size, operations like VACUUM add a bunch of IO operations that are not really interesting to take into account. How will we handle this discrepancies? I'd not want to add overhead (even if minimal) to each IO operation without a plan to distinguish good measures, since numbers are pretty much pointless if you can't trust them.

Marco Bonardo [:mak]

Comment 9

•

13 years ago

A stupid example, Google toolbar in the past was literally adding hundreds of MBs to places.sqlite to store thumbnails of pages (on mainthread too!), based on these numbers and considering how many GT installs exist in the wild, we'd think places.sqlite sucks more than it really does. Another example is that I commonly run a lot of heavy operations though SQLite Manager add-on while testing, these would be reported as my IO pattern but it really doesn't hurt me.

(dormant account)

Assignee

Comment 10

•

13 years ago

I thought about this. a) we will have extension lists b) I spaced the histogram so one has (1-Whatever read)(CHUNK_SIZE), so we can tell when addons are reading from non-chunked databases. One can split this up further, but I think this will be good enough

Marco Bonardo [:mak]

Comment 11

•

13 years ago

So if I get that correctly, for each single datapoint submitted, I can tell which addons were installed? What would be useful, for example in the places.aqlite file (it should be a good example for its size), is to have a datapoint for IO, being able to discard it if we see it may depend on add-ons (ev. blame the add-on). If we keep it, being able to correlate it to data we'll collect in bug 671042. So "we have this IO pattern with this much history and this many bookmarks on a DB of this size". This seems to be what you are heading toward, right? I just quickly skimmed through the patch, the separate file approach is much better and cleaner. Does this work on quota dbs yet?

(dormant account)

Assignee

Comment 12

•

13 years ago

(In reply to comment #11) > So if I get that correctly, for each single datapoint submitted, I can tell > which addons were installed? That's the plan > What would be useful, for example in the places.aqlite file (it should be a > good example for its size), is to have a datapoint for IO, being able to > discard it if we see it may depend on add-ons (ev. blame the add-on). If we > keep it, being able to correlate it to data we'll collect in bug 671042. So > "we have this IO pattern with this much history and this many bookmarks on a > DB of this size". > This seems to be what you are heading toward, right? yes > > I just quickly skimmed through the patch, the separate file approach is much > better and cleaner. Does this work on quota dbs yet? It should. I added code for it. I'll post a non-broken patch tonight. One thing this patch doesn't do is per-file sqlite io. it may or may not be useful to have MOZ_SQLITE_PLACES, MOZ_SQLITE_URLCLASSIFIER, MOZ_SQLITE_OTHER prefixes. The other thing we should do is report sizes of key sqlite files, but that's a separate bug.

Marco Bonardo [:mak]

Comment 13

•

13 years ago

(In reply to comment #12) > One thing this patch doesn't do is per-file sqlite io. it may or may not be > useful to have MOZ_SQLITE_PLACES, MOZ_SQLITE_URLCLASSIFIER, MOZ_SQLITE_OTHER > prefixes. Well, personally I feel pretty much pointless having a giant pile'o'data without being able to associate it to a single db, what are we supposed to learn from that, apart "it's a lot"/"it's not a lot", without being able to blame each single component for its usage?

(dormant account)

Assignee

Comment 14

•

13 years ago

(In reply to comment #13) > (In reply to comment #12) > > One thing this patch doesn't do is per-file sqlite io. it may or may not be > > useful to have MOZ_SQLITE_PLACES, MOZ_SQLITE_URLCLASSIFIER, MOZ_SQLITE_OTHER > > prefixes. > > Well, personally I feel pretty much pointless having a giant pile'o'data > without being able to associate it to a single db, what are we supposed to > learn from that, apart "it's a lot"/"it's not a lot", without being able to > blame each single component for its usage? Ok.

(dormant account)

Assignee

Comment 15

•

13 years ago

So I thought about this. Bytes read/written should separate out separate dbs. Stuff that times fsyncs might benefit from it too. Timing for read/writes is ok to keep amalgamated.

Marco Bonardo [:mak]

Comment 16

•

13 years ago

it may work, times of read/writes are also heavily dependent on the platform itself.

Andrew Sutherland [:asuth] (he/him)

Comment 17

•

13 years ago

(In reply to comment #15) > Stuff that times fsyncs might benefit from it too. Since fsyncs are correlated with transactions, it definitely seems useful to split out. Specifically, something issuing write statements one-by-one so they don't get batched in a transaction, and thus resulting in N fsyncs, would stick out more obviously. One othere thing that might be useful to parameterize would be to indicate the thread that the writes are happening on. I/O on the main thread is mucho horribleo, I/O on the async threads is not.

(dormant account)

Assignee

Comment 18

•

13 years ago

Attached patch v1 — Details — Splinter Review

This tracks urlclassifier, places, cookies + other databases

Attachment #546041 - Attachment is obsolete: true

Attachment #546679 - Flags: review?(mak77)

Attachment #546041 - Flags: feedback?(mak77)

Marco Bonardo [:mak]

Comment 19

•

13 years ago

Asuth, would you like to go through this, for feedback or even as reviewer? You know SQLite internals (and the vfs) far better than me.

Marco Bonardo [:mak]

Updated

•

13 years ago

Attachment #546679 - Flags: feedback?(bugmail)

Andrew Sutherland [:asuth] (he/him)

Comment 20

•

13 years ago

Comment on attachment 546679 [details] [diff] [review] v1 Looks good to me, and I believe all of mak's questions/desires were addressed, so I'll r=asuth since it doesn't sound like mak urgently desires to also review. As always, thanks for your tireless efforts to speed everything up or make it feasible to speed everything up, taras! nit that doesn't really matter: not all the histogram timestamps have units, but some do. missing: MOZ_SQLITE_OPEN MOZ_SQLITE_TRUNCATE MOZ_SQLITE_*_SYNC

Attachment #546679 - Flags: review?(mak77)

Attachment #546679 - Flags: review+

Attachment #546679 - Flags: feedback?(bugmail)

Marco Bonardo [:mak]

Comment 21

•

13 years ago

the only thing that would still be really useful would be to differentiate per-thread operations, as you suggested in comment 17

(dormant account)

Assignee

Comment 22

•

13 years ago

(In reply to comment #20) > Comment on attachment 546679 [details] [diff] [review] [review] > v1 > > Looks good to me, and I believe all of mak's questions/desires were > addressed, so I'll r=asuth since it doesn't sound like mak urgently desires > to also review. As always, thanks for your tireless efforts to speed > everything up or make it feasible to speed everything up, taras! > > nit that doesn't really matter: not all the histogram timestamps have units, > but some do. missing: MOZ_SQLITE_OPEN MOZ_SQLITE_TRUNCATE MOZ_SQLITE_*_SYNC Yeah I only put in units to differentiate ones that have ms and bytes. (In reply to comment #21) > the only thing that would still be really useful would be to differentiate > per-thread operations, as you suggested in comment 17 Will do in a followup. I didn't want to make this any more complicated for this bug. landed on inbound http://hg.mozilla.org/integration/mozilla-inbound/rev/d34df472649c

Whiteboard: [inbound]

(no longer active)

Comment 23

•

13 years ago

http://hg.mozilla.org/mozilla-central/rev/d34df472649c

Assignee: nobody → tglek

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Whiteboard: [inbound]

Target Milestone: --- → mozilla8

(dormant account)

Assignee

Updated

•

13 years ago

Blocks: 673727

Shawn Wilsher :sdwilsh

Comment 24

•

13 years ago

It would have been nice if the coding style of this patch was 1) consistent and 2) followed storage's style guide... Also, copyright is attributed to Oracle, which isn't right.

Shawn Wilsher :sdwilsh

Updated

•

13 years ago

Depends on: 676064

BMO Automation

Updated

•

5 months ago

Product: Toolkit → Core

begining of an sqlite wrapper 14 years ago (dormant account) 6.82 KB, patch		Details \| Diff \| Splinter Review
basically it 14 years ago (dormant account) 13.58 KB, patch		Details \| Diff \| Splinter Review
how about this? 14 years ago (dormant account) 15.45 KB, patch		Details \| Diff \| Splinter Review
v1 13 years ago (dormant account) 18.42 KB, patch	asuth : review+	Details \| Diff \| Splinter Review