691268 - Use filesystem transactions instead of SQLite transactions to initialize profile

David Teller [:Yoric] - still alive but not very active

Reporter

Description

•

14 years ago

A considerable part of profile initialization is done with SQLite. Tracing (see bug 611837) indicates that we work with the strictest level of transactions, which causes considerable I/O. This is costly, in particular on Android devices. We use these transactions to ensure that the profile cannot be corrupted because of accidents during start-up. However, there is another, less costly mechanism, that can be used: file-system based transactions. Current process: - create myfile.sqlite; - adopt pragma synchronous = FULL; - create tables, populate them, etc.; - commit; // Lots of I/O - in case of commit error, fail startup. Proposed process: - create myfile.sqlite.tmp - adopt pragma synchronous = OFF; - (possibly) adopt pragma temp_store = MEMORY; - (possibly) adopt pragma journal_mode = OFF; - create tables, populate them, etc. - commit; // Little I/O - in case of commit error, fail startup; - adopt pragma synchronous = FULL; //No I/O - rename "myfile.sqlite.tmp" to "myfile.sqlite"; // Trivial I/O - proceed as usual. Now, we need to check whether this can work under Windows.

Mike Hommey [:glandium]

Comment 1

•

14 years ago

We should just not create sqlite databases until we add the first data to them.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 2

•

14 years ago

Mike: there's a chance that this might add a noticeable main-thread pause at runtime, couldn't it?

Mike Hommey [:glandium]

Comment 3

•

14 years ago

are all storage uses on main thread?

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 4

•

14 years ago

I hope not but the truth is that I have no idea. We would definitely need to investigate this before moving to a fully async initialization.

Benjamin Smedberg

Comment 5

•

14 years ago

This is all storage and really has nothing to do with the profile system ;-)

Assignee: dteller → nobody

Component: Profile: BackEnd → Storage

Product: Core → Toolkit

QA Contact: profile-manager-backend → storage

Marco Bonardo [:mak]

Comment 6

•

14 years ago

Storage uses synchronous NORMAL by default, not FULL. Consumers using a WAL journal should also never invoke fsync in all of this process (well, in Places after creating the database we force an fsync on another thread to ensure the structure is correctly saved). Do you have specific Storage consumers in mind? They may just do things the wrong way. Btw, having some measure of the time and I/O difference between creating a db correctly in normal mode and with your suggestion alternative, would help a lot.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 7

•

14 years ago

Attached patch Seems to work. A bit too simple to be true. — Details — Splinter Review

I attach a work-in-progress patch on nsPermissionManager.cpp. It seems to work but I am not confident in the quality of the patch yet: I do not see it performing *any* fsync, which is worrying. I may be missing a pragma to instruct sqlite to flush immediately.

David Teller [:Yoric] - still alive but not very active

Reporter

Updated

•

14 years ago

Depends on: 692408

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 8

•

14 years ago

mak: well noted for NORMAL, thanks for the correction. I have put together a list of fsync during/shortly after startup here: https://bug611837.bugzilla.mozilla.org/attachment.cgi?id=561211 . As I understand these fsyncs, many of them are caused by COMMITs in the default synchronous mode. Does this answer your question? My initial test, as mentioned in Comment 7, indicates that we are going down from 6 to 0 fsyncs for the nsPermissionManager - which, in itself, is a bit surprising. I will try and write a more elementary benchmark.

Marco Bonardo [:mak]

Comment 9

•

14 years ago

most of these would be resolved by shipping pre-built databases rather than creating them on the fly. Btw, I don't completely understand why we are measuring these on an empty database, do we care more about the single time a profile is created rather than on the multiple times it is used? Almost all of these are commits used to create the schemas, that ideally run once in the profile life.

Marco Bonardo [:mak]

Comment 10

•

14 years ago

And with that I'm not saying the first experience is not important, I'm saying we should probably first work on removing fsyncs from common starts of an already existing profile, that is what majority of users hit everyday.

David Teller [:Yoric] - still alive but not very active

Reporter

Comment 11

•

14 years ago

Yes, we are investigating both techniques. The advantages of creating the database, however, is that 1/ it doesn't increase the download size; 2/ it doesn't complicate or build/release system even further. If the programmatic solution doesn't succeed, we will fallback to that solution. And, we are measuring this on an empty database because first start-up is, by definition, the first contact of users with Firefox. Our first start-up is currently way too long on Android and we fear that this may drive users away from Firefox. That, and the work has already essentially been done on second start-up.

Vladan Djeric (:vladan)

Assignee

Comment 12

•

14 years ago

Attached patch Unpack DBs from omni.jar when profile is initialized — Details — Splinter Review

See attachment for a patch that unpacks pre-generated databases from omni.jar when profile is first initialized. The databases contain schemas but no data. It reduces the number of fsyncs done on the very first run from 76 to 10. The omni.jar is only 10-20K bigger as a result. I didn't include binary databases in the above patch. Also, I'm not really sure where in the tree to put the pre-generated DB directory. Right now they're in browser/locales/generic/databases/.

Assignee: nobody → vdjeric

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699616

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699613

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699614

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699615

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699617

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699618

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699619

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699620

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699621

Vladan Djeric (:vladan)

Assignee

Updated

•

14 years ago

Depends on: 699622

Justin Dolske [:Dolske]

Comment 13

•

14 years ago

What's the actual measured improvement to startup as a result of doing this? I think I'm ok with doing this in principle, but I'd like to know what the quantitative benefit is, as this does add a bit of maintenance/complexity cost.

Shawn Wilsher :sdwilsh

Comment 14

•

14 years ago

(In reply to Mike Hommey [:glandium] from comment #3) > are all storage uses on main thread? There is still quite a bit of it. (In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > This is all storage and really has nothing to do with the profile system ;-) Actually, this has nothing to with storage either! It's all about the consumers of it.

Component: Storage → General

Product: Toolkit → Firefox

QA Contact: storage → general

Version: unspecified → Trunk

Shawn Wilsher :sdwilsh

Comment 15

•

14 years ago

(In reply to David Rajchenbach Teller [:Yoric] from comment #0) > Proposed process: > - create myfile.sqlite.tmp > - adopt pragma synchronous = OFF; > - (possibly) adopt pragma temp_store = MEMORY; > - (possibly) adopt pragma journal_mode = OFF; > - create tables, populate them, etc. > - commit; // Little I/O > - in case of commit error, fail startup; > - adopt pragma synchronous = FULL; //No I/O > - rename "myfile.sqlite.tmp" to "myfile.sqlite"; // Trivial I/O > - proceed as usual. A long time ago, I wanted to modify openDatabase to take a schema and various callbacks so all this could be done asynchronously or managed by storage any way we deemed it to be necessary: http://groups.google.com/group/mozilla.dev.platform/browse_thread/thread/0bfbb3a880998029/8f47d830ba7cbb89?#8f47d830ba7cbb89

Justin Dolske [:Dolske]

Comment 16

•

14 years ago

(In reply to Justin Dolske [:Dolske] from comment #13) > What's the actual measured improvement to startup as a result of doing this? Is there data on this yet?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 17

•

14 years ago

(In reply to David Rajchenbach Teller [:Yoric] from comment #0) > Proposed process: > - create myfile.sqlite.tmp > - adopt pragma synchronous = OFF; > - (possibly) adopt pragma temp_store = MEMORY; > - (possibly) adopt pragma journal_mode = OFF; > - create tables, populate them, etc. > - commit; // Little I/O > - in case of commit error, fail startup; > - adopt pragma synchronous = FULL; //No I/O > - rename "myfile.sqlite.tmp" to "myfile.sqlite"; // Trivial I/O > - proceed as usual. At no point in this process will any of the files be guaranteed to be on disk, because there are no syncs. Is this what is intended? Also, on some Linux filesystems, you have to do (multiple) fsyncs after a rename, in order for the rename to stick. I think the pre-created databases are a great idea, but they have the same issue--once we unpack the files, they aren't necessarily on the disk unless/until we do a bunch of fsyncs. So, I think the main part of this work is ensuring that it is safe to have these empty databases go away during startup. But, if we were to do that, I think that it would also work to just build them in-place (like we do now) with synchronous = OFF, switch syncronous=FULL, and then wait for the first thing to be committed.

Marco Bonardo [:mak]

Comment 18

•

14 years ago

fwiw, not a lot of our DBs use synchronous=FULL, indeed afaict only XPIPRovider's database does. Places uses WAL with NORMAL that fsyncs everytime the journal reaches 512KB (actually Places forces an off-mainthread fsync after schema creation, and some other that can likely be removed now, I still have to file a bug on that). Cookies uses WAL with OFF, that means it will never fsync. Permission manager, content-prefs and the disk cache use OFF as well. All of these are completely ignoring data safety, so I assume they contain data that can be rebuilt at any time, otherwise it's a bit crazy to go the OFF path when WAL+NORMAL guarantees similar performances. Any other DB uses NORMAL and is likely fsyncing more. Note though, that Vladan found on the first startup the journal change (for example from the default DELETE to WAL) seems to cause an additional fsync, even if the database is empty.

Vladan Djeric (:vladan)

Assignee

Comment 19

•

14 years ago

I measured the speedup of unpacking DBs vs initializing them with SQL on a borrowed Galaxy S phone. The gains looked to be minor but there was significant variation between measurements. It will be hard to properly evaluate the benefits of this change without real-world telemetry so I am marking this as dependent on enabling Telemetry in Nightly by default (bug 699806)

Depends on: 699806

Shawn Wilsher :sdwilsh

Comment 20

•

14 years ago

Note: the Galaxy S has probably one of the worst performing file systems of any android phone out there.

Justin Dolske [:Dolske]

Comment 21

•

14 years ago

I'm going to propose we WONTFIX this. This is a pretty invasive change without any clear data to support doing it. If we get telemetry for dbinits that shows this is a cost worth paying, then we can reconsider.

Justin Dolske [:Dolske]

Comment 22

•

14 years ago

WONTFIXing until we have some data that says the benefit of doing this is worth the cost. There's a bit of confusing overlap with the discussion in bug 699615, imma add a comment there too.

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → WONTFIX

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

13 years ago

No longer depends on: 699806

Seems to work. A bit too simple to be true. 14 years ago David Teller [:Yoric] - still alive but not very active 11.63 KB, patch		Details \| Diff \| Splinter Review
Unpack DBs from omni.jar when profile is initialized 14 years ago Vladan Djeric (:vladan) 34.60 KB, patch		Details \| Diff \| Splinter Review