Closed
Bug 691268
Opened 13 years ago
Closed 13 years ago
Use filesystem transactions instead of SQLite transactions to initialize profile
Categories
(Firefox :: General, defect)
Firefox
General
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: Yoric, Assigned: vladan)
References
(Depends on 5 open bugs, Blocks 1 open bug)
Details
(Keywords: perf, Whiteboard: mobilestartupshrink)
Attachments
(2 files)
11.63 KB,
patch
|
Details | Diff | Splinter Review | |
34.60 KB,
patch
|
Details | Diff | Splinter Review |
A considerable part of profile initialization is done with SQLite. Tracing (see bug 611837) indicates that we work with the strictest level of transactions, which causes considerable I/O. This is costly, in particular on Android devices.
We use these transactions to ensure that the profile cannot be corrupted because of accidents during start-up. However, there is another, less costly mechanism, that can be used: file-system based transactions.
Current process:
- create myfile.sqlite;
- adopt pragma synchronous = FULL;
- create tables, populate them, etc.;
- commit; // Lots of I/O
- in case of commit error, fail startup.
Proposed process:
- create myfile.sqlite.tmp
- adopt pragma synchronous = OFF;
- (possibly) adopt pragma temp_store = MEMORY;
- (possibly) adopt pragma journal_mode = OFF;
- create tables, populate them, etc.
- commit; // Little I/O
- in case of commit error, fail startup;
- adopt pragma synchronous = FULL; //No I/O
- rename "myfile.sqlite.tmp" to "myfile.sqlite"; // Trivial I/O
- proceed as usual.
Now, we need to check whether this can work under Windows.
Comment 1•13 years ago
|
||
We should just not create sqlite databases until we add the first data to them.
Reporter | ||
Comment 2•13 years ago
|
||
Mike: there's a chance that this might add a noticeable main-thread pause at runtime, couldn't it?
Comment 3•13 years ago
|
||
are all storage uses on main thread?
Reporter | ||
Comment 4•13 years ago
|
||
I hope not but the truth is that I have no idea. We would definitely need to investigate this before moving to a fully async initialization.
Comment 5•13 years ago
|
||
This is all storage and really has nothing to do with the profile system ;-)
Assignee: dteller → nobody
Component: Profile: BackEnd → Storage
Product: Core → Toolkit
QA Contact: profile-manager-backend → storage
Comment 6•13 years ago
|
||
Storage uses synchronous NORMAL by default, not FULL. Consumers using a WAL journal should also never invoke fsync in all of this process (well, in Places after creating the database we force an fsync on another thread to ensure the structure is correctly saved). Do you have specific Storage consumers in mind? They may just do things the wrong way.
Btw, having some measure of the time and I/O difference between creating a db correctly in normal mode and with your suggestion alternative, would help a lot.
Reporter | ||
Comment 7•13 years ago
|
||
I attach a work-in-progress patch on nsPermissionManager.cpp. It seems to work but I am not confident in the quality of the patch yet: I do not see it performing *any* fsync, which is worrying. I may be missing a pragma to instruct sqlite to flush immediately.
Reporter | ||
Comment 8•13 years ago
|
||
mak: well noted for NORMAL, thanks for the correction.
I have put together a list of fsync during/shortly after startup here: https://bug611837.bugzilla.mozilla.org/attachment.cgi?id=561211 . As I understand these fsyncs, many of them are caused by COMMITs in the default synchronous mode. Does this answer your question?
My initial test, as mentioned in Comment 7, indicates that we are going down from 6 to 0 fsyncs for the nsPermissionManager - which, in itself, is a bit surprising. I will try and write a more elementary benchmark.
Comment 9•13 years ago
|
||
most of these would be resolved by shipping pre-built databases rather than creating them on the fly.
Btw, I don't completely understand why we are measuring these on an empty database, do we care more about the single time a profile is created rather than on the multiple times it is used? Almost all of these are commits used to create the schemas, that ideally run once in the profile life.
Comment 10•13 years ago
|
||
And with that I'm not saying the first experience is not important, I'm saying we should probably first work on removing fsyncs from common starts of an already existing profile, that is what majority of users hit everyday.
Reporter | ||
Comment 11•13 years ago
|
||
Yes, we are investigating both techniques. The advantages of creating the database, however, is that 1/ it doesn't increase the download size; 2/ it doesn't complicate or build/release system even further. If the programmatic solution doesn't succeed, we will fallback to that solution.
And, we are measuring this on an empty database because first start-up is, by definition, the first contact of users with Firefox. Our first start-up is currently way too long on Android and we fear that this may drive users away from Firefox.
That, and the work has already essentially been done on second start-up.
Assignee | ||
Comment 12•13 years ago
|
||
See attachment for a patch that unpacks pre-generated databases from omni.jar when profile is first initialized. The databases contain schemas but no data. It reduces the number of fsyncs done on the very first run from 76 to 10. The omni.jar is only 10-20K bigger as a result.
I didn't include binary databases in the above patch. Also, I'm not really sure where in the tree to put the pre-generated DB directory. Right now they're in browser/locales/generic/databases/.
Assignee: nobody → vdjeric
Comment 13•13 years ago
|
||
What's the actual measured improvement to startup as a result of doing this?
I think I'm ok with doing this in principle, but I'd like to know what the quantitative benefit is, as this does add a bit of maintenance/complexity cost.
Comment 14•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #3)
> are all storage uses on main thread?
There is still quite a bit of it.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #5)
> This is all storage and really has nothing to do with the profile system ;-)
Actually, this has nothing to with storage either! It's all about the consumers of it.
Component: Storage → General
Product: Toolkit → Firefox
QA Contact: storage → general
Version: unspecified → Trunk
Comment 15•13 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] from comment #0)
> Proposed process:
> - create myfile.sqlite.tmp
> - adopt pragma synchronous = OFF;
> - (possibly) adopt pragma temp_store = MEMORY;
> - (possibly) adopt pragma journal_mode = OFF;
> - create tables, populate them, etc.
> - commit; // Little I/O
> - in case of commit error, fail startup;
> - adopt pragma synchronous = FULL; //No I/O
> - rename "myfile.sqlite.tmp" to "myfile.sqlite"; // Trivial I/O
> - proceed as usual.
A long time ago, I wanted to modify openDatabase to take a schema and various callbacks so all this could be done asynchronously or managed by storage any way we deemed it to be necessary: http://groups.google.com/group/mozilla.dev.platform/browse_thread/thread/0bfbb3a880998029/8f47d830ba7cbb89?#8f47d830ba7cbb89
Comment 16•13 years ago
|
||
(In reply to Justin Dolske [:Dolske] from comment #13)
> What's the actual measured improvement to startup as a result of doing this?
Is there data on this yet?
Comment 17•13 years ago
|
||
(In reply to David Rajchenbach Teller [:Yoric] from comment #0)
> Proposed process:
> - create myfile.sqlite.tmp
> - adopt pragma synchronous = OFF;
> - (possibly) adopt pragma temp_store = MEMORY;
> - (possibly) adopt pragma journal_mode = OFF;
> - create tables, populate them, etc.
> - commit; // Little I/O
> - in case of commit error, fail startup;
> - adopt pragma synchronous = FULL; //No I/O
> - rename "myfile.sqlite.tmp" to "myfile.sqlite"; // Trivial I/O
> - proceed as usual.
At no point in this process will any of the files be guaranteed to be on disk, because there are no syncs. Is this what is intended?
Also, on some Linux filesystems, you have to do (multiple) fsyncs after a rename, in order for the rename to stick.
I think the pre-created databases are a great idea, but they have the same issue--once we unpack the files, they aren't necessarily on the disk unless/until we do a bunch of fsyncs.
So, I think the main part of this work is ensuring that it is safe to have these empty databases go away during startup. But, if we were to do that, I think that it would also work to just build them in-place (like we do now) with synchronous = OFF, switch syncronous=FULL, and then wait for the first thing to be committed.
Comment 18•13 years ago
|
||
fwiw, not a lot of our DBs use synchronous=FULL, indeed afaict only XPIPRovider's database does.
Places uses WAL with NORMAL that fsyncs everytime the journal reaches 512KB (actually Places forces an off-mainthread fsync after schema creation, and some other that can likely be removed now, I still have to file a bug on that).
Cookies uses WAL with OFF, that means it will never fsync. Permission manager, content-prefs and the disk cache use OFF as well. All of these are completely ignoring data safety, so I assume they contain data that can be rebuilt at any time, otherwise it's a bit crazy to go the OFF path when WAL+NORMAL guarantees similar performances.
Any other DB uses NORMAL and is likely fsyncing more.
Note though, that Vladan found on the first startup the journal change (for example from the default DELETE to WAL) seems to cause an additional fsync, even if the database is empty.
Assignee | ||
Comment 19•13 years ago
|
||
I measured the speedup of unpacking DBs vs initializing them with SQL on a borrowed Galaxy S phone. The gains looked to be minor but there was significant variation between measurements. It will be hard to properly evaluate the benefits of this change without real-world telemetry so I am marking this as dependent on enabling Telemetry in Nightly by default (bug 699806)
Depends on: 699806
Comment 20•13 years ago
|
||
Note: the Galaxy S has probably one of the worst performing file systems of any android phone out there.
Comment 21•13 years ago
|
||
I'm going to propose we WONTFIX this. This is a pretty invasive change without any clear data to support doing it. If we get telemetry for dbinits that shows this is a cost worth paying, then we can reconsider.
Comment 22•13 years ago
|
||
WONTFIXing until we have some data that says the benefit of doing this is worth the cost. There's a bit of confusing overlap with the discussion in bug 699615, imma add a comment there too.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•