Last Comment Bug 679852 - (leveldb) integrate leveldb
(leveldb)
: integrate leveldb
Status: RESOLVED WONTFIX
:
Product: Core
Classification: Components
Component: General (show other bugs)
: Trunk
: All All
: -- normal with 3 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
http://code.google.com/p/leveldb/
Depends on: 44959 701522
Blocks: 580099 663979 679267 679883 687566
  Show dependency treegraph
 
Reported: 2011-08-17 13:26 PDT by Josh Aas
Modified: 2014-09-03 05:02 PDT (History)
50 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Code import, V1 (740.08 KB, patch)
2011-10-19 14:04 PDT, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
Makefiles, V0.5 (4.78 KB, patch)
2011-10-19 14:08 PDT, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
Lock primitives V0.5 (2.90 KB, patch)
2011-10-19 14:38 PDT, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
Checkpoint: Basic API on Linux|Mac with test (24.82 KB, patch)
2011-11-01 12:22 PDT, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
V1, DB interface. Start here. (4.03 KB, text/plain)
2011-11-11 18:25 PST, Matthew McPherrin [:mcpherrin]
no flags Details
V1, Options objects IDL file. (2.20 KB, text/plain)
2011-11-11 18:27 PST, Matthew McPherrin [:mcpherrin]
no flags Details
V1, Status object. Returned by many DB methods. (570 bytes, text/plain)
2011-11-11 18:28 PST, Matthew McPherrin [:mcpherrin]
no flags Details
V1, Iterator over database. (1.92 KB, text/plain)
2011-11-11 18:29 PST, Matthew McPherrin [:mcpherrin]
no flags Details
V1, Batch object. Group writes together. (1.03 KB, text/plain)
2011-11-11 18:30 PST, Matthew McPherrin [:mcpherrin]
no flags Details
Code Import, V1.1 (738.44 KB, patch)
2011-11-16 01:10 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
Makefiles, V1.0 (6.40 KB, patch)
2011-11-16 01:11 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
Part of the port. V0.7 (2.97 KB, patch)
2011-11-16 01:13 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
XPCOM interface, V1 (33.37 KB, patch)
2011-11-16 01:16 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
xpcshell tests, V0.7 (8.17 KB, patch)
2011-11-16 01:18 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
XPCOM interface, V1.0.1 (fix batches) (33.27 KB, patch)
2011-11-16 15:50 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
xpcshell tests, V1.0.1 (8.46 KB, patch)
2011-11-16 15:51 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
Code import, V2.0 (747.14 KB, patch)
2012-01-12 20:27 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
port/env v2.0 (21.45 KB, patch)
2012-01-12 20:28 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
XPCOM interface, V2.0 (35.20 KB, patch)
2012-01-12 20:31 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
Makefiles, V2.0 (7.88 KB, patch)
2012-01-12 20:32 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review
tests, v2.0 (10.85 KB, patch)
2012-01-12 20:33 PST, Matthew McPherrin [:mcpherrin]
no flags Details | Diff | Splinter Review

Description Josh Aas 2011-08-17 13:26:03 PDT
leveldb keeps coming up as a possibly better choice for upcoming projects than sqlite. We should investigate integrating it and do so if it makes sense so that we can start making use of it.
Comment 1 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-08-18 13:36:01 PDT
As far as I can tell leveldb doesn't even build on Windows.
Comment 2 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-08-18 13:49:40 PDT
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #1)
> As far as I can tell leveldb doesn't even build on Windows.

I think we'd do something like chromium does, see https://src.chromium.org/viewvc/chrome/trunk/src/third_party/leveldatabase/
Comment 3 Matthew McPherrin [:mcpherrin] 2011-08-30 10:45:40 PDT
I'm starting work on this.

I agree with Ben Turner about doing what chromium does: Implement an environment, port, and integrate it into the build system.
Comment 4 Jason Duell [:jduell] (needinfo me) 2011-08-31 17:17:10 PDT
Do we have any story on how to evict entries when we use levelDB?  I.e for the HTTP cache we want the ability to do least-recently-used (LRU) or ideally a plug-in-able algorithm.  Does leveldb give us any efficient way to get LRU data?

This is an issue for DNS prefetching too, and probably many other potential use cases.

Given the docs here

   http://leveldb.googlecode.com/svn/trunk/doc/index.html

the only algorithm that comes to mind is to 

 1) store a "last use" timestamp in the data, and 

 2) have a background thread do occasional scans of all key/value pairs and evict those that are older than a certain date (or which have expired).

That's a pretty blunt hammer: I'm not sure how we could do things like "evict 10 MB", or prioritize evicting larger objects.
Comment 5 Dietrich Ayala (:dietrich) 2011-10-18 13:12:01 PDT
No update for quite a while. Josh or Matt, are you actually pursuing this?

If so, what is the ETA? We've got a number of potential uses for a smaller hammer/footgun than SQLite both in desktop and mobile Firefox.

We also might want to have Taras and the perf team do an analysis of how LevelDB could work in the context of one of our potential use-cases. It would be useful to have a comparative analysis of memory usage, IO patterns, binary size (both lib and data) between this and SQLite.
Comment 6 Dietrich Ayala (:dietrich) 2011-10-18 13:13:45 PDT
(In reply to Jason Duell (:jduell) from comment #4)
> I'm not sure how we could do things like
> "evict 10 MB", or prioritize evicting larger objects.

GetApproximateSizes looks like it can give bytes-used for specific key ranges, which sounds like it might do what you want.
Comment 7 Matthew McPherrin [:mcpherrin] 2011-10-18 13:28:48 PDT
(In reply to Dietrich Ayala (:dietrich) from comment #5)
> No update for quite a while. Josh or Matt, are you actually pursuing this?

Yes, I am actively working on this.  Initial patches to be posted soon.

Sorry for any slowness, but I'm new to both of these codebases.
Comment 8 Dietrich Ayala (:dietrich) 2011-10-18 13:35:30 PDT
That's great news Matthew! When possible, please do put status updates and work-in-progress patches here, so that interested parties can help and follow along :)
Comment 9 Matthew McPherrin [:mcpherrin] 2011-10-19 14:04:27 PDT
Created attachment 568183 [details] [diff] [review]
Code import, V1

First in a series of in-progress patches:

Import LevelDB code into source tree.
Comment 10 Matthew McPherrin [:mcpherrin] 2011-10-19 14:08:16 PDT
Created attachment 568187 [details] [diff] [review]
Makefiles, V0.5

In progress: Makefiles to build leveldb as part of Firefox build.
Comment 11 Matthew McPherrin [:mcpherrin] 2011-10-19 14:38:53 PDT
Created attachment 568213 [details] [diff] [review]
Lock primitives V0.5

Some lock primitives that LevelDB requires.
Comment 12 Matthew McPherrin [:mcpherrin] 2011-11-01 12:22:34 PDT
Created attachment 571100 [details] [diff] [review]
Checkpoint: Basic API on Linux|Mac with test

Progress checkpoint.
One patch for working additions (apply code import first).

Individual patches will be posted later today.
Comment 13 Honza Bambas (:mayhemer) 2011-11-02 11:07:39 PDT
Comment on attachment 571100 [details] [diff] [review]
Checkpoint: Basic API on Linux|Mac with test

Review of attachment 571100 [details] [diff] [review]:
-----------------------------------------------------------------

::: db/leveldb/public/IMozLevelDB.idl
@@ +14,5 @@
> +  ldbStatus Get(in ldbReadOptions options, in string key, out ACString value);
> +  ldbStatus Put(in ldbWriteOptions options, in string key, in string value);
> +  ldbStatus Write(in ldbWriteOptions options, in ldbWriteBatch updates);
> +  ldbStatus Delete(in ldbWriteOptions options, in string key);
> +};

In general, please add doxygen-like comments to all method and also their arguments.  Even when these interfaces are so young.


How can I change the default options?  As I have seen in the test bellow in this patch, passing null as options means to use the default options, right?

How is the API for range-requests/gets gonna look?  That is what I'm most interested in.

And thanks for these patches!
Comment 14 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-11-09 02:47:48 PST
Is all the I/O happening in a non-main-thread? And does the API and implementation make sure that the 
main-thread can't be blocked?
Comment 15 Matthew McPherrin [:mcpherrin] 2011-11-09 11:31:47 PST
(In reply to Olli Pettay [:smaug] from comment #14)
> Is all the I/O happening in a non-main-thread? And does the API and
> implementation make sure that the 
> main-thread can't be blocked?

I/O happens on background threads, however, the API can block on locks while waiting for the background thread to catch up if buffers fill.

The API is intended to be used synchronously.  If there's a strong desire for an async API, I can make something happen.  I envisioned the async behaviour to take place at the level of services calling levelDB.

A documented, feature-complete API will be posted within 24 hours for review, btw.
Comment 16 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-11-09 11:36:47 PST
We must stop using blocking calls in the main thread, and because of that, I think it would be
better to not even provide synchronous/blocking API in the main thread.

Or is there some reason why the API needs to by synchronous?
Comment 17 Honza Bambas (:mayhemer) 2011-11-09 11:47:06 PST
I think when LevelDB already does things in the background we could try using the sync APIs first.  Advantage is easy of use, mainly when using e10s (e.g. localStorage.getItem).  

Question is how fast reading will be.  If our async API could be build on the natural "threadiness" of LevelDB then it would be much better then inventing another layer of ours over it - i.e. no new threads or whatever.  I don't remember whether LevelDB has an API we could use for that.
Comment 18 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-11-09 11:49:30 PST
Could we add some assertions or aborts if the sync API is used in the main thread?
Comment 19 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-11-09 11:56:46 PST
For E10s I would assume chrome would run leveldb in a background thread and
content process would communicate with that. We must not block chrome process' main thread.
Blocking content process' main thread is kind ok, and, well, required by localStorage.
Comment 20 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-11-09 12:02:40 PST
IMO I think we should just integrate leveldb now with it's normal sync API. We need this for indexedDB, for instance. Then, we should just strictly enforce (via hard abort) that the sync API is never used on the main thread in the chrome process. We can build a generic async API later. This bug should be about just getting it building in our tree, right?
Comment 21 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-11-09 12:05:34 PST
(In reply to ben turner [:bent] from comment #20)
> IMO I think we should just integrate leveldb now with it's normal sync API.
> We need this for indexedDB, for instance. Then, we should just strictly
> enforce (via hard abort) that the sync API is never used on the main thread
> in the chrome process. We can build a generic async API later. This bug
> should be about just getting it building in our tree, right?

That sounds ok if "Then, we should just strictly
enforce (via hard abort) that the sync API is never used on the main thread
in the chrome process" is changed to "When landing the patch, we should just strictly
enforce (via hard abort) that the sync API is never used on the main thread
in the chrome process"
Comment 22 Shawn Wilsher :sdwilsh 2011-11-09 12:33:02 PST
(In reply to Olli Pettay [:smaug] from comment #21)
> That sounds ok if "Then, we should just strictly
> enforce (via hard abort) that the sync API is never used on the main thread
> in the chrome process" is changed to "When landing the patch, we should just
> strictly
> enforce (via hard abort) that the sync API is never used on the main thread
> in the chrome process"
This.
Comment 23 Matthew McPherrin [:mcpherrin] 2011-11-09 12:49:06 PST
(In reply to Olli Pettay [:smaug] from comment #21)
> (In reply to ben turner [:bent] from comment #20)
> > IMO I think we should just integrate leveldb now with it's normal sync API.
> > We need this for indexedDB, for instance. Then, we should just strictly
> > enforce (via hard abort) that the sync API is never used on the main thread
> > in the chrome process. We can build a generic async API later. This bug
> > should be about just getting it building in our tree, right?
> 
> That sounds ok if "Then, we should just strictly
> enforce (via hard abort) that the sync API is never used on the main thread
> in the chrome process" is changed to "When landing the patch, we should just
> strictly
> enforce (via hard abort) that the sync API is never used on the main thread
> in the chrome process"

This sounds reasonable. I'll do this if nobody has any strong objections.
Comment 24 Matthew McPherrin [:mcpherrin] 2011-11-11 18:24:25 PST
I'm attaching the .idl files that describe LevelDB's public xpcom interface.

Please provide commentary on the API.  In particular, I'm looking for feedback from people who have uses in mind, and how well it fits with their use cases.
Comment 25 Matthew McPherrin [:mcpherrin] 2011-11-11 18:25:07 PST
Created attachment 573978 [details]
V1, DB interface.  Start here.
Comment 26 Matthew McPherrin [:mcpherrin] 2011-11-11 18:27:03 PST
Created attachment 573979 [details]
V1, Options objects IDL file.
Comment 27 Matthew McPherrin [:mcpherrin] 2011-11-11 18:28:09 PST
Created attachment 573980 [details]
V1, Status object. Returned by many DB methods.
Comment 28 Matthew McPherrin [:mcpherrin] 2011-11-11 18:29:25 PST
Created attachment 573981 [details]
V1, Iterator over database.

This is the second most important file. The interface for iterating over DB keys/values.
Comment 29 Matthew McPherrin [:mcpherrin] 2011-11-11 18:30:17 PST
Created attachment 573982 [details]
V1, Batch object. Group writes together.
Comment 30 Matthew McPherrin [:mcpherrin] 2011-11-16 01:10:27 PST
Created attachment 574839 [details] [diff] [review]
Code Import, V1.1
Comment 31 Matthew McPherrin [:mcpherrin] 2011-11-16 01:11:38 PST
Created attachment 574840 [details] [diff] [review]
Makefiles, V1.0
Comment 32 Matthew McPherrin [:mcpherrin] 2011-11-16 01:13:54 PST
Created attachment 574841 [details] [diff] [review]
Part of the port.  V0.7

Porting is incomplete. This is the "port" portion (locks, primitives), "Env" portion (nsIthread, file I/O) is still in progress and will be posted within a week.  Once we have both, then this should run anywhere.
Comment 33 Matthew McPherrin [:mcpherrin] 2011-11-16 01:16:17 PST
Created attachment 574842 [details] [diff] [review]
XPCOM interface, V1

XPCOM interface for LevelDB.  Documented and working.  Please comment.

(N.B. Writing batches is failing currently; will determine why ASAP).
Comment 34 Matthew McPherrin [:mcpherrin] 2011-11-16 01:18:41 PST
Created attachment 574844 [details] [diff] [review]
xpcshell tests, V0.7

Test of the majority of the interesting APIs.  Consider these as "examples", but they may be a little tiny bit crufty.
Comment 35 Matthew McPherrin [:mcpherrin] 2011-11-16 15:50:24 PST
Created attachment 575024 [details] [diff] [review]
XPCOM interface, V1.0.1  (fix batches)
Comment 36 Matthew McPherrin [:mcpherrin] 2011-11-16 15:51:28 PST
Created attachment 575025 [details] [diff] [review]
xpcshell tests, V1.0.1

fix batches and test them
Comment 37 Boris Zbarsky [:bz] 2011-11-17 13:37:07 PST
Some general comments on the API:

1)  Would prefixing the interface names with "ns" be a good idea?
2)  The SeekToFirst/SeekToLast comments call them SeekStart/SeekEnd.
3)  Does getNext return false when you reach the end?  If so, it should document that
    clearly.
4)  We typically use camelCase, not underscores (so create_if_missing may be better as
    createIfMissing; similar for error_if_exists).
5)  Are the various unWrap methods actually planned to be used?  If not, should we still
    have them?
6)  Should these interfaces be marked as only being implementable via C++?  Seems like it,
    given all the noscript stuff on them...
7)  The [notxpcom] parts should probably also be [nostdcall].
8)  Should ILdbStatus be using readonly attributes?
9)  Method names should generally start with a lowercase letter.  This is all over the
    interfaces.
10) Nit on the C++: add a space between "if" and '('.
Comment 38 Marco Bonardo [::mak] 2011-11-23 11:55:35 PST
(In reply to Boris Zbarsky (:bz) from comment #37)
> 1)  Would prefixing the interface names with "ns" be a good idea?

or use moz, so mozILevelDB instead of IMozLevelDB, though someone still prefers sticking to the old ns prefix.(In reply to ben turner [:bent] from comment #20)

> IMO I think we should just integrate leveldb now with it's normal sync API.
> We can build a generic async API later. This bug
> should be about just getting it building in our tree, right?

While I agree we may need both kind of APIs, this is exactly what happened with Storage, and now we have issues to solve. We should try to learn from errors.
An abort is a decent solution but does not block addons from abusing the API on mainthread. Since the issue is mostly thread-locking and contention, would it be possible, when working on mainthread, to enforce a sort of exclusive lock, so that it's impossible for any other thread or process to even try access the same "database"? This way we'd at least have protection against contention on mainthread.
Unless the idea was to use a RUNTIME_ABORT.
Comment 39 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-11-23 11:57:48 PST
We should totally use release aborts to prevent addons from doing synchronous main thread i/o.  We have a quick way to abort if off the main thread in a release build already that's used for the cycle collector.  Just flip the polarity and we're good to go.
Comment 40 Marco Bonardo [::mak] 2011-11-23 12:09:31 PST
If we completely disallow mainthread, can we still use levelDB for localStorage API (that is a synchronous API)?
Comment 41 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-11-23 13:50:35 PST
(In reply to Marco Bonardo [:mak] from comment #38)
> An abort is a decent solution but does not block addons from abusing the API
> on mainthread.

I mean an abort even for release builds. No funny business from extensions.
Comment 42 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-11-23 13:51:29 PST
(In reply to Marco Bonardo [:mak] from comment #40)
> If we completely disallow mainthread, can we still use levelDB for
> localStorage API (that is a synchronous API)?

Do we care? I thought we were done trying to improve that API and were instead funneling people towards indexedDB.
Comment 43 Marco Bonardo [::mak] 2011-11-23 14:06:04 PST
We care because it's one of the current mainthread offenders (bug 627635), and the plan to reduce that pain, was to move to levelDB (bug 687566). If we won't be able to do that, a different plan should be made.
Actually, if the web standards are moving far away from localStorage, and it's going to be deprecated soon for indexedDB, we may even just survive with some bandaid SQLite improvement (bug 704933 is a start).
Comment 44 Shawn Wilsher :sdwilsh 2011-11-24 10:25:23 PST
Please, let's not repeat past mistakes.  No main thread I/O.

(In reply to Marco Bonardo [:mak] from comment #43)
> We care because it's one of the current mainthread offenders (bug 627635),
> and the plan to reduce that pain, was to move to levelDB (bug 687566). If we
> won't be able to do that, a different plan should be made.
localStorage will just have to get a different engineered solution.  We can't give developers APIs that allow synchronous disk I/O on the GUI thread because they will use those instead (as we've seen).
Comment 45 (dormant account) 2011-11-28 15:12:33 PST
(In reply to Shawn Wilsher :sdwilsh from comment #44)
> (In reply to Marco Bonardo [:mak] from comment #43)
> > We care because it's one of the current mainthread offenders (bug 627635),
> > and the plan to reduce that pain, was to move to levelDB (bug 687566). If we
> > won't be able to do that, a different plan should be made.
> localStorage will just have to get a different engineered solution.  We
> can't give developers APIs that allow synchronous disk I/O on the GUI thread
> because they will use those instead (as we've seen).

We give browser developers(us) private C++ apis. We can't block a useful backend api on "people might use it wrong" grounds.
Comment 46 Shawn Wilsher :sdwilsh 2011-11-29 20:18:52 PST
(In reply to Taras Glek (:taras) from comment #45)
> We give browser developers(us) private C++ apis. We can't block a useful
> backend api on "people might use it wrong" grounds.
We are already the ones who use it wrong.  Footgun APIs are going to get misused, that's why we shouldn't make them (and that's why you've got a super reviewer saying the same thing in comment 21).
Comment 47 Matthew McPherrin [:mcpherrin] 2012-01-12 20:27:28 PST
Created attachment 588293 [details] [diff] [review]
Code import, V2.0
Comment 48 Matthew McPherrin [:mcpherrin] 2012-01-12 20:28:36 PST
Created attachment 588294 [details] [diff] [review]
port/env v2.0
Comment 49 Matthew McPherrin [:mcpherrin] 2012-01-12 20:31:05 PST
Created attachment 588295 [details] [diff] [review]
XPCOM interface, V2.0
Comment 50 Matthew McPherrin [:mcpherrin] 2012-01-12 20:32:08 PST
Created attachment 588296 [details] [diff] [review]
Makefiles, V2.0
Comment 51 Matthew McPherrin [:mcpherrin] 2012-01-12 20:33:38 PST
Created attachment 588298 [details] [diff] [review]
tests, v2.0
Comment 52 Ksec 2012-01-21 03:48:53 PST
Something make be worth looking into? ( May be too late in the cycle )

https://github.com/shuttler/nessDB
Comment 53 Shawn Wilsher :sdwilsh 2012-01-22 11:26:28 PST
So I guess we are sticking with the synchronous API, eh?
Comment 54 Richard Newman [:rnewman] 2012-01-22 12:03:17 PST
"Asynchronous API later" is like a "California maybe". Synchronous APIs snowball in engineering cost. My fro will twitch just as soon as you're done fsyncing. 

Without reading this whole thread, I'll just add that Sync runs on the chrome thread, accessing storage APIs from js, and will block the hell out of your ui if there is no async option available. 

I am happy to provide a Sync perspective on blocking calls.
Comment 55 Josh Aas 2012-01-22 13:24:08 PST
(In reply to Shawn Wilsher :sdwilsh from comment #53)
> So I guess we are sticking with the synchronous API, eh?

I don't think this decision has been finalized. Development is paused while we are looking for a new owner since Matthew's internship is done.
Comment 56 Josh Aas 2012-07-04 14:33:49 PDT
Due to success in optimizing sqlite usage we've decided to stop work on leveldb integration. Marking WONTFIX, we can re-open if we change course in the future.

If we do re-open I strongly recommend that we do not provide any synchronous APIs. I was quiet on the subject before but in the mean time I've seen more of what such footguns can do and I'm strongly against them. If we are going to provide a synchronous API that can do disk i/o it needs to have code enforcing non-main-thread-only usage, failing if called on the main thread.

Note You need to log in before you can comment on or make changes to this bug.