Closed Bug 1131776 Opened 5 years ago Closed 5 years ago

Use WITHOUT ROWID tables for IndexedDB

Categories

(Core :: Storage: IndexedDB, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: bent.mozilla, Assigned: bent.mozilla)

References

Details

Attachments

(6 files, 4 obsolete files)

6.48 KB, patch
Details | Diff | Splinter Review
12.58 KB, patch
Details | Diff | Splinter Review
30.66 KB, patch
janv
: review+
Details | Diff | Splinter Review
17.80 KB, patch
Details | Diff | Splinter Review
228.37 KB, patch
janv
: review+
Details | Diff | Splinter Review
1.79 KB, patch
janv
: review+
Details | Diff | Splinter Review
This lets us save a ton of space and time.

Should block bug 866846 so that we only need one schema change/VACUUM to go forward.
Attached patch Patch, v1 (obsolete) — Splinter Review
Ok, here's the new schema, and the upgrade code. This cuts our database size by about 30% on some of my tests! Reduced size also means reduced I/O times.

Jonas wanted me to upgrade our keys since we're doing a vacuum anyway, so sicking will you please look at the UpgradeKeyFunction::CopyAndUpgradeKeyBufferInternal() function? And I'd love for you to look at the WriteCompressedNumber()/ReadCompressedNumber() functions too (those are the ones that store the index keys).

Jan, the rest is all for you!

This is the SQLite doc on WITHOUT ROWID: http://www.sqlite.org/withoutrowid.html

A few notes:

1. In the near future I'm going to start doing some vacuum/analyze work, so I left space in the db metadata for that so we don't have to do another upgrade.
2. I turned on the incremental vacuum stuff because that has crazy requirements around when it can be turned on. We still won't actually vacuum until we explicitly ask SQLite to do it, so that will be for a future bug.
3. The upgrade is really complicated, and it basically requires a rewrite of every table. The big new addition is the |object_data.index_data_values| column. It stores a compressed representation of all the index keys that came from the |object_data| row. We need this information to make delete/update fast.
4. Once this upgrade is complete we won't have any implicit or explicit indexes in SQLite!
Attachment #8569405 - Flags: review?(jonas)
Attachment #8569405 - Flags: review?(Jan.Varga)
Oh, and we turn off FOREIGN KEY support for opt builds since we don't need the cascading deletes any more.
Comment on attachment 8569405 [details] [diff] [review]
Patch, v1

Review of attachment 8569405 [details] [diff] [review]:
-----------------------------------------------------------------

Get rid of the distinction between the 'Max' and the 'Array' key tag types. The code doesn't work if they are different anyway.

::: dom/indexedDB/ActorsParent.cpp
@@ +605,5 @@
> +  MOZ_ASSERT(aNumber);
> +
> +  // All bytes have 7 bits available.
> +  uint32_t count = 1;
> +  while ((aNumber = aNumber >> 7)) {

aNumber =>> 7

@@ +613,5 @@
> +  return count;
> +}
> +
> +uint32_t
> +CompressedByteCountForIndexId(int64_t aIndexId, bool aUnique)

aUnique is unused.

@@ +636,5 @@
> +  const uint64_t originalNumber = aNumber;
> +#endif
> +
> +  while (true) {
> +    if (uint64_t shiftedNumber = (aNumber >> 7)) {

Nit: IMO

uint64_t shiftedNumber = aNumber >> 7;
if (shiftedNumber) {
  ...


would be more readable.

@@ +651,5 @@
> +               CompressedByteCountForNumber(originalNumber));
> +}
> +
> +uint64_t
> +ReadCompressedNumber(const uint8_t** aIterator, const uint8_t* aEnd)

This code is actually hard to read due to all the assertions being mixed in with actual code :(

It would be much nicer if you can just assert stuff in the beginning and the end.

@@ +661,5 @@
> +
> +  uint64_t result = 0;
> +
> +  const uint8_t*& buffer = *aIterator;
> +  const uint8_t* bufferStart = buffer;

Make these debug-only, and then have a multiplier which starts at 1 and is multiplied by 128 each time through the loop?

@@ +674,5 @@
> +
> +      const uint64_t shiftedNumber =
> +        (uint64_t(thisByte) << (shiftMultiplier * 7));
> +
> +      MOZ_ASSERT(UINT64_MAX - result > shiftedNumber);

This assertion seems more confusing than not.

Just add at the end something like

MOZ_ASSERT((*aIterator - bufferStart) * 7 < 64);

@@ +2814,5 @@
> +  static MOZ_CONSTEXPR_VAR uint8_t kOldStringTag = 0x3;
> +  static MOZ_CONSTEXPR_VAR uint8_t kOldArrayTag = 0x4;
> +  static MOZ_CONSTEXPR_VAR uint8_t kOldMaxType = kOldArrayTag;
> +
> +  MOZ_ASSERT(aTagOffset <=  Key::kMaxArrayCollapse);

Move this up with the other asserts.

@@ +2839,5 @@
> +
> +    // Numbers and Dates are encoded as 64-bit integers, but trailing 0
> +    // bytes have been removed.
> +    const uint32_t byteCount =
> +      std::min(uint32_t(sizeof(uint64_t)), uint32_t(aSourceEnd - aSource));

Might be more readable if you introduce a function like

inline
uint32_t adjustedSize(uint_t length, uint8_t* aSource,
                      uint8_t* aSourceEnd)
{
  return std::min(uint32_t(length), uint32_t(aSourceEnd - aSource));
}

That way you can use adjustSize in the string handling code too.
And then change the above to
const uint32_t byteCount =
  adjustedSize(sizeof(uint64_t), aSource, aSourceEnd);

@@ +2879,5 @@
> +    return NS_OK;
> +  }
> +
> +  if (NS_WARN_IF(sourceTag < kOldArrayTag) ||
> +      NS_WARN_IF(sourceTag > kOldMaxType * Key::kMaxArrayCollapse)) {

The second NS_WARN_IF is redundant. You already did exactly the same check higher up.

@@ +2886,5 @@
> +  }
> +
> +  aTagOffset++;
> +
> +  if (aTagOffset == Key::kMaxArrayCollapse) {

Could you assert that |sourceTag == kOldArrayTag| here?

::: dom/indexedDB/Key.cpp
@@ +92,5 @@
>   end of the encoded buffer.
>   
> + "foo"         // 0x30 s s s
> + 1             // 0x10 bf f0
> + ["a", "b"]    // 0x80 s 0x30 s

Can you add a 0 after the first 's'?
Attachment #8569405 - Flags: review?(jonas) → review+
I have all the patches applied on my mac and
./mach mochitest-plain dom/indexedDB fails on test_app_isolation_inproc.html

ActorsParent.cpp
CreateStorageConnection()
executing "PRAGMA auto_vacuum = INCREMENTAL;" fails with NS_ERROR_FILE_NO_DEVICE_SPACE

When I comment out this pragma, everything works just fine.

./mach mochitest-plain --e10s dom/indexedDB
test_lowDiskSpace.html fails for the same reason

I think there must be something special happening when sqlite autovacuums which is not handled properly in our VFS implementation.
Attached patch Fix truncate, v1 (obsolete) — Splinter Review
Ah, I found it. xTruncate wasn't handling the chunk size (growth increment) properly.

This applies on top of the other.
Attachment #8577569 - Flags: review?(Jan.Varga)
Attached patch Fix truncate, v1Splinter Review
Oops, that was an old version.
Attachment #8577569 - Attachment is obsolete: true
Attachment #8577569 - Flags: review?(Jan.Varga)
Attachment #8577570 - Flags: review?(Jan.Varga)
(In reply to Jan Varga [:janv] from comment #5)
> executing "PRAGMA auto_vacuum = INCREMENTAL;" fails with
> NS_ERROR_FILE_NO_DEVICE_SPACE

It's actually unrelated to the vacuum stuff. The test triggers a silent underflow of quota usage and then this is the first SQLite query that needs to write to disk. When we go to see if there is more space available it looks like the disk is crazy-full.
Attached patch Patch, v1.1 (obsolete) — Splinter Review
Includes tests and truncate fix.
Attachment #8569405 - Attachment is obsolete: true
Attachment #8569405 - Flags: review?(Jan.Varga)
Attachment #8577644 - Flags: review?(Jan.Varga)
Comment on attachment 8577570 [details] [diff] [review]
Fix truncate, v1

This doesn't need separate review, but it's a useful interdiff.
Attachment #8577570 - Flags: review?(Jan.Varga)
This applies on top of the others, it's just a simple rename so that we have consistent names in our linked tables.
Attachment #8578832 - Flags: review?(Jan.Varga)
In terms of the B*-trees versus B-trees impacts of this change discussed in the 4.0 section of http://www.sqlite.org/withoutrowid.html what's our plan?  It seems like the ideal thing would be a SQLite enhancement to support still having index pages with high fan-out despite having cool complex primary keys.

Our mitigations would seem to be:
- bug 964561 where we spill the structured clone data to disk (or a separate, non-"without rowid" table) when it gets large enough to start impacting the btree's efficacy.
- create separate database tables for each object store instead of using one big soup object_data table.
- have a moz-specific hack creation option that lets some object stores that are expecting to store large data live in a non-"without rowid" table (or their own table)
- have content create entirely separate IndexedDB databases for content with larger values to avoid damage from the big soup object_data table.

I expect a non-starter is exposing the structured clone algorithm directly to content so it could workaround things on its own with Blobs.  (And JSON.stringify seems like a bad alternative.)

Disclaimer/context: I'm not able to intuitively reason about the implications of a combination of object stores with small values and large values.  My fear is that pathological situations could occur where the large values, especially if more numerous than the small values, could really decrease the efficiency of the table.  If anyone does intuitively understand the worst-case scenario or has done the legwork, I'd be really interested in hearing what it is.
(In reply to Andrew Sutherland [:asuth] from comment #12)
> - bug 964561 where we spill the structured clone data to disk (or a

Yeah, that's the plan so far. Once we have that mechanism in place we can tweak the numbers to see what works the best.
Attached patch Fix VFS, v1Splinter Review
Ok, Jan, I think I finally figured out what was wrong with the VFS. SQLite uses xTruncate to *grow* the file in some cases, and it uses xFileControl(SQLITE_FCNTL_SIZE_HINT) to resize the file ahead of xWrite. Maybe it just does this more often with WAL and mmap enabled, but this looks like a long-standing bug in our VFS.

Since xTruncate can grow the file it doesn't make sense to have separate UpdateSize and MaybeAllocateSpace functions on QuotaObject, so I combined them. The only tricky part is that xWrite can write into the middle of a file, so we don't want to always set mSize to the new size.

And I hooked xFileControl to give us the chunk size and let us know when it is going to use sneaky unhookable OS functions to resize the file out from under us.
Attached patch Patch, v1.2Splinter Review
This is the latest full patch.
Attachment #8577644 - Attachment is obsolete: true
Attachment #8577644 - Flags: review?(Jan.Varga)
Attachment #8579015 - Flags: review?(Jan.Varga)
Awesome, thanks for fixing it!
I'll review it tomorrow.
Comment on attachment 8578832 [details] [diff] [review]
Rename object_data.key_value column (no functional changes), v1

Review of attachment 8578832 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/indexedDB/ActorsParent.cpp
@@ +16044,5 @@
>    if (NS_WARN_IF(NS_FAILED(rv))) {
>      return rv;
>    }
>  
> +  rv = aObjectStoreKey.BindToStatement(updateStmt,NS_LITERAL_CSTRING("key"));

Nit: add space after ","
Attachment #8578832 - Flags: review?(Jan.Varga) → review+
Comment on attachment 8579015 [details] [diff] [review]
Patch, v1.2

Review of attachment 8579015 [details] [diff] [review]:
-----------------------------------------------------------------

I'll go through ActorsParent.cpp one more time tomorrow, but I think this all looks very good!

All IDB related tests work for me except:
./mach mochitest-plain dom/datastore

::: dom/indexedDB/ActorsParent.cpp
@@ +489,5 @@
> +  }
> +
> +  bool
> +  operator==(const IndexDataValue& aOther) const
> +  { 

Nit: extra whitespace

@@ +1015,5 @@
>    // Table `database`
>    nsresult rv = aConnection->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>      "CREATE TABLE database"
> +      "( name TEXT PRIMARY KEY"
> +      ", origin TEXT NOT NULL"

How is the origin column helpful ?

@@ +1061,5 @@
>    rv = aConnection->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
>      "CREATE TABLE object_data"
> +      "( object_store_id INTEGER NOT NULL"
> +      ", key_value BLOB NOT NULL"
> +      ", index_data_values BLOB DEFAULT NULL"

Could you add a comment with description of index_data_values column ?

@@ +3071,5 @@
> +    return rv;
> +  }
> +
> +  rv = aConnection->ExecuteSimpleSQL(NS_LITERAL_CSTRING(
> +      // This will eventually become the |object_store| table.

Nit: bad indentation of the comment

@@ +20976,5 @@
>    {
>      size_t compressedLength = snappy::MaxCompressedLength(uncompressedLength);
>  
>      // moz_malloc is equivalent to NS_Alloc, which we use because mozStorage
> +    // expectsc to be able to free the adopted pointer with NS_Free.

a typo "expectsc"

::: dom/indexedDB/Key.cpp
@@ +101,5 @@
>  nsresult
>  Key::EncodeJSValInternal(JSContext* aCx, JS::Handle<JS::Value> aVal,
>                           uint8_t aTypeOffset, uint16_t aRecursionDepth)
>  {
> +  NS_ENSURE_TRUE(aRecursionDepth < kMaxRecursionDepth, NS_ERROR_DOM_INDEXEDDB_DATA_ERR);

convert to NS_WARN_IF ?

@@ +183,5 @@
>  Key::DecodeJSValInternal(const unsigned char*& aPos, const unsigned char* aEnd,
>                           JSContext* aCx, uint8_t aTypeOffset, JS::MutableHandle<JS::Value> aVal,
>                           uint16_t aRecursionDepth)
>  {
> +  NS_ENSURE_TRUE(aRecursionDepth < kMaxRecursionDepth, NS_ERROR_DOM_INDEXEDDB_DATA_ERR);

convert to NS_WARN_IF ?

@@ +539,5 @@
> +  const uint8_t* data;
> +  uint32_t dataLength = 0;
> +
> +  nsresult rv = aSource->GetSharedBlob(aIndex, &dataLength, &data);
> +  NS_ENSURE_SUCCESS(rv, NS_ERROR_DOM_INDEXEDDB_UNKNOWN_ERR);

Again, NS_WARN_IF
The style in this file is mixed anyway, there is both NS_ASSERTION/MOZ_ASSERT and NS_ENSURE_SUCCESS/NS_WARN_IF. So I guess we should use the new style for new code or lines that are being modified.

::: dom/quota/QuotaObject.h
@@ +35,2 @@
>    bool
> +  MaybeUpdateSize(int64_t aSize, bool aTruncate);

Hm, aSize doesn't sound absolutely correct. Especially in the case of calling from write(), it's not the size.
Maybe rename to aMaxOffset ?
Attachment #8579015 - Flags: review?(Jan.Varga) → review+
This applies on top of the others. We need it because the packages upgrade xpcshell tests have databases that do not have incremental vacuuming enabled because they were created/packaged on desktop builds. So we can't assert that all upgraded dbs on mobile have incremental vacuuming. It's easier to just always set the pragma rather than do silly debug checking.
Attachment #8583124 - Flags: review?(Jan.Varga)
Oops, right patch now.
Attachment #8583124 - Attachment is obsolete: true
Attachment #8583124 - Flags: review?(Jan.Varga)
Attachment #8583127 - Flags: review?(Jan.Varga)
Attachment #8583127 - Flags: review?(Jan.Varga) → review+
Comment on attachment 8579015 [details] [diff] [review]
Patch, v1.2

Review of attachment 8579015 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/indexedDB/ActorsParent.cpp
@@ +263,5 @@
> +  }
> +};
> +
> +template <typename T>
> +using UniqueMozFreePtr = UniquePtr<T, MozFreeDeleter>;

This is a quite convoluted way to do something that you don't even need at all. You're using moz_malloc, but you should be using fallible new instead, in which case delete would work just fine instead of moz_free.
(In reply to Mike Hommey [:glandium] from comment #25)
> This is a quite convoluted way to do something that you don't even need at
> all. You're using moz_malloc, but you should be using fallible new instead,
> in which case delete would work just fine instead of moz_free.

The requirement for using moz_free comes from our storage code... When I call BindAdoptedBlobByName I hand in a uint8_t* and it calls moz_free sometime later.

Are you saying there's a way around that?
(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #26)
> (In reply to Mike Hommey [:glandium] from comment #25)
> > This is a quite convoluted way to do something that you don't even need at
> > all. You're using moz_malloc, but you should be using fallible new instead,
> > in which case delete would work just fine instead of moz_free.
> 
> The requirement for using moz_free comes from our storage code... When I
> call BindAdoptedBlobByName I hand in a uint8_t* and it calls moz_free
> sometime later.
> 
> Are you saying there's a way around that?

There is one: don't care. moz_free, delete and free are, in practice, the same thing. That's why there's ongoing simplification, starting with bug 1138293 (which removes moz_malloc/moz_free, and that I just landed (and that's also why I noticed this bug, since it added a use of moz_malloc/moz_free)).
(In reply to Mike Hommey [:glandium] from comment #28)
> There is one: don't care. moz_free, delete and free are, in practice, the
> same thing.

Hrm, I feel weird about pairing new with free... Won't tools like asan/valgrind/vtune scream about that sort of thing?
Flags: needinfo?(bent.mozilla)
https://hg.mozilla.org/mozilla-central/rev/165cbd99c2ce
https://hg.mozilla.org/mozilla-central/rev/bfa0009deb1e
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
> Hrm, I feel weird about pairing new with free... Won't tools like
> asan/valgrind/vtune scream about that sort of thing?

Valgrind does. |new| and |delete| should still always be used together. But moz_malloc and moz_free have been removed; just use malloc and free.
(In reply to Nicholas Nethercote [:njn] from comment #34)
> just use malloc and free.

I am. The pain point is that UniquePtr calls delete by default.
Depends on: 1248851
You need to log in before you can comment on or make changes to this bug.