1208840 - IndexedDB blocks DOM rendering

Reporter

Description

•

10 years ago

Attached file database-comparison.zip — Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:41.0) Gecko/20100101 Firefox/41.0 Build ID: 20150917150946 Steps to reproduce: 1. Do a long-running IndexedDB operation in a transaction 2. Meanwhile, run an animated GIF in an <img/> tag or a JavaScript animation See a live demo here: http://nolanlawson.github.io/database-comparison/ Steps: 1. Choose "IndexedDB" 2. Choose 10000 or 100000 3. Click "Insert docs" 4. Notice that Kirby stops dancing while the documents are being inserted. The DOM is not affected by IndexedDB when run in a web worker. Also I note that this bug also exists in Chrome, but doesn't exist in Safari or Edge. Actual results: The animation is paused during the IDB transaction. Expected results: The animation should continue running smoothly even while IndexedDB operations are in progress.

Nolan Lawson

Reporter

Comment 1

•

10 years ago

Issue filed on Chromium: https://code.google.com/p/chromium/issues/detail?id=536620

Andrew Sutherland [:asuth] (he/him)

Comment 2

•

10 years ago

(In reply to nolan from comment #0) > Expected results: > > The animation should continue running smoothly even while IndexedDB > operations are in progress. From running on non-e10s nightly using the Firefox devtools profiler, it appears that the expected thing is happening: the animated gif is unable to paint while the testing code on the main thread is generating 10,000 documents and issuing 10,000 put calls as part of the over-arching transaction. Once the JS code yields control flow, painting can resume. I took 2 screenshots of my run; unfortunately this is on an active profile because the IndexedDB cleanup logic seems broken (I got "Error: InvalidStateError: A mutation operation was attempted on a database that did not allow mutations." after trying to re-run on a cleaner profile after running clean.): * The waterfall view: https://clicky.visophyte.org/files/screenshots/20150927-150058.png * The JS Flame Chart view: https://clicky.visophyte.org/files/screenshots/20150927-150146.png Familiarizing ourselves quickly, on the big waterfall and the mini-timeline waterfall, the green squares represent the paints Kirby is making us do, and the yellow stuff is the testing logic. I've zoomed the selection area down to the approximate time window of the test. The longest promise callback is 107.69ms in test.js:133 which is this function from http://nolanlawson.github.io/database-comparison/tester.js: return new Promise(function (resolve, reject) { var txn = db.transaction('docs', 'readwrite'); var oStore = txn.objectStore('docs'); for (var i = 0; i < numDocs; i++) { var doc = createDoc(); doc.id = 'doc_' + i; oStore.put(doc); } txn.oncomplete = resolve; txn.onerror = reject; txn.onblocked = reject; }); Now, the testing logic for me says "Took 371ms". This is consistent with the initial promise callbacks through to the "complete" DOM event on the waterfall. The testing logic is tracking how long the commit took to happen. We can see paints did happen asynchronously while the transaction was running in the background, but of course, none happened while JS was on the main thread issuing the transaction. And if we flip over to the JS Flame Chart we can see that part of the reason the paints were further delayed was because the flipping Google +1 buttons (plusone.js) from the Chrome web extensions documentation pages are thrashing the browser because I apparently desperately need realtime updates on how many people have +1'ed the documentation page. But I'll go address this by making a snarky tweet now.

Nolan Lawson

Reporter

Comment 3

•

10 years ago

Maybe I'm misunderstanding, but why does the function you posted block for so long? When I run the "Regular object" test with 100000 documents, it only blocks the GIF for a split-second, then reports ~150ms. When I run the same number of documents with IndexedDB, though, it reports ~7600ms and blocks Kirby for at least 4 seconds by my eye. It's creating the same number of objects, but it's 7 seconds slower with significantly more dropped frames. Isn't that IndexedDB blocking the DOM?

Andrew Sutherland [:asuth] (he/him)

Comment 4

•

10 years ago

Your regular object test is just: function createDoc() { return { data: Math.random() }; } function regularObjectTest(numDocs) { var obj = {}; for (var i = 0; i < numDocs; i++) { obj['doc_' + i] = createDoc(); } } Although this will generate a fair amount of garbage memory, this is the type of micro-test the SpiderMonkey JIT should eat for breakfast, especially since it's creating small objects with a stable shape/type that do not require cross-compartment wrappers, etc., so it's not surprising that it runs very quickly. The IndexedDB test I ran generated 10,000 separate IndexedDB put requests, which includes running a structured clone operation, creating non-trivial C++/WebIDL bound objects using a native non-continuous-arena allocator, etc. This is inherently more expensive. My point is that the JS execution model is that you are going to block other main-thread things while your JS is running on the main thread. Unless off-the-main-thread-animation is taking responsibility for animating the gif (it may on some platforms/in some cases? or maybe in the future), during the idbTest for loop, we expect that Kirby won't animate. What I meant to convey previously is that you really should be capturing 2 numbers: 1) The wall clock time for the write to durably complete. 2) The amount of time your code is blocking the main thread. The dominating factor will be your for-loop, but you could also wrap other pieces of your code too. We do expect Kirby to stop animating for the time interval(s) covered by that second number. We may also potentially expect some secondary fallout related to the cycle collector potentially needing to be involved in garbage collecting stuff (or not). But if Kirby is stopping significantly more than covered by that interval, then there might be something IndexedDB could do to improve its performance. (Albeit, limited by what the test does. For example, if you put event-handlers on the requests returned by the put() call, we would expect the main thread to be slaughtered by N callbacks firing.) This is likely to explain why Chrome/Chromium would behave similarly (assuming they depend on the main thread to help animate gifs on your platform.) The profiler definitely helps you and me understand what might be going on. Especially since the JS Flame Chart is actually showing what Gecko is getting up to even for things not caused by your page/compartment. If I zoomed in more, I saw some stuff attributed to mozStorage, which is suspicious, but quite possibly related to the Places API and its shenanigans. It would be great if you could play around with the profiler in your Firefox instance to see if you are getting similar results, etc.

Nolan Lawson

Reporter

Comment 5

•

10 years ago

I'm on OS X 10.10.5. I don't know much about Firefox internals, but Kirby is definitely blocked for several seconds if you insert 100000 objects into IndexedDB. Whereas with both LocalStorage and in-memory ("regular object"), he barely misses a beat when you insert the same number of objects. I took a performance snapshot and attached it (profile.json), but I'm seeing roughly the same thing as you, except that the "promise callback" is taking around 4 seconds on my machine, corresponding to the amount of time Kirby is blocked. The timer is indeed a bit inaccurate. Is there a better way to calculate the number of dropped frames during the GIF animation? requestAnimationFrame()? Side note: this is the first time I've reported a bug on Firefox's IndexedDB, which is saying a lot because I've probably reported a dozen IndexedDB bugs on the other browsers. I've always been super happy with your work (especially the performance improvements over the past couple years!). This result only surprises me because I've always been told that IndexedDB is preferable to LocalStorage since LocalStorage blocks the DOM, but here I'm seeing nearly the opposite.

Nolan Lawson

Reporter

Comment 6

•

10 years ago

Attached file Profile of adding 100000 docs to IndexedDB — Details

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 7

•

10 years ago

So, fwiw, the reason the animated GIF freezes is because we don't do off-main-thread animations for GIFs, and the main thread is stalled. It's possible that the loop falls off the JIT or something ... we should investigate.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 8

•

10 years ago

Is this something either of you would be interested in profiling?

Flags: needinfo?(bzbarsky)

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 9

•

10 years ago

I see 33% of total in IDBObjectStore::AddOrPut, and when focus on that 33% (so that it is 100%) - almost 50% is ___memmove_ssse3_back somewhere in IDBTransaction::StartRequest. SendPBackgroundIDBRequestConstructor is 12% - Structured clone handling 11% (in GetAddInfo) (of that 11, 3.5 is VectorBase::growStorageBy) - IDBRequest::Create 12%, (of that 12, 6 is IDBRequest::CaptureCaller)

Flags: needinfo?(bugs)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 10

•

10 years ago

The memmove is likely also IPDL overhead. What is the other 66% in? JS?

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 11

•

10 years ago

That 33% was actually with all the threads. So of main thread time it is 77% and the rest is random JS or Promise handling.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 12

•

10 years ago

That's an important detail! So this means it's basically all IPDL overhead.

Boris Zbarsky [:bzbarsky]

Comment 13

•

10 years ago

So I poked a bit at this too, kinda cheating (running in non-e10s mode, but this stuff mostly uses PBackground anyway, so I think that's OK). I see significant mainthread time in the following places: 1) Under ipc::DoWorkRunnable::Run. This is about 40% (or more; my profile includes some painting while I was mousing over to the right button) of the mainthread time. a) About 1/5 of that is mozilla::dom::indexedDB::BackgroundRequestChild::Recv__delete__ doing the DispatchSuccessEvent thing (converting idb keys to jsvals, firing the actual DOM events, etc). b) About 3/5 is mozilla::dom::indexedDB::PBackgroundIDBTransactionChild::RemoveManagee, almost all of this under nsTArray_Impl<PBackgroundIDBRequestChild*>::RemoveElementSorted calling memmove. Sounds like we have tons of transactions and end up with nice O(N^2) behavior here. c) About 1/5 is in other stuff here and there (IPC overhead with MessageChannel::CxxStackFrame, etc). In an e10s build I guess this would happen in the parent process, which might help some, if these calls from the child are non-blocking. 2) Another 40%+ under IDBObjectStoreBinding::Put. This includes: a) About 2/5 of this is under PBackgroundIDBTransactionChild::SendPBackgroundIDBRequestConstructor. This is doing MessageChannel::Send (locks, SendMessage) and doing memmoves via nsTArray_Impl<PBackgroundIDBRequestChild*>::InsertElementAt. Yay more O(N^2) stuff. b) About 1/4 is under IDBObjectStore::GetAddInfo, mostly doing the JS structured clone write. Something like 1/3 of that time is just overhead getting to the internal JS "write" impl. :( c) 1/5 under IDBRequest::Create, which is doing various stuff like DOMEventTargetHelper::BindToOwner, but half of it is nsJSUtils::GetCallingLocation which does JS::DescribeScriptedCaller. This is mostly the frameiter stuff; using SavedStack won't help much here I expect. Getting rid of the need to do this, on the other hand, might help. ;) d) A bit of long-tail stuff.

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 14

•

10 years ago

Oh, so conclusions: We should fix the data structure we use for storing Managees to deal with having lots of them, and then remeasure/reprofile... ;)

Boris Zbarsky [:bzbarsky]

Updated

•

10 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Nathan Froyd [:froydnj]

Updated

•

10 years ago

Depends on: 1212027

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Updated

•

9 years ago

Keywords: perf

OS: Unspecified → All

Hardware: Unspecified → All

Version: 43 Branch → unspecified

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 15

•

9 years ago

Per comment 14 looks like we need more profiling here.

Flags: needinfo?(bzbarsky)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 16

•

9 years ago

Zoom says 32% time in main thread goes to _raw_write_unlock_irqrestore whatever it means. Lots of context switches? 29% is spent under IDBObjectStore::AddOrPut. Of that 29% 35% is IDBRequest::Create, under which 19% (of that 29%) is CaptureCaller. Why we need such thing? Sort of a regression from Bug 1083285 ? Of that 29% 21% is structured cloning write. 21% goes to SendPBackgroundRequestConstructor. there is some odd 2.2% of that 29% going to LoggingString ctor.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 17

•

9 years ago

Most of the interesting stuff happens on a background thread. An idle main thread would mean this bug is fixed though.

Boris Zbarsky [:bzbarsky]

Comment 18

•

9 years ago

OK, so I do still see a pause in the animation if I insert 100000 things on the given page. Happens in both e10s and non-e10s mode. A profile, in non-e10s mode (let me know if you think an e10s mode one would be useful) shows time split about evenly across three threads: 1) The one running indexedDB::ConnectionPool::ThreadRunnable::Run (calling DoDatabaseWork, mostly). 2) The one running ipc::MessagePumpForNonMainThreads (several thiings going on in there). 3) The main thread. Restricting to just the main thread, I see 1787ms under IDBObjectStoreBinding::put calls. Going to focus on that, because the other 500ms was probably the time it took me to move from the "start the profiler" button to the "Insert docs" button. So the time under put() breaks down as follows (all percentages of the time under put(): 1) 33% IDBTransaction::StartRequest. The time is mostly under MessageChannel::Send, taking and releasing locks, doing MessagePump::ScheduleWork (8% of the time is under here, calling nsThread::PutEvent, which involves _more_ locks). Overall, MessageChannel::Seend is 25% of the time. The remainder is PBackgroundIDBTransactionChild::Write (3%), hashtable addition (2%), PBackgroundChild::Register (2%, mostly malloc). 2) 30% under IDBObjectStore::GetAddInfo. 25% of this is JSAutoStructuredCloneBuffer::write, which is mostly writing strings, traversing objects, mallocing in the destructor of JSStructuredCloneWriter, that sort of thing. 5% is KeyPath::ExtractKey, about half under Key:::EncodeString, which is mostly doing the malloc under EnsureMutable/SetCapacity. This is the char16_t* overload of EncodeString, and that EnsureMutable looks like the mBuffer.GetMutableData call to me. 3) 20% under IDBRequest::Create. 11% of this is nsJSUtils::GetCallingLocation (!). 4% under DOMEventTargetHelper:::BindToOwner, 3% cyclecollector::HoldJSObjectsImpl, 1% malloc. Apart from those big three, there's the usual long tail of stuff: 3% in binding code doing WrapObject, 3% LoggingString constructors under AddOrPut, 2% ~ObjectStoreAddPutParams, 2% RequestParams::MaybeDestroy, 2% RequestParams::operator=, 2% ObjectStoreAddPutParams::Assign, and smaller things of various sorts.

Flags: needinfo?(bzbarsky)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 19

•

9 years ago

Alright, so it sounds like the next two pieces are misc IPDL overhead (under IDBTransaction::StartRequest) and nsJSUtils::GetCallingLocation.

Ben Kelly [:bkelly, not reviewing]

Updated

•

9 years ago

Depends on: 1284369

Marion Daly [:mdaly]

Updated

•

8 years ago

Priority: -- → P3

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Andrew Sutherland [:asuth] (he/him)

Comment 20

•

1 year ago

I'm not entirely sure what to do with this bug; a quick tl;dr of the situation is:

IndexedDB operations on a given thread must fundamentally block that thread for the duration of structured serialization calls and related plumbing bookkeeping.
Although Firefox has support for OMT (off (the) main thread) animations, animated gifs in Firefox are ticked by nsRefreshDriver (diagram on the data structure that tracks them, nsRefreshDriver::mStartTable which covers registration and ticks.) and inherently involve the main thread. It's conceivable that Firefox could instead do the ticking for animated gifs OMT, but that should probably be an enhancement bug in Core :: Graphics. (Bug 666446 was in there and added the initial ticking system I think.) Bug 717872 was "Off-main-thread animated images" which seems like it was intended for that purpose, but https://bugzilla.mozilla.org/show_bug.cgi?id=717872#c86 closing it doesn't provide a lot of context, presumably because it was a response to a bot nag.
At the time this was filed there were extra plumbing overheads, and in particular I think we may have had some structured serialization overheads we've since improved. (Although we may have improved deserialization more than serialization since deserialization can trigger GC.) I don't know that there are actionable enhancements possible at this time that don't involve things like putAll().
I think one aspect of this that definitely was addressed was runnable priorities and queues. IDB tasks/runnables for the request completions should run at a lower priority than the refresh driver ticks, so even as the main thread will be overwhelmed by runnables resulting from that, we do expect the gif to animate without much problem as long as the tasks are short. (That is, massive structured deserialization payloads could still jank the animation.)

So I don't think there's more to do directly for this bug, and I'm going to resolve this WFM. (I'm also going to resolve Ben's filed bug 1284369 because actor event targets and runnable prioritization I think addressed what he was talking about.)

Status: NEW → RESOLVED

Closed: 1 year ago

Resolution: --- → WORKSFORME

Whiteboard: dom-lws-bugdash-triage

database-comparison.zip 10 years ago Nolan Lawson 190.18 KB, application/zip		Details
Profile of adding 100000 docs to IndexedDB 10 years ago Nolan Lawson 851.02 KB, application/json		Details