Closed Bug 885205 Opened 6 years ago Closed 6 years ago

Extremely slow reflect bookmarks menu items after restoring bookmarks. And disk access abnormal. And Firefox process does not disappear from task manager on exit

Categories

(Toolkit :: Places, defect)

24 Branch
x86_64
All
defect
Not set

Tracking

()

VERIFIED FIXED
mozilla25
Tracking Status
firefox23 --- unaffected
firefox24 + verified
firefox25 --- verified

People

(Reporter: alice0775, Assigned: raymondlee)

References

Details

(Keywords: perf, regression)

Attachments

(5 files, 4 obsolete files)

Build Identifier:
http://hg.mozilla.org/mozilla-central/rev/d2a7cfa34154
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130619 Firefox/24.0 ID:20130619031048

I have a 500kb bookmarks JSON backup file. (Approx. 1400 bookmark items and 100 folders)
I found a critical problem when I try to restore the JSON backup file to newly created profile using Library.

a) Extremely slow reflect bookmarks menu items.
b) Disk access abnormal.  
c) Firefox process does not disappear from task manager on exit

Steps To Reproduce:
1. Create New Profile
2. Restore Bookmarks from JSON backup file
3. Open Bookmarks Menu
   --- Observe Number of menuitems
   --- Observe HDD access activities
4. Exit Browser
   --- Observe Firefox.exe process in Task Manager
   --- Observe HDD access activities

Actual Results:
  Extremely slow reflect bookmarks menu items.
  Disk access abnormal. 
  Firefox process does not disappear from task manager on exit.

Expected Results:
  Should reflect bookmarks menu items almost instantly.
  Low Disk access activity
  Firefox process should disappear within 2-3 sec after exit browser

Regression window(m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/a47f4e36197f
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130606 Firefox/24.0 ID:20130606024537
Bad:
http://hg.mozilla.org/mozilla-central/rev/dc8e78ed8c44
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130606 Firefox/24.0 ID:20130606121949
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a47f4e36197f&tochange=dc8e78ed8c44


Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/26eb52ade0ee
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130606 Firefox/24.0 ID:20130606040138
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/5e07d25dbd87
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130606 Firefox/24.0 ID:20130606050336
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=26eb52ade0ee&tochange=5e07d25dbd87


Regressed by:
7d5c7429fe03	Raymond Lee — Bug 865555 - Remove synchronous setCharsetForURI in BI_importJSONNode. r=mak
Summary: Extremely slow reflect bookmarks menu items. And disk access abnormal. And Firefox process does not disappear from task manager on exit → Extremely slow reflect bookmarks menu items after restoring bookmarks. And disk access abnormal. And Firefox process does not disappear from task manager on exit
The issue doesn't happen on Mac at all.  I will investigate this.
Assignee: nobody → raymond
This problem also happens on Ubuntu 12.04.
http://hg.mozilla.org/mozilla-central/rev/d2a7cfa34154
Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20130619 Firefox/24.0 ID:20130619031048
OS: Windows 7 → All
(In reply to Alice0775 White from comment #3)
> This problem also happens on Ubuntu 12.04.
> http://hg.mozilla.org/mozilla-central/rev/d2a7cfa34154
> Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20130619 Firefox/24.0
> ID:20130619031048

I can see that the slow reflect bookmarks menu items on ubuntu but I don't see the UI lockup and other odd behaviours like I saw in windows 7.  The reason is that we are using async bookmarks API so the time which takes to import all bookmarks would be slower.

I will check what happens on windows.
Attached patch WIP (obsolete) — Splinter Review
This patch can reduce the UI lockup because we use 
yield PlacesUtils.setCharsetForURI in BI_importJSONNode().  That setCharsetForURI uses mainThread.dispatch because we are waiting for bug 699844.

However, after applying this patch, the UI lockup still happens for few seconds when restoring the JSON file in this bug.

@Marco: any suggestions how to improve this?
Flags: needinfo?(mak77)
(In reply to Raymond Lee [:raymondlee] from comment #5)
> Created attachment 765891 [details] [diff] [review]
> WIP
> 
> This patch can reduce the UI lockup because we use 
> yield PlacesUtils.setCharsetForURI in BI_importJSONNode().  That
> setCharsetForURI uses mainThread.dispatch because we are waiting for bug
> 699844.
> 
> However, after applying this patch, the UI lockup still happens for few
> seconds when restoring the JSON file in this bug.
> 
http://ftp.mozilla.org/pub/mozilla.org/mozilla.org/firefox/try-builds/raymond@raysquare.com-f1c9e04943f5/

These builds contain the patch and you can try that on Windows or Linux.
Both Window and Linux builds works fine.
All problems in comment#0 are solved! 

http://hg.mozilla.org/try/rev/f1c9e04943f5
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130621 Firefox/24.0 ID:20130621113038
http://hg.mozilla.org/try/rev/f1c9e04943f5
Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20130621 Firefox/24.0 ID:20130621113906
(In reply to Alice0775 White from comment #7)
> Both Window and Linux builds works fine.
> All problems in comment#0 are solved! 
> 
> http://hg.mozilla.org/try/rev/f1c9e04943f5
> Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130621 Firefox/24.0
> ID:20130621113038
> http://hg.mozilla.org/try/rev/f1c9e04943f5
> Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20130621 Firefox/24.0
> ID:20130621113906

WAIT!, the try build is broken!
Some bookmark items are missing after restoring from JSON.

Successfully restored in Good Build of Comment#0 and Aurora23.0a2(2013-06-21).
However, Some bookmark items are missing in the Try Build.
Attached image screen shot #1
Attached image screen shot #2
And error in Browser Console:

[14:25:48.338] [Exception... "Could not convert JavaScript argument arg 1 [nsICategoryManager.deleteCategoryEntry]"  nsresult: "0x80570009 (NS_ERROR_XPC_BAD_CONVERT_JS)"  location: "JS frame :: resource://gre/components/PlacesCategoriesStarter.js :: <TOP_LEVEL> :: line 45"  data: no] @ resource://gre/components/PlacesCategoriesStarter.js:45

[14:25:48.339] too much recursion @ chrome://browser/content/browser.js:3096

[14:25:48.341] [Exception... "Could not convert JavaScript argument arg 1 [nsICategoryManager.deleteCategoryEntry]"  nsresult: "0x80570009 (NS_ERROR_XPC_BAD_CONVERT_JS)"  location: "JS frame :: resource://gre/components/PlacesCategoriesStarter.js :: <TOP_LEVEL> :: line 45"  data: no] @ resource://gre/components/PlacesCategoriesStarter.js:45
(In reply to Alice0775 White from comment #11)
> And error in Browser Console:
> 
> [14:25:48.338] [Exception... "Could not convert JavaScript argument arg 1
> [nsICategoryManager.deleteCategoryEntry]"  nsresult: "0x80570009
> (NS_ERROR_XPC_BAD_CONVERT_JS)"  location: "JS frame ::
> resource://gre/components/PlacesCategoriesStarter.js :: <TOP_LEVEL> :: line
> 45"  data: no] @ resource://gre/components/PlacesCategoriesStarter.js:45
> 
> [14:25:48.339] too much recursion @ chrome://browser/content/browser.js:3096
> 
> [14:25:48.341] [Exception... "Could not convert JavaScript argument arg 1
> [nsICategoryManager.deleteCategoryEntry]"  nsresult: "0x80570009
> (NS_ERROR_XPC_BAD_CONVERT_JS)"  location: "JS frame ::
> resource://gre/components/PlacesCategoriesStarter.js :: <TOP_LEVEL> :: line
> 45"  data: no] @ resource://gre/components/PlacesCategoriesStarter.js:45

OK, if we made the change as in attachment 765891 [details] [diff] [review], and no more async methods and then remove the return Task.spawn in BI_importJSONNode (converting back to a sync method), the "too much recursion" error wouldn't happen anymore.

Marco: any recommendations?
we must continue towards the asynchronous way, cause in future bookmarks APIs will be async, so there's no way we can keep things immediate.
The fact it may take a while more to restore bookmarks is expected, and it's unlikely the user will start a long restore and close Firefox immediately.
Surely it's worth to measure the difference, like if it takes twice the time, or 10 times the time.

Though the high disk usage is not expected, as well as the too much recursion error, we should figure what's causing these.
That said, the suggested path could surely be used as a temporary fix while we figure the issue.

I didn't have enough time to look at the whole thing so far, so I can't suggest a better path yet.
Attached patch Temp fix v1 (obsolete) — Splinter Review
Temp fix for the issue.  Remove the Task.spawn from importJSONNode()
Attachment #765891 - Attachment is obsolete: true
Attachment #766524 - Flags: review?(mak77)
(In reply to Marco Bonardo [:mak] from comment #13)
> we must continue towards the asynchronous way, cause in future bookmarks
> APIs will be async, so there's no way we can keep things immediate.
> The fact it may take a while more to restore bookmarks is expected, and it's
> unlikely the user will start a long restore and close Firefox immediately.
> Surely it's worth to measure the difference, like if it takes twice the
> time, or 10 times the time.

Without the patch, it took around 250 seconds to restore the attachment 765202 [details] on my windows machine.  After applying the patch, it took around 3.6 seconds

> 
> Though the high disk usage is not expected, as well as the too much
> recursion error, we should figure what's causing these.

For high disk usage, IMO when we restore bookmarks, they would be stored into the DB.  When it takes 3 or 4 seconds to restore all bookmarks, the high disk usage is not very noticeable when writing to DB.  However, it would a different story when restoring process takes 250 seconds as writing to DB happens across the whole period IMO.  

For too much recursion, it happens when just changing |yield PlacesUtils.setCharsetForURI| to |PlacesUtils.annotations.setPageAnnotation| and keeping BI_importJSONNode() async with |Task.spawn| see attachment 765891 [details] [diff] [review].

The error console displays few places with too much recursion error.

In BI_importJSONNode
> case PlacesUtils.TYPE_X_MOZ_PLACE_SEPARATOR:
>   id = PlacesUtils.bookmarks.insertSeparator(aContainer, aIndex);

In BI_importJSONNode
> if (aData.dateAdded)
>   PlacesUtils.bookmarks.setItemDateAdded(id, aData.dateAdded);
> if (aData.lastModified)
>   PlacesUtils.bookmarks.setItemLastModified(id, aData.lastModified);

TS_onItemAdded() in nsTaggingService.js
The too much recursion error could be solved by moving to the new Promise.jsm module that is resource://gre/modules/Promise.jsm, could you please check if that makes any difference both for the recursion error and the time it takes to restore.
Notice however that Task.jsm is not yet using the new module, so the issue may be there.
Flags: needinfo?(mak77)
(In reply to Marco Bonardo [:mak] from comment #16)
> The too much recursion error could be solved by moving to the new
> Promise.jsm module that is resource://gre/modules/Promise.jsm, could you
> please check if that makes any difference both for the recursion error and
> the time it takes to restore.
> Notice however that Task.jsm is not yet using the new module, so the issue
> may be there.

I have switched to use resource://gre/modules/Promise.jsm for BookmarkJSONUtils.jsm, PlacesUtils.jsm and Task.jsm.  The too much recursion error doesn't exist anymore but it still takes around 240 seconds to restore the same bookmarks file, around 10 seconds faster.
:mak, are we waiting to get an updated patch from :raymondlee here based on your feedback in comment #16 ?
no, the blame is on me finding time to do the actual testing and see if there's a good solution, otherwise we'll just workaround for now, as suggested by Raymond. I will have an answer shortly.
So, looks like the problem is that in importFromJSON we have a runInBatchMode, this is a call that allows us to do a big bunch of bookmarks changes with minimum cost.
By adding tasks we are doing this work asynchronously but we don't have the concept of an asynchronous batch.
So in the end everything ends up happening outside of the batch, and thus it's crazily slow.
I have a patch that makes us skip a tick at each recursive call, so that the ui is responsive, but still the disk churns like crazy due to the missing database transaction.

It's not a problem we can solve properly in Aurora so I suggest filing a separate bug about it and we'll think about a solution for Nightly separately. We could for example collect all of the needed information before actually doing the inserts in a batch.

For now we'll take Raymond's patch, though the external importJSONNode API (BJU_importJSONNode) is expected to return a promise and with the patch it's not returning it, so please return a Promise.resolve() from it, so the external API is satisfied, comment above it explaining why we can't make importJSONNode a task for now, and point the comment to the new bug filed for figuring how to batch async changes.
Comment on attachment 766524 [details] [diff] [review]
Temp fix v1

Review of attachment 766524 [details] [diff] [review]:
-----------------------------------------------------------------

r- cause it breaks the external API, see previous comment.
Though, the approach will be fine for now.
Attachment #766524 - Flags: review?(mak77) → review-
Attached patch v1 (obsolete) — Splinter Review
Filed 890203
Attachment #766524 - Attachment is obsolete: true
Attachment #771246 - Flags: review?(mak77)
Comment on attachment 771246 [details] [diff] [review]
v1

Review of attachment 771246 [details] [diff] [review]:
-----------------------------------------------------------------

I would have appreciated a diff -w, but I couldn't find anything wrong with it.
Attachment #771246 - Flags: review?(mak77) → review+
Attached patch v1 -w (obsolete) — Splinter Review
Please check this -w diff
Flags: needinfo?(mak77)
the diff -w is quite better :) Everything looks fine, thanks!
Flags: needinfo?(mak77)
Attachment #771246 - Attachment is obsolete: true
Attachment #771526 - Attachment is obsolete: true
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/0eb6d5bccc85

A reminder, a good commit message tells you what the patch is actually doing, not just restating the bug title. Please keep this in mind for the future.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/0eb6d5bccc85
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Comment on attachment 771586 [details] [diff] [review]
Patch for checkin

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 865555
User impact if declined: bookmarks restore from JSON takes many minutes instead of seconds, when it happens at startup (due to places.sqlite corruption and consequent database replacement) it may completely block the browser startup.
Testing completed (on m-c, etc.): m-c
Risk to taking this patch (and alternatives if risky): most of the patch is whitespace changes, the change itself is safe enough for an uplift
String or IDL/UUID changes made by this patch: none
Attachment #771586 - Flags: approval-mozilla-aurora?
Attachment #771586 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Alice, can you please verify this is fixed in the latest Beta and Aurora builds?
Flags: needinfo?(alice0775)
I cannot repeoduce the problem.

http://hg.mozilla.org/releases/mozilla-beta/rev/260fcd9e23a3
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0 ID:20130909203154
http://hg.mozilla.org/releases/mozilla-aurora/rev/79caa25bdfaf
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0 ID:20130910004002
Status: RESOLVED → VERIFIED
Flags: needinfo?(alice0775)
You need to log in before you can comment on or make changes to this bug.