Closed Bug 327330 Opened 18 years ago Closed 18 years ago

Speed up mork history import

Tracking

()

Status:

RESOLVED FIXED

Milestone:

Firefox 2 alpha2

People

(Reporter: bryner, Assigned: bryner)

Details

(Keywords: perf)

Attachments

(1 file, 1 obsolete file)

many improvements 18 years ago Brian Ryner (not reading) 41.44 KB, patch	brettw : review+	Details \| Diff \| Splinter Review
better patch 18 years ago Brian Ryner (not reading) 43.47 KB, patch	brettw : review+	Details \| Diff \| Splinter Review

Brian Ryner (not reading)

Assignee

Description

•

18 years ago

Currently when importing history entries, we call setPageDetails() and then addVisit() to add a visit on the last visit date.  This does more queries than we really need to since in both cases it checks whether the page already exists in history.  We could combine these into a single (non-interface) method on nsNavHistory and reduce the amount of querying.

Ben Goodger (use ben at mozilla dot org for email)

Updated

•

18 years ago

Priority: -- → P3

Target Milestone: --- → Firefox 2 alpha2

Brian Ryner (not reading)

Assignee

Comment 1

•

18 years ago

taking and resummarizing

Assignee: brettw → bryner

Summary: Optimize database access when importing history entries → Speed up mork history import

Brian Ryner (not reading)

Assignee

Comment 2

•

18 years ago

Attached patch many improvements (obsolete) — Details — Splinter Review

The changes here are somewhat extensive.  Here are the highlights:

nsMorkReader:
- Rather than keeping a per-row hashtable of column id to value, I'm now just keeping a nsTArray<nsCString> per row, with the array indexes corresponding to a global column array.  This reduces hashtable operations significantly.

- Atom map _keys_ (not values) now use a 9-byte string buffer built into the key, rather than a heap-allocated string (8 bytes for the string representation of a 32-bit key, plus a byte for the terminating null).  This reduces heap allocations significantly for parsing the atom map.  It does increase the amount of memcpy work when we grow the atom table, however, so one optimization we could do if we need to get more speed is to come up with a heuristic for initially sizing the atom table based on the size of the file.

- Don't MorkUnescape column ids which are atom keys (in practice that's all column ids)

- Consider a colon as a terminator for a row id.  Not doing this was causing us to treat table edits as new rows, probably causing duplicate entries and lost data.

nsNavHistory:
- Added new method AddPageWithVisit, which adds a new entry to history (without checking to see whether it's already there) and optionally adds a visit, doing everything with just 2 statement executions (and the statements cached).

nsMorkHistoryImporter:
- Use AddPageWithVisit, and update for nsMorkReader changes

nsStorageFormHistory:
- Update for nsMorkReader changes

Attachment #213408 - Flags: review?(brettw)

Brett Wilson

Comment 3

•

18 years ago

Comment on attachment 213408 [details] [diff] [review]
many improvements

I'm a little uncomfortable with not checking URLs to see if they already exist. However, from my timing experiments we can only check 5000 URLs for existance per second on my powerbook. My history contains 10K items, so this isn't really fast enough. You should be sure the code can not call update for the same item more than once, because we'll get really confused if this happens. Also, we should be sure that this happens before bookmark or default_places import since those will both create history items.

Attachment #213408 - Flags: review?(brettw) → review+

Brian Ryner (not reading)

Assignee

Comment 4

•

18 years ago

Attached patch better patch — Details — Splinter Review

implements the post-import duplicate pruning we talked about

Attachment #213408 - Attachment is obsolete: true

Attachment #213521 - Flags: review?(brettw)

Brett Wilson

Comment 5

•

18 years ago

Comment on attachment 213521 [details] [diff] [review]
better patch


>+nsNavHistory::InternalAddNewPage(nsIURI* aURI, const nsAString& aTitle,
>+                                 PRBool aHidden, PRBool aTyped,
>                                  PRInt32 aVisitCount, PRInt64* aPageID)
> {
>   mozStorageStatementScoper scoper(mDBAddNewPage);
>   nsresult rv = BindStatementURI(mDBAddNewPage, 0, aURI);
>   NS_ENSURE_SUCCESS(rv, rv);
> 
>+  // title
>+  rv = mDBAddNewPage->BindStringParameter(1, aTitle);

Check for isNull and use BindNullParameter(1) if so.

Attachment #213521 - Flags: review?(brettw) → review+

Brian Ryner (not reading)

Assignee

Comment 6

•

18 years ago

checked in, trunk and branch

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 7

•

18 years ago

Looks like you forgot to check in the nsMorkReader.* changes, and it broke balsa.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 8

•

18 years ago

Oh, db/morkreader isn't in the tinderbox module...

Nick Thomas [:nthomas] (UTC+12)

Comment 9

•

18 years ago

biesi checked in a fix for balsa:

Index: mozilla/db/morkreader/nsMorkReader.cpp
===================================================================
RCS file: /cvsroot/mozilla/db/morkreader/nsMorkReader.cpp,v
retrieving revision 1.3
retrieving revision 1.4
diff -r1.3 -r1.4
564c564
<   nsAutoPtr< nsTArray<nsCString> > array = new nsTArray<nsCString>(aCount);
---
>   nsAutoPtr< nsTArray<nsCString> > array(new nsTArray<nsCString>(aCount));

Didn't land on the 1.8 branch yet.

(dbaron fixed the tinderbox module in bug 328979)

Brett Wilson

Comment 10

•

18 years ago

Nick's fix is now on the branch.

Gervase Markham [:gerv]

Comment 11

•

15 years ago

Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv

Component: Places → Bookmarks & History

QA Contact: places → bookmarks

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Speed up mork history import

Categories

(Firefox :: Bookmarks & History, defect, P3)

Tracking

()

People

(Reporter: bryner, Assigned: bryner)

References

Details

(Keywords: perf)

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Attachment

General

Description

File Name

Content Type