Closed Bug 500391 Opened 15 years ago Closed 15 years ago

When filtering results on search avoid querying tags and parent for each result

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla1.9.3a1

Tracking Flags:

Tracking

Status

status1.9.2

---

beta5-fixed

People

(Reporter: mak, Assigned: mak)

References

Details

(Keywords: perf, Whiteboard: [TSnappiness])

Attachments

(2 files, 4 obsolete files)

wip 0.1 15 years ago Marco Bonardo [:mak] 36.22 KB, patch		Details \| Diff \| Splinter Review
wip 0.2 15 years ago Marco Bonardo [:mak] 53.58 KB, patch		Details \| Diff \| Splinter Review
patch v1.0 15 years ago Marco Bonardo [:mak] 54.58 KB, patch	asaf : review-	Details \| Diff \| Splinter Review
patch v1.1 15 years ago Marco Bonardo [:mak] 56.59 KB, patch	asaf : review+	Details \| Diff \| Splinter Review
patch v1.2 15 years ago Marco Bonardo [:mak] 66.32 KB, patch		Details \| Diff \| Splinter Review
patch for 1.9.2 15 years ago Marco Bonardo [:mak] 46.30 KB, patch	beltzner : approval1.9.2+	Details \| Diff \| Splinter Review

Marco Bonardo [:mak]

Assignee

Description

•

15 years ago

Actually when filtering results on a query with searchTerms, for every node we examine we get the tags with a query and then we filter.
This means that if a query has to pass through 1000 nodes, we will execute 1000 queries...
This is obviously crazy and slow, you can see the result trying to search in Library or Sidebar with NSPR_logging enabled on storage.

In this specific case we should instead try to query for tags in the original query, so that we will get them directly from the node. see bug 486603 for further informations on why we can't do that always.

Fixing this should provide a largely better experience when using Places search fields.

Marco Bonardo [:mak]

Assignee

Updated

•

15 years ago

Whiteboard: [TSnappiness]

Marco Bonardo [:mak]

Assignee

Comment 1

•

15 years ago

And we also get the parent folder for each bookmark returned by a query...

Marco Bonardo [:mak]

Assignee

Updated

•

15 years ago

Summary: When filtering results on search avoid querying tags for each result → When filtering results on search avoid querying tags and parent for each result

Marco Bonardo [:mak]

Assignee

Comment 2

•

15 years ago

Attached patch wip 0.1 (obsolete) — Details — Splinter Review

passes tests, and should be faster, i can search in a debug build, so i suppose optimized builds will be good. I'll create some trybuild to share testing. I'll also need to measure perf when executing queries in the sidebar.

Marco Bonardo [:mak]

Assignee

Comment 3

•

15 years ago

searching seem to be faster, in debug builds it was taking about 20s, now it takes about 10s (as usual on a large db with lot of history). So this is improving.
builds will appear here:
https://build.mozilla.org/tryserver-builds/mak77@bonardo.net-try-24803e3c1131/

Marco Bonardo [:mak]

Assignee

Comment 4

•

15 years ago

Attached patch wip 0.2 (obsolete) — Details — Splinter Review

Includes an additional experimental change to the treeview.

I've taken a look around, our trees are damn slow due to the fact we hold a visibleElements list, and we populate and update that list on invalidateContainer. that means we have to getChild() each node from the resultNode, and xpcom is expensive when you getChild thousands of nodes. So i've practically moved most of the buildVisibleSection code to a new method of the container. This is a bit hackish, but i want to see how it performs, from my measurements it takes about less than half of the time.

Attachment #385887 - Attachment is obsolete: true

Marco Bonardo [:mak]

Assignee

Comment 5

•

15 years ago

To put in some number, on a debug build with large history i can open the history sidebar by last visited in 6 seconds, vs more than 20s it was taking before. wip 0.2 is actually building on tryserver.

Marco Bonardo [:mak]

Assignee

Comment 6

•

15 years ago

these have a small bug that won't allow to use uriIsPrefix queries, but should be good for testing https://build.mozilla.org/tryserver-builds/mak77@bonardo.net-try-10b834f019d3/

Marco Bonardo [:mak]

Assignee

Comment 7

•

15 years ago

So with latest patch i measured time for a search in history sidebar on a pretty large DB, without patch it took about 144s, with patch 43s... still a lot but less than a half is a good first attack. notice without any search the sidebar takes about 41s, so the search term overhead is highly reduced.

Marco Bonardo [:mak]

Assignee

Comment 8

•

15 years ago

Attached patch patch v1.0 (obsolete) — Details — Splinter Review

i forgot to tell that previous times are taken with a debug build, an optimized one will most likely take a really shorter time, really likely less than 10s.
Btw we can measure this as about a 70% improvement on search times.

if we could rewrite the query builder this code would really benefit for maintainability

Attachment #386330 - Attachment is obsolete: true

Attachment #403389 - Flags: review?(mano)

Asaf Romano (gone)

Comment 9

•

15 years ago

Comment on attachment 403389 [details] [diff] [review]
patch v1.0

Nice work, even more nice gains.

>   // This is a LEFT OUTER JOIN with moz_places since folders does not have
>   // a reference into that table.
>   rv = mDBConn->CreateStatement(NS_LITERAL_CSTRING(
>       "SELECT h.id, h.url, COALESCE(b.title, h.title), "
>         "h.rev_host, h.visit_count, h.last_visit_date, f.url, null, b.id, "
>-        "b.dateAdded, b.lastModified, b.position, b.type, b.fk, b.folder_type "
>+        "b.dateAdded, b.lastModified, b.parent, null, b.position, b.type, "
>+        "b.fk, b.folder_type "
>       "FROM moz_bookmarks b "
>       "JOIN moz_places_temp h ON b.fk = h.id "
>       "LEFT JOIN moz_favicons f ON h.favicon_id = f.id "
>       "WHERE b.parent = ?1 "
>       "UNION ALL "
>       "SELECT h.id, h.url, COALESCE(b.title, h.title), "
>         "h.rev_host, h.visit_count, h.last_visit_date, f.url, null, b.id, "
>-        "b.dateAdded, b.lastModified, b.position, b.type, b.fk, b.folder_type "
>+        "b.dateAdded, b.lastModified, b.parent, null, b.position, b.type, "
>+        "b.fk, b.folder_type "

We cannot add mFolderId just for some nodes, please update all queries.

Remember to special case the places root as you do that.

>+    nsAutoString tags;
>+    rv = aRow->GetString(kGetInfoIndex_ItemTags, tags);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+    if (tags.Length())
>+      (*aResult)->mTags.Assign(tags);
>+

We cannot do that, mTags needs to be comma-separated.


>   PRInt64 session = aRow->AsInt64(kGetInfoIndex_SessionId);
> 
>   if (aOptions->ResultType() == nsNavHistoryQueryOptions::RESULTS_AS_VISIT) {
>     *aResult = new nsNavHistoryVisitResultNode(url, title, accessCount, time,
>                                                favicon, session);
>     if (! *aResult)
>       return NS_ERROR_OUT_OF_MEMORY;
>+
>+    nsAutoString tags;
>+    rv = aRow->GetString(kGetInfoIndex_ItemTags, tags);
>+    if (tags.Length())
>+      (*aResult)->mTags.Assign(tags);
>+

ditto

Attachment #403389 - Flags: review?(mano) → review-

Marco Bonardo [:mak]

Assignee

Comment 10

•

15 years ago

Attached patch patch v1.1 (obsolete) — Details — Splinter Review

Good catches, i recall i had found some problem with those commas, but having passed some time i forgot to check :( i should have annotated that in the bug at that time.
So i investigated why we were using comma somewhere and space somewhere else, and the answer was "no reason", i also asked dietrich to be sure since he introduced that binded param, and he thinks is fine to use a comma everywhere.

- fixed comma/space issue
- avoid setting mFolderId if parent value is <= 0
- updated mDBBookmarkToUrlResult to set parent value

i've not found other queries in need of an update, btw i'll take a second look while you review this updated patch.

Attachment #403389 - Attachment is obsolete: true

Attachment #403805 - Flags: review?(mano)

Asaf Romano (gone)

Comment 11

•

15 years ago

Comment on attachment 403805 [details] [diff] [review]
patch v1.1


> PlacesSQLQueryBuilder::SelectAsURI()
> {
>+  nsNavHistory* history;
>+  history = nsNavHistory::GetHistoryService();

nit: assign on decalre:


> nsresult
> PlacesSQLQueryBuilder::SelectAsVisit()
> {
>+  nsNavHistory* history;
>+  history = nsNavHistory::GetHistoryService();

ditto.


>   // URI
>-  if (NS_SUCCEEDED(aQuery->GetHasUri(&hasIt)) && hasIt)
>+  if (NS_SUCCEEDED(aQuery->GetHasUri(&hasIt)) && hasIt) {
>     BindStatementURI(statement, index.For("uri"), aQuery->Uri());
>+    if (aQuery->UriIsPrefix()) {
>+      nsCAutoString uriString;
>+      aQuery->Uri()->GetSpec(uriString);

At first, I though it would crash if  if uriisprefix is set, but uri itself isn't, but then I saw the hasIt check. I think we should rather check aQuery->Uri() directly (actually, please file a bug to remove these hasIt* methods altogether).
 

>   nsTArray<nsTArray<nsString>*> terms;
>   ParseSearchTermsFromQueries(aQueries, &terms);
> 
>-  PRInt32 queryIndex;
>   PRUint16 resultType = aOptions->ResultType();

Please move |resultType| closer to its usage.

> 
>   // The includeFolders array for each query is initialized with its
>   // query's folders array. We add sub-folders as we check items.
>   nsTArray< nsTArray<PRInt64>* > includeFolders;
>   nsTArray< nsTArray<PRInt64>* > excludeFolders;
>-  for (queryIndex = 0;
>+  for (PRInt32 queryIndex = 0;

nit: PRUin32


> 
>-  for (PRInt32 nodeIndex = 0; nodeIndex < aSet.Count(); nodeIndex ++) 
...
>+  for (PRInt32 nodeIndex = 0; nodeIndex < aSet.Count(); nodeIndex++)
...
>-  // de-allocate the matrixes
>+  // De-allocate the temporary matrixes.
>   for (PRInt32 i = 0; i < aQueries.Count(); i++) {

ditto, while your'e at it.

>+    // Find all the folders with the annotation we are excluding and save their
>+    // item ids.  When doing filtering, if a result's parent item id matches one
>+    // of the saved item ids, the result will be excluded.

nit: result's parent's or  " ... of ..."


>-  PRBool isNull;
>-  rv = aRow->GetIsNull(kGetInfoIndex_ItemId, &isNull);
>-  NS_ENSURE_SUCCESS(rv, rv);
>-  if (!isNull) {
>-    itemId = aRow->AsInt64(kGetInfoIndex_ItemId);
>-  }
>+  PRInt64 itemId = aRow->AsInt64(kGetInfoIndex_ItemId);
>+  PRInt64 parentId = -1;
>+  if (itemId > 0)
>+    parentId = aRow->AsInt64(kGetInfoIndex_ItemParentId);
>+  else
>+    itemId = -1;
> 

itemId could actually be 0, i think (places root?). 
Also, this code path is a little confusing.  Please add some comments.



>       (*aResult)->mItemId = itemId;
>+      // Don't set mFolderId for Places root node since it would be 0, and we
>+      // could rely on a not existant parent.
>+      if (parentId > 0)

Sigh, CreateRoot(..placesroot..) should've passed "null" rather 0 :(


>+
>+    nsAutoString tags;
>+    rv = aRow->GetString(kGetInfoIndex_ItemTags, tags);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+    if (tags.Length())
>+      (*aResult)->mTags.Assign(tags);
>+

Shouldn't you use IsVoid? most URIs don't have tags set and thus 0 length is valid.

>+    nsAutoString tags;
>+    rv = aRow->GetString(kGetInfoIndex_ItemTags, tags);
>+    if (tags.Length())
>+      (*aResult)->mTags.Assign(tags);
>+

ditto.

r=mano otherwise.

Attachment #403805 - Flags: review?(mano) → review+

Marco Bonardo [:mak]

Assignee

Comment 12

•

15 years ago

(In reply to comment #11)
> >   // URI
> >-  if (NS_SUCCEEDED(aQuery->GetHasUri(&hasIt)) && hasIt)
> >+  if (NS_SUCCEEDED(aQuery->GetHasUri(&hasIt)) && hasIt) {
> >     BindStatementURI(statement, index.For("uri"), aQuery->Uri());
> >+    if (aQuery->UriIsPrefix()) {
> >+      nsCAutoString uriString;
> >+      aQuery->Uri()->GetSpec(uriString);
> 
> At first, I though it would crash if  if uriisprefix is set, but uri itself
> isn't, but then I saw the hasIt check. I think we should rather check
> aQuery->Uri() directly (actually, please file a bug to remove these hasIt*
> methods altogether).

for the follow-up, do you mean to remove all GetHas methods, or just hasIt checks?


> >   // The includeFolders array for each query is initialized with its
> >   // query's folders array. We add sub-folders as we check items.
> >   nsTArray< nsTArray<PRInt64>* > includeFolders;
> >   nsTArray< nsTArray<PRInt64>* > excludeFolders;
> >-  for (queryIndex = 0;
> >+  for (PRInt32 queryIndex = 0;
> 
> nit: PRUin32

That would then require to cast nsCOMArray.Count() everywhere(PRInt32 Count() const) or it would pollute output with bunch of compiler warnings.
Not sure why arrays count has been defined signed, but i won't change that here since does not make a difference for us.

> >+  PRInt64 itemId = aRow->AsInt64(kGetInfoIndex_ItemId);
> >+  PRInt64 parentId = -1;
> >+  if (itemId > 0)
> >+    parentId = aRow->AsInt64(kGetInfoIndex_ItemParentId);
> >+  else
> >+    itemId = -1;
> > 
> 
> itemId could actually be 0, i think (places root?). 
> Also, this code path is a little confusing.  Please add some comments.

table ids start from 1, so no itemId can ever be 0, and that's enough to check. i modified this code path to make it more clear and added comments though.

> >+      // could rely on a not existant parent.
> >+      if (parentId > 0)
> 
> Sigh, CreateRoot(..placesroot..) should've passed "null" rather 0 :(

doesn't change a lot, when you call asInt64 null would be converted to 0, you have to explicitly use GetIsNull... btw i've removed this check since i just set parentId = -1 above.

> >+    nsAutoString tags;
> >+    rv = aRow->GetString(kGetInfoIndex_ItemTags, tags);
> >+    NS_ENSURE_SUCCESS(rv, rv);
> >+    if (tags.Length())
> >+      (*aResult)->mTags.Assign(tags);
> >+
> 
> Shouldn't you use IsVoid? most URIs don't have tags set and thus 0 length is
> valid.

Not sure what that would bring us, mTags is initialized to a void string in the result code, how can 0 length be valid to overwrite that initialization? If the db string has 0 length we don't have tags, then there is no reason to change the void initialized value in the node.

Marco Bonardo [:mak]

Assignee

Comment 13

•

15 years ago

Attached patch patch v1.2 — Details — Splinter Review

fixed all comments based on IRC discussion with Mano.

Attachment #403805 - Attachment is obsolete: true

Marco Bonardo [:mak]

Assignee

Comment 14

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/98a42087c07b

the functionality is tested by previous existing unit tests. This is mostly performances in searching the Library or the sidebar (really minor improvements could appear in other points though).

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla1.9.3a1

Marco Bonardo [:mak]

Assignee

Comment 15

•

15 years ago

filed Bug 520608 about getting rid of getHasXXX methods.

Marco Bonardo [:mak]

Assignee

Comment 16

•

15 years ago

Comment on attachment 404577 [details] [diff] [review]
patch v1.2

This brings a 70% speed improvement in searches in sidebar and library. Would be pretty cool to have on 1.9.2

there are no compatibility changes and queries are already tested by our existing tests.

Attachment #404577 - Flags: approval1.9.2?

Marco Bonardo [:mak]

Assignee

Comment 17

•

15 years ago

Attached patch patch for 1.9.2 — Details — Splinter Review

this is the patch unbitrotted and ready to be pushed on 1.9.2 (with author and comment)

Mike Beltzner [:beltzner, not reading bugmail]

Comment 18

•

15 years ago

a192=beltzner, be ready to pull it out if it causes problems, though!

Mike Beltzner [:beltzner, not reading bugmail]

Updated

•

15 years ago

Attachment #415851 - Flags: approval1.9.2+

Mike Beltzner [:beltzner, not reading bugmail]

Updated

•

15 years ago

Attachment #404577 - Flags: approval1.9.2?

Dietrich Ayala (:dietrich)

Comment 19

•

15 years ago

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/9f471f8b2167

status1.9.2: --- → final-fixed

Marco Bonardo [:mak]

Assignee

Updated

•

14 years ago

Depends on: 536081

Marco Bonardo [:mak]

Assignee

Updated

•

14 years ago

Blocks: 491746

Marco Bonardo [:mak]

Assignee

Updated

•

14 years ago

Blocks: 490512

You need to log in before you can comment on or make changes to this bug.