Closed Bug 381898 Opened 13 years ago Closed 13 years ago

history menu does not match the history sidebar when grouped by last visited (because of inner content / TRANSITION_EMBED visits)

Categories

(Firefox :: Bookmarks & History, defect)

defect
Not set

Tracking

()

RESOLVED DUPLICATE of bug 375777
Firefox 3 alpha8

People

(Reporter: moco, Assigned: moco)

References

Details

(Keywords: regression)

Attachments

(3 obsolete files)

history menu does not match the history sidebar

expected results:  the history menu should be the top ten items from the history sidebar when sorted by last visited.

actually results:  they are not the same.

(note, if you try to reproduce this, watch out for bug #381896)

this looks like a regression from bug #372025:  "Duplicates in the history menu are not collapsed"

see http://lxr.mozilla.org/seamonkey/source/browser/base/content/browser-menubar.inc#334

we went from:
place="place:beginTime=-2592000000000&beginTimeRef=1&endTime=7200000000&endTimeRef=2&type=1&sort=4&maxResults=10">

to:
place="place:beginTime=-2592000000000&beginTimeRef=1&endTime=7200000000&endTimeRef=2&type=0&sort=4&maxResults=10">

one difference between switching from RESULTS_AS_VISIT to RESULTS_AS_URI is we no longer excluding TRANSITION_EMBED visits (see nsNavHistory::GetQueryResults()).

There may be other differences in the results as well.

note, in the history sidebar, I believe we collapse duplicates in the tree view.
I am unable to reproduce this problem.  curses, I should have debugged it when I saw it happening.

before the fix for bug #372025, our query was:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, 
v.visit_date, 
f.url, v.session, null 
FROM moz_places h 
JOIN moz_historyvisits v ON h.id = v.place_id 
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id 
WHERE (h.hidden <> 1 AND v.visit_type <> 4 )

after, it is:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, 
(SELECT MAX(visit_date) FROM moz_historyvisits WHERE place_id = h.id), 
f.url, null, null 
FROM moz_places h 
LEFT OUTER JOIN moz_historyvisits v ON h.id = v.place_id 
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE (h.hidden <> 1 )
GROUP BY h.id ORDER BY 6 DESC LIMIT 10

I've used SQLite Database browser on my places.sqlite to see if I have any visits that are of type TRANSITION_EMBED that are not hidden, but I don't have any.

I think for queries of type QUERY_TYPE_HISTORY, we should add v.visit_type <> 4 to our conditions.

I still think this is a legit bug, but I still haven't seen it.

but I still think this is a good idea to do this:

> I think for queries of type QUERY_TYPE_HISTORY, we should add v.visit_type <> 4
> to our conditions.
Flags: blocking-firefox3?
OS: Windows XP → All
Hardware: PC → All
Based on comment #2 clearing the nomination, Seth, please renom if you think that's wrong of us.
Flags: blocking-firefox3?
good news, this is a legit bug, and ispike's places.sqlite has a real world example.

details coming.
escuse me Seth, what is the pro of having a SELECT between selected fields as in:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, 
===> (SELECT MAX(visit_date) FROM moz_historyvisits WHERE place_id = h.id), 
f.url, null, null 

The resulting table is

id | url | title | host | visit_count | SELECT MAX(BLABLABLA | url | null | null

and SELECT MAX(BLABLABLA is always an empty field...

should not that be a simple MAX(v.visit_date) grouped by h.id? I ask you because i have seen it in different files, but i don't understand it

I'd like also to help you in benchmarking and optimizing queries but i have not a large enough db to try, so if you have one big places.sqlite could you share it on web for tests?
i got the answer by myself, the only valid request is for a big profile/places.sqlite for benchmarking

sorry for bug spam
however you can avoid the SELECT in the selection fields cause you already join on historyvisists so already have a max(visit_date) grouping on h.id. i get the same results:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, 
MAX(visit_date), f.url, null, null 
FROM moz_places h 
LEFT OUTER JOIN moz_historyvisits v ON h.id = v.place_id 
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE (h.hidden <> 1 )
GROUP BY h.id ORDER BY 6 DESC LIMIT 10
better:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, 
MAX(visit_date) AS max_visit_date, f.url, null, null 
FROM moz_places h 
LEFT OUTER JOIN moz_historyvisits v ON h.id = v.place_id 
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE h.hidden <> 1
GROUP BY h.id ORDER BY max_visit_date DESC LIMIT 10
Marco, thanks for looking into this.

We have two problems with the existing history menu query:

1) we are not ignoring items that are "inner content" visits (visit_type == 4, or TRANSITION_EMBED see nsINavHistoryService.idl)

2) it is slow

Your query in comment #7 (and #8) is much faster than the existing code, but it still has problems with TRANSITION_EMBED visits.

For the history menu, here what I've got (currently part of bug #385397):

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, v.visit_date, f.url, null, null
FROM moz_places h 
JOIN moz_historyvisits v ON h.id = v.place_id 
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id WHERE 
(h.id in (select distinct h.id from moz_historyvisits, moz_places h where place_id = 
h.id and hidden <> 1 and visit_type <> 4 order by visit_date desc limit 10)) GROUP BY h.id order by v.visit_date desc

Even if I modify your WHERE clause to "WHERE (h.hidden <> 1 and visit_type <> 4)", the results of my query and your differ because of the TRANSITION_EMBED visits.

In addition to fixing the history menu query, we need to fix the query for the history sidebar when grouped by last visited.

For last visited, we want RESULTS_AS_URI not RESULTS_AS_VISIT, and that will be tracked in bug #385245
Assignee: nobody → sspitzer
Summary: history menu does not match the history sidebar → history menu does not match the history sidebar when grouped by last visited (because of inner content)
i've tried to modify my query as 

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, MAX(visit_date) AS visit_date, f.url, null, null
FROM moz_places h
JOIN moz_historyvisits v ON h.id = v.place_id
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE h.hidden <> 1 AND visit_type <> 4
GROUP BY h.id 
ORDER BY visit_date DESC 
LIMIT 10

but i don't get any TRANSITION_EMBED... maybe my db is not complete enough. i have also tried to manual edit visit_type field to catch the problem, but no luck, it does not get visit_type = 4 rows, do you have a test case to try?

however let's see some alternative... i use your last SQL (from comment #9) as a starting point, so i hope it takes the right results :)

imho what you are doing is really this: you want to join your results only against non hidden and visit_type <> 4, so the query could be this way:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, v.visit_date, f.url, null, null
FROM moz_places h 
JOIN
(
  select place_id, max(visit_date) as visit_date
  from moz_historyvisits
  LEFT OUTER JOIN moz_places h ON place_id = h.id
  WHERE hidden <> 1 and visit_type <> 4
  GROUP BY place_id
  order by visit_date desc
  limit 10
) v ON h.id = v.place_id
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
order by v.visit_date desc

and this is (a little) better performing and more readable... 
but really here the point is that it is a JOIN with a VIEW, but the view does not exists (we create it on the fly with sql subselect) so it should perform better if you create a view in the database with the CREATE VIEW statement... than you could JOIN directly against the VIEW!
In this way probably the schema should be revamped inserting some useul VIEWs to simplify and speed up joins...

BUT:
I have tried to see your query from the other end. I suppose that for history you want historyvisits, and not places results, so you use JOIN to get results that exist in both tables... so let's invert the query, ask for results from historyvisits (should have less items than places, isn't it?) and join against places:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, MAX(v.visit_date) as visit_date, f.url, null, null
FROM moz_historyvisits v
JOIN moz_places h ON h.id = v.place_id
LEFT OUTER JOIN moz_favicons f ON  f.id = h.favicon_id
WHERE h.hidden <> 1 AND v.visit_type <> 4
GROUP BY v.place_id
ORDER BY visit_date DESC
LIMIT 10

it's similar to the first query, on my DB this gives me half time to execute against "virtual view" query and little advantage against first query.
I get same results from the three posted queries, please send me a DB test case if you obtain different results so i can test better... 
I cannot test against a big places.sqlite (don't have one big enough) but it could be a good starting point, maybe the result could be a mix between this and the view paradigm.
Marco, thanks for all the help on these issues!  

Your queries work, but only if I make a slight change:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, MAX(visit_date) AS
visit_date_2, f.url, null, null
FROM moz_places h
JOIN moz_historyvisits v ON h.id = v.place_id
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE h.hidden <> 1 AND visit_type <> 4
GROUP BY h.id 
ORDER BY visit_date_2 DESC 
LIMIT 10

or, following what the current code does:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, MAX(visit_date), f.url, null, null
FROM moz_places h
JOIN moz_historyvisits v ON h.id = v.place_id
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE h.hidden <> 1 AND visit_type <> 4
GROUP BY h.id 
ORDER BY 6 DESC 
LIMIT 10

Here is a modified version of your other query:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, visit_date, f.url,
null, null
FROM moz_places h 
JOIN
(
  select place_id, max(visit_date) as visit_date
  from moz_historyvisits
  LEFT OUTER JOIN moz_places h ON place_id = h.id
  WHERE hidden <> 1 and visit_type <> 4
  GROUP BY place_id
  order by 2 desc
  limit 10
) v ON h.id = v.place_id
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
order by 6 desc

Ordering by "visit_date" is what causes the problem for me.

For the same reason, my query in comment #9 is incorrect.  For the history menu, here is the corrected version:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, MAX(v.visit_date), f.url,
null, null
FROM moz_places h 
JOIN moz_historyvisits v ON h.id = v.place_id 
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id WHERE 
(h.id in (select distinct h.id from moz_historyvisits, moz_places h where
place_id = 
h.id and hidden <> 1 and visit_type <> 4 order by visit_date desc limit 10))
GROUP BY h.id order by 6 desc

This query (for the history meni) is much faster against my large places.sqlite than your suggested queries.

For our "last visited" history sidebar query, we want something that will match the output of:

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, MAX(visit_date), f.url, null, null
FROM moz_places h
JOIN moz_historyvisits v ON h.id = v.place_id
LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE h.hidden <> 1 AND visit_type <> 4
GROUP BY h.id 
ORDER BY 6 DESC 

Let's continue to work on ways to optimize that (perhaps using CREATE VIEW as you suggest.)

Note, to determine correctness of a query, I compare the results to the following query after I manually group by id.

SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, v.visit_date, null, null
FROM moz_places h, moz_historyvisits v
WHERE h.id = v.place_id and v.visit_type <> 4 and h.hidden <> 1
order by 6 desc

I'll follow up with answers to your other questions / comments.
Status: NEW → ASSIGNED
> ask for results from historyvisits (should have less items than places, isn't it?)

I think it is the reverse.  For example, imagine a profile where you visit www.google.com to search.  That will end up with many more visits than places.

In my real world "big" places.sqlite, I have 300k visits and about 50k places.
> please send me a DB test case if you obtain different results so i can test better... 

I'm going to have to ask permission before I share the places.sqlite file I am testing with.
> should not that be a simple MAX(v.visit_date) grouped by h.id?

to answer your earlier question, yes, I think you are right here.

updated work-in-progress patch coming...
Attached patch update wip patch (obsolete) — Splinter Review
Attachment #270616 - Attachment is obsolete: true
Attached patch updated patch (obsolete) — Splinter Review
Attachment #272071 - Attachment is obsolete: true
moving out to m8
Target Milestone: --- → Firefox 3 M8
Summary: history menu does not match the history sidebar when grouped by last visited (because of inner content) → history menu does not match the history sidebar when grouped by last visited (because of inner content / TRANSITION_EMBED visits)
Comment on attachment 273191 [details] [diff] [review]
updated patch

the fix for this bug (the part ot nsNavHistory.cpp) is part of bug #389782.

the optimization work has been moved out to bug #385245
Attachment #273191 - Attachment is obsolete: true
the fix for this ended up being done as part of bug #375777.  marking it as a duplicate.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Depends on: 375777
No longer depends on: 389782
Resolution: --- → DUPLICATE
Duplicate of bug: 375777
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
You need to log in before you can comment on or make changes to this bug.