Open
Bug 408991
Opened 17 years ago
Updated 2 years ago
Find a way to efficiently extract bookmarks paths
Categories
(Toolkit :: Places, enhancement, P3)
Toolkit
Places
Tracking
()
NEW
People
(Reporter: mak, Unassigned)
References
(Blocks 3 open bugs)
Details
(Whiteboard: [fxsearch])
This is something i already pointed out some time ago, current bookmarks implementation uses a parent-position relationship to build a n-level tree and this has some pro & cons:
pros:
- fast inserts
- compact table
cons:
- to find the grandparent of a child you have to use recursion
- to find all childrens of a folder (and his subfolders) you have to do complex loops
- to count all childrens of a folder (and his subfolders) you have to do complex loops
- you can end up with cycles
But for nested structures in an SQL database the best choice would be to use a nested set preordered tree, these are php/mysql article on the subject:
http://www.phpriot.com/articles/nested-trees-1/1
http://www.phpriot.com/articles/nested-trees-2
implementing this in other languages is not a problem since the main concept is sql based
table changes required would be adding 3 columns: level, left, right, and a index on them, but probably one could discard position
pros:
- you can always know if a node is child of another
- you find all childrens of a folder with a single query
- you count all childrens of a folder with a single query
- you easily move from one node to another
- you cannot create cycles
cons:
- less compact table (2 more columns)
- slow inserts: on every insert you have to update more than one node
Reporter | ||
Comment 1•16 years ago
|
||
an interesting document from mysql:
http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
Reporter | ||
Updated•15 years ago
|
Assignee: nobody → mak77
Reporter | ||
Updated•15 years ago
|
Summary: investigate the use of a nested tree for bookmarks table → Find a way to efficiently extract bookmarks paths (nested tree for bookmarks table?)
Comment 2•15 years ago
|
||
I would recommend trying to get a Labs' TestPilot study going to help determine just how nested our user's bookmarks hierarchies are.
The querys that will run to determine children and ancestors may not be as hindered as we think by recursion. MPTT requires some pretty hefty queries for writes, and in my experience tends to be quite fragile (with or without a comprehensive test suite).
Comment 3•15 years ago
|
||
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".
In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body contains places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.
Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.
Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Reporter | ||
Updated•15 years ago
|
Assignee: mak77 → nobody
Component: Bookmarks & History → Places
Product: Firefox → Toolkit
QA Contact: bookmarks → places
Reporter | ||
Comment 4•15 years ago
|
||
An alternative approach could be to save a cache of all the parents of each folder, in a separate closure table, this cache should be updated every time we move a folder in the hierarchy, then we should update the moved folder and all of its descendants folders.
At first glance it could be faster and require less disk space than a nested tree but it would have other disadvantages:
* harder to check coherence (with nested trees we can run some aritmetic to check if the values are out of sync)
* will only give us ancestors path, this is what we need atm, but a nested tree would also give us possibility to implement a richer bookmarks API
Nested trees are usually expensive, but we don't have so long hierarchies in bookmarks. If we can go for them (not too heavy perf loss) they are the best choice for our APIs, otherwise we can try this alternative implementation.
I was also thinking about a leaf-less nested tree implementation, where we only track folders, ignoring the fact they could have flat children, i can't tell offhand if it's easier to do (or doable at all but sounds possible), at first glance it could be faster to update but would lose some functionality (checking number of children, telling if a bookmark is descendant of a certain ancestor...). It sounds pretty similar to the closure table approach above, but should not have the coherence check problem.
So for now i'd go for the richer implementation, evaluate perf loss, and if it's not acceptable start looking into less functional implementations.
status1.9.1:
--- → wontfix
status1.9.2:
--- → wontfix
Summary: Find a way to efficiently extract bookmarks paths (nested tree for bookmarks table?) → Find a way to efficiently extract bookmarks paths
Target Milestone: Future → ---
Reporter | ||
Comment 5•15 years ago
|
||
(In reply to comment #2)
> The querys that will run to determine children and ancestors may not be as
> hindered as we think by recursion.
it will make use async damn hard, since recursion has to be done in code, so you have to query -> code -> query -> code. suppose you have 10 000 bookmarks to render and you want the path for each one of them.
Reporter | ||
Comment 6•15 years ago
|
||
unless you want to recurse at each change, and cache the value for each folder. but at this point updates will still be slow, and the stytem will still be fragile (plus we would miss a way to check coherence of the cached data). At that point i'd prefer the leaf-less implementation.
Comment 7•15 years ago
|
||
I do that when I have used MPTT, we tracked all nodes in a separate table called "hierarchy", thus making a distinction between data and hierarchy. This served us well, but we stil had to to a single join to scoop up the data:
table hierarchy
id, bookmark_id, sibling_index, has_children, depth, _right, _left, parent_id
The select: SELECT * FROM hierarchy WHERE _left BETWEEN 239 AND 278 - and of course JOIN the bookmarks
writes were expensive, but the update query can be written pretty optimized as you only have to update _left, _right and depth:
UPDATE hierarchy
SET _left = %d,
_right = %d,
depth = %d
WHERE id = %d
I think the MozStorage async api can take an array of statements and synchronously iterate. The tricky thing was always that the next update needed to operate on a row that was updated microseconds beforehand. We had some issues with postgres where this moved nodes around in random ways. The funny thing was that when did our initial testing with SQLite and it never exhibited the random bug. This screams "test driven developm,ent" of even the tinest private methods.:)
Comment 8•15 years ago
|
||
(In reply to comment #2)
> I would recommend trying to get a Labs' TestPilot study going to help determine
> just how nested our user's bookmarks hierarchies are.
Test pilot study is here: https://testpilot.mozillalabs.com/testcases/a-week-life/results.html
Reporter | ||
Comment 9•8 years ago
|
||
most of the links expired, good ref atm:
http://stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree/192462#192462
Reporter | ||
Updated•8 years ago
|
Priority: -- → P3
Reporter | ||
Updated•7 years ago
|
Reporter | ||
Updated•7 years ago
|
Whiteboard: [fxsearch]
Reporter | ||
Comment 10•7 years ago
|
||
Most of the discussion here may not be valid anymore, we still need a way to get paths, it could likely be simplified to an adjacency table. Ideally we'd need that info mirrored in memory to be able to remove GetFolderIdForItem
Updated•5 years ago
|
status-firefox73:
--- → affected
status-firefox74:
--- → affected
status-firefox75:
--- → affected
status-firefox76:
--- → affected
Updated•5 years ago
|
status-firefox73:
affected → ---
status-firefox74:
affected → ---
status-firefox75:
affected → ---
status-firefox76:
affected → ---
Reporter | ||
Comment 12•3 years ago
|
||
(In reply to Mathew Hodson from comment #11)
Is this a duplicate of bug 1591790?
it's a partial overlap. bug 1591790 made an opt-in version of this, using a recursive query. The scope here was use a non-recursive query and return the path to a bookmark for each bookmark every time. So this still applies, but it's less critical now.
Flags: needinfo?(mak)
Updated•3 years ago
|
No longer blocks: 469441
Severity: normal → --
status1.9.1:
wontfix → ---
status1.9.2:
wontfix → ---
See Also: → 1591790
You need to log in
before you can comment on or make changes to this bug.
Description
•