Closed Bug 223476 Opened 21 years ago Closed 15 years ago

large history makes Firefox painfully slow in several ways

Categories

(Firefox :: Bookmarks & History, defect, P2)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mozilla, Unassigned)

References

Details

(Keywords: perf, Whiteboard: [TSnappiness])

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5b) Gecko/20030909 Firebird/0.6.1+
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5b) Gecko/20030909 Firebird/0.6.1+

My history is set to save 180 days. The history.dat file is about 12 megs.
Doing things that involve the history ('Go' menu, opening History pane)
take 30 to 60 seconds, during which time everything else in the browser is
frozen, chewing up 90% of my CPU usage. This is reproducible always.

Additionally, after doing those things page loads seem to get very slow for
15 to 30 seconds, chewing up 90% of my CPU usage. This is reproducible only
sometimes.

Reproducible: Always

Steps to Reproduce:
1. Start program.
2. Click on 'Go' menu or open history pane by keystroke.
3. (Optional) Load some pages and notice how slow they are.

Actual Results:  
Everything works, but only if very patient.

Expected Results:  
No more than a second to display any menu. No more than five seconds to
display the history pane.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6a) Gecko/20031028
Firebird/0.7+

Remember for 20 days
history.dat: 3.5M

Opening the Go menu takes 3 seconds.  Opening the History panel takes 4 seconds
in a flat mode.  But opening the Go menu doesn't slow down subsequent page loads
afaict.

I think there's a startup hit but that should be a separate bug.
Depends on: 232772
Depends on: 222717
No longer depends on: 232772
also makes typing urls painfully slow due to autocomplete.  it sometimes takes
about 2 seconds to "wake up" after typing just one character.  the delay
decreases as more characters are typed, narrowing down the results.  backspacing
seems to always take at least one second no matter how many autocomplete results
there are.
history sidebar opening/searching seems to be a little faster in 0.8 than with
previous releases, but it's still sluggish.  expanding the "older than 6 days"
branch takes almost :30.  does large history affect page load/new window/tab at
all?  i've long since gotten rid of the Go menu with userChrome.css because i'm
afraid of it; accidentally mousing over it would freeze the browser up for
minutes, and it's relatively useless to me anyway.

history.dat 12.5mb
remember for 365 days (so basically, never expire)

firefox 0.8, windows 2000 on a pII 400, 368mb ram.
I have experienced this several times, but without any form of crash.  The whole
history just disappears.  I use URL autocomplete a lot and every few weeks it
just stops working because evry URL has gone from the history.  Sorry, I can't
give you any steps to reproduce, it seems random.
I think this is probably a dup of bug 193236 , or, at least, they're definitely
related.
(In reply to comment #4)
> I think this is probably a dup of bug 193236 , 
> or, at least, they're definitely related.
bug 193236 should probably be set as blocking this, like bug 222717, since those
both deal with specific effects of having a large history, whereas this bug is
about general poor performance.  this bug should also be set to OS=All and maybe
Hardware=All too - anyone see this on Macs?

(In reply to comment #3)
> The whole history just disappears.
doesn't sound like the same thing.  that sounds like your history is expiring or
getting corrupted.
Summary: large history makes Firebird painfully slow in several ways → large history makes Firefox painfully slow in several ways
I can definitely reproduce this consistently, it's very painful as the title
suggests.  Internet Explorer is WAY more responsive, which is not a good thing
for such a main feature of the browser.

Is it just inefficient code, or bad design?
Eli, I recommend perf keyword
Keywords: perf
I want to emphasize this problem. It's a pain in the ass.
This bug makes firefox unusable, it is known for a year and unfortunately made
it to version 1.0.

Even my latest athlon 3000+ / SATA HD takes 15 seconds to open go menu and
afterwards all non cached links are freezing for 3-5 seconds before starting to
load.

I had reproduced this on WinXP Pro, WinXP Pro 64, debian sarge i386, debian sid
amd64.

Never had a problem with the suite, which I am still using.

According to bug 245745, an effort is underway to convert Firefox's history to a
more efficient implementation using SQLite, a compact, open-source database
engine that follows the popular SQL relational model. The history changes are
part of a large-scale redesign of Mozilla's storage mechanism (see bug 261861)
that likely will take several months to complete. So relief is on the way,
though apparently (and unfortunately) not soon.
Firefox 1.0 on Windows 2000

the problem with history can be reproduced/misused by following html file:

<html><head><script>
 var tit="Bug";
 function writetitle()
 {
   tit+=tit;
   document.title = tit;
   setTimeout("writetitle()", 100);
 } 
 writetitle();
</script></head></html>

WARNING: the browser start to generate huge history.dat file
when firefox is terminated before the disk is full
and then started again, the autocompletion feature will prevent any browser use
to get browser working again, delete the history.dat file
I have "Save history for: 900 days" and my history.dat is 76MB large. I am using
Fedora Core 3 with firefox-1.0-2.fc3 on an 1.8GHz P4.

- The most painful problem that I am experiencing is that almost every time I
open a new page (whether in a current window or in a new one), Firefox
completely freezes (all Firefox windows become completely unresponsive and would
not even refresh) for a few seconds with CPU itilization staying at 100%. This
is covered in bug 266627, so I am marking it as a dependency. Also, my
experience suggests that the more Firefox windows I have opened, the worse this
problem seems to be.

- The second most payiful is the "Go" menu freezing the UI for ~30 seconds, as
already mentioned in previous comments.

Note that this problem appears to be Firefox-specific - I only switched from
Mozilla suite to Firefox recently and I did not experience any of these problems
with Mozilla. These (especially bug 266627) make Firefox nearly unusable for me,
so "Severity: major" is IMHO appropriate.
Severity: normal → major
Depends on: 193236, 266627
Depends on: 245745
same goes for mozilla, history file is now 20 megs, takes 3-15 secs to render
first page on p3-450 to amd64-3000+, file a new bug ?
Flags: blocking-aviary1.1?
Flags: blocking-aviary1.1? → blocking-aviary1.1-
*** Bug 293643 has been marked as a duplicate of this bug. ***
*** Bug 300693 has been marked as a duplicate of this bug. ***
Today I tried to delete some history items... It's extremaly slow. Firefox
writes history file after deleting each item! Task Manager reported 20GB of data
(!) was written by Firefox.
I'd also like to note how much of a pain in the ass this is.  I left my history
set to 20 days, and the first time I'd try to enter any URL in the Location bar,
the system freezes up for anywhere between 5 second and a minute.  And uh, I
don't have a slow computer; it's an Athlon XP 2500+ with 512 MB of RAM, and no
other browser I've used has slown down like this.
I can confirm the behavior from comment #16.

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915
Windows XP SP2
In addition, even a modest history size is painfully slow when deleting items,
both on Windows and Linux. When deleting a group of items, Firefox will cause
"chugging" disk activity that lasts for anything up to a few minutes.
No longer depends on: 266627
It happens on Mac, too. When the history gets even bigger, Firefox freezes and has to force quited.
Places replaces the current history system in current Firefox nightlies. It has different performance characteristics from the old system. The new places window being slow with large histories is bug 330906. Deleting should be much faster.
My history.mab and history.dat files are 4K and 5K lines (approx. 256 Kbytes) respectively.
*** Bug 248345 has been marked as a duplicate of this bug. ***
Depends on: 279342
*** Bug 339588 has been marked as a duplicate of this bug. ***
I experience this on Windows XP with Firefox 2.0 beta 1
=
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1b1) Gecko/20060710 Firefox/2.0b1 ID:2006071020

Twiddling bits to reflect this.

I really dislike this bug :( I'm considering deleting my precious 24 MB history.dat because of it *sniffle*
Assignee: bross2 → nobody
OS: Linux → All
QA Contact: mozilla → history
Version: unspecified → 2.0 Branch
*** Bug 354934 has been marked as a duplicate of this bug. ***
truly slow
My history is set to remember 200 days and its current size is 29MB. Firefox freezes 90 seconds when I click the History menu, it is awful but I don't want to lose my history ! Could someone fix this bug (October 2003...) ? Maybe the history file format should be optimized ?
It would be nice if, at the least, firefox could keep an extended history that isn't used for find-as-you-type URI location bar completion.

Or perhaps it would help if only the top N most used entries were searched during URI completion, where N can be changed via about:config?

caracalears, from what I've read, the history and bookmarks are getting totally redone for Firefox 3 to use SQLite (supposedly was to be in Firefox 2 but was bumped). I figure if I have waited and dealt with it for 4 years as it stands, I can wait another 9 months.
Recent Firefox alphas are already using places, so history is now stored using SQLite backend. Unfortunately, when I tried deleting random two hundred entries spread in my history for the last six months, Firefox started making a hole in my HDD by creating and deleting one entry at a time.

Imagine two hundred journaled removals from SQLite ... The current scheme of working with SQLite backend plain sucks ...
Re: "Unfortunately, when I tried deleting random two hundred entries
spread in my history for the last six months, Firefox started making a
hole in my HDD by creating and deleting one entry at a time."

OK.  That's it.  This has been open for four years and is clearly
not going to be fixed, because nobody with a clue is ever going to
work on it (as opposed to **** on it).

I just tried Firefox again after a year (Mozilla 1.7 has served me
sort-of OK for years, given the alternatives, but it's time to move on)
-- but it's still often 10+ seconds to follow a link in New Improved
Robust Firefox 2.0.0.2.  We're talking BILLIONS of machine cycles here
to do almost nothing.

I give up.

Safari.app here I come.  Open source is a clearly a dead cause here
(and I was the second person after RMS to ever write any GNU software!)
I wonder, does anyone remember History in NS3? I used it constantly ... it was, for me, an adjunct to blogging. Trim by deleting, sort variously ... the file was a wealth of information, not just data.

history.dat? Gack ... data, data everywhere, and not a thought to think.

I have managed in the past to delete some of the worst blocks of crap (huge URLs for sites not worth remembering), but it's always a crap shoot.

To quote Jamie Zawinsky (which I do while holding my nose), "it uses Mork, which is -- and I do not use these words lightly -- the single most braindamaged
file format that I have ever seen in my nineteen year career."

If someone had documented Mork somewhere, or made their documentation discoverable, I think we would have improved this long ago.

(Some folk get paid. When I got paid I didn't just file things in a bottom drawer.)
p.s. I just ran "Dork" and my 2.5M .dat ... the text file was 1.29M

Deleting just the links that were comprised of google.ad gack reduced that to .62M ... a quarter.
Deleting other gack (like adclick) reduced it to .49 ... i.e less than 1.5th!

We have lost NSGold functionality and created a gack collector.

BTW: "Re: High CPU with Beagle Thunderbird" reads in part, "the mork file parsing is the most expensive operation probably, and the most likely to loop incorrectly." Seems to me that Mork sux. Maybe that's just because of how it's been implemented. Or maybe it was just a matter of "anything but JWZ".
I've got the same complaint - big history makes for slow Firefox. I'm on Windoze, and used Filemon from SysInternals to try to discover what made my disk so active. It appears that when Firefox hits the rollover point for the history (365 days, or whatever you have set), it stops appending to the file and REWRITES THE ENTIRE FILE, slightly trimmed, every time you alter your history - i.e., visit a new page.

What a great idea. 

Not. 
(In reply to comment #35)
This is already history on the trunk builds, see bug 245745.
It will be so with Firefox 3.
No longer depends on: 279342
Depends on: 83205
for a few days minefield clean start takes ages again
so is this INVALID now since places introduction?
(it's blocking a 'places' bug)
still happening for me, even with places in there on FF3 RC2.  if i navigate while i've got the history window open, the browser freezes for ~5 seconds each time the URL changes.
Component: History → Bookmarks & History
QA Contact: history → bookmarks
FF 3.0.1 is still very slow. History menu takes a few seconds to open, for example.

A simple fix would be not to use SQL queries for menus, just just a little cache since it's only 10 items.

The history window is very slow too.
So can everybody say that the problem is still valid in Firefox 3? Then we can update the bug from the old mork system to places and set the appropriate bug properties.
It is definitely a problem in FF3. If I open my history window right now it's very slow (quad Core 2 3.1GHz, 6GB RAM, SATA HDD, XP x64) and even typing into the address bar causes some pauses.
The problem is worse with places than it was with mork.  Mork never slowed me down typing in the address bar.  Also, if I do any browsing with the history sidebar open, firefox freezes for about 45 seconds every time I navigate to a new page.
I get that too, when browsing with the history sidebar open.
Ok, we can try it. But I personally cannot reproduce the problems you mention on my slow PC. I exclude the fact that the History menu takes about 1 second to open for the first time. Maybe I have too few history entries. Any way we can count and compare them (except counting directly using SQL in the sqlite file)? I would leave the addressbar slowness out of this bug, it probably has its own bugs (like bug 373256) and I think it is being worked on.
Component: Bookmarks & History → Places
QA Contact: bookmarks → places
Version: 2.0 Branch → 3.0 Branch
open up the places library window, click on "history" in the tree view, and look at the lower-right panel.  mine has 71,545 history entries.
That method works only since FF 3.1b1. I only have 9000 entries...
Priority: -- → P2
Hardware: PC → All
Target Milestone: --- → Firefox 3.1
I only have 41,856 history entries and it's slow. 9000 is nothing if you use your browser a lot. My history setting is the default 90 days.

To see how many entries you have in 3.0.4, open the Places window by selecting Bookmarks -> Organise Bookmarks, then click on History in the list on the left, and finally click on the scroll bar. The bottom section of the window will change to show the total number of entries.

It seems like either the performance of the history database needs to be improved by orders of magnitude, or the default retention setting needs to be lowered from 90 days back to 9. Maybe the devs didn't think anyone would manage to get 40,000+ entries in 90 days, but I don't even count myself as a heavy user.
Yeah, like I said, I've got twice as many as you.  I frequently use a couple of web-based messageboards, and those really add to history size.  The history window is almost unusable, and i'm seeing significant slowdowns even in the address bar.

It seems like the real solution is to do anything places-related in a background thread so at least the whole browser session doesn't freeze up while it's being done.  I'd be content with a history panel that incrementally fills as the database is being queried provided i can use the rest of the browser while it's happening.  Not being terribly familiar with the mozilla codebase, though, I assume that's much easier said than done.
Yes, I think the original bug is fixed ("mork is slow with 9 days of history") and now got replaced by "sqlite is slow with over 9000* days of history")

(* "over 9000" is a stupid meme and not an actual number)

(In reply to comment #49)
> My history setting is the default 90 days.

Note that this is just browser.history_expire_days_min, the minimum number of days before history item can be deleted. browser.history_expire_days is by default 180.

browser.history_expire_sites is 40000 by default. Since you're exceeding that number, your history is exactly 90 days old. For someone who doesn't exceed this number, it can be up to 180 days old.

It is my humble opinion that these settings are all three times as big as they should be for the right trade off.
most painful scenario yet:

1. open up the places library window, click on "history" in the tree view
2. type in a search that has some results
3. leave it open and switch back to a browser window
4. attempt to navigate as usual

the browser freezes while it not only updates your entire history, but also while it redoes your history search.

this can make searching for a site in your history really difficult.  what i usually end up doing is opening up IE, and copying and pasting the history results there.
(In reply to comment #52)
> 1. open up the places library window, click on "history" in the tree view
> 2. type in a search that has some results
> 3. leave it open and switch back to a browser window
> 4. attempt to navigate as usual

I can confirm that too. It happens to me quite often as I usually use the history to search for a site I visited previously, but then have to navigate around a bit to see if it's the right one (e.g. the right forum for a particular post etc).
Ok guys.
Let's make some tests:)
0. how big is your places.sqlite file in your profile?
1. start FF with the history sidebar closed.
2. open the sidebar
2A. how long does it take?
2B. can you check, whether FF takes 100% CPU or is using disk heavily?
3. what grouping do you have in the sidebar? (byt date, server?) Does it make any difference?
4. leave it open, visit a new yet unvisited page
4A. how long does it take?
4B. can you check, whether FF takes 100% CPU or is using disk heavily?
Version: 3.0 Branch → Trunk
Okay, places.sqlite is 25.1MB.

Opening the sidebar takes about 2 seconds to open the History menu and then about four seconds to open the sidebar. Keep in mind this is on a 3.1GHz Core 2 Quad with 6GB of RAM.

CPU usage does peek at 100% (or rather 25%, max for one thread on a quad core) while opening. There is some heavy disk activity too. Seems like a combination of both problems.

I group by date usually. That seems to make the most sense. Grouping order does not seem to make an appreciable difference. Note that in order to check this you really need to do restarts between tests because the OS will cache the places.sqlite file making it a bit faster to open the next time.

When visiting a new page, sometimes it's more or less instant but often it can take up to 60 seconds for FF to start responding. During that time CPU is pegged at 100% and there some some disk activity.

This can also happen when re-opening items from the history, especially in a new tab.

On a possibly related note, very occasionally opening a new tab via the recently closed tabs or middle clicking the back button drop-down can cause FF to freeze for at least several seconds, perhaps because the history data is being modified.
Now that is interesting. I have only 9000 history entries, but my file is the same size. Can you check if you are hitting any of the limits mentioned in comment 51. Can you check in about:config how the prefs mentioned are set in your profile? If you are hitting the limit of browser.history_expire_sites=40000
that may cause that your history is pruned of old entries at any new entry insertion. You said you have about 41,856 entries, is this still valid?
Can anybody confirm this?
Scratch, can you check the setting of your preferences too?
My history is now at just over 42,000 and browser.history_expire_sites = 40000. Something is clearly wrong then, as presumably it should be being pruned. Perhaps it's because browser.history_expire_days_min is set to 90, so a minimum of 90 days is kept regardless of the 40,000 limit?

browser.history_expire_days = 180
browser.history_expire_days.mirror = 180
browser.history_expire_days_min = 90
browser.history_expire_sites = 40000
Yes, that is I think what comment 51 said.
Ok, and how old are your entries? If you sort by last visit, how old is your oldest entry (last visit of it)?
My earliest entry is 29/08/08. Today is 29/11/08.
(In reply to comment #57)
> My history is now at just over 42,000 and browser.history_expire_sites = 40000.
> Something is clearly wrong then, as presumably it should be being pruned.
> Perhaps it's because browser.history_expire_days_min is set to 90, so a minimum
> of 90 days is kept regardless of the 40,000 limit?

See bug 425219 wrt the Places' history expiration algorithm.
(In reply to comment #59)
> My earliest entry is 29/08/08. Today is 29/11/08.
That seems to be a bit above the 90 days. Maybe the entries really are pruned, but not directly down to 40000, but maybe in smaller chunks (or one at a time). Could you maybe look at it when visiting a new site, if the oldest entry (29/08/08) is going away (after the disk and CPU finishes)?
Without suggesting the two are identical: when I look at the gack that passes for history data in FF2 I find a tremendous number of what I can only call "stubs", bits of clutter from advertising and such. What's significant is that these don't ever seem to be deleted. They're certainly not accessible for manual deletion.

p.s. I see folk valuing their Histories, so I'll repeat: with such as NS3 History was available for manipulation very much as Bookmarks are through the Management function, not just through the sidebar. I happen to be one of those who values my browser history and certainly feel snubbed.
(In reply to comment #61)
> (In reply to comment #59)
> > My earliest entry is 29/08/08. Today is 29/11/08.
> That seems to be a bit above the 90 days. Maybe the entries really are pruned,
> but not directly down to 40000, but maybe in smaller chunks (or one at a time).
They are cleared out slowly (just a small number) on every visit, and in larger chunks when the computer is idle).

(In reply to comment #62)
> p.s. I see folk valuing their Histories, so I'll repeat: with such as NS3
> History was available for manipulation very much as Bookmarks are through the
> Management function, not just through the sidebar. I happen to be one of those
> who values my browser history and certainly feel snubbed.
History -> Show All History ?
In reply to comment #62:
This bug is about "trunk" Firefox, i.e., Fx3.1 beta, and there has been an
almost complete rewrite of the history subsystem between Fx2 and Fx3 so you
should at least compare the capabilities of Fx3 with what you see on Fx2 before
criticizing. (The latest Fx3 can be had from
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/ and calls
itself Minefield). On the following version (the latest Linux nightly):

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b3pre) Gecko/20081128
Minefield/3.1b3pre

I see a "History => View all history" menu item, which brings up a History
manager (which calls itself Library) where, among other possible actions, URLs
can be removed by hitting Del (or clicking Organize => Delete) after selecting
something.
I have to admit I didn't really explore "History => All" ... found it confounded with Bookmarks and so just shuddered and turned away. But it's definitely a substantial move in the right direction.

But what I was suggesting was the /possibility/ that, as with FF2, items that accumulate as bloat don't show up to be deleted. 
When I compare what I see in HST (using MozillaHistoryView v1.02 2007 Nir Sofer - http://www.nirsoft.net) with what appears within FF there are huge differences i.e. the bloat items in HST are not seen.

Just a suggesting ... wasn't attacking anybody's mother or first-born son.
Here is the list of features I would expect from a good history mecanism:

a) deeper than 6 days classification. I clean up hosts I don't want to keep every weekend, but that is 7 days, so the 'older than 6 days' bucket is hard to filter out. It should simply be a config point in the prefs (30 days classes for example). Don't forget there is sql lite in the back, so it is pretty powerful when using sql "group by", "order by" and sharp "where" clauses.

b) I would love that once you group by host, you could sort by url count in that host. It would help cleaning up fat and lean groups. In fact, given any grouping type, give the count in a discrete value and allow to sort.

c) I know I keep all my imdb and wikipedia visited links, because it helps showing that I already read some web page. I never want these to roll off (until a 2 years threshold maybe...). I would want a whitelist/blacklist to enter the history. This is also good to get rid of the neverending list of google urls...

But when you think about it, ultimately, it boils down to giving a time-to-live to a url in the history associated to a matching rule. Example:

*google.com/* -> TTL = 1 day (just enough to recall a search)
*imdb.com/* -> TTL = 999 days.
forums.foobar.com/* -> 15 days.
DEFAULT -> <whatever> days.

Implementation wise, it is not exactly a TTL value in the db schema but rather an expiration date and an index on that column in the table (so that rolling off , on browser start I guess, but only once per day, remains quick).
(In reply to comment #63)
> They are cleared out slowly (just a small number) on every visit

So on every visit, his firefox will go through all entries in an attempt to find "small number" that can be deleted?
Yes, that is what we are trying to find out. And this process may go through the whole table in the sqlite file (depends no how good the DB indexes are), using a lot of CPU and the resulting changes may cause lots of file writes.
Something like when then old mork file in FF2 was completely rewritten when adding or pruning entries, when you were above some limit.
But I think there were some improvements in FF3.1, which created temporary tables so that history and the location bar gets faster. Can any Places developer comment?

Comment 66, please file a separate bug (request for enhancement) for your ideas, they probably do not belong into this bug. Thanks.
Ok, bug #467157 was spun off (to seamkonkey because firefox has a far too simple history management ui (both panel and frame)).

I an thinking of writing a separate client over the sqllite file. I already queried it with freeware tools, but the schema is too complex for everyday sql query typing... (LOL).

If you have schema doc somewhere, I would appreciate a link. Thanks.
> browser.history_expire_days = 180
> browser.history_expire_days_min = 90

we will keep 90 days, if we have slots we will keep more visits, up to 180 days

> browser.history_expire_sites = 40000

This limit applies to URIS not to visits, so you will have fairly more than 40000 visits. This is a maximum cap of unique visited uris.

About how expiration work, we have ideas to change it, however it's not matter of going through ALL the history items to find what to expire, databases do not work like that. Same thing for UI, not enough resources to revise it actually, but clearly everyone wants better history interaction, so no need to push on it i guess.

I doubt all of this "comments" are pertinent to this bug itself, i would suggest you moving the discussion up to mozillazine's forum or a newsgroup, developers can follow and comment there as well.
Newsbin, a Usenet reader which has to deal with massive databases of posts which easily exceed 40,000 entries, has a nice solution to this problem.

Instead of using one massive database per group, it uses lots of smaller ones. Each database contains only posts from say 1 day. When opening a group, it knows how many days back it needs to go and so only needs to read databases going back that far, dramatically speeding up loading. It also means that to purge old records it simply deletes any database whose newest post is over a certain age.

FF could probably do something like that, perhaps by grouping histories into weekly chunks. That would eliminate slow-downs when navigating to new sites, or using the history to open recently closes tabs/recently visited sites. Older databases would only need to be loaded when using the full history view, and even then if there is no searching some metadata about the databases could prevent them all being loaded until the user actually tries to view them by scrolling the list, when using the default sort-by-date view. Cleanup of old entries would be one file deletion too.
(In reply to comment #71)
> FF could probably do something like that, perhaps by grouping histories into
> weekly chunks. That would eliminate slow-downs when navigating to new sites, or
> using the history to open recently closes tabs/recently visited sites. 

doubt it, notice awesomebar and searches (and views) are throughout the full history, not a chunk of it, that would cause complexity grow by a large amount. Our slowdown is not always due to queries, frontend trees and transaction manager are parts of the frontend that are slowing us down, and a database change does not help there. All of our queries stays under 1s, so you can see by yourself we must take care of different interactions.
(In reply to comment #72)
> doubt it, notice awesomebar and searches (and views) are throughout the full
> history, not a chunk of it, that would cause complexity grow by a large amount.
And in speed.  Loading each of those databases will take a lot longer, and looking at more databases for a query will make the query take longer too.
Okay then, so what is the solution? Better database structure and queries? Allowing more RAM to be used by SQLite?
I was looking at the schema; the moz_places (url) has it unique index. Right.

But SQL lite considers every type as just a bunch of bytes, i.e. strings (C wise). http://www.sqlite.org/datatypes.html and will simply use strcmp() or memcmp() to sort/compare thus index. So it looks to me that we are creating a difficulty when we store places urls that ALL start with "http://" or "https://".

So here is a few ideas:

a) for some reason I cannot explain, the rev_host stores the string in reverse. I sure find this pointless since the beginning of a host name is more selective than the end, but one could argue that the tail .com is less common than the head www. However, for URLs, I sure think that the end of the url must be changing a lot more that the beginning. But again, it all depends. Trick: store the urls in reverse.

b) there is more pattern searches than equality searches right? I mean, the user types a letter or number and the auto completion searches at every key stroke right? Couldn't the 1st query wait for at least 4 chars (beyond http[s]://) and the resultset be stored in a tmp table, so that subsequent keystroke only search on the tmp table instead?

Sorry if I sound naive, I don't know the firefox/seamonkey codebase at all, just a few experience to share.
Actually, now that I slept on it, I realize one important thing: the index on the url string is pointless if the sql query where clause has a LOCATE(substring) ("contains"). Url strings in the db are indexed by only 1 arithmetic dimension (which would be the suited for equality searches, which cannot suit both "startswith", "endswith" and "contains" searches. In the end, the sql lite must be doing a table scan.

Another idea:

The user's keystroke trigger only a host search, which then reveal a host id if any. Then the places revealed are only the ones with that host id. So it reduces the size of url to use LOCATE(substring) on.

It boils down to the speed of finding a host. 2 things: we don't have a host table (not normalized anyway) and we don't have a host id nor an index on it for the places table (although there is an index on the moz_places.rev_host, which again is pointless for LOCATE searches).

Assuming the existence of an host table, the problem of efficiently finding hosts that "contains" a name is similar to finding urls, although I trust there wouldn't be 40000 distinct hosts...

For finding host rapidly, I'm thinking we should exploit the equality searches or at least avoid LOCATE. The following is under the assumption that there is a speed gain in sql lite when using LIKE rather than LOCATE('mozil')=0 ). This could be worked around by user-defined functions http://www.sqlite.org/c3ref/create_function.html if using sql lite 3.

What if we added let's say 5 columns for the labels between dots within the hostname? bugzilla.mozilla.org would be stores as ["bugzilla", "mozilla", "org", null, null].

If a user typed "mozil" for example, we don't have to search for entire hostnames that contains "mozil", we just have to search for a label that startswith "mozil" (LIKE 'mozil%') in the 5 columns. 

It seems the user experience of the url auto complete would not be affected more than now (at least in seamonkey for what I can see), because if you type in "ozill", you don't get "mozilla" urls. So there is already some constraint going on the search input and matching clause.
(In reply to comment #76)
> Actually, now that I slept on it, I realize one important thing: the index on
> the url string is pointless if the sql query where clause has a
> LOCATE(substring) ("contains"). Url strings in the db are indexed by only 1
> arithmetic dimension (which would be the suited for equality searches, which
> cannot suit both "startswith", "endswith" and "contains" searches. In the end,
> the sql lite must be doing a table scan.
We do not use LOCATE anywhere in places:
http://mxr.mozilla.org/mozilla-central/search?find=%2Ftoolkit%2Fcomponents%2Fplaces%2Fsrc%2F&string=LOCATE

> It boils down to the speed of finding a host. 2 things: we don't have a host
> table (not normalized anyway) and we don't have a host id nor an index on it
> for the places table (although there is an index on the moz_places.rev_host,
> which again is pointless for LOCATE searches).
Normalization isn't always a good idea.  Normalization means you have to do more joins, which can be expensive, and in many of our queries, we try to join as little as possible because of this.  Sometimes that does result in us having an unnormalized database.

Regarding rev_host, there is in fact a reason for doing this:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/src/nsNavHistory.cpp#6741

We have other solutions to making large history making Firefox painfully slow.  Two of the more important ones are bug 453529 and bug 455555
Depends on: 453529, 455555
Good.

But I certainly hope they are not gonna stop painting visited links because they are more than 24h old though; that would completely defeat the purpose of an history, as far as I am concerned...

Let me ask you this: how the heck did mork made it so fast then?

I mean, deleting from and sorting the entire history was painful because of the structure, but I never had problem with visited links nor url autocomplete on seamonkey up to now.

Could the entire history simply fit in ram? I think so in many cases, where the file is 100MB or less (i.e. even at 1k per place, that 100'000 places!). The writes are replicated to the real table on disk and this heap table. The reads of course are made only from the heap table.

Assuming the sql lite db be preloaded in a heap table, any latency would only mean there is an abuse in CPU (some kind of order(N*logN) or worst) (you guys have a profiler, right?)
(In reply to comment #78)
> But I certainly hope they are not gonna stop painting visited links because
> they are more than 24h old though; that would completely defeat the purpose of
> an history, as far as I am concerned...
Embed visits are only visits that are loaded in subframes.

> Let me ask you this: how the heck did mork made it so fast then?
Mork wasn't fast.  You could only keep a little more than nine days of history and it'd slow down your browser horribly.

> Assuming the sql lite db be preloaded in a heap table, any latency would only
> mean there is an abuse in CPU (some kind of order(N*logN) or worst) (you guys
> have a profiler, right?)
That's not reasonable.  For starters, people already complain that Firefox takes up too much memory.  Additionally, there is going to be quite a startup cost importing that each time.  Then you get into issues about making sure everything is flushed onto disk.
Does sql lite have a tunable read cache? You could allocate a 10M for every gig of ram the user has. My point is, assuming there is no CPU abuses with the searching (BIG assumption, see below*), then the only bottleneck would be i/o, so that's worth having a look into read caches (of limited size I agree).

*For the string searches, I read about Rabin-Karp rolling hashes, a trick that can speed up a search of a short substring of m chars in a text of N chars. Instead of a worst case of (N-m)(m) char comparison, it can do the search in ~N comparisons.
Wikipedia the thing http://en.wikipedia.org/wiki/Rabin-Karp_string_search_algorithm
this needs discussions and planning on strategies, so most likely not going to make 3.1, hwv some changes could have already modified some of the behaviours here.
Target Milestone: Firefox 3.1 → Firefox 3.2a1
How about this design?

Create a class which implements in memory caching.  Keep the most frequently accessed classes in a small buffer, such as 400 URLs.

Fault the cache for the URL whenever you do an update on a URL.

the memory caching class can use a Queue<Dictionary<URLString, CachedUrlObject>> data structure.

the memory caching class should only be accessible via a singleton.

When items are pulled from the sql lite database, they are cached.

I'd like the end design to look like memcached.

Thoughts?

I'm new so if the code already does this my apologies.  I'm going to go build the code now and look at the sql lite implementation.  

Cheers.
SQLite already has it's own cache so it's not always hitting the disk...
A related issue is that a large history causes FF to be very slow immediately after starting. The disk get thrashed as you open sites up for the first time. Remove the history and everything is fine again.

To be honest I don't think there is a simple solution to this one. If there was, SQLite or some other database would surely implement it. Assuming replacing SQLite is not an option, there is only one fix I can think of:

Do queries in a separate, low priority thread. There have been calls for FF to become multithreaded for years, and despite Chrome proving it can work really well there does not seem to be much progress so far.

So, basically, we are stuffed until there is a major architectural change to FF.
Whiteboard: [TSnappiness]
No longer depends on: 83205
Target Milestone: Firefox 3.6a1 → ---
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
This bug is so generic that it can be closed now. All dependencies are fixed, history menu has been fixed in 3.5, the awesomebar has been fixed in 3.6, sidebar and library are faster in 3.5 and even faster in 3.6 (Still some work ongoing in treeviews), expiration is now async.

And we are retaining much much more history than what we were doing in 2003, still i can use the browser to type in this comment :)

At this point i'm resolving this bug as WFM, and remaining performance issues should have their own more specific bugs.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.