Closed Bug 387545 Opened 18 years ago Closed 9 years ago

Support multiple entries when disk cache hash collisions occur

Categories

(Core :: Networking: Cache, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: schapel, Unassigned)

References

(Blocks 1 open bug)

Details

Bug 175600 comment 14 states that when a disk cache hash collision occurs, the cache just dooms the older of the two entries. Handling hash collisions with chaining or open addressing would allow both entries to remain in the cache. This enhancement would help offline browsing work better.
What is used as the hash function? If you're worried about collisions in such a small cache, that sounds like the place to start...
No matter what hash function is used, there will be collisions. Currently, any collision whatsoever will doom a cache entry. Even one doomed entry can make it so that a page cannot be viewed properly in offline mode. This bug report requests that entries not be doomed on collisions so that offline browser can work better.
When hashing is used, synonym is one of most difficult problem to resolve. Ordinal approaches I know are; (a) Discard synonym, keep newly hashed item only(current one on Disk Cache) (b) Generate synonym chain, and search synonym chain upon each access. Sequential search is used initially in many cases. This is simplest, practically fast, and inexpensive way, if synonym chain length is almost always short. However, if long synonym chain happens, it produces performance issues. (c) If many synonyms can occur in many hashed values, indexing, binary-tree search etc. is introduced to avoid performance issue due to long synonym chain. In large systems, usually (b) is used initially due to cost(memory usage, CPU usage, ...). And, according to system growth, performance issue due to lo--ng synonym chain arise, then expensive way of (c) than (b) is introduced to resolve performance issue. And, if performance issue remains due to many synonyms even after (c), (d) hashing algorythm is changed in order to reduce synonyms, in order to make hashed value space larger. It's very expensive if vary large Data Base system, because re-organization of whole Data Base is required. Although Fx/Tb is very smaller system than server system, Fx/Tb is sufficiently large system because vary large but inexpensive HDD is available. If synonyms is supported with hashing, both performance and cost(memory, CPU usage) should be cared for. Note: There are already known requirements of (d) in Mozilla. - Max 8192 entries of Disk Cache (Bug 175600) - Max 256 cache entry for image data per a web page (Bug 263590) - Unexpected collisions of haching value, need to reduce sysnonym chain length (bug 290032) In addition to this bug, these bugs should be resolved at same time to support many cached data. I dont't know Fx's new "Offline cache device" has such limitations/restrictions or not.
Blocks: http_cache
Summary: Handle disk cache hash collisions better → Support multiple entries when disk cache hash collisions occur
new cache code and layout
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.