Closed Bug 224625 Opened 21 years ago Closed 21 years ago

[spellcheck] Convert mozPersonalDictionary to use nsTHashTable

Categories

(Core :: DOM: Editor, defect)

defect
Not set
minor

Tracking

()

RESOLVED FIXED

People

(Reporter: mvl, Assigned: mvl)

Details

Attachments

(1 file, 2 obsolete files)

mozPersonalDictionary is mainly just a list of strings. It is stored using an
AVLTree. That's a fast stucture.
But is also takes lots of code. Using nsTHashTable would reduce the amount of
code, and make it more readble imho. It won't make it faster. In fact, it is a
few percent slower. Both scale good.

So I suggest to convert it, to make the code cleaner and smaller. (Shared code
is good)
Urk.  I was going to argue for keeping AVLTree but I changed my mind after
looking at the code.  It needs a lot of style and naming cleanup, spellcheck
isn't actually exploiting the tree structure, and I think the child rotation
during remove doesn't correctly maintain the AVL invariant.

A hash table is good enough for doing lookups although it won't handle
suggestions (should they ever be implemented).  Although, I'm not sure an AVL
tree is very good for suggestions either.
FWIW, I do not believe that a hashtable is an efficient and useful data
structure for dictionaries. I highly recommend looking at a patricia trie. This
kind of structure is much more capable at variant and predictive matching, and
some kinds of autocompletions that are impossible with a hashtable.

--BDS
Ben, I agree that some form of trie is the right way to go for dictionaries.  I
believe that compact dawgs are more efficient than patricia tries although I'd
have to check.  However, it appears that all of the functions that would
actually require a trie are currently unimplemented.
Keep in mind this is just for the personal dictionary, not for the 'real'
dictionary. (that already uses a hash btw)
The personal dictionary is usually small. It is only used to lor an exect match
with the word to be spellchecked. No smart substring matching or whatever.
What I want to archive is to get rid of the current avl tree code. It either
needs to be cleaned up, or replaced.
Using nsTHashTable was the easiest thing to do. It will work just as well, and
make the code smaller and cleaner.
So, I don't want to use the hashtable beacause it would be better or anything,
but just to clean up the code. It's just that there already is hashtable code in
the tree.

If we want to imporve the gerenal storage, it's better to attack the myspell
code. That uses a hashtable to store a huge dictionary.
i agree with mvl/nallen... we currently don't use any capabilities trees provide
(in this case) and it's just bloating things. bsmedberg's suggestion might be
good if we ever do implement those features, but for now, hashes are so much
simpler ;)

so why don't we go ahead and take this debloatification?
Attached patch Patch v1.5 (obsolete) — Splinter Review
This patch converts mozPersonalDictionary to use a hashtable. It also has some
other small cleanups I came across.
Comment on attachment 134968 [details] [diff] [review]
Patch v1.5

r=me, per extensive patch iteration and comments on irc. but my r= is
conditional on kin accepting my review, since i'm not a spellcheck guy ;)
Attachment #134968 - Flags: review+
Attachment #134968 - Flags: superreview?(kinmoz)
Comment on attachment 134968 [details] [diff] [review]
Patch v1.5

mkaply, can you look at this?
Attachment #134968 - Flags: superreview?(kinmoz) → superreview?(mkaply)
Looks good to me, but I am not technically an sr.

get sr=kin on it.
Comment on attachment 134968 [details] [diff] [review]
Patch v1.5

Sorry for the bugspam. I didn't check the sr page...
Attachment #134968 - Flags: superreview?(mkaply) → superreview?(kinmoz)
Attached patch patch v1.6 (obsolete) — Splinter Review
Updated patch, with a few small changes, and to be applied on top of the
patches in bug 225749 and bug 226756. Most notable are some changed in the
previous patch, which are now part of the patch in bug 226756, or obsoleted by
that patch.
Attachment #134968 - Attachment is obsolete: true
Attached patch patch v1.6bSplinter Review
whoops, I missed a line. The previous patch crashes, this one shouldn't :)
Attachment #137417 - Attachment is obsolete: true
Attachment #134968 - Flags: superreview?(kinmoz)
Comment on attachment 137425 [details] [diff] [review]
patch v1.6b

dwitte, can you take a quick glance over this, if i didn't mess something up?
alecf, if you are tired of those cleanup patches, let me know :)
(as always, i will rev the interface ids when i check in after the tree opens)
Attachment #137425 - Flags: superreview?(alecf)
Attachment #137425 - Flags: review?(dwitte)
Comment on attachment 137425 [details] [diff] [review]
patch v1.6b

+  PRUint32 bytesWritten;
+  for (PRInt32 i = 0; i < array.Count(); ++i ) {
+    const nsString *key = array[i];
+    NS_ConvertUTF16toUTF8 utf8Key(*key);

would be nicer to use CopyUTF16toUTF8() here, and keep an nsCAutoString outside
the for() loop.

other than that, this looks fine. sr=alecf

I also wonder if we could start allocating some of these words like the ignore
list from an arena?
Attachment #137425 - Flags: superreview?(alecf) → superreview+
Comment on attachment 137425 [details] [diff] [review]
patch v1.6b

>diff -u5 -rp spellcheck_unicode2/src/mozPersonalDictionary.cpp spellcheck/src/mozPersonalDictionary.cpp
> /* void Save (); */
> NS_IMETHODIMP mozPersonalDictionary::Save()
> {

>+  nsStringArray array;
>+  mDictionaryTable.EnumerateEntries(AddHostToStringArray, &array);

hmm, care to prealloc the correct count for |array|? like this:

nsStringArray array(mDictionaryTable.Count());

>+NS_IMETHODIMP mozPersonalDictionary::GetWordList(nsIStringEnumerator **aWords)
> {

>+  nsStringArray *array = new nsStringArray;

same here. (nice stringenumerator fu!)

>diff -u5 -rp spellcheck_unicode2/src/mozSpellChecker.cpp spellcheck/src/mozSpellChecker.cpp
> NS_IMETHODIMP 
> mozSpellChecker::GetPersonalDictionary(nsStringArray *aWordList)
> {

>+  nsCOMPtr<nsIStringEnumerator> words;
>+  mPersonalDictionary->GetWordList(getter_AddRefs(words));
>+  
>+  PRBool hasMore;
>+  nsAutoString word;
>+  while (NS_SUCCEEDED(words->HasMore(&hasMore)) && hasMore) {
>+    words->GetNext(word);
>+    aWordList->AppendString(word);
>+  }

auch, please do something about this sometime in the near future :/

with alecf's nit picked too, r=me, happy landing!
Attachment #137425 - Flags: review?(dwitte) → review+
Checked in, with the review comments fixed, and un-bitrotted.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: