Open Bug 1017380 Opened 10 years ago Updated 2 years ago

Dictionary of first names and surnames included in Spellchecker

Categories

(Core :: Spelling checker, enhancement)

32 Branch
x86_64
Windows 7
enhancement

Tracking

()

UNCONFIRMED

People

(Reporter: karlmernagh, Unassigned)

Details

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0 (Beta/Release)
Build ID: 20140528030219

Steps to reproduce:

It would be great if we could link the spellchecker to a dictionary of first names and surnames to prevent spellchecker from constantly underlining names in emails and webpages that are being edited or sent
Component: Untriaged → General
Priority: -- → P3
Summary: Dictionary of first names and surnames → Dictionary of first names and surnames included in Spellchecker
Component: General → Spelling checker
Product: Firefox → Core
I am inclined to WONTFIX this, as doing this will be preferential treatment to those who have common English names, and others such as myself whose names will not be in this dictionary will have their names suggested as misspellings.

But I do agree that the original problem reported here is pretty annoying.  Perhaps we should ignore words that begin with an upper case letter?  Not sure if hunspell has a good way to express that since the notion of upper case letters is also language specific.
I know what you mean but from the dbs I've looked up I was able to find Mernagh which is very uncommon. Maybe pulling together a couple of dbs from English speaking countries, Europe, Asia, Africa would cover most of the most common names leaving only a small amount of people with the issue.

Also, I see that new words are added to hunspell all the time so it could be a matter of continually updating the db for a while until it has more name coverage. It will never get 100% coverage but even a 50% coverage would hep a lot.
Also maybe developers from different countries could contribute their list of names. There are usually databases of names (both first and last) for each country stored online that can be downloaded and added to a central database
An alternative approach maybe to sync the dictionary between devices. That means that any names that I have added to the dictionary will not be underlined as misspelled in the other device. Don't know if that makes sense but just thinking of different approachs.
(In reply to Karl Mernagh from comment #3)
> I know what you mean but from the dbs I've looked up I was able to find
> Mernagh which is very uncommon. Maybe pulling together a couple of dbs from
> English speaking countries, Europe, Asia, Africa would cover most of the
> most common names leaving only a small amount of people with the issue.

I think there are too many cultures/languages and also too many ways to shorten the names etc. for that to work in practice.  And note that all of this work should ideally be done per-language (other languages than en-US have their own dictionaries which people install as extensions, and maybe sometimes with some localized builds as well.)

> Also, I see that new words are added to hunspell all the time so it could be
> a matter of continually updating the db for a while until it has more name
> coverage. It will never get 100% coverage but even a 50% coverage would hep
> a lot.

I'm not worried about not having 100% coverage, that's OK since we'll never get there in practice.  I'm worried about starting with English names and never driving the effort further, and hence discriminating against some of our users.

(In reply to Karl Mernagh from comment #4)
> Also maybe developers from different countries could contribute their list
> of names. There are usually databases of names (both first and last) for
> each country stored online that can be downloaded and added to a central
> database

I don't doubt that such databases exist, but those might be in non-Latin scripts, or in Latin based scripts but using characters which are commonly omitted/replaced in English, etc.  But of course it's at least conceivable for it to work.  :-)

(In reply to Karl Mernagh from comment #5)
> An alternative approach maybe to sync the dictionary between devices. That
> means that any names that I have added to the dictionary will not be
> underlined as misspelled in the other device. Don't know if that makes sense
> but just thinking of different approachs.

I doubt that this will be practical.  :-)
Thanks Ehsan. I think I've exhausted all options here. I suppose if it was a higher priority issue the effort would be worth it but as you've said, there are too many localisations, different spellings, etc... Might be nice as an addon for en-us, en-uk, en-ie. I've time over the summer now. Might build my first addon :-)

Thanks again.
(In reply to comment #7)
> Thanks Ehsan. I think I've exhausted all options here. I suppose if it was a
> higher priority issue the effort would be worth it but as you've said, there
> are too many localisations, different spellings, etc... Might be nice as an
> addon for en-us, en-uk, en-ie. I've time over the summer now. Might build my
> first addon :-)

Thanks for filing the bug and for the discussion.  I hope I didn't come across too negative.  :-)  FWIW, I actually think this is a great exploration idea for an add-on, so if you end up working on this, please update this bug with your experiments and findings.  Maybe we'll figure out a good way to solve this some day!
Not at all Ehsan, you've been very patient and explanatory. I know how busy you are. Thanks for your time. I'll keep you posted with any experiments I do.
Severity: normal → enhancement
Priority: P3 → --
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.