Open Bug 420817 Opened 16 years ago Updated 2 years ago

"add to dictionary" fails on misspellings containing Unicode punctuation

Categories

(Core :: Spelling checker, defect)

All
Linux
defect

Tracking

()

People

(Reporter: aepalea, Unassigned)

References

Details

User-Agent:       Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9b3) Gecko/2008020513 Firefox/3.0b3
Build Identifier: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9b3) Gecko/2008020513 Firefox/3.0b3

http://www.fileformat.info/info/unicode/char/2019/index.htm

I copied a fairly ordinary word (children’s, correctly used in the pl.poss.) from a web page containing the apostrophe character (encoded as U+2019) into a text edit box.  The Firefox 3b3 spell checker underlined this word.  Not noticing that the apostrophe was a non-ASCII code point, I thought "this ought to be in the dictionary" and immediately right-clicked "add to dictionary".  The word remained underlined as a spelling error.  I tried again.  Same difference.  Then I noticed the oddly slanted apostrophe.  

Concerning certain Unicode characters, the spell checker is clearly tokenizing differently than the user dictionary.  Probably the solution in this case is to do a better job of mapping Unicode punctuation marks into families.  More generally, it would be better if the spell-check and user dictionary parsers were brought into alignment. 


Reproducible: Always

Steps to Reproduce:
1. copy the five letter string "xxx’s" into a text edit box 
2. right click "add to dictionary" 
3. notice it is still marked as a spelling error 
4. go to new line and type "xxx" 
5. notice you now have this word in your dictionary 

Actual Results:  
Adds the base word into the dictionary, but leaves the composed word displayed as an error. 


Expected Results:  
The token as underlined added to my dictionary.  

It was actually hard to check the reproduction steps because FF is rather arbitrary about updating the spell checker underline during text edits.  If you press enter in the middle of an underlined string "my|bad" (| represents cursor) *often* (but not always) FF now shows two lines with "my" and "bad" *both* underlined.  Even stranger, if you then backspace over the new CR to rejoin the two correct words (both inappropriately underlined) you get "mybad" now displayed with *no* underline (but if you then cursor out of the word, it is updated properly after the fact).
The flip side of 450602. Also related to 355178.
Status: UNCONFIRMED → NEW
Component: General → Spelling checker
Ever confirmed: true
Product: Firefox → Core
QA Contact: general → spelling-checker

This is still problematic on Nightly 100.

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.