Closed Bug 471799 Opened 17 years ago Closed 4 years ago

Hunspell doesn't recognize misspelled words if they are in different encoding

Tracking

()

Status:

RESOLVED WORKSFORME

People

(Reporter: rail, Unassigned)

References

Details

Rail Aliiev [:rail]

Reporter

Description

•

17 years ago

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090101 Minefield/3.2a1pre Ubiquity/0.1.4 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2a1pre) Gecko/20090101 Minefield/3.2a1pre Ubiquity/0.1.4 Hunspell cannot properly handle words which use encoding not "compatible" with the current dictionary encoding. At least this is true for inline spell checking. Reproducible: Always Steps to Reproduce: 1. Set the current dictionary to en-US (which uses ISO8859-1) 2. Write a _wrong_ word which uses non-western symbols. For example, I use Russian word "тестх" (right one is "тест") Actual Results: Wrong word is not underlined. Expected Results: Wrong word should be underlined. The problem can be worked around if we change the encoding of en-US dictionary (s/SET ISO8859-1/SET UTF-8/ and recode if needed). Not investigated, but I think, there are some lost after character conversion within hunspell module. BTW, hunspell used in OpenOffice.org returns the expected result.

Rail Aliiev [:rail]

Reporter

Updated

•

17 years ago

Version: unspecified → Trunk

Alexander L. Slovesnik

Comment 1

•

17 years ago

Confirmed with Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.9.2a1pre) Gecko/20081226 Minefield/3.2a1pre ID:20081226103856

Status: UNCONFIRMED → NEW

Ever confirmed: true

Alexey Gladkov

Comment 2

•

17 years ago

s/SET ISO8859-1/SET UTF-8/ definitely not enough. I get the following errors with this change: This UTF-8 encoding can't convert to UTF-16: smörgåsbord UTF-8 encoding error. Missing continuation byte in 5. character position: soigné So definitely need to recode.

Kagami Rosylight 🏳️‍🌈🏳️‍⚧️ [:saschanaz] (they/them)

Comment 3

•

4 years ago

It seems this is in multiple language design area in some way, and I think it works now if I understand this correctly.

Status: NEW → RESOLVED

Closed: 4 years ago

Depends on: 69687

Resolution: --- → WORKSFORME

Dan Minor [:dminor]

Comment 4

•

4 years ago

The behaviour was changed in Bug 1773802 when multiple dictionaries are enabled. If only a single dictionary is enabled, we still don't spellcheck words in using other encodings. This was a deliberate choice to not change the behaviour for users with a single dictionary.

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Hunspell doesn't recognize misspelled words if they are in different encoding

Categories

(Core :: Spelling checker, defect)

Tracking

()

People

(Reporter: rail, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4