Closed Bug 1748408 Opened 2 years ago Closed 2 years ago

Spellcheck abilities (at least for Hungarian language) has decreased between Ffx95 and 96 beta

Categories

(Core :: Spelling checker, defect, P2)

Firefox 96
defect

Tracking

()

VERIFIED FIXED
97 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox95 --- unaffected
firefox96 + verified
firefox97 + verified

People

(Reporter: szalai.kalman, Assigned: emilio)

References

(Regression)

Details

(Keywords: regression)

Attachments

(3 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:100.0) Gecko/20100101 Firefox/100.0

Steps to reproduce:

I just copy a batch of Hungarian text (from here: https://hu.wikipedia.org/wiki/Sz%C3%B6veg) to a text box.
"A szöveg megfelelője gyakorlatilag az összes európai nyelvben "Text" (különböző írásképekkel a nemzeti helyesírás miatt), ami a latin "textum" szóból ered, amely szó eredeti jelentése: szövet, szöveg. A magyarban a nyelvújítás idején a jelentést magyar szóval jelöltük. A szöveg egy összefüggő és a környezetétől jól elhatárolt vagy elhatárolható megnyilvánulás, kijelentés írott vagy tágabb értelemben nem írott de (le)írható nyelven. A nem feltétlenül írott, de leírható szövegre példa a dalszöveg, egy film szövege vagy improvizált színházi szöveg."

Actual results:

Copy works, but the spellchecker is not. In Firefox 95 the spellchecker only shows two errors "Text" and "textum" which are correct. Firefox 96 b10 shows more than 20 misspelled words, that is incorrect.

It seems the problem is related to words that contains affixes, suffixes or prefixes.

Are other agglutinative languages affected by this?

Expected results:

Somehow the recognize performance of spellchecker is getting worse during 95 ro 96 beta change (same for nighties too).

I excepting same correct spellchecking work as worked in Firefox 95.

:emilio , could you check this bug?

The Bugbug bot thinks this bug should belong to the 'Core::Spelling checker' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Spelling checker
Product: Firefox → Core

I can repro this:

  1. Install Hungarian dictionary: https://addons.mozilla.org/en-US/firefox/addon/hungarian-dictionary/
  2. Copy-paste this to URL bar: data:text/html;charset=utf-8,<div contenteditable spellcheck="true" lang=hu>A sz%C3%B6veg megfelel%C5%91je gyakorlatilag az %C3%B6sszes eur%C3%B3pai nyelvben "Text" (k%C3%BCl%C3%B6nb%C3%B6z%C5%91 %C3%ADr%C3%A1sk%C3%A9pekkel a nemzeti helyes%C3%ADr%C3%A1s miatt), ami a latin "textum" sz%C3%B3b%C3%B3l ered, amely sz%C3%B3 eredeti jelent%C3%A9se: sz%C3%B6vet, sz%C3%B6veg. A magyarban a nyelv%C3%BAj%C3%ADt%C3%A1s idej%C3%A9n a jelent%C3%A9st magyar sz%C3%B3val jel%C3%B6lt%C3%BCk. A sz%C3%B6veg egy %C3%B6sszef%C3%BCgg%C5%91 %C3%A9s a k%C3%B6rnyezet%C3%A9t%C5%91l j%C3%B3l elhat%C3%A1rolt vagy elhat%C3%A1rolhat%C3%B3 megnyilv%C3%A1nul%C3%A1s, kijelent%C3%A9s %C3%ADrott vagy t%C3%A1gabb %C3%A9rtelemben nem %C3%ADrott de (le)%C3%ADrhat%C3%B3 nyelven. A nem felt%C3%A9tlen%C3%BCl %C3%ADrott, de le%C3%ADrhat%C3%B3 sz%C3%B6vegre p%C3%A9lda a dalsz%C3%B6veg, egy film sz%C3%B6vege vagy improviz%C3%A1lt sz%C3%ADnh%C3%A1zi sz%C3%B6veg.
  3. Click somewhere within the text

Firefox 95: Only shows two errors "Text" and "textum" which are correct, as the reporter said.
Firefox 97: Errors nearly everywhere.

Jari, I think you are working on spell checker, could you take a look?

:emilio , could you check this bug?

Also adding NI to Emilio as the reporter requested.

Flags: needinfo?(jjalkanen)
Flags: needinfo?(emilio)

[Tracking Requested - why for this release]: Spell-checking in some language regressed to the point of being made useless.

I don't know why I was ni?d here but I can repro and mozregression says:

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=18985bc7fb198a9c08d899f49046210e89f96e3d&tochange=6e970ac25c219afedaf60cf1b61a323b4280dc55

Bobby can you take a look? I can try too in case you're swamped.

Flags: needinfo?(emilio) → needinfo?(bholley)
Regressed by: 1739761
Has Regression Range: --- → yes

Some dictionaries might use more memory for some words than what we were
allowing.

Assignee: nobody → emilio

I suspect this might be connected to bug 1737396

Unfortunately I haven't worked on the spell checker. Maybe Olli would know more?

Flags: needinfo?(jjalkanen)

Let's see Emilio's patch works there too. Thank you Emilio!

Severity: -- → S2
Priority: -- → P2
Flags: needinfo?(bholley)
Blocks: 1748699
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c6d0faf06391
Allow bigger chunks in hunspell. r=bholley

Comment on attachment 9257518 [details]
Bug 1748408 - Allow bigger chunks in hunspell. r=bholley

Beta/Release Uplift Approval Request

  • User impact if declined: Some spell checkers would misfunction
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: comment 4
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Simple tweak to regressing bug to avoid going over the chunk size.
  • String changes made/needed: none
Attachment #9257518 - Flags: approval-mozilla-release?
Flags: qe-verify+

Just a question regarding the high memory fragmentation of Hunspell dictionary. Is it possible to fix the memory fragmentation and higher memory usage in the Hunspell upstream, or this caused problems only in sandboxing?

(for comment 12)

Flags: needinfo?(bholley)

(In reply to Kami from comment #12)

Just a question regarding the high memory fragmentation of Hunspell dictionary. Is it possible to fix the memory fragmentation and higher memory usage in the Hunspell upstream, or this caused problems only in sandboxing?

The issue is not specific to sandboxing, though the net amount of fragmentation will be allocator-dependent (wasi-libc uses dlmalloc, so that's where the measurements came from). This would be a reasonable change to take upstream if the maintainers are interested.

Flags: needinfo?(bholley)
Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 97 Branch

Could we get this fix into Firefox 96 Branch too?

QA Whiteboard: [qa-triaged]

Reproduced the issue with Firefox 97.0a1 (20220104214425) on Windows 10x64 sing STR from comment 4.
The issue is no longer reproducible with 97.0a1 (20220106034727) on Windows 10x64, macOS 10.15 and Ubuntu 20.04. Only errors for Text and textum words are displayed.

Comment on attachment 9257518 [details]
Bug 1748408 - Allow bigger chunks in hunspell. r=bholley

Approved for 96.0rc2

Attachment #9257518 - Flags: approval-mozilla-release? → approval-mozilla-release+

Verified fixed with Firefox 96.0 RC2 (20220106144528) on Windows 10x64, macOS 10.15 and Ubuntu 20.04.

Status: RESOLVED → VERIFIED
Flags: qe-verify+

Works well for me in Nightly.

See Also: → 1737396
Blocks: 1758626
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: