Closed Bug 525581 Opened 15 years ago Closed 15 years ago

crashes [@ AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)] - [@ AffixMgr::isRevSubset(char const*, char const*, int) ]

Categories

(Core :: Spelling checker, defect, P1)

x86
Windows XP
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1
Tracking Status
status1.9.2 --- beta4-fixed

People

(Reporter: dbaron, Assigned: dbaron)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: [crashkill][crashkill-fix][fixed-in-hunspell-1.2.12])

Crash Data

Attachments

(3 files, 1 obsolete file)

Whiteboard: [#20 Firefox 3.6b1 topcrash]
...that's 3.6b1 in testing before it shipped, I should say.
Whiteboard: [#20 Firefox 3.6b1 topcrash]
regression, no?

also appears in Thunderbird:3.1a1pre, first build is 2009102000.
FF appears as early as 20090624 (if one excludes 2008070200 as an aberration)
Severity: normal → critical
might be in 3.5.x but certainly at much lower volume.

distribution of all versions where the AffixMgr::suffix_check crash was found on 20091111-crashdata.csv
 245 Firefox 3.6b1
 109 Firefox 3.6b2
   9 Firefox 3.7a1pre
   3 Firefox 3.6b3pre
   1 Firefox 3.5.5
[no other releases reported]

the domains reported in the crash reports are *very* highly dominated by sites in poland.

domains of sites
  58 http://nasza-klasa.pl
  17 \N//
  13 http://www.google.pl
  12 http://www.allegro.pl
   8 http://mail.google.com
   8 http://[iedzik.pl
   7 about:blank//
   6 http://translate.google.pl
   6 http://allegro.pl
   5 http://www.youtube.com
   5 http://www.fotka.pl
   5 http://wiadomosci.onet.pl
   5 http://poczta.o2.pl
   5 http://mail10.tlen.pl
AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)
        http://crash-stats.mozilla.com/report/index/81d39977-33b0-4ca3-a212-415902091111
        I was changing dictionary from english to polish
        http://www.dobreprogramy.pl/Firefox-Beta-do-pobrania,Aktualnosc,15293.html#komentarze
        Firefox 3.6b2 Windows NT 6.1.7600
         525581

AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)
        http://crash-stats.mozilla.com/report/index/04e80bcd-9348-4012-aa17-b8af02091110
        "Bonjour,  |  | J'étais en train d'écrire un courriel. J'ai voullu changer la langue du correcteur d'orthographe pour passer du français à l'italien en passant par le menu context
uel (clic droit). Au moment où j'ai cliqué sur ""italien"", firefox s'est fermé.  |  | Merci pour votre super travail avec Firefox."
        https://mail.google.com/mail/? [query string removed]
        Firefox 3.6b1 Windows NT 6.0.6002 Service Pack 2
         525581
Keywords: regression
this one has really exploded in the since the 11/6. that's when 3.6b1 when over 200k adu's

3.6b*
adu's  #crashes signature  date

 53k   3   total crashes for AffixMgr::suffix_check on 20091024-crashdata.csv
       6   total crashes for AffixMgr::suffix_check on 20091025-crashdata.csv
       4   total crashes for AffixMgr::suffix_check on 20091026-crashdata.csv
       4   total crashes for AffixMgr::suffix_check on 20091027-crashdata.csv
       5   total crashes for AffixMgr::suffix_check on 20091028-crashdata.csv
       4   total crashes for AffixMgr::suffix_check on 20091029-crashdata.csv
       2   total crashes for AffixMgr::suffix_check on 20091030-crashdata.csv
 65k  12   total crashes for AffixMgr::suffix_check on 20091031-crashdata.csv
 89k  15   total crashes for AffixMgr::suffix_check on 20091101-crashdata.csv
123k  28   total crashes for AffixMgr::suffix_check on 20091102-crashdata.csv
158k  43   total crashes for AffixMgr::suffix_check on 20091103-crashdata.csv
180k  55   total crashes for AffixMgr::suffix_check on 20091104-crashdata.csv
198k  92   total crashes for AffixMgr::suffix_check on 20091105-crashdata.csv
213k 304   total crashes for AffixMgr::suffix_check on 20091106-crashdata.csv
198k 250   total crashes for AffixMgr::suffix_check on 20091107-crashdata.csv
209k 369   total crashes for AffixMgr::suffix_check on 20091108-crashdata.csv
241k 428   total crashes for AffixMgr::suffix_check on 20091109-crashdata.csv
248k 371   total crashes for AffixMgr::suffix_check on 20091110-crashdata.csv
260k 367   total crashes for AffixMgr::suffix_check on 20091111-crashdata.csv
should have mentioned also that prior to 10/24 the daily crash volume was steady at 0 to 9 crashes per day for sept. and oct.
just about all the stacks I've looked at look something like

Frame  	Module  	Signature [Expand]  	Source
0 	xul.dll 	AffixMgr::suffix_check 	extensions/spellcheck/hunspell/src/affixmgr.cpp:2485
1 	xul.dll 	AffixMgr::affix_check 	extensions/spellcheck/hunspell/src/affixmgr.cpp:2789
2 	xul.dll 	Hunspell::checkword 	extensions/spellcheck/hunspell/src/hunspell.cpp:655
3 	xul.dll 	xul.dll@0x3ad162 	
4 	xul.dll 	mozHunspell::Check 	extensions/spellcheck/hunspell/src/mozHunspell.cpp:433
5 	xul.dll 	mozSpellChecker::CheckWord 	extensions/spellcheck/src/mozSpellChecker.cpp:145
6 	xul.dll 	nsEditorSpellCheck::CheckCurrentWordNoSuggest 	editor/composer/src/nsEditorSpellCheck.cpp:300
7 	xul.dll 	mozInlineSpellChecker::DoSpellCheck 	extensions/spellcheck/src/mozInlineSpellChecker.cpp:1407
8 	xul.dll 	mozInlineSpellChecker::ResumeCheck 	extensions/spellcheck/src/mozInlineSpellChecker.cpp:1476
9 	xul.dll 	mozInlineSpellResume::Run 	extensions/spellcheck/src/mozInlineSpellChecker.cpp:510
Whiteboard: [crashkill]
ryanvm, any ideas on this one?
regression from  Bug 454108 ?
not reproducible on macos. must be windows only.

I'll test on W7, but I don't have XP.
Cannot reproduce on Win7. Anyone with XP?
It seems, Hunspell with the Polish dictionary has problem with some Polish web pages. Is there any reproducible input (a Polish web page) for this bug? 
Thanks, László
I failed to reproduce it using Fx3.6b2 pl on Win7. It may be Win XP/Vista only.
the signature  AffixMgr::isRevSubset(char const*, char const*, int) might also be the same problem  reports of that are on the rise too.

http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.6b2&query_search=signature&query_type=exact&date=&range_value=1&range_unit=weeks&do_query=1&signature=AffixMgr%3A%3AisRevSubset%28char%20const*%2C%20char%20const*%2C%20int%29

here are comments and more ideas for testing


20091112-crashdata.csv

AffixMgr::isRevSubset(char const*, char const*, int)
	http://crash-stats.mozilla.com/report/index/959e3e28-39a6-47bb-bdd3-54af22091112
	"pharse ""się, że "" causes crash - hitting space after word ""że"" in phpbb form for sending new messages to forums"
	about:sessionrestore
	Firefox 3.6b2 Windows NT 5.1.2600 Dodatek Service Pack 3
	 

AffixMgr::isRevSubset(char const*, char const*, int)
	http://crash-stats.mozilla.com/report/index/9abe8c51-7fb8-4e45-ac57-a12592091112
	x
	http://www.futbol.pl/artykul/68032.html
	Firefox 3.6b2 Windows NT 5.1.2600 Dodatek Service Pack 3
	 
20091113-crashdata.csv

AffixMgr::isRevSubset(char const*, char const*, int)
	http://crash-stats.mozilla.com/report/index/5305e26e-28a0-4bab-8f41-e3f5a2091113
	Jak na forum chciałem wyjustować tekst to się przeglądarka wyłączyła.
	\N
	Firefox 3.6b2 Windows NT 5.1.2600 Dodatek Service Pack 3
	 
20091114-crashdata.csv
20091115-crashdata.csv
20091116-crashdata.csv

AffixMgr::isRevSubset(char const*, char const*, int)
	http://crash-stats.mozilla.com/report/index/1b0ebd59-8fa9-459a-a4e6-339682091116
	Nie wiem co się dzieje, pisałam w notlogu i zniknęło samo, druga sprawa strasznie długo trzeba czekać na otworzenie Firefoksa.
	http://pl.netlog.com/go/messages/send/ ---sanitized user data
	Firefox 3.6b2 Windows NT 5.1.2600 Dodatek Service Pack 3
	 

AffixMgr::isRevSubset(char const*, char const*, int)
	http://crash-stats.mozilla.com/report/index/81df7f6a-9f21-4bf1-9a3e-e4eba2091116
	The sudden lock in time the previewer of writing down the textu in textual field. | Nagłe zamknięcie się przeglądarki w czasie wpisywania textu w pole tekstowe.
	http://nasza-klasa.pl/profile/12087199/gallery/144
	Firefox 3.6b2 Windows NT 5.1.2600 Dodatek Service Pack 3
	 
20091117-crashdata.csv

AffixMgr::isRevSubset(char const*, char const*, int)
	http://crash-stats.mozilla.com/report/index/4bee3ef1-cfc9-41fb-9968-cfb462091117
	na www.tlen.pl
	http://mail10.tlen.pl/
	Firefox 3.6b2 Windows NT 5.1.2600 Dodatek Service Pack 3
signature list
 403 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)

os breakdown
 152 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 5.1.2600 Dodatek Service Pack 3
  68 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 6.1.7600
  54 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 5.1.2600 Dodatek Service Pack 2
  53 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 6.0.6002 Service Pack 2
  25 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 6.1.7100
  14 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 5.1.2600 Service Pack 3
  13 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 6.0.6000
  10 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 6.0.6001 Service Pack 1
   5 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 5.1.2600
   4 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 5.1.2600 Service Pack 2
   2 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 6.1.7229
   2 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Windows NT 5.1.2600 Dodatek Service Pack. 1
   1 AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char) Linux 0.0.0 Linux 2.6.31.5-0.1-default #1 SMP 2009-10-26
 15:49:03 +0100 i686 GNU/Linux

signature list
  69 AffixMgr::isRevSubset(char const*, char const*, int)

os breakdown
  31 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 5.1.2600 Dodatek Service Pack 3
  13 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 5.1.2600 Dodatek Service Pack 2
   8 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 6.1.7600
   8 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 6.0.6002 Service Pack 2
   3 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 6.1.7100
   3 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 6.0.6000
   1 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 5.1.2600 Service Pack 3
   1 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 5.1.2600 Service Pack 2
   1 AffixMgr::isRevSubset(char const*, char const*, int) Windows NT 5.1.2600 Dodatek Service Pack. 1

distribution of all versions where the AffixMgr::isRevSubset crash was found on 20091117-crashdata.csv
  65 Firefox 3.6b2
   4 Firefox 3.7a1pre
Attached file more ideas for testing
lots of ideas on how to test, but they need some translation from polish

one that was in english suggests...

       "Hello, polish words cause a crash. ""ą, ę"""

see attachment for more.
http://people.mozilla.com/crash_analysis/20091118/20091118_Firefox_3.6b2-interesting-addons.txt doesn't suggest there is any older version of the dictionary that might be installed and a problem.   do we ship a dictionary with pl ?

that would help to confirm its only the latest dictionary that is a problem.
(In reply to comment #16)
> Created an attachment (id=413274) [details]
> more ideas for testing
> 
> lots of ideas on how to test, but they need some translation from polish

So here comes the translation.
There are reports of crashes while surfing the polish web pages.
Much of the cases relate to writing messages at social networking sites, or just entering a text in a text box. There are some indications of a crash happening in the moment of typing a polish special character (witch i do not dare to type now).
I won't mention the comments that could be described as... let's say offending.

Hope it helps.
however hard this may be we need some one to do this...

happening in the moment of typing a polish special character (witch i do not
dare to type now).

maybe just typing those out in a text file in another program or earlier version of firefox, and attaching that file to this bug can help us find a crash test helper.

@Wojciech Moch  -- thanks a lot for the translation!
@chris hofmann -- Any time! 

Today I can give you some more info about this. I made some experimenting, and found some astonishing results. All of my tests were made under Windows Vista.
1. Firefox 3.5.5 EN + polish dictionary - no crashes, every works perfectly.
2. Firefox 3.5.5 PL + polish dictionary - no crashes, every works perfectly.
3. Firefox 3.6b3 EN + polish dictionary - no crashes, every works perfectly.
4. Firefox 3.6b3 PL + polish dictionary - occasional crashes, not all words are correctly recognized by the dictionary. Try typing "Oczywiście" - the word meaning "Of course" is marked as wrong although it is written right. The same happens with some other words containing polish special characters. It looks like the dictionary does not like the characters "ś" and "ł", and marks a lot of such word ad incorrect, but some of them seem to be all right. Strange indeed.
So it looks like, the problem is caused by the polish locale and polish dictionary combination.
If you're saying fx3.x pl + polish dictionary, you're talking about the dictionary we ship with the polish localization, and if you're saying en+polish dictionary, you're talking about the dictionary from AMO?

Which version on AMO are you using?
Honestly, i am not sure. In all cases the dictionary that was installed was: https://addons.mozilla.org/pl/firefox/addon/3052.
I know, that i had to install it separately for the EN versions of Firefox, but for the polish versions, i think it was installed by default. If it is not the case, then it seems, i had to install it somewhere along the way.
I can try to make a one more clean installation, but not until tomorrow.

Best regs
Wojtek Moch
The dictionary should show in your add-ons manager, including a version number.

Anyway, I got a suspicion: The dictionary we ship hasn't been updated in a while, so something with the old version of the dic with the new hunspell goes interesting.

Maybe a diff on the pl.aff can hint at something?
The version of the dictionary I'm using here at work is 1.0.20090810.
It is running at Firefox 3.6b3 PL, and i did not receive any updates for it for the last few... months?
As for the version of all other combinations, i will be able to check them as soon as i return home, but am quite confident that the version number is the same.

Best regs
Wojtek
we don't record locale in the crash data (maybe some day soon), so figuring out which locales might be affected by this is going to be tricky.  I've seen comments like

AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)
        http://crash-stats.mozilla.com/report/index/0ee592d2-d0ca-4a4d-a665-267f92090812
        Crashed after switching spell checking language from English to Armenian
        Firefox 3.5.2 Mac OS X 10.5.8 9L30


and comments in French like

AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)
        http://crash-stats.mozilla.com/report/index/04e80bcd-9348-4012-aa17-b8af02091110
        "Bonjour,  |  | J'étais en train d'écrire un courriel. J'ai voullu changer la langue du correcteur d'orthographe pour passer du français à l'italien en passant par le 
menu contextuel (clic droit). Au moment où j'ai cliqué sur ""italien"", firefox s'est fermé.  |  | Merci pour votre super travail avec Firefox."
        Firefox 3.6b1 Windows NT 6.0.6002 Service Pack 2
         525581

so I think its beyond just a 3.6b3-polish dictionary combination problem.  

I'll try to figure out a way to scan the crash data to isolate other locale/dictionary pairs that might be a problem.
I wonder if we can fuzz against :check in the spellchecking engine, http://mxr.mozilla.org/mozilla1.9.2/source/extensions/spellcheck/idl/mozISpellCheckingEngine.idl?mark=92-92#89. If we're lucky that's accessible from something as low end as xpcshell?
(In reply to comment #26)
> 
> so I think its beyond just a 3.6b3-polish dictionary combination problem.  
> 
> I'll try to figure out a way to scan the crash data to isolate other
> locale/dictionary pairs that might be a problem.

I'm blocking on good ideas for figuring this out.  The only short term thing I can think of are brute force menthods.

1) aleart all localization teams to check the attachment for comments that look like they might be in your language to suspect if there is a problem in your localization

2) aleart all localization teams to test specifically in this area and watch for reported bugs

3) develop some kind of automation hack that would auto enter all characters the alphabets of each local to figure out which characters might be causing problems.

4) look at the current set of known characters that are causing problems to figure out if there is any pattern there.

item #3 sounds pretty hard but maybe there is a creative way to do this.  that might be the only reliable way to scope the problem if we need targeted fixes by local to fix dictionaries.  the better thing would be to fix this in the spell check engine if we can isolate the bug to something there, but that fix would also benefit from some kind of item #3 testing.
s/aleart/alert
pt-BR and pt-PT as a candidate that should get testing.

20091108-crashdata.csv
	AffixMgr::AffixMgr(char const*, Has
	http://crash-stats.mozilla.com/report/index/bf613312-3647-4c55-8db5-a4c172091108
	nao percebo
I'm not sure what locale this one is.  Arabic?

20090727-crashdata.csv

	AffixMgr::isRevSubset(char const*, 
	http://crash-stats.mozilla.com/report/index/b98914fe-1d6a-44df-b134-8cadc2090726
     لقد انقطع عملي	

I'm casting the widest net on AffixMgr stack signatures just so we don't miss something by mistake.  Extra testing by all locales in this area can't hurt.
yes, that's arabic ( he says he lost his work in the crash I think)
This needs a testcase and an owner, pronto. Looks like there's enough meat in the comments to be able to generate one.
Flags: blocking1.9.2? → blocking1.9.2+
Keywords: testcase-wanted
Priority: -- → P1
I pulled six minidumps for this one (actually a little before Beltzner's comment) and I've started poking at them.

Of interest is that the word being checked does appear to be stored on the stack in mozInlineSpellChecker::DoSpellCheck; thus the parameter to nsEditorSpellCheck::CheckCurrentWordNoSuggest can be examined in a minidump.

http://crash-stats.mozilla.com/report/index/0a4554c7-a920-4326-a910-20bf62091118
the word is "żeby" (U+017C U+0065 U+0062 U+0079)

http://crash-stats.mozilla.com/report/index/6d55ffdb-fbcc-412b-9bad-e79f42091118
the word is "że" (U+017C U+0065)

http://crash-stats.mozilla.com/report/index/8a8fa9cc-44cb-4c07-9585-4be962091118
the word is "że" (U+017C U+0065)

http://crash-stats.mozilla.com/report/index/977ecdd1-c02c-4a31-95c9-31cc62091118
the word is "żeby" (U+017C U+0065 U+0062 U+0079)

http://crash-stats.mozilla.com/report/index/dac35395-c3f5-44bf-88df-440a42091118
the word is "że" (U+017C U+0065)

http://crash-stats.mozilla.com/report/index/fa7cab9a-5f08-4b63-98e9-735642091118
the word is "że" (U+017C U+0065)

I think I see a bit of a pattern here.
Based on that, I still couldn't figure out steps to reproduce using a Polish Linux Firefox 3.6b3 build (and I tried with the extension in comment 23 as well).  Maybe somebody else can, though.  And I'll try on Windows in a bit (although this crash is cross-platform according to crash-stats).


Wojtek Moch, is there some particular combination of words you can enter to make it crash?  Maybe something related to the words in comment 34?
Other interesting data from the minidumps:  the first two parameters to affix_check and suffix_check (they take char* rather than PRUnichar*) are:
 * a string whose bytes are the word as described above, except with the "ż" replaced by a null
 * the length 0 (which does correspond to a string whose first byte is null)
I looked at the URL list for this crash to try to figure out what localization users were using by searching the URL list for the "org.mozilla:locale" strings that we send to google for the start page and search box.

In other words, I took 8 days worth of the URLs for this crash (November 12-19, inclusive) and ran:

grep org.mozilla | sed 's/%3A/:/g' | sed 's/.*org\.mozilla://;s/:official.*//' | sort | uniq -c

on it, leading to the result:

      5 en-US
      1 fr
      1 ja
     42 pl


I had actually been thinking that the cause of the crash might be the empty string going through, and that it was the result of users typing in Polish with some other language's dictionary.  But that doesn't seem to be the case, at least based on this (although it's a pretty small sample).

(Our hunspell glue code's encoding handling is pretty broken; if a word to be spell-checked can't be converted to the dictionary's encoding (which varies by language), it seems like it converts only the initial substring that can be converted and spell-checks that instead.  It should just report the word as misspelled!)
I tried running with the polish dictionary under valgrind and discovered that the code in get_current_cs in csutil.cpp has serious problems, although I don't think it's related to this crash.
(In reply to comment #37)
> (Our hunspell glue code's encoding handling is pretty broken; if a word to be
> spell-checked can't be converted to the dictionary's encoding (which varies by
> language), it seems like it converts only the initial substring that can be
> converted and spell-checks that instead.  It should just report the word as
> misspelled!)

Actually, here I was misreading and I think it's ok.

(In reply to comment #38)
> I tried running with the polish dictionary under valgrind and discovered that
> the code in get_current_cs in csutil.cpp has serious problems, although I don't
> think it's related to this crash.

Now I'm thinking that this actually *is* related to the crash, and whether you crash would depend on what happens to be in certain uninitialized memory at hunspell startup time.
David: can you own this? I mean, you seem to be .. ;)
Yeah, I'm working on a patch that I think might fix it, although I'm not sure how I'll tell for sure if it did.
Assignee: nobody → dbaron
And, to clarify the current decoder behavior going through the current code:

The ISO-8859-1 decoder consumes 256 characters and outputs 256.

The ISO-8859-2 decoder consumes 128 characters and outputs 127, leaving the other 129 uninitialized.

I think what may be leading to problems is when one of those uninitialized characters is 0 and hunspell gets confused as a result.
Attached patch possible patchSplinter Review
I think there's a pretty good chance that this patch will fix the problem.

It definitely fixes the problem that the upper 129 entries of the array generated in this function are left uninitialized when the dictionary encoding is ISO-8859-2, which it is for the Polish dictionary.

My theory is that certain patterns of bad data in those entries (or perhaps inconsistency between different times the function is called) can lead to crashes.


The code in question is attempting to build an array of in-encoding case equivalents.  The old code did this by doing encoding conversion on an array of bytes 0-255.  I'm replacing that code with code that does the encoding conversion separately for each byte, doing more rigorous error checking, and assuming that if there's an error, the byte's case mappings should be the identity mapping.

I'm reasonably sure this will be slower, but it should also be more correct.  Believe it or not, we actually initialize this array many times, though we should probably fix that and just initialize it once-per-encoding or once-per-encoding-change.  But I'm not planning to worry about the performance implications right now; it's probably not a big deal.
Comment on attachment 413745 [details] [diff] [review]
possible patch

I'm going to request review on this patch from two people:

First, from jst, since I know he's around and can review it so that I can get this in reasonably quickly.
Comment on attachment 413745 [details] [diff] [review]
possible patch

And, second, from smontagu, who I would like to have a look at this as well, although I may not wait for his comments before landing, since I'd like to get testing on this as soon as possible.  (And, in particular, if we build a new weekly beta this weekend, I'd like to get this in.)
Attachment #413745 - Flags: review?(smontagu)
Attachment #413745 - Flags: review?(jst) → review+
We're getting a few crash reports per day from nightlies (although there aren't any from today yet, but perhaps that's because users in Poland probably aren't likely to get nightlies until morning their time, i.e., a few hours from now):

3.6b4pre nightlies (click the "Table" tab after loading):
http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.6b4pre&query_search=signature&query_type=startswith&query=Affix&date=&range_value=1&range_unit=weeks&do_query=1&signature=AffixMgr%3A%3Asuffix_check%28char%20const*%2C%20int%2C%20int%2C%20AffEntry*%2C%20char**%2C%20int%2C%20int*%2C%20unsigned%20short%2C%20unsigned%20short%2C%20char%29

3.7a1pre nightlies (click the "Table" tab after loading):
http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.7a1pre&query_search=signature&query_type=startswith&query=Affix&date=&range_value=1&range_unit=weeks&do_query=1&signature=AffixMgr%3A%3Asuffix_check%28char%20const*%2C%20int%2C%20int%2C%20AffEntry*%2C%20char**%2C%20int%2C%20int*%2C%20unsigned%20short%2C%20unsigned%20short%2C%20char%29

so it may be possible to get a sense of whether this worked based on nightlies alone.  (Note that if we bump the version number it will break that query.)
l10n nightlies come in in the evening CET, so the nightlies with this fix will arrive here tomorrow late-afternoon/evening.
(In reply to comment #42)
> And, to clarify the current decoder behavior going through the current code:
> 
> The ISO-8859-1 decoder consumes 256 characters and outputs 256.
> 
> The ISO-8859-2 decoder consumes 128 characters and outputs 127, leaving the
> other 129 uninitialized.

This decoder behavior of the ISO-8859-2 decoder has changed since 1.9.1.  (I just tested on the Firefox 3.5.5 release tag, and there it did 256/256.)
I was trying to figure out when the encoder behavior change, and I discovered that I only get the encoder behavior that causes the problem I found if I install the Polish *chrome*.  (In other words, replace the chrome/ directory next to Firefox with one from a Polish build.)

It's not at all clear to me why that matters; it really shouldn't change encoder behavior.  (So my statement that the encoders didn't behave that way on 1.9.1 may be wrong.)
(In reply to comment #51)
> I was trying to figure out when the encoder behavior change, and I discovered
> that I only get the encoder behavior that causes the problem I found if I
> install the Polish *chrome*.  (In other words, replace the chrome/ directory
> next to Firefox with one from a Polish build.)
> 
> It's not at all clear to me why that matters; it really shouldn't change
> encoder behavior.  (So my statement that the encoders didn't behave that way on
> 1.9.1 may be wrong.)

The reason for *that* is that http://hg.mozilla.org/mozilla-central/rev/5953efc48779 essentially made the stateless decoders stateful for callers who don't call SetInputErrorBehavior, so the decoders that are singletons cache that from their previous user.  That seems pretty bad.
I think fixing the underlying decoder problem here is probably a release blocker; I would expect it to cause problems with Web pages in areas that use affected encodings; I'm surprised we haven't heard more about such problems, but I'm not sure how much Web-compatibility testing we've gotten on this branch yet, especially in non-English-speaking parts of the world.

In any case, here's a patch to remove the concept of stateless decoders, which we've always gone without for the many single-byte encodings that don't start with "ISO-8859".  The only one that's still truly stateless is ISO-8859-1, since we implement it as CP1252, i.e., without any holes in the table, which means the error behavior doesn't matter since there are never errors.  That might be worth a special case, but if it is, we probably want to improve the performance issues in general.  (I actually have a patch lying around to make the charset converter manager cache factories so it doesn't need to go through the service manager every time.  The other issue is sharing the work of CreateFastTable.)  I'm not sure how much performance improvement we have time to do in time for release, though.

Thoughts?

I've pushed this to try to see if it shows up on the perf numbers.
Though given that Henri put a mutex around the decoder hash two months ago and that didn't register, I'd sort of expect this wouldn't either, since that mutex made the CCM's hash probably-not-much-faster than the service manager (which is really also just a hash table with a mutex around it).  (That also means my patch probably isn't much use anymore either.)

It still might be worth sharing the table initialization somehow.
There are a few situations where the decoder lookup is noticeable (e.g. innerHTML sets).  Usually whatever is being decoded is big enough that the lookup cost is not relevant.

If we see this showing a regression on innerHTML benchmarks, we could try to do perf work as a followup in a minor release, I think.
I filed bug 530328 on the stateless decoder problem in general rather than continuing this bug for it.
After bug 530328 is fixed, you could change to 
 decoder->SetInputErrorBehavior(decoder->kOnError_Recover);
and revert the rest of the patch, right?
(In reply to comment #58)
> After bug 530328 is fixed, you could change to 
>  decoder->SetInputErrorBehavior(decoder->kOnError_Recover);
> and revert the rest of the patch, right?

I could, although this approach does seem a bit more robust.  (For example, I think it handles better the case of encodings that have one of a pair of case-equivalents but not the other, such as the ÿ in ISO-8859-1.)
Summary: crashes [@ AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)] → crashes [@ AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)] - [@AffixMgr::isRevSubset(char const*, char const*, int) ]
Attachment #413745 - Flags: review?(smontagu) → review+
Summary: crashes [@ AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)] - [@AffixMgr::isRevSubset(char const*, char const*, int) ] → crashes [@ AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)] - [@ AffixMgr::isRevSubset(char const*, char const*, int) ]
I think I'm comfortable marking this fixed based on the nightly crash data.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a1
Whiteboard: [crashkill] → [crashkill][crashkill-fix]
Whiteboard: [crashkill][crashkill-fix] → [crashkill][crashkill-fix][fixed-in-hunspell-1.2.12]
Depends on: 579649
Crash Signature: [@ AffixMgr::suffix_check(char const*, int, int, AffEntry*, char**, int, int*, unsigned short, unsigned short, char)] [@ AffixMgr::isRevSubset(char const*, char const*, int) ]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: