Closed Bug 983817 Opened 6 years ago Closed 5 years ago

[x86 Linux Ubuntu packages] crashes during spellchecker initialization (various textfield/textarea interactions/message compose) due to compiler bug using a 32-bit read for a read from an array of unsigned short

Categories

(Core :: Spelling checker, defect, critical)

27 Branch
x86
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla38
Tracking Status
firefox36 --- fixed
firefox37 --- fixed
firefox38 --- fixed
firefox-esr31 --- fixed

People

(Reporter: snailtsunami, Assigned: dbaron)

References

Details

(Keywords: crash, intl, topcrash-thunderbird, Whiteboard: [regression:TB31][tbird crash])

Crash Data

Attachments

(6 files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:27.0) Gecko/20100101 Firefox/27.0 (Beta/Release)
Build ID: 20140218140052

Steps to reproduce:

Clicked on a text field on github's web site and hit ctrl + v to paste a long public key into the text field, very quickly.


Actual results:

Browser stopped responding and crashed after a minute.


Expected results:

Text pasted; browser not crash.
I saw this happen; I'm wondering from the crash report if it may have been Firefox trying to spell-check the public key.  Or, it may have something to do with text fields in a form and switching fields quickly.
Status: UNCONFIRMED → NEW
Component: Untriaged → Spelling checker
Ever confirmed: true
Product: Firefox → Core
I was directed to this report from this crash report [1] on my system.

On my system, firefox is crahsing regularly (several times a day!) when I click a text field, or when I change the spell-check language. This is not systematic though. I am using 2 different languages in FF text fields, and I switch between them several times a day (maybe 20 times?). About 10% of the time FF will crash. This is highly unreliable. As a consequence I systematically copy/paste the content of the text field before I switch spell-check language (as the crash lose the last edit). This is truly frustrating.

This bug appeared somewhere around FF26-FF27 about. It used to work fine before.



[1] https://crash-stats.mozilla.com/report/index/e043de51-0fc8-4ddb-93ae-ab7c82140331
Duplicate of this bug: 995356
From the comments in other crashes, this may be an issue with focusing on a form's text field rather than a spelling check error. The crashes only happen on Linux and is still happening in builds for Firefox 29 from the end of April.  There have been around 600 crashes with this signature in the last 7 days 

More crash reports: https://crash-stats.mozilla.com/report/list?signature=flag_qsort&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2014-05-03+21%3A00%3A00&range_value=1#tab-sigsummary
Component: Spelling checker → Layout: Form Controls
This is a crash in the hunspell code we use to spell check.

To people who can reproduce: what dictionaries do you have installed?  Does someone have steps to reproduce?
Component: Layout: Form Controls → Spelling checker
I have French, English (UK) and English (US)
Thanks!  It would be nice if someone can try installing those dictionaries and get us some steps to reproduce.
Keywords: qawanted
Unfortunately I have no steps to reproduce. On my system it is pretty random, which makes the issue even more frustrating (and a real pain). "May I click this field? Will FF crash this time? Or should I first copy the field content in the clipboard?". See also my comment in bug 995356.
(In reply to comment #9)
> Unfortunately I have no steps to reproduce. On my system it is pretty random,
> which makes the issue even more frustrating (and a real pain). "May I click
> this field? Will FF crash this time? Or should I first copy the field content
> in the clipboard?". See also my comment in bug 995356.

One thing that you can try is to disable the English (UK) and French dictionaries one by one and see if disabling one of them will make the crash go away.  I strongly suspect that this is due to a corrupted dictionary file.
How can I disable dictionaries?
(I can't remember how I got them here)
(In reply to comment #11)
> How can I disable dictionaries?
> (I can't remember how I got them here)

If you go to about:addons, do you see them listed either under Dictionaries or Extensions?
(In reply to comment #12)
> If you go to about:addons, do you see them listed either under Dictionaries
> or Extensions?

no.
There is one French dictionary listed, but it is disabled.
The fact that the crash addresses all end in five zeros is highly suspicious; is flag_qsort reading a word further than it ought to, and thus intermittently crossing a page boundary when the array it's sorting bumps up against the edge of that page?
So, flags_qsort expects begin to be the first index and end to be 1 greater than the last index.  (It thus does more work than needed on one element arrays.)  During the loop it maintains the invariants that all values in the range [begin + 1, l) are less than pivot and all values in [r, end) are greater than pivot.  These ranges both might be empty.  It exits the loop when l == r, which ensures that l is always a valid index; r might be equal to end and thus not a valid index.

Then, after the loop, l is set to one less than r.  If begin + 1 == end, then l == begin and r == end, since the while loop was never entered.

So the code of flag_qsort itself looks ok to me, or at least if there's a problem, I haven't seen it.  It seems somewhat unlikely for the compiler to misoptimize.


The caller that matters is HashMgr::load_tables, which uses an allocation made in decode_flags, which does the allocation and gives the caller both the pointer and he length, so it looks like it's passing the right size as well.


Integer overflow seems somewhat unlikely, although I suppose it's possible.


I really wish the crash reports had at least line number information.
The format of the .dic file is basically like this:

<N> # denoting the number of lines
... # followed by N lines

And hunspell doesn't perform any bounds checking on the contents of the file, and in the past I've seen at least one crash which was caused by a dictionary file which got this wrong.

Something like this would be my first guess as to what's happening here.
This is what I get from my hunspell dictionaries:

$ cd /usr/share/hunspell
$ ls *.dic
en_GB.dic  en_US.dic  fr.dic

$ head -1 en_GB.dic  && wc -l en_GB.dic 
56506
56507 en_GB.dic

$ head -1 en_US.dic  && wc -l en_US.dic 
62154
62155 en_US.dic

$ head -1 fr.dic  && wc -l fr.dic 
63062
63063 fr.dic
Can you please tar up the .dic and .aff files there and attach it to the bug, and perhaps include some links to pages where you experience this crash?
Sure.

Here is a tarball of my /usr/shar/hunspell directory.

I remember that I removed manually the dictionaries that I did not wanted (like French local variants).

fr.*    comes from ubuntu package hunspell-fr
en_US.* comes from ubuntu package hunspell-en-us
en_GB.* comes from ubuntu package myspell-en-gb
The pages where I experience crash are pretty random, as far as I recall. Of course frequently visited pages produce most crashes, like gmail or facebook just to name two.
(can you see the crashing page from my crash reports?)
Are all the people seeing this using Ubuntu's builds of Firefox (or some other distro?), or are any of you seeing this in Mozilla-generated builds?  (The build IDs in crash stats don't match seem to match our builds.)
https://crash-stats.mozilla.com/report/index/f84c892b-3d01-46f5-aa98-e64d92140501 shows (using minidump_stackwalk):

Thread 0 (crashed)
 0  libxul.so + 0x15ecb6e
    eip = 0xb4bb5b6e   esp = 0xbfbf43d0   ebp = 0x8e5ffffe   ebx = 0xb6dcaef4
    esi = 0x00000002   edi = 0x82a14014   eax = 0x00000000   ecx = 0xb6dc532a
    edx = 0x00000001   efl = 0x00210282

(didn't get symbols set up)

This is consistent with the disassembly of 
http://mirrors.kernel.org/ubuntu/pool/main/f/firefox/firefox_29.0+build1-0ubuntu0.13.10.3_i386.deb
(that's the package for saucy), but not consistent with the disassembly of the Mozilla official Firefox 29 build.
This shows where the bug is.  At the time of the crash we're loading flags[l] with a 32-bit read in order to compare it to pivot.

Comparing with the registers in comment 23, we can see that:
  l == 1 (%edx)
  r == 2 (%esi)
  &flags[l] == 0x8e5fffe (%ebp)
  begin == 0 (%eax)
  pivot == 0x532a (%cx)


In any case, I think this is pretty clearly a compiler bug.  Not sure who we bug about such things, given that it's in Ubuntu's builds.  (From the kernel versions I saw, it looks like it's showing up for precise and saucy.)
Who's the right person to talk to if the most frequent Firefox crash on Linux is a bug in the compiler used to build the Ubuntu packages?

(Note that I've made no attempt so far to reduce a testcase for the compiler bug.  It might or might not be fixed in trunk gcc, and I don't know what gcc options are needed.  I'm not going to have the time to do so, either.)
Flags: needinfo?(chrisccoulson)
... oh, and there are definitely trusty kernel versions in that list as well.
Summary: Crashed on pasting a public rsa key into a text field → [x86 Linux Ubuntu packages] crashes during spellchecker initialization (various textfield/textarea interactions) due to compiler bug using a 32-bit read for a read from an array of unsigned short
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-7) from comment #28)
> Who's the right person to talk to if the most frequent Firefox crash on
> Linux is a bug in the compiler used to build the Ubuntu packages?

maybe you could be in touch with the ubuntu firefox package maintainer


(I am using a mainstream kernel, 3.14.0-031400rc6-generic)
Keywords: crash
(In reply to pionchon.luc from comment #31)
> maybe you could be in touch with the ubuntu firefox package maintainer

I *think* that's who I made the needinfo request to in comment 28.
Hi,

I'll talk to our compiler guy about this. It probably won't be me who looks at it as I'm working on another project now.
Flags: needinfo?(chrisccoulson)
Could somebody running the Firefox that has this crash (i.e., 32-bit Ubuntu packages) attach the contents of about:buildconfig to this bug?  (That is, just type "about:buildconfig" in the URL bar, save it to a file, and use the "Add an attachment" link above.)
Er, never mind, I can extract it from the package in comment 23.
We ought to be able to work around this, and probably should, given the lack of response.

If somebody has a setup that can reproduce the compiler bug, there might be a straightforward workaround such as inserting |volatile| somewhere or similar trivial rearrangement of code.

If not, we ought to be able to pad these arrays by 2 bytes, #ifdef linux and gcc.
Attached file about:buildconfig.html
Buldconfig info from my Firefox, which I think is affected by this bug.
Great to see progress.  

#6 crash for Thunderbird 31, and for a linux crash to have such a high rank in all of Thunderbird is a pretty big deal.  Typical crash comment is similar to prior comments in the bug - changing the spell-check language.

Some linux users with this crash signature also see signature  g_list_sort_with_data but the stack is different. For example bp-c571441b-ee2b-4622-a4e5-ed3532140809 "Upon closing Thunderbird it crashed."  Different bug?
Severity: normal → critical
Summary: [x86 Linux Ubuntu packages] crashes during spellchecker initialization (various textfield/textarea interactions) due to compiler bug using a 32-bit read for a read from an array of unsigned short → [x86 Linux Ubuntu packages] crashes during spellchecker initialization (various textfield/textarea interactions/message compose) due to compiler bug using a 32-bit read for a read from an array of unsigned short
Whiteboard: [regression:TB31][tbird crash]
My biggest concern for review of this patch is whether the #ifdef will
correctly catch what Ubuntu is using to compile Firefox.  Does anybody
know how to confirm that Ubuntu is compiling with gcc, and that these
#ifdefs are correct?
Attachment #8561105 - Flags: review?(ehsan)
Assignee: nobody → dbaron
Status: NEW → ASSIGNED
Comment on attachment 8561105 [details] [diff] [review]
Pad heap allocations passed to flag_qsort() on x86 Linux to work around gcc bug affecting Ubuntu packages

Hmmm, given that ehsan's away for a bit, transferring review to froydnj.

(I'd hope to get this in to beta, although I really should have tried to do this many releases ago, although I was hoping somebody else would.)
Attachment #8561105 - Flags: review?(ehsan) → review?(nfroyd)
Comment on attachment 8561105 [details] [diff] [review]
Pad heap allocations passed to flag_qsort() on x86 Linux to work around gcc bug affecting Ubuntu packages

Review of attachment 8561105 [details] [diff] [review]:
-----------------------------------------------------------------

Yuck.  The #ifdef checks are correct, fwiw.
Attachment #8561105 - Flags: review?(nfroyd) → review+
Comment on attachment 8561105 [details] [diff] [review]
Pad heap allocations passed to flag_qsort() on x86 Linux to work around gcc bug affecting Ubuntu packages

Approval Request Comment
[Feature/regressing bug #]: not a regression in our codebase
[User impact if declined]: #3 topcrash on Linux, specific to 32-bit Ubuntu-distributed builds.  Firefox will randomly crash on 32-bit Linux builds the first time the user uses a textarea or otherwise does something that initializes the spellchecker.  (It only crashes a small percentage of the time, but it affects a large number of users.)
[Describe test coverage new/current, TreeHerder]:  None.  Just landed on mozilla-inbound.  I don't know of any way to test that the fix works without shipping it on the release channel.
[Risks and why]: Low risk; it's padding a few allocations in the spellcheck code with 2 extra bytes on all 32-bit Linux builds.
[String/UUID change made/needed]: no
Attachment #8561105 - Flags: approval-mozilla-beta?
Attachment #8561105 - Flags: approval-mozilla-aurora?
CCing the Hunspell maintainers to make sure this patch gets upstreamed as well.
https://hg.mozilla.org/mozilla-central/rev/80d3d1eef2f6
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Attachment #8561105 - Flags: approval-mozilla-beta?
Attachment #8561105 - Flags: approval-mozilla-beta+
Attachment #8561105 - Flags: approval-mozilla-aurora?
Attachment #8561105 - Flags: approval-mozilla-aurora+
magnus, it seems like we may want this for ESR. Do you agree?

#5 crash for https://crash-stats.mozilla.com/topcrasher/products/Thunderbird/versions/31.4.0
Flags: needinfo?(mkmelin+mozilla)
Keywords: qawanted
Comment on attachment 8561105 [details] [diff] [review]
Pad heap allocations passed to flag_qsort() on x86 Linux to work around gcc bug affecting Ubuntu packages

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: top crasher. 

See comment 43 for risk evaluation. (Low)
Flags: needinfo?(mkmelin+mozilla)
Attachment #8561105 - Flags: approval-mozilla-esr31?
Comment on attachment 8561105 [details] [diff] [review]
Pad heap allocations passed to flag_qsort() on x86 Linux to work around gcc bug affecting Ubuntu packages

Accepting for ESR 31 since it's a topcrash and affects many users.
Attachment #8561105 - Flags: approval-mozilla-esr31? → approval-mozilla-esr31+
I added this change to the patches directory: https://hg.mozilla.org/integration/mozilla-inbound/rev/99814e9730de
I am not seeing this in esr TB31.6.0  v.fixed and thanks for the uplift
Status: RESOLVED → VERIFIED
Original problem IIUC here is around qsort, which is no longer used in hunspell git master now in favour of standard c++ std::sort of removal of a pile of dodgy casting to force the flags into qsort
You need to log in before you can comment on or make changes to this bug.