Closed Bug 564608 Opened 10 years ago Closed 10 years ago

Update Hunspell to 1.2.11

Categories

(Core :: Spelling checker, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla1.9.3a5

People

(Reporter: RyanVM, Assigned: RyanVM)

References

(Depends on 1 open bug)

Details

Attachments

(1 file, 2 obsolete files)

Attached patch Update Hunspell to 1.2.11 (obsolete) — Splinter Review
Hunspell has been updated to version 1.2.11. Numerous fixes have gone in between 1.2.8 (currently in-tree) and 1.2.11 for valgrind and coverity issues. See the bug dependencies for a sampling of some of the issues. Hunspell 1.2.11 resolves many stability issues and I would recommend we take this on 1.9.1 and 1.9.2 as well once this bakes on the trunk.

The attached patch builds locally on VC10. I haven't been able to push it to the tryserver yet since I don't have access and haven't found a volunteer to push it on my behalf.

I am getting one build error that I'm not sure how to resolve. Ted, do you have any insight on this?
affixmgr.cpp
c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\typeinfo(157) : warning C4275: non dll-interface class 'stdext::exception' used as base for dll-interface class 'std::bad_cast'
c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\exception(218) : see declaration of 'stdext::exception'
c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\typeinfo(156) : see declaration of 'std::bad_cast'
c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\typeinfo(178) : warning C4275: non dll-interface class 'stdext::exception' used as base for dll-interface class 'std::bad_typeid'
c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\exception(218) : see declaration of 'stdext::exception'
c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\typeinfo(177) : see declaration of 'std::bad_typeid'

Requested review from Smaug since I don't know of anyone else doing spellcheck reviews these days.
Attachment #444257 - Flags: review?(Olli.Pettay)
Tryserver builds are available from the link below. Interestingly, the build warnings mentioned above don't appear in the TryServer log, so they may be VC10 only.
https://build.mozilla.org/tryserver-builds/tnikkel@gmail.com-try-1cec62a173bf/
cjones: is that build error related to our STL wrapping?
Those are just warnings, but yes, they're bug 556886.
I think at the very least, we need to make sure that this update either doesn't modify code wrapped in MOZILLA_CLIENT ifdefs, or if it does, we have a good reason for doing so.  Also, you need to make sure that the changes landed to the extensions/spellcheck/hunspell/src directory after the update to 1.2.8 are preserved:

changeset:   35181:6282cd265a5c
user:        L. David Baron <dbaron@dbaron.org>
date:        Fri Nov 20 17:21:12 2009 -0800
summary:     Do unicode conversion separately for each byte in the encoding so encoder/decoder errors don't skew the results or leave them uninitialized.  (Bug 525581)  r=jst

changeset:   31811:a0b2a2ffa124
user:        Benjamin Smedberg <benjamin@smedbergs.us>
date:        Tue Aug 25 08:59:31 2009 -0700
summary:     Followup to bug 398573 - remove REQUIRES from the tree since it is no longer used... automatically generated patch, rs=ted

changeset:   30709:786c89deb042
user:        Michael Kohler <michaelkohler@live.com>
date:        Mon Jul 27 10:46:59 2009 +0200
summary:     Bug 106386 - Correct misspellings in source code (old); Patch 1; r=timeless

The only significant one is the last change (6282cd265a5c) though.
Great points, Ehsan. I'll look into those and report back.
786c89deb042 and a0b2a2ffa124 are not affected by this patch.
6282cd265a5c (bug 525581) has not been ported upstream. That will need to be resolved before this can land.
Attached patch Update Hunspell to 1.2.11 v2 (obsolete) — Splinter Review
Updated patch to include the patch from bug 525581. It's also been upstreamed so that future releases (1.2.12 and on) will have it.
Attachment #444257 - Attachment is obsolete: true
Attachment #446911 - Flags: review?(Olli.Pettay)
Attachment #444257 - Flags: review?(Olli.Pettay)
Sorry for the spam. The previous patch had a whitespace issue. I also added proper hg mq patch information to it.
Attachment #446911 - Attachment is obsolete: true
Attachment #446913 - Flags: review?(Olli.Pettay)
Attachment #446911 - Flags: review?(Olli.Pettay)
I guess you should still explain the changes to the MOZILLA_CLIENT code.

I'm also concerned that OOo updated to just 1.2.9 just recently, http://www.openoffice.org/issues/show_bug.cgi?id=110191.
From what I can tell, most of the changes were header cleanup and a couple fixes for leak/crash bugs in the dependency list.

More specifically, the header changes were primarily deleting ifndef MOZILLA_CLIENT code, so that shouldn't be an issue here. Otherwise, the big MOZILLA_CLIENT code change was the erroneous reverting of bug 525581 in the v1 patch, which is now fixed with v2a, yes? Were there any specific changes that concerned you in particular? Maybe Caolan can speak more directly to your questions. Still, there's no problem from my perspective with the changes being reviewed if need be.

Regarding OOo shipping 1.2.9, I'm not sure I understand your concern over that with respect to this bug. 1.2.10 was the first release to fix a lot of the coverity/valgrind bugs in the dependency list here (and some that aren't). 1.2.11 was a relatively minor update over 1.2.10. Upgrading to just 1.2.9 would require removing a lot of the dependency list, which doesn't make any sense to me.
Regarding downstreaming changes that didn't ship in OOo yet, you asked about lowering the review requirements in .planning, and one argument to do so would be "is shipping in OOo to end user since ...". One particular advantage there would be exposing the new code on dictionaries in the wild.

Re MOZILLA_CLIENT code, I've read that ifdef the other way around, so I was more cautious than necessary there. There's at least one #include <vector> which is using the non-MOZILLA_CLIENT pattern. Not sure if our current c++ guidelines like that.

Derailing from this bug a bit, sorry. Looking at http://hunspell.cvs.sourceforge.net/viewvc/hunspell/hunspell/tests/, would it make sense to have a way to run those from xpcshell or so? I'm not sure I'd advocate for downstreaming those tests and run them constantly, but being able to run them would be nice. Not exactly sure how to get the dictionaries in.
Ryan, could you perhaps answer to comment 11, especially, is it ok to use <vector>? Could you ask cjones or bsmedberg.
(I *think* it is ok nowadays.)
<vector> hasn't been explicitly OKed (bug 556701), but apparently we use it in other places so it's listed in our stl-headers wrapper file anyway:
http://mxr.mozilla.org/mozilla-central/source/config/stl-headers
Comment on attachment 446913 [details] [diff] [review]
Update Hunspell to 1.2.11 v2a


>@@ -296,16 +289,19 @@ AffixMgr::~AffixMgr()
>   free_utf_tbl();
>   if (lang) free(lang);
>   if (wordchars) free(wordchars);
>   if (wordchars_utf16) free(wordchars_utf16);
>   if (ignorechars) free(ignorechars);
>   if (ignorechars_utf16) free(ignorechars_utf16);
>   if (version) free(version);
>   checknum=0;
>+#ifdef MOZILLA_CLIENT
>+  delete [] csconv;
>+#endif
Could you explain why this is ifdef MOZILLA_CLIENT

 
>         // FIRST WORD
>         *presult = '\0';
>-        if (partresult) strcat(presult, partresult);
>+        if (partresult) mystrcat(presult, partresult, MAXLNLEN);
Just wondering, what is mystrcat.


>diff --git a/extensions/spellcheck/hunspell/src/affixmgr.hxx b/extensions/spellcheck/hunspell/src/affixmgr.hxx
>--- a/extensions/spellcheck/hunspell/src/affixmgr.hxx
>+++ b/extensions/spellcheck/hunspell/src/affixmgr.hxx
>@@ -52,42 +53,40 @@
>  * the provisions above, a recipient may use your version of this file under
>  * the terms of any one of the MPL, the GPL or the LGPL.
>  *
>  ******* END LICENSE BLOCK *******/
> 
> #ifndef _AFFIXMGR_HXX_
> #define _AFFIXMGR_HXX_
> 
>-#ifdef MOZILLA_CLIENT
>-#ifdef __SUNPRO_CC // for SunONE Studio compiler
>-using namespace std;
>-#endif
>+#include "hunvisapi.h"
>+
> #include <stdio.h>
>-#else
>-#include <cstdio>
>-#endif
So this change is ok for SunONE? Or do we not support that anymore? or what?

>diff --git a/extensions/spellcheck/hunspell/src/csutil.cpp b/extensions/spellcheck/hunspell/src/csutil.cpp
>--- a/extensions/spellcheck/hunspell/src/csutil.cpp
>+++ b/extensions/spellcheck/hunspell/src/csutil.cpp
>@@ -49,30 +51,23 @@
>  * use your version of this file under the terms of the MPL, indicate your
>  * decision by deleting the provisions above and replace them with the notice
>  * and other provisions required by the GPL or the LGPL. If you do not delete
>  * the provisions above, a recipient may use your version of this file under
>  * the terms of any one of the MPL, the GPL or the LGPL.
>  *
>  ******* END LICENSE BLOCK *******/
> 
>-#ifndef MOZILLA_CLIENT
>-#include <cstdlib>
>-#include <cstring>
>-#include <cstdio>
>-#include <cctype>
>-#else
> #include <stdlib.h> 
> #include <string.h>
> #include <stdio.h> 
> #include <ctype.h>
>-#endif
>-
>+
>+#include "csutil.hxx"
> #include "atypes.hxx"
>-#include "csutil.hxx"
> #include "langnum.hxx"
> 
> #ifdef OPENOFFICEORG
> #  include <unicode/uchar.h>
> #else
> #  ifndef MOZILLA_CLIENT
> #    include "utf_info.cxx"
> #    define UTF_LST_LEN (sizeof(utf_lst) / (sizeof(unicode_info)))
>@@ -88,26 +83,16 @@
> #include "nsICharsetConverterManager.h"
> #include "nsUnicharUtilCIID.h"
> #include "nsUnicharUtils.h"
> 
> static NS_DEFINE_CID(kCharsetConverterManagerCID, NS_ICHARSETCONVERTERMANAGER_CID);
> static NS_DEFINE_CID(kUnicharUtilCID, NS_UNICHARUTIL_CID);
> #endif
> 
>-#ifdef MOZILLA_CLIENT
>-#ifdef __SUNPRO_CC // for SunONE Studio compiler
>-using namespace std;
>-#endif
>-#else
>-#ifndef W32
>-using namespace std;
>-#endif
>-#endif
>-
Same here.

>diff --git a/extensions/spellcheck/hunspell/src/hashmgr.cpp b/extensions/spellcheck/hunspell/src/hashmgr.cpp
>--- a/extensions/spellcheck/hunspell/src/hashmgr.cpp
>+++ b/extensions/spellcheck/hunspell/src/hashmgr.cpp
>@@ -49,41 +50,24 @@
>  * use your version of this file under the terms of the MPL, indicate your
>  * decision by deleting the provisions above and replace them with the notice
>  * and other provisions required by the GPL or the LGPL. If you do not delete
>  * the provisions above, a recipient may use your version of this file under
>  * the terms of any one of the MPL, the GPL or the LGPL.
>  *
>  ******* END LICENSE BLOCK *******/
> 
>-#ifndef MOZILLA_CLIENT
>-#include <cstdlib>
>-#include <cstring>
>-#include <cstdio>
>-#include <cctype>
>-#else
> #include <stdlib.h> 
> #include <string.h>
> #include <stdio.h> 
> #include <ctype.h>
>-#endif
> 
>+#include "hashmgr.hxx"
>+#include "csutil.hxx"
> #include "atypes.hxx"
>-#include "csutil.hxx"
>-#include "hashmgr.hxx"
>-
>-#ifdef MOZILLA_CLIENT
>-#ifdef __SUNPRO_CC // for SunONE Studio compiler
>-using namespace std;
>-#endif
>-#else
>-#ifndef W32
>-using namespace std;
>-#endif
>-#endif
And here.

>@@ -543,36 +538,50 @@ int Hunspell::spell(const char * word, i
>   }
> 
>   if (rv) return 1;
> 
>   // recursive breaking at break points
>   if (wordbreak) {
>     char * s;
>     char r;
>-    int corr = 0;
>+    int nbr = 0;
>     wl = strlen(cw);
>     int numbreak = pAMgr ? pAMgr->get_numbreak() : 0;
>+
>+    // calculate break points for recursion limit
>+    for (int j = 0; j < numbreak; j++) {
>+      s = cw;
>+      do {
>+      	s = (char *) strstr(s, wordbreak[j]);
>+      	if (s) { 
>+		nbr++;
>+		s++;
>+	}
>+      } while (s);
>+    } 
>+    if (nbr >= 10) return 0;
>+
Strange tab characters in the code, but I assume those are there
in the original code.
(In reply to comment #14)
> Could you explain why this is ifdef MOZILLA_CLIENT
This change is per bug 465612 comment 2:
"There is moz specific code in hunspell to use a dynamic result from get_current_cs which is why it doesn't show up on testing of the standalone version. While I'm at it, there's a mix of malloc/delete as well so also changed to use new[] delete[]"

> Just wondering, what is mystrcat.
mystrcat is a specialized string concatanation function defined in csutil.cpp/hxx
http://mxr.mozilla.org/mozilla-central/source/extensions/spellcheck/hunspell/src/csutil.cpp#317
From the CVS logs, the mystrcat change was to "fix morphological analysis of long numbers combined with dash, eg. 45-00000045" - Nemeth/Caolan, please explain in greater detail if that's not enough of an answer as that's about the best I can offer :)

> So this change is ok for SunONE? Or do we not support that anymore? or what?
The SunONE changes were to simplify includes for std:: vs not according to the changelog. From what I can tell, all this is doing is making things uniform for all compilers instead of special casing SunONE. Caolan, can you comment further on this since there's not a ton on context on that change?

> Strange tab characters in the code, but I assume those are there
> in the original code.
Correct. Agreed that the brackets are a bit oddly positioned there.
caolan->ryanvm: re SunONE, yeah simplifying the ifdef spew. We build on OOo with Sun's compilers all the time, of course with the other non MOZILLA_CLIENT branches, its to reduce the amount of special casing going on.

Tabs/spaces are quite inconsistent across the source alright, should stick to a single approach, but its nothing new :-)
Comment on attachment 446913 [details] [diff] [review]
Update Hunspell to 1.2.11 v2a

Assuming using <vector> is ok. Please ask cjones.
Attachment #446913 - Flags: review?(Olli.Pettay) → review+
Depends on: 556701
Yes, fwiw I'm fine with that.
Thanks for the review. Let's get this landed so we can get it in for a5 to flesh out any potential l10n regressions it might cause. If everyone's happy with the results, I'll request approval for 1.9.2 landing. The v2a patch should apply cleanly to 1.9.2 without modification.

David, would bug 525581 be safe to port to 1.9.1? If so, I'll need to create a 1.9.1-specific patch with that patch included. Like I mentioned in comment 7, the patch has already been upstreamed for future Hunspell releases anyway.
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/c5b3d7beca8c
Status: NEW → RESOLVED
Closed: 10 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a5
Depends on: 571728
You need to log in before you can comment on or make changes to this bug.