Closed Bug 552050 Opened 14 years ago Closed 13 years ago

startup crash [@ memcpy | fts3SegWriterAdd] and [@ fts3SegWriterAdd] (Mac) and [@ fastcopy_I] - [@ fts3SegWriterAdd]

Categories

(MailNews Core :: Backend, defect)

x86
All
defect
Not set
critical

Tracking

(blocking-thunderbird3.1 .10+, thunderbird3.1 .10-fixed)

VERIFIED FIXED
Tracking Status
blocking-thunderbird3.1 --- .10+
thunderbird3.1 --- .10-fixed

People

(Reporter: wsmwk, Assigned: asuth)

References

()

Details

(4 keywords, Whiteboard: [gs][gssolved][workaround http://getsatisfaction.com/mozilla_messaging/topics/crash_and_burning#reply_3927615 ])

Crash Data

Attachments

(2 files)

crash [@ memcpy | fts3SegWriterAdd]

#14 for v3.0.3.  not in the top 100 for v3.0.1.

unclear if something regressive has crept into v3.0.3, or perhaps socorro changed such that the signature now visible. but the sig first appears with 2010011900 3.1a1 builds. eg bp-bbd02f42-a470-44de-9aed-f54be2100303
so =topcrash + regression until we find otherwise

appears to be mostly startup crashes, and multiple crashes per person no doubt based on http://crash-stats.mozilla.com/report/list?product=Thunderbird&query_search=signature&query_type=exact&query=memcpy%20|%20fts3SegWriterAdd&date=&range_value=4&range_unit=weeks&process_type=all&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&signature=memcpy%20|%20fts3SegWriterAdd

bp-b58172d2-94a7-43f7-a5f1-abdc12100312  v3.0.3
0	mozcrt19.dll	memcpy	 memcpy.asm:271
1	sqlite3.dll	fts3SegWriterAdd	db/sqlite3/src/sqlite3.c:105163
2	sqlite3.dll	fts3MergeCallback	db/sqlite3/src/sqlite3.c:105414
3	sqlite3.dll	sqlite3Fts3SegReaderIterate	db/sqlite3/src/sqlite3.c:105531
4	sqlite3.dll	fts3SegmentMerge	db/sqlite3/src/sqlite3.c:105703
5	sqlite3.dll	fts3AllocateSegdirIdx	db/sqlite3/src/sqlite3.c:104278
6	sqlite3.dll	fts3SegmentMerge	db/sqlite3/src/sqlite3.c:105660
7	sqlite3.dll	fts3AllocateSegdirIdx	db/sqlite3/src/sqlite3.c:104278
8	sqlite3.dll	sqlite3Fts3PendingTermsFlush	db/sqlite3/src/sqlite3.c:105747
9	sqlite3.dll	sqlite3VtabSync	db/sqlite3/src/sqlite3.c:86328
10	sqlite3.dll	vdbeCommit	db/sqlite3/src/sqlite3.c:48783
11	sqlite3.dll	sqlite3VdbeHalt	db/sqlite3/src/sqlite3.c:49241
12	sqlite3.dll	sqlite3VdbeExec	db/sqlite3/src/sqlite3.c:54843
13	sqlite3.dll	winMutexLeave	db/sqlite3/src/sqlite3.c:15554
14	sqlite3.dll	sqlite3LockAndPrepare	db/sqlite3/src/sqlite3.c:78833
15	sqlite3.dll	winMutexEnter	db/sqlite3/src/sqlite3.c:15513
16	thunderbird.exe	mozilla::storage::AsyncExecuteStatements::executeStatement	storage/src/mozStorageAsyncStatementExecution.cpp:354
17	thunderbird.exe	mozilla::storage::AsyncExecuteStatements::executeAndProcessStatement	storage/src/mozStorageAsyncStatementExecution.cpp:304
18	thunderbird.exe	mozilla::storage::AsyncExecuteStatements::executeAndProcessStatement	storage/src/mozStorageAsyncStatementExecution.cpp:292
19	thunderbird.exe	mozilla::storage::AsyncExecuteStatements::Run	storage/src/mozStorageAsyncStatementExecution.cpp:579
20	xpcom_core.dll	nsThread::ProcessNextEvent	xpcom/threads/nsThread.cpp:527
correction: does NOT appear in 3.0.1 at all.
from bug 536312 comment 1 ...

did bug 536312 morph into Bug 552050?? I am not able to determine what fixed 536312, if anything.  Nor do we know what caused it.  But bug 552050 appears when 536312 disappears -  552050 starts with v3.0.2/.3 and 536312 ends with 3.0.1, they have approximately the same crash rank, 552050. And so there is the hint of a link.

This bug has a crash on on trunk, bp-874c9c73-959e-48d2-8b35-c2e742100322 3.1b1 (yogesh) (unlike 536312 which had none)
We upgraded to SQLite 3.6.22 from 3.6.20 starting with 3.0.2:

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/6dc036c10334
Andrew, shawn, can we determine whether this crash is fixed in a newer sqlite? (other than by a reporter testing trunk with a reproducible crasher)  And, is an update to sqlite 3.6.23.1 possibly in the cards for the 1.9.2 branch?

added Mac-only crash to summary, which is @ fts3SegWriterAdd
#20 rank crash for v3.1.4, when combining the two crash sigs' numbers.
#4 for 3.0.7
>50% of crashes are startup, but I imagine many of them are repeats (from same user)

We have a reporter, Ron, at http://gsfn.us/t/1ha7d with bp-88430d7b-eb89-48b6-a249-a8b242100921
OS: Windows Vista → All
Summary: crash [@ memcpy | fts3SegWriterAdd] → crash [@ memcpy | fts3SegWriterAdd] and [@ fts3SegWriterAdd] (Mac)
Whiteboard: [gs]
(In reply to comment #4)
> Andrew, shawn, can we determine whether this crash is fixed in a newer sqlite?
> (other than by a reporter testing trunk with a reproducible crasher)  And, is
> an update to sqlite 3.6.23.1 possibly in the cards for the 1.9.2 branch?
Looking at getting SQLite 3.7.1 landed on 1.9.2, but no good way to determine if it's fixed without steps to reproduce.
this sig now #1 - increased greatly compared to v3.1.4. in v3.1.5 on raw numbers this is 10x the #2 crash. http://crash-stats.mozilla.com/topcrasher/byversion/Thunderbird/3.1.5

I've just finished an IRC conversation with someone who updated to v3.1.5 and started crashing on startup. He may have some testcases for us.  Don't know yet if he will also crash if he reverts to v3.1.4

http://gsfn.us/t/1okj4 is a new posting in the last 24 hr.

still developing more info
Summary: crash [@ memcpy | fts3SegWriterAdd] and [@ fts3SegWriterAdd] (Mac) → startup crash [@ memcpy | fts3SegWriterAdd] and [@ fts3SegWriterAdd] (Mac)
(In reply to comment #6)
> this sig now #1 - increased greatly compared to v3.1.4. in v3.1.5 on raw
> numbers this is 10x the #2 crash.
> http://crash-stats.mozilla.com/topcrasher/byversion/Thunderbird/3.1.5

> I've just finished an IRC conversation with someone who updated to v3.1.5 and
> started crashing on startup. He may have some testcases for us.  Don't know yet
> if he will also crash if he reverts to v3.1.4

bp-258d01f6-a47f-466f-a4aa-16c852101020 (nootrope on IRC)

> http://gsfn.us/t/1okj4 is a new posting in the last 24 hr.
bp-9b50085a-9ca7-41cd-8286-11a9e2101020

> still developing more info

bp-048c39de-784d-4203-b6ac-a71532101019 (porte) been working fine for two months [don't know which version yet]. as of today, it crashes whenever I try to open it.
nootrope = alberto cc this bug and has supplied potential testcases.


there are two stack variations that I can distinguish so far ...

(In reply to comment #7)
> > if he will also crash if he reverts to v3.1.4
> 
> bp-258d01f6-a47f-466f-a4aa-16c852101020 (nootrope on IRC)

variation #1 matches comment 0's crash and  bp-258d01f6-a47f-466f-a4aa-16c852101020 (nootrope on IRC)
   
0		@0xffff0b3c	
1	libsqlite3.dylib	fts3SegWriterAdd	db/sqlite3/src/sqlite3.c:113816
2	libsqlite3.dylib	sqlite3Fts3SegReaderIterate	db/sqlite3/src/sqlite3.c:114184
3	libsqlite3.dylib	fts3SegmentMerge	db/sqlite3/src/sqlite3.c:114354
4	libsqlite3.dylib	fts3AllocateSegdirIdx	db/sqlite3/src/sqlite3.c:112931
5	libsqlite3.dylib	sqlite3Fts3PendingTermsFlush	db/sqlite3/src/sqlite3.c:114400
6	libsqlite3.dylib	fts3PendingTermsDocid	db/sqlite3/src/sqlite3.c:112726
7	libsqlite3.dylib	fts3UpdateMethod	db/sqlite3/src/sqlite3.c:114732
8	libsqlite3.dylib	sqlite3Step	db/sqlite3/src/sqlite3.c:65285
9	libsqlite3.dylib	sqlite3_step	db/sqlite3/src/sqlite3.c:58000
10	thunderbird-bin	mozilla::storage::AsyncExecuteStatements::executeStatement	storage-backport/src/mozStorageAsyncStatementExecution.cpp:345
11	thunderbird-bin	mozilla::storage::AsyncExecuteStatements::executeAndProcessStatement	storage-backport/src/mozStorageAsyncStatementExecution.cpp:292
12	thunderbird-bin	mozilla::storage::AsyncExecuteStatements::bindExecuteAndProcessStatement	storage-backport/src/mozStorageAsyncStatementExecution.cpp:274
13	thunderbird-bin	mozilla::storage::AsyncExecuteStatements::Run	storage-backport/src/mozStorageAsyncStatementExecution.cpp:597 


> > http://gsfn.us/t/1okj4 is a new posting in the last 24 hr.
> bp-9b50085a-9ca7-41cd-8286-11a9e2101020

variation #2 bp-9b50085a-9ca7-41cd-8286-11a9e2101020 


> > still developing more info
> 
> bp-048c39de-784d-4203-b6ac-a71532101019 (porte) been working fine for two
> months [don't know which version yet]. as of today, it crashes whenever I try
> to open it.
bp-048c39de-784d-4203-b6ac-a71532101019  same as variation #1


fastcopy_I matches variation #2 from http://gsfn.us/t/1olhn is bp-96b90551-a84e-44fb-94c4-60451210102 
 (12 of 15 fastcopy_I v3.1.5 crashes thus far are fts3SegWriterAdd)

0	mozcrt19.dll	fastcopy_I	
1	mozcrt19.dll	_VEC_memcpy	
2	mozcrt19.dll	arena_ralloc	objdir-tb/mozilla/memory/jemalloc/crtsrc/jemalloc.c:4412
3	sqlite3.dll	fts3SegWriterAdd	db/sqlite3/src/sqlite3.c:113816
4	sqlite3.dll	fts3MergeCallback	db/sqlite3/src/sqlite3.c:114067
5	sqlite3.dll	sqlite3Fts3SegReaderIterate	db/sqlite3/src/sqlite3.c:114184
6	sqlite3.dll	fts3SegmentMerge	db/sqlite3/src/sqlite3.c:114356
7	sqlite3.dll	fts3AllocateSegdirIdx	db/sqlite3/src/sqlite3.c:112931
8	sqlite3.dll	sqlite3Fts3PendingTermsFlush	db/sqlite3/src/sqlite3.c:114400
9	sqlite3.dll	sqlite3VtabSync	db/sqlite3/src/sqlite3.c:94013
10	sqlite3.dll	vdbeCommit	db/sqlite3/src/sqlite3.c:56036
11	sqlite3.dll	sqlite3VdbeHalt	db/sqlite3/src/sqlite3.c:56503
12	sqlite3.dll	sqlite3VdbeExec	db/sqlite3/src/sqlite3.c:60315
13	mozcrt19.dll	arena_dalloc	objdir-tb/mozilla/memory/jemalloc/crtsrc/jemalloc.c:4230
14	mozcrt19.dll	free	objdir-tb/mozilla/memory/jemalloc/crtsrc/jemalloc.c:6017
15	sqlite3.dll	sqlite3_free	db/sqlite3/src/sqlite3.c:17542
16	sqlite3.dll	winMutexEnter	db/sqlite3/src/sqlite3.c:16999
17	sqlite3.dll	sqlite3_step	db/sqlite3/src/sqlite3.c:58002
18	nspr4.dll	nspr4.dll@0x15fff	
19	thunderbird.exe	mozilla::storage::AsyncExecuteStatements::executeAndProcessStatement	storage-backport/src/mozStorageAsyncStatementExecution.cpp:292
20	thunderbird.exe	mozilla::storage::AsyncExecuteStatements::bindExecuteAndProcessStatement	storage-backport/src/mozStorageAsyncStatementExecution.cpp:274
21	thunderbird.exe	mozilla::storage::AsyncExecuteStatements::Run	storage-backport/src/mozStorageAsyncStatementExecution.cpp:597
22	xpcom_core.dll	nsThread::ProcessNextEvent	xpcom/threads/nsThread.cpp:527
23	xpcom_core.dll	NS_ProcessNextEvent_P	objdir-tb/mozilla/xpcom/build/nsThreadUtils.cpp:250 

fastcopy_I ... skimming a couple dozen v3.1.4 crashes
* a small percentage are fts3SegWriterAdd, like bp-9546399e-cdef-4658-a09f-e23212100928
* and a couple are memcpy | fts3NodeAddTerm  (filed Bug 605996 for that family)
Keywords: testcase
Summary: startup crash [@ memcpy | fts3SegWriterAdd] and [@ fts3SegWriterAdd] (Mac) → startup crash [@ memcpy | fts3SegWriterAdd] and [@ fts3SegWriterAdd] (Mac) and [@ fastcopy_I] - [@ fts3SegWriterAdd]
If we can get some STR here, that would be super helpful.
Our top crasher on 3.1.5 with plenty of comments in Japanese.
Contacted 15 reporters with the following email :
Hello,

I'm the quality assurance lead of Mozillamessaging, we build Thunderbird. You've recently submitted a crash to us. We can't reproduce the crash in house.
We have a rough idea of where it happens, but we probably need some of your data for us to be able to reproduce. What we are looking for is located in you profile directory ( see http://support.mozillamessaging.com/en-US/kb/Profiles for information on locating your profile). from the profile we would be interested in having the following files, if they exist :

    addons.sqlite           
    people.sqlite
    cookies.sqlite           
    permissions.sqlite
    downloads.sqlite       
    signons.sqlite
    extensions.sqlite       
    urlclassifier2.sqlite
    formhistory.sqlite       
    webappsstore.sqlite
    global-messages-db.sqlite

we'll keep your data private, we just need to collect some in order to understand why Thunderbird is crashing.

Ludovic
Ludovic, when this crash occurs, it uses fts3 code.  So this may be Thunderbird only.

I will be investigating this.
(In reply to comment #9)
> If we can get some STR here, that would be super helpful.

I'm currently having sqlite files that placed in a Thunderbird profile folder will reproduce the crash. I need to figure out which file does that. Would sending the file directly to you be good ?
(In reply to comment #13)
> (In reply to comment #9)
> > If we can get some STR here, that would be super helpful.
> 
> I'm currently having sqlite files that placed in a Thunderbird profile folder
> will reproduce the crash. I need to figure out which file does that. Would
> sending the file directly to you be good ?

Yes, please.
(In reply to comment #14)
> (In reply to comment #13)
> > (In reply to comment #9)
> > > If we can get some STR here, that would be super helpful.
> > 
> > I'm currently having sqlite files that placed in a Thunderbird profile folder
> > will reproduce the crash. I need to figure out which file does that. Would
> > sending the file directly to you be good ?
> 
> Yes, please.

Done.
Note there are other fts3 crashes that don't yet have bug reports:

- sqlite3Fts3SegReaderIterate all ppc, #2 crash for Mac-only, #7 crash overall bp-9582122e-2316-4f44-a975-1b3db2101022

- fts3SegReaderNext, mostly but not entirely Mac,  #8 crasher for Macs in v3.1.5. bp-b177e128-7cb2-409a-8f30-03cd62101018

- fts3SegReaderNext bp-bfa612f7-d4bf-41e6-b1da-273432101021
This is now the #1 topcrash for Thunderbird in 3.1.5.

I note that sqlite was upgraded during the update to version 3.7.1 (bug 583611), my suspicion is that the upgrade may have made this worse.

As Ludovic has a reproducible test-case, setting this to block the next release.
blocking-thunderbird3.1: --- → .6+
Keywords: topcrashtopcrash+
(In reply to comment #8)
> nootrope = alberto cc this bug and has supplied potential testcases.

I forwarded alberto's info to Makoto Kato, but it's only messages not db files. If further testcases are needed, there's plenty of users in http://getsatisfaction.com/mozilla_messaging/topics/crash_and_burning


Assessing the run up to v3.1.5 ... No crashes in 3.1.5pre or on trunk in the run up to v3.1.5 [1], so we don't have enough testers there at the moment to surface a crash of this type. So adding in-testsuite?  There were a few crashes in 3.1.5 prior to release on 10/19 [2], but apparently not enough of a increase to alert us to the potential severity.

[1] v3.1.5pre, v3.3a1pre, v3.2a1pre : http://crash-stats.mozilla.com/query/query?product=Thunderbird&version=Thunderbird:3.3a1pre&version=Thunderbird:3.2a1pre&version=Thunderbird:3.1.5pre&range_value=8&range_unit=weeks&date=10/19/2010+06:40:45&query_search=signature&query_type=contains&query=fts3&build_id=&process_type=any&hang_type=any&do_query=1
[2] v3.1.5 http://crash-stats.mozilla.com/query/query?product=Thunderbird&version=Thunderbird:3.1.5&range_value=2&range_unit=weeks&date=10/19/2010+06:40:45&query_search=signature&query_type=contains&query=fts3&build_id=&process_type=any&hang_type=any&do_query=1
Flags: in-testsuite?
I'm told that http://www.sqlite.org/src/ci/84194c4195 may fix this issue.  drh will be e-mailing asuth an amalgamation soon for some potential test builds.  If those don't work out, I will forward the two databases I have to the SQLite team (with permission from the users) so they can reproduce and fix this crash.
Attached file amalgamation
Here's the amalgamation I was sent.  I tried to use our try server to spin a build, but it doesn't seem to work with comm-1.9.2:
  /builds/slave/tryserver-macosx/build/.mozconfig: line 9: /builds/slave/tryserver-macosx/build/build/macosx/universal/mozconfig.common: No such file or directory
  client.mk:136: /builds/slave/tryserver-macosx/build/.mozconfig.mk: No such file or directory
  make: *** No rule to make target `/builds/slave/tryserver-macosx/build/.mozconfig.mk'.  Stop.
  program finished with exit code 2

That specific build wouldn't have worked out anyways, though, because I forgot to attach my script that patches the SQLite files under the mozilla/ subdirectory to the Makefile, but it would be nice if the mechanism worked.  The specific push is here if someone wants to build on it somehow:
http://hg.mozilla.org/try-comm-central/rev/dd8ff29d555c

I ran a local build and then realized I don't have a copy of the database that crashes, just some messages that gloda doesn't want to index.  I can locally spin a build for someone who has the crashy database if I'm told what platform to build for.

As an aside, I still think it's misleading to call things like this our #1 topcrash when it appears that the crash only happens for a few very persistent (and likely very justifiably irate) users.  (Actually, scratch 'likely'.  The crash report comments are very clear cut...)

Also, the workaround on gsfn should not be to disable gloda; it should be to delete global-messages-db.sqlite.  The user can then disable gloda if they want, but the problem is database corruption, not gloda.  The trick is that the database corruption only becomes evident and crashy if gloda is active.
I'll try and fix try server for 1.9.2 either later today or tomorrow.
(In reply to comment #20)
> As an aside, I still think it's misleading to call things like this our #1
> topcrash when it appears that the crash only happens for a few very persistent
> (and likely very justifiably irate) users.  (Actually, scratch 'likely'.  The
> crash report comments are very clear cut...)

I don't think we are off in calling this a topcrash and a severe one at that, though one might conclude that from a few crasher reports, like mi_r16 (40 occurrences), whose mental state is well beyond irate based on his/her single word, and singularly unhelpful, crash comment. 

The mean "persistence inflation" seems to me in the area of 5 crash reports per user, based on a quick examination of crash comments with email address and comments.  And when I commented ~a week ago and had looked at it in more detail and said "on raw numbers this is 10x the #2 crash", I failed to state that I knew there to be a high degree of inflation, which I doubted to be more 10x. So even when cut by 90% it was very reasonable to call it the #1 crash at the time.

I'm of the opinion 10x is still a reasonable inflation estimate, which makes this top 15 crash in today's numbers. But even if you use 40x inflation it's still top 80 and still a topcrash.
 
> Also, the workaround on gsfn should not be to disable gloda; it should be to
> delete global-messages-db.sqlite.  The user can then disable gloda if they
> want, but the problem is database corruption, not gloda.  The trick is that the
> database corruption only becomes evident and crashy if gloda is active.

Yours is the first suggestion (and there is none in gsfn) for such a workaround. And it would be great to publicize that, but do we know what's causing the corruption, and whether it's unlikely the user will experience further corruption?  If that's stated here in the bug or patch then I missed it.
(In reply to comment #22)
> I don't think we are off in calling this a topcrash and a severe one at that,
> though one might conclude that from a few crasher reports, like mi_r16 (40

... though one might conclude OTHERWISE from a few crasher reports...
(In reply to comment #22)
> I don't think we are off in calling this a topcrash and a severe one at that,
> though one might conclude that from a few crasher reports, like mi_r16 (40
> occurrences), whose mental state is well beyond irate based on his/her single
> word, and singularly unhelpful, crash comment. 

Indeed, 'tis quite a crash.  I should have characterized my comment better.  I shall try and do so now:

I feel like we are using the crash ranking numbers because they are the only numbers we have and that the numbers are implying importance or lack of importance where they should not.  My concern is that this might lead busy people to over/underestimate the importance of the problem and distract them from potentially more (generally) important things.  In this case, slowing down the SQLite people is concerning to me because the potential benefits of fully debugged write-ahead logging and a shipped FTS4 are really high for Thunderbird.

I had more analysis to contribute here, but the bottom line is that you are doing a fantastic job of keeping on top of the crashers given limited tooling and quite limited analytical support from me.  I am going to see what I can do about improving both of those things while not falling down a tooling hole, as I am prone to (appear to) do :).


> Yours is the first suggestion (and there is none in gsfn) for such a
> workaround. And it would be great to publicize that, but do we know what's
> causing the corruption, and whether it's unlikely the user will experience
> further corruption?  If that's stated here in the bug or patch then I missed
> it.

I would not expect the corruption to be deterministic within Thunderbird and SQLite/FTS3.  I would expect the probability of corruption for someone who has already experienced it to be higher; any of the following things could have caused it and could easily cause it again, but aren't things we can really do anything about:
- Power failures during hard disk writes.
- Flaky hardware.
- Frequent OS crashes not due to the above, perhaps bad drivers or such.
- Badly written anti-virus software (or evil trojans) altering the contents of the database because they saw something in the file they don't like.
Things that could be our bad but aren't obvious or deterministic:
- Memory corruption
Thanks for the great info. We'll be able to use that in communicating to users.

Without metrics from crash stats like unique reporters, deriving impact of a crash signature becomes some thing of an art. overzealous reporters like mi_r16 makes it even worse.  I try not to fall in the trap of conflating importance and severity (the same happens in bugzilla, where many people think severity=critical implies the bug should get fixed today). The good thing is, more information and scrutiny from everyone in bug reports helps flesh out the proper level of importance.

fwiw, this graph depicts the uptick blip we were seeing a week ago - http://crash-stats.mozilla.com/daily?form_selection=by_version&p=Thunderbird&v[]=3.1.2&throttle[]=100&v[]=3.1.5&throttle[]=100&v[]=3.1.4&throttle[]=100&v[]=&throttle[]=100&hang_type=any&os[]=Windows&os[]=Mac&os[]=Linux&date_start=2010-10-15&date_end=2010-10-27&submit=Generate
(In reply to comment #24)
> I feel like we are using the crash ranking numbers because they are the only
> numbers we have and that the numbers are implying importance or lack of
> importance where they should not.  My concern is that this might lead busy
> people to over/underestimate the importance of the problem and distract them
> from potentially more (generally) important things.  In this case, slowing down
> the SQLite people is concerning to me because the potential benefits of fully
> debugged write-ahead logging and a shipped FTS4 are really high for
> Thunderbird.

I just want to clarify. We were seeing this crash at not-very-high topcrash level prior to 3.1.4. When we released 3.1.5, the crashes jumped to the top of the list.

I'm therefore fairly confident that something regressed. I doubt that users would suddenly start submitting lots of duplicates at a particular release. As sqlite was touched/updated in 3.1.5, that seems like an obvious regression source for this bug.

That regression is why this bug now has blocking status. If it had just been there from the beginning, then I wouldn't have necessarily been as concerned.
Andrew can you try again to build a try build ?
(In reply to comment #27)
> Andrew can you try again to build a try build ?

(In reply to comment #20)
> That specific build wouldn't have worked out anyways, though, because I forgot
> to attach my script that patches the SQLite files under the mozilla/
> subdirectory to the Makefile, but it would be nice if the mechanism worked. 
> The specific push is here if someone wants to build on it somehow:
> http://hg.mozilla.org/try-comm-central/rev/dd8ff29d555c
> 
> I ran a local build and then realized I don't have a copy of the database that
> crashes, just some messages that gloda doesn't want to index.  I can locally
> spin a build for someone who has the crashy database if I'm told what platform
> to build for.

I don't actually know how to make the try server build the changes without spending a few hours looking into it.  (The changes are under mozilla/.)  I need to know what platform to spin a build for manually.
(In reply to comment #28)
 
> I don't actually know how to make the try server build the changes without
> spending a few hours looking into it.  (The changes are under mozilla/.)  I
> need to know what platform to spin a build for manually.

Mac will do.
Standard8 was a helpful fellow and made us a try-server built build so we're forgetting about my local build that I wasn't sure how to package:
http://ftp.mozilla.org/pub/mozilla.org/thunderbird/tryserver-builds/bugzilla@standard8.plus.com-9c17016890ca/tryserver-macosx/
Attached patch sqlite patchSplinter Review
I created a try server build with the patch that asuth linked to. I've attached it for reference, here's the build:

http://ftp.mozilla.org/pub/mozilla.org/thunderbird/tryserver-builds/bugzilla@standard8.plus.com-9c17016890ca/tryserver-macosx/
(In reply to comment #31)
> Created attachment 490549 [details] [diff] [review]
> sqlite patch
> 
> I created a try server build with the patch that asuth linked to. I've attached
> it for reference, here's the build:
> 
> http://ftp.mozilla.org/pub/mozilla.org/thunderbird/tryserver-builds/bugzilla@standard8.plus.com-9c17016890ca/tryserver-macosx/

My tainted profile doesn't crash with that build. While it still crashes on that same profile with latest trunk.
I notified drh that their fix is good.  The fix is planned to be released in the official 3.7.4 release planned for Dec 8.  I have conveyed that we're good on waiting for that release.

The two related actions which are completely on my plate are:

- Make gloda notice that the FTS3 database is broken by actually paying attention to the error notifications and doing something to cause the database to be regenerated.

- Try enabling "pragma fullfsync" which only affects OS X.  The good news would be even greater reductions in the possibility of corrupt gloda databases.  The bad news is really, really, really making sure the writes hit the disk platter might have performance side-effects on others threads/processes trying to do I/O around the time of our commit.
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
(In reply to comment #33)
> - Try enabling "pragma fullfsync" which only affects OS X.  The good news would
> be even greater reductions in the possibility of corrupt gloda databases.  The
> bad news is really, really, really making sure the writes hit the disk platter
> might have performance side-effects on others threads/processes trying to do
> I/O around the time of our commit.
Pretty sure that that is really really really slow, fwiw.
blocking-thunderbird3.1: .7+ → .8+
Whiteboard: [gs] → [gs][workaround http://getsatisfaction.com/mozilla_messaging/topics/crash_and_burning#reply_3927615 ]
release 3.7.4 is now available. Will we be able to land it and have the requisite trunk _and_ branch testing/baking to ship it in Thunderbird v3.1.8?  (implies all work finished by about mid to late January)
Depends on: SQLite3.7.4
(In reply to comment #35)
> release 3.7.4 is now available. Will we be able to land it and have the
> requisite trunk _and_ branch testing/baking to ship it in Thunderbird v3.1.8? 
> (implies all work finished by about mid to late January)
That really depends on branch drivers.  I filed bug 618315 about upgrading.
Blocks: 605996
Blocks: 581992
blocking-thunderbird3.1: .8+ → .9+
anyone here with persistent crashing? 

bug 618315 aka sqlite 3.7.4 landed on trunk, and hopefully fixes some or all of the thunderbird issues.  If you are willing, now is a good time to try thunderbird development builds found at ftp://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-central/  so we get feedback on the "fix".

note - these are *untested* development builds
note - backup your profile before using

please post your results
Ludovic tested this and agreed it was fixed, therefore I'm going to mark as fixed. If we get different instances we can file a new bug if necessary.
Status: ASSIGNED → RESOLVED
blocking-thunderbird3.1: .9+ → .10+
Closed: 13 years ago
Resolution: --- → FIXED
yay, confirming topcrash is fixed. all these signatures are gone from version 3.1.10

v.fixed. posted in and closed gfsn topics
Status: RESOLVED → VERIFIED
Whiteboard: [gs][workaround http://getsatisfaction.com/mozilla_messaging/topics/crash_and_burning#reply_3927615 ] → [gs][gssolved][workaround http://getsatisfaction.com/mozilla_messaging/topics/crash_and_burning#reply_3927615 ]
Crash Signature: [@ memcpy | fts3SegWriterAdd] [@ fts3SegWriterAdd] [@ fastcopy_I] [@ fts3SegWriterAdd]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: