The #5 trunk (3.7a1) topcrash is crashes in sqlite3VdbeExec. It's present on 3.6b1, 3.5.*, and 3.0.* as well, although it's not as high on the list. Crash query:
The comments that are comprehensible all mention bookmarking a page.
Possibly related to bug 520541.
I think I should just backout bug 519270 at this point. The next release of SQLite contains a fix for this crash.
Many of the stack traces at http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5.3&query_search=signature&query_type=exact&query=sqlite3VdbeExec&date=&range_value=1&range_unit=weeks&do_query=1&signature=sqlite3VdbeExec show sqlite3VdbeExec() being called directly from nsSocketTransportService::DoPollIteration() through a pointer to a function. See, for example http://crash-stats.mozilla.com/report/index/0f46c5d7-a838-4a85-b353-f00982091027
Even if DoPollIteration() had a reason to call SQLite, there ought to be several layers of intermediate functions before you get into sqlite3VdbeExec(). I'm guessing that the problem here is that DoPollIteration is invoking a faulty function pointer which is causing a jump into the middle of SQLite, resulting the crash.
(In reply to comment #3)
> Even if DoPollIteration() had a reason to call SQLite, there ought to be
> several layers of intermediate functions before you get into sqlite3VdbeExec().
> I'm guessing that the problem here is that DoPollIteration is invoking a
> faulty function pointer which is causing a jump into the middle of SQLite,
> resulting the crash.
Indeed this is true. Any idea how this is getting a bad pointer folks (bz, jduell)?
Frames can get omitted from stack traces, particularly when compilers are doing tail-call optimization, but also for other reasons.
Do these stacks make any more sense?
Those latter traces do make sense. Frames have been omitted, yes, but in ways that are explainable as "optimization" or "inlining". But there are way too many inexplicably missing frames in http://crash-stats.mozilla.com/report/index/0f46c5d7-a838-4a85-b353-f00982091027 to blame it on the optimizer, I think.
The stack linked from comment 6 has some other issues too. DoPollIteration could conceivably call into sqlite3VdbeExec if it had a dead object for that virtual function call. But PR_GetThreadPrivate can't possibly be calling DoPollIteration.... That said, ProcessNextEvent calling DoPollIteration via nsSocketTransportService::OnProcessNextEvent does make sense...
That said, the ownership model for the SocketContext's mHandler is pretty simple: it's reference counted and the socket context holds a reference. So unless someone else is screwing up the refcounting somewhere else, I wouldn't expect this object to be dead much.
3.5.4 was out last night, and it looks like this is still a crasher, but the stacks are different.
This is on SQLite 3.6.16 now. URL field is updated.
Lots of stacks through nsNavHistory::PreparePlacesForVisitsDelete in there.
are we sure this is not another way to hit bug 520541?
(In reply to comment #9)
> Lots of stacks through nsNavHistory::PreparePlacesForVisitsDelete in there.
Yeah - places is the biggest consumer of SQLite, so that doesn't really surprise me.
*** Bug 524709 has been marked as a duplicate of this bug. ***
bug 524709 is believed to be the same as this bug, but shows up as a different stack due to compiler optimizations being different.
When user's try to delete more than 32K rows of history, a 16-bit integer in the SQLite code generator is overflowing. This is the likely cause of the OP_If problem.
Sam - they can produce a build based off of 3.6.16 (currently shipping in 3.5.4) with a change that should make this crash go away (changing a 16 bit integer to a 32 bit one). This should work around the issue, and they'll have a better fix in 3.6.20 which we can decided to take on 1.9.1 on a later date. Do we want them to make this build?
(In reply to comment #14)
> Do we want them to make this build?
When you file the "upgrade sqlite" bug, please be sure to mention that this is the only thing that changed in the new version we're taking and that we're specifically taking it to fix this crash. If you have a diff, that'd help Dan feel even more comfortable. :)
Thanks a lot for looking into this! (You and drh, both.)
The diff for our candidate change is here:
The above passes all of our regression tests and seems to clear the problem as I was able to reproduce it. But we'd like to spend a little more time with it looking for related problems. We'll tag the above as a release if we don't find anything wrong with it by noon (EDT) tomorrow (2009-10-30).
Yeah, no hurry on the tag. We don't need it ASAP; code freeze is scheduled for November 10 and this is a pretty simple fix.
SQLite version 126.96.36.199 is now available at
Version 188.8.131.52 is version 3.6.16 with the simple patch shown above. The
use of version 184.108.40.206 should resolve this issue.
SQLite version 3.6.20 will also resolve this issue, though in a different way.
Also includes these crashes:
Filed bug 525539 to upgrade to SQLite 220.127.116.11 on mozilla-central, 1.9.2, and 1.9.1.
Fixed in mozilla-central via bug 525539.
Fixed in mozilla-1.9.2 via bug 525539 as well.
Fixed in mozilla1.9.1 via bug 525539 too!