Closed Bug 1376139 Opened 4 years ago Closed 4 years ago

TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js and more - SQLite: Assertion failed: (p->nRef==0)

Categories

(Thunderbird :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 56.0

People

(Reporter: jorgk-bmo, Assigned: jorgk-bmo)

References

Details

(Whiteboard: [Thunderbird-testfailure: X all debug])

Attachments

(1 file)

Crashes on all platforms in debug, but the logs look a little different:

Mac:
TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js | xpcshell return code: 5
PID 8571 | Assertion failed: (p->nRef==0), function pthreadMutexFree, file /builds/slave/tb-c-cen-m64-d-000000000000000/build/mozilla/db/sqlite3/src/sqlite3.c, line 23350.
PROCESS-CRASH | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js | application crashed [@ __pthread_kill + 0xa]

Windows:
TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js | xpcshell return code: 1
PID 3636 | Assertion failed: p->nRef==0 && p->owner==0, file c:/builds/moz2_slave/tb-c-cen-w32-d-000000000000000/build/mozilla/db/sqlite3/src/sqlite3.c, line 24074
PROCESS-CRASH | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js | application crashed [@ abort + 0x5a]

Linux:

TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js | xpcshell return code: -6
PROCESS-CRASH | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js | application crashed [@ libc-2.15.so + 0x360d5]

First seen Wed Jun 21, 2017, 9:28:39:
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=29b9a4a3fc1a9edd224ad789ad7a8e60c3a86cac

M-C last good: 7a6baa6cca3292e8099e652b64d27e74df
M-C first bad: e1e4a481b7e88dce163b9cccc2fb720320

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7a6baa6cca3292e8099e652b64d27e74df&tochange=e1e4a481b7e88dce163b9cccc2fb720320
Whiteboard: [Thunderbird-testfailure: X all debug]
Line 23350:
static void pthreadMutexFree(sqlite3_mutex *p){
  assert( p->nRef==0 );

Line 24074:
static void winMutexFree(sqlite3_mutex *p){
  assert( p );
  assert( p->nRef==0 && p->owner==0 );

Looks like bug 1371945 - https://hg.mozilla.org/mozilla-central/rev/eacdba502064

Marco and Andrew, any hints why our tests are suddenly crashing in debug mode?

If you look here, you'll see other tests crashing, all on the assertion quoted above.
Flags: needinfo?(mak77)
Flags: needinfo?(bugmail)
Summary: TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js → TEST-UNEXPECTED-FAIL | xpcshell-icaljs.ini:calendar/test/unit/test_bug494140.js and more - SQLite: Assertion failed: (p->nRef==0)
If you look here, you'll see other tests crashing, all on the assertion quoted above (now with the link):
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=064ea09a484db674bea276f4f634dd9b7bf85d24&selectedJob=109771100

Three storage related tests fail.
A couple things became a little bit more stricter:
1. prepared statements must be finalized. so if you have a statement created through createStatement, you must .finalize() it.
2. connections must be closed either with Close() or AsyncClose().

check for warnings before the assertions:

WARNING: SQL statement 'SELECT DISTINCT(zone) FROM (SELECT recurrence_id_tz AS zone FROM cal_attendees    WHERE recurrence_id_tz IS NOT NULL UNION SELECT recurrence_id_tz AS zone FROM cal_events       WHERE recurrence_id_tz IS NOT NULL UNION SELECT event_start_tz   AS zone FROM cal_events       WHERE event_start_tz   IS NOT NULL UNION SELECT event_end_tz     AS zone FROM cal_events       WHERE event_end_tz     IS NOT NULL UNION SELECT recurrence_id_tz AS zone FROM cal_properties   WHERE
03:17:57     INFO -  PID 7530
WARNING: SQL statement 'SELECT version FROM cal_tz_version LIMIT 1' (a6130a90) should have been finalized before closing the connection: file /builds/slave/tb-c-cen-lx-d-0000000000000000/build/mozilla/storage/mozStorageConnection.cpp, line 1062

also see the patch in bug 1371945, I fixed a bunch of tests in m-c, and likely you can just copy over the tactics from there.
Flags: needinfo?(mak77)
Flags: needinfo?(bugmail)
:mak's too fast.  I mid-aired, but here's my non-updated-from-midair comment:

There's 2 things going on here:

== Thing 1

calendar/providers/storage/calStorageUpgrade.jsm is failing to finalize 2 statements, as indicated by the log warnings immediately prior to the assertion failure:
I think the log lines above the assertion suggest why the assertion is firing:
15:10:57     INFO -  PID 6902 | [6902] WARNING: SQL statement 'SELECT DISTINCT(zone) FROM (SELECT recurrence_id_tz AS zone FROM cal_attendees    WHERE recurrence_id_tz IS NOT NULL UNION SELECT recurrence_id_tz AS zone FROM cal_events       WHERE recurrence_id_tz IS NOT NULL UNION SELECT event_start_tz   AS zone FROM cal_events       WHERE event_start_tz   IS NOT NULL UNION SELECT event_end_tz     AS zone FROM cal_events       WHERE event_end_tz     IS NOT NULL UNION SELECT recurrence_id_tz AS zone FROM cal_properties   WHERE
15:10:57     INFO -  PID 6902 | [6902] WARNING: SQL statement 'SELECT version FROM cal_tz_version LIMIT 1' (11a7099e0) should have been finalized before closing the connection: file /builds/slave/tb-c-cen-m64-d-000000000000000/build/mozill

The latter statement is from `ensureUpdatedTimezones` and will eventually be collected and destroyed by GC, so it at least could be intermittent based on GC.  It's currently subject to an explicit .reset() but it should just be .finalize()d instead.  The first statement may be the getZones statement from later in the file that is also reset() instead of being finalize()d and intermittent due to potential GC.

This is a legit bug in calendar and regardless of what happens for thing 2 below, it should be fixed because mozStorage is just as likely to intentionally assert on these unfinalized statements as accidentally let sqlite do it for us.  Feel free to use this bug to track your Thunderbird fix.  Actual legwork for mozStorage will happen in its own bug.

== Thing 2, recent change in mozStorage

In mozStorage we changed to hold the sharedDBMutex locked scope in our "there were unfinalized statements, we'll finalize them for you" `if (srv == SQLITE_BUSY)` clause.  We should be dropping the lock immediately prior to re-triggering sqlite3_close the second time.  That's the cause of the assertion failure.  (Noting that because of the various iterations we went through with Statement::internalFinalize, we shouldn't actually need to be holding the sharedDBMutex at all, although it's arguably a nice-to-have in the face of all the crazy stuff that could be happening.)

Notably, while look at this, it turns out sqlite3_close_v2 introduces the ability for the close to succeed and the connection to be put into a zombie state until all the statements actually get finalized.  The downside to that is we don't get to complain about the statements that should have finalized.

I'm leaving the needinfo for :mak up for context, but will file a mozStorage bug now.
Wires crossed with Andrew's reply, I'm posting this anyway.

Thanks, Marco.

(In reply to Jorg K (GMT+2) from comment #2)
> Three storage related tests fail.
OK, so let's look at those tests:
calendar/test/unit/test_bug494140.js
calendar/test/unit/test_bug759324.js
calendar/test/unit/test_storage.js

This warning appears 12 times:
15:10:57     INFO -  PID 6902 | [6902] WARNING: SQL statement 'SELECT DISTINCT(zone) FROM (SELECT recurrence_id_tz AS zone FROM cal_attendees    WHERE recurrence_id_tz IS NOT NULL UNION SELECT recurrence_id_tz AS zone FROM cal_events       WHERE recurrence_id_tz IS NOT NULL UNION SELECT event_start_tz   AS zone FROM cal_events       WHERE event_start_tz   IS NOT NULL UNION SELECT event_end_tz     AS zone FROM cal_events       WHERE event_end_tz     IS NOT NULL UNION SELECT recurrence_id_tz AS zone FROM cal_properties   WHERE
15:10:57     INFO -  PID 6902 | [6902] WARNING: SQL statement 'SELECT version FROM cal_tz_version LIMIT 1' (11a7099e0) should have been finalized before closing the connection.

Looking at code I see a .reset:
https://dxr.mozilla.org/comm-central/rev/484beb60833c61561f1c447b972e7d4f55e52633/calendar/providers/storage/calStorageUpgrade.jsm#406

Changing this to .finalize gets me further, but now it complains on 
SELECT version FROM cal_tz_version LIMIT 1
That's also .reset() and not .finalize()d.

With these two changes, those three tests pass. Patch coming.
Attached patch fin.patch (v1).Splinter Review
I spotted some more crashes in Mozmill runs. With this patch all four remaining reset() statements are turned into finalize().
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
https://hg.mozilla.org/comm-central/rev/530a4148e236774c49fe403e4fc5a2d191762474
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 56.0
You need to log in before you can comment on or make changes to this bug.