Permanent Orange: TEST-UNEXPECTED-FAIL | log4moz.js | gloda.datastore ERROR got error in _asyncTrackerListener.handleError(): 10: disk I/O error

RESOLVED FIXED in Thunderbird 26.0

Status

MailNews Core
Database
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: standard8, Assigned: standard8)

Tracking

({intermittent-failure})

Trunk
Thunderbird 26.0
intermittent-failure
Dependency tree / graph

Thunderbird Tracking Flags

(thunderbird24 fixed, thunderbird25 fixed)

Details

Attachments

(2 attachments)

(Assignee)

Description

4 years ago
Since resolving some of a big bustage, which included the xpcshell reporting more errors than it had previously, we've been seeing these types of failures in some of the gloda tests:

https://tbpl.mozilla.org/php/getParsedLog.php?id=26160366&tree=Thunderbird-Aurora

TEST-UNEXPECTED-FAIL | /Users/cltbld/talos-slave/test/build/xpcshell/tests/mailnews/db/gloda/test/unit/test_gloda_content_imap_offline.js | test failed (with xpcshell return code: 0), see following log:
...
TEST-INFO | (xpcshell/head.js) | test MAIN run_test pending (1)
2013-08-05 04:13:40	gloda.datastore	DEBUG	!! mapped 1 from mailbox://nobody@Local%20Folders/Trash
2013-08-05 04:13:40	gloda.datastore	DEBUG	!! mapped 2 from mailbox://nobody@Local%20Folders/Unsent%20Messages
2013-08-05 04:13:40	gloda.datastore	DEBUG	!! mapped 3 from imap://user@localhost/INBOX
2013-08-05 04:13:40	gloda.test	DEBUG	((( Index listener notified! aStatus = 0 waiting: false


TEST-INFO | (xpcshell/head.js) | test pending (2)

TEST-INFO | (xpcshell/head.js) | test _async_driver pending (3)

TEST-INFO | (xpcshell/head.js) | test MAIN run_test finished (3)

TEST-INFO | (xpcshell/head.js) | running event loop

TEST-UNEXPECTED-FAIL | resource:///modules/gloda/log4moz.js | 2013-08-05 04:13:40	gloda.datastore	ERROR	got error in _asyncTrackerListener.handleError(): 10: disk I/O error
 - See following stack:
JS frame :: resource:///modules/gloda/log4moz.js :: TApp_doAppend :: line 925
JS frame :: resource:///modules/gloda/log4moz.js :: App_append :: line 587
JS frame :: resource:///modules/gloda/log4moz.js :: Logger_log :: line 377
JS frame :: resource:///modules/gloda/log4moz.js :: Logger_error :: line 385
JS frame :: resource:///modules/gloda/datastore.js :: <TOP_LEVEL> :: line 1795
native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0
(Assignee)

Comment 1

4 years ago
Asuth, any idea where we can look here? I'm a bit lost as to where these are coming from or why.
Flags: needinfo?(bugmail)
could the test machines/a single test machine have full disks or have two xpcshell tests trying to run at the same time in the same dir and fighting each other or something like that?

the extended error codes (which we won't get told) are:
#define SQLITE_IOERR_READ              (SQLITE_IOERR | (1<<8))
#define SQLITE_IOERR_SHORT_READ        (SQLITE_IOERR | (2<<8))
#define SQLITE_IOERR_WRITE             (SQLITE_IOERR | (3<<8))
#define SQLITE_IOERR_FSYNC             (SQLITE_IOERR | (4<<8))
#define SQLITE_IOERR_DIR_FSYNC         (SQLITE_IOERR | (5<<8))
#define SQLITE_IOERR_TRUNCATE          (SQLITE_IOERR | (6<<8))
#define SQLITE_IOERR_FSTAT             (SQLITE_IOERR | (7<<8))
#define SQLITE_IOERR_UNLOCK            (SQLITE_IOERR | (8<<8))
#define SQLITE_IOERR_RDLOCK            (SQLITE_IOERR | (9<<8))
#define SQLITE_IOERR_DELETE            (SQLITE_IOERR | (10<<8))
#define SQLITE_IOERR_BLOCKED           (SQLITE_IOERR | (11<<8))
#define SQLITE_IOERR_NOMEM             (SQLITE_IOERR | (12<<8))
#define SQLITE_IOERR_ACCESS            (SQLITE_IOERR | (13<<8))
#define SQLITE_IOERR_CHECKRESERVEDLOCK (SQLITE_IOERR | (14<<8))
#define SQLITE_IOERR_LOCK              (SQLITE_IOERR | (15<<8))
#define SQLITE_IOERR_CLOSE             (SQLITE_IOERR | (16<<8))
#define SQLITE_IOERR_DIR_CLOSE         (SQLITE_IOERR | (17<<8))
#define SQLITE_IOERR_SHMOPEN           (SQLITE_IOERR | (18<<8))
#define SQLITE_IOERR_SHMSIZE           (SQLITE_IOERR | (19<<8))
#define SQLITE_IOERR_SHMLOCK           (SQLITE_IOERR | (20<<8))
#define SQLITE_IOERR_SHMMAP            (SQLITE_IOERR | (21<<8))
#define SQLITE_IOERR_SEEK              (SQLITE_IOERR | (22<<8))
#define SQLITE_IOERR_DELETE_NOENT      (SQLITE_IOERR | (23<<8))
#define SQLITE_IOERR_MMAP              (SQLITE_IOERR | (24<<8))

All of these things seem like bad things that shouldn't happen.
Flags: needinfo?(bugmail)
(Assignee)

Comment 3

4 years ago
Parallel running of tests hasn't been enabled. The disk size shouldn't be an issue as we are running the 2GB and 4GB folder size tests on these same machines.
(Assignee)

Updated

4 years ago
Blocks: 896738
(Assignee)

Comment 4

4 years ago
Ok, so I've been doing some debugging. It looks like this error is being generated with these three INSERT statements:

TEST-INFO | (xpcshell/head.js) | test MAIN run_test pending (1)
2028134784[108119070]: Inited async statement 'INSERT INTO folderLocations (id, folderURI, dirtyStatus, name,                                     indexingPriority) VALUES         (?1, ?2, ?3, ?4, ?5)' (0x108116158)
196612096[10811bff0]: Initialized statement 'INSERT INTO folderLocations (id, folderURI, dirtyStatus, name,                                     indexingPriority) VALUES         (?1, ?2, ?3, ?4, ?5)' (0x1081341a0)
196612096[10811bff0]: sqlite3_trace on 10b730b70 for 'INSERT INTO folderLocations (id, folderURI, dirtyStatus, name,                                     indexingPriority) VALUES         (1, 'mailbox://nobody@Local%20Folders/Trash', 0, 'Trash', -1)'
2013-08-06 14:26:48	gloda.datastore	DEBUG	!! mapped 1 from mailbox://nobody@Local%20Folders/Trash
196612096[10811bff0]: sqlite3_trace on 10b730b70 for 'INSERT INTO folderLocations (id, folderURI, dirtyStatus, name,                                     indexingPriority) VALUES         (2, 'mailbox://nobody@Local%20Folders/Unsent%20Messages', 0, 'Unsent Messages', -1)'
2013-08-06 14:26:48	gloda.datastore	DEBUG	!! mapped 2 from mailbox://nobody@Local%20Folders/Unsent%20Messages
196612096[10811bff0]: sqlite3_trace on 10b730b70 for 'INSERT INTO folderLocations (id, folderURI, dirtyStatus, name,                                     indexingPriority) VALUES         (3, 'imap://user@localhost/INBOX', 0, 'INBOX', 50)'
2013-08-06 14:26:48	gloda.datastore	DEBUG	!! mapped 3 from imap://user@localhost/INBOX
2013-08-06 14:26:48	gloda.test	DEBUG	((( Index listener notified! aStatus = 0 waiting: false

Not quite sure why yet.
(Assignee)

Comment 5

4 years ago
Created attachment 786294 [details]
Run with debugging

This is debugging with NSPR_LOG_MODULES=mozStorage:5 and an additional line:

    dbConnection.executeSimpleSQL("PRAGMA vdbe_debug = 1");

added in _createDB in datastore.js

The interesting part is the three rc=1802 lines after each of the sqlite3_trace INSERT statements I referenced in comment 4 above.

Looking at the definitions, I think that equates to SQLITE_IOERR_FSTAT.

Checking the profile directory whilst the test is running, indeed shows that the file is no longer present - something has removed it.
Huh.  So, there's no good reason for the file to go away.  Gloda logging says things like "migrate" and "purged" when it decides to nuke the database.

I thought xpcshell created a physically new directory for each fake-run-dir it uses?  So it shouldn't be any type of lingering thing, right?  And the toolkit logic that helps provide automated vacuuming should still be opt-in exclusively, so this is a big stumpy to me.

Is it possible to run dtrace scripts?  The failure you reported was on OS X, and it can easily tell us who is being a jerk.  You can probably even add a ustack() command:

see:
http://superuser.com/questions/164749/how-does-one-find-who-or-what-service-deleted-a-particular-file-or-folder-in-s
http://chihungchan.blogspot.com/2009/05/which-process-deleted-my-file.html
(Assignee)

Comment 7

4 years ago
Created attachment 786310 [details] [diff] [review]
The fix

The problem here is that the initialize for mailDirService.js is being called twice - I think I centralized it in the head_gloda.js file, but didn't realise that gloda was also pulling in head_server.js from the imap tests.

This fixes a few of the failures for me.
Assignee: nobody → mbanner
Status: NEW → ASSIGNED
Attachment #786310 - Flags: review?(Pidgeot18)
(Assignee)

Updated

4 years ago
Blocks: 901959
Attachment #786310 - Flags: review?(Pidgeot18) → review+
(Assignee)

Comment 8

4 years ago
https://hg.mozilla.org/comm-central/rev/90e049357560
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 26.0
(Assignee)

Comment 9

4 years ago
Comment on attachment 786310 [details] [diff] [review]
The fix

[Triage Comment]
Looking good on trunk (Mac debug failures down to 7!). Getting this onto branches.
Attachment #786310 - Flags: approval-comm-beta+
Attachment #786310 - Flags: approval-comm-aurora+
(Assignee)

Comment 10

4 years ago
https://hg.mozilla.org/releases/comm-aurora/rev/d856a8a20b8d
https://hg.mozilla.org/releases/comm-beta/rev/6cdc385231e8
status-thunderbird24: --- → fixed
status-thunderbird25: --- → fixed
You need to log in before you can comment on or make changes to this bug.