Closed Bug 603009 Opened 9 years ago Closed 9 years ago

Catch and report all exceptions thrown during test pilot study execution

Categories

(Mozilla Labs Graveyard :: Test Pilot, defect, P1, major)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jono, Unassigned)

Details

(Whiteboard: [approved patches landed on aurora] data, api, extension-core)

Attachments

(1 file, 2 obsolete files)

We should be catching exceptions raised by test pilot code, in the most general way possible, recording them and then reporting them as part of the data upload (in the metadata, or perhaps a new .errors section of the JSON).

This way we can find out what's going wrong on people's computers when they try to run the studies, and fix it in future revisions of the study.

There are two places to catch exceptions: The first is in remote-experiment-loader, where we have the top-level Cuddlefish context that requires and runs everything else.  Any exceptions in the main threads will bubble up to there.

But exceptions that happen in callbacks from listeners are executed from the gecko thread, so we can't catch them in cuddlefish.  The best thing to do there, I think, is to catch exceptions in the ._listen() method of the base study class.

We will probably need to add a new database table for these stored exceptions in order to not abuse the various experiment schemas too bad.  So the data store will need additional API calls for recording errors, and then all the errors should get automatically reported on upload.
Whiteboard: data, api, extension-core
Severity: normal → major
Priority: -- → P1
Target Milestone: -- → 1.1
Basic functionality added (with unit tests!) in http://hg.mozilla.org/labs/testpilot/rev/71864f9d24c9

Specifically this will catch, log and report exceptions caught in the handlers' onExperimentStartup, onExperimentShutdown, onAppStartup, onAppShutdown, onEnterPrivateBrowsing, onExitPrivateBrowsing, onNewWindow, onWindowClosed, and doExperimentCleanup methods.

It will not (yet) log errors thrown in the callbacks of listeners (that has to be done in base_classes on the server side) or errors thrown when Cuddlefish attempts to require() the study code (those are tricky to log because the experiment's data store doesn't exist yet.)

Also we're not yet getting the line number in the study code from which the exception originates.  Should look at how Cuddlefish does that and replicate.
Target Milestone: 1.1 → 1.2
Attachment #529777 - Flags: review?(dtownsend)
Better version of patch, incorporating fixes for bugs 645092 and 646122
Attachment #529777 - Attachment is obsolete: true
Attachment #529790 - Flags: review?(dtownsend)
Attachment #529777 - Flags: review?(dtownsend)
Comment on attachment 529790 [details] [diff] [review]
Patch to mozilla-central to add this feature

>diff --git a/browser/app/profile/extensions/testpilot@labs.mozilla.com/modules/experiment_data_store.js b/browser/app/profile/extensions/testpilot@labs.mozilla.com/modules/experiment_data_store.js

>       switch (this._columns[i].type) {
>-        case TYPE_INT_32: case TYPE_DOUBLE:
>-          insStmt.params[i] = datum;
>+        case TYPE_INT_32:
>+          insStmt.bindInt32Parameter(i, datum);
>+        break;
>+        case TYPE_DOUBLE:
>+          insStmt.bindDoubleParameter(i, datum);
>         break;
>         case TYPE_STRING:
>-          insStmt.params[i] = sanitizeString(datum);
>+          insStmt.bindUTF8StringParameter(i, sanitizeString(datum));
>         break;
>       }

bindXXXParameter methods are now deprecated. Fine to use for now but you should open a bug on replacing them with bindByName or bindByIndex.

>+  logException: function EDS_logException(exception) {
>+    let insertSql = "INSERT INTO " + EXCEPTION_TABLE_NAME + " VALUES (?1, ?2);";
>+    let insStmt =  this._createStatement(insertSql);
>+    // Even though the SQL says ?1 and ?2, the bind methods count from 0.
>+    // What a confusing API!
>+    insStmt.bindDoubleParameter(0, Date.now());
>+    let txt = exception.message ? exception.message : exception.toString();
>+    insStmt.bindUTF8StringParameter(1, txt);
>+    insStmt.executeAsync();
>+    insStmt.finalize(); // TODO Is this the right thing to do when calling asynchronously?

Yes I believe it is, in this form executeAsync just clones the statement and executes that I believe. Apparently it can be better to use createAsyncStatement instead, but something for another bug.
Attachment #529790 - Flags: review?(dtownsend) → review+
(In reply to comment #3)
> Created attachment 529790 [details] [diff] [review] [review]
> Patch to mozilla-central to add this feature
> 
> Better version of patch, incorporating fixes for bugs 645092 and 646122

These two patches seem to be identical and don't include those fixes. The fix for bug 645092 looks reviewed already.
Oops, this is the file I meant to attach before.
Attachment #529790 - Attachment is obsolete: true
Attachment #529799 - Flags: review?(dtownsend)
Comment on attachment 529799 [details] [diff] [review]
Patch to mozilla-central to add this feature

Looks good.
Attachment #529799 - Flags: review?(dtownsend) → review+
Landed: http://hg.mozilla.org/mozilla-central/rev/f6152e382b94
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment on attachment 529799 [details] [diff] [review]
Patch to mozilla-central to add this feature

Aurora drivers, I think we should take this so we can get the in-tree test pilot for Firefox 5.0 in sync with the version currently on AMO
Attachment #529799 - Flags: approval-mozilla-aurora?
Comment on attachment 529799 [details] [diff] [review]
Patch to mozilla-central to add this feature

Part of plan to solve the problems we encounter leading to beta.
Attachment #529799 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Whiteboard: data, api, extension-core → [approved patches landed on aurora] data, api, extension-core
Product: Mozilla Labs → Mozilla Labs Graveyard
You need to log in before you can comment on or make changes to this bug.