stackdriver error reporting combines unrelated errors
Categories
(Taskcluster :: Operations and Service Requests, task)
Tracking
(Not tracked)
People
(Reporter: dustin, Unassigned)
Details
Attachments
(2 files)
The headline for this error is an error from the hooks service from a few days ago. The errors in the "recent samples" list are from the auth service and are about roles, with a different error message. Yet these were combined..
https://console.cloud.google.com/errors/CJSTprrxmruauQE?time=PT1H&project=heroku-logging
Reporter | ||
Comment 1•6 years ago
•
|
||
Increasing the time-range of that view does, indeed, show the hooks error. It appears that stackdriver only matches on the first line of the stack?
at ServerResponse.res.reply (schema.js:76)
Is that configurable?
Reporter | ||
Comment 2•6 years ago
|
||
Even then, it appears to allow some slop. So basically the error are matched on "happened in a file named X somewhere near line Y".
TypeError: Cannot read property 'user_id' of undefined
at Handler.identityFromProfile (/app/services/login/src/handlers/mozilla-auth0.js:162:47)
at Handler.getUser (/app/services/login/src/handlers/mozilla-auth0.js:90:26)
and
TypeError: Cannot read property 'fxa_sub' of undefined
at Handler.identityFromProfile (/app/services/login/src/handlers/mozilla-auth0.js:172:47)
at Handler.getUser (/app/services/login/src/handlers/mozilla-auth0.js:90:26)
are considered equivalent.
Which is sort of worse than useless.
Reporter | ||
Comment 3•6 years ago
|
||
Another instance:
https://console.cloud.google.com/errors/CMuy9diB767SgwE?time=P7D&project=heroku-logging
seems to be matching any error with "at process._tickCallback (next_tick.js:61)" in its stack. And those familiar with node stacks will notice that's the bottommost line on just about every one. This means that lots of unrelated errors are being folded into this single error record and thus not reported in such a way that we can see them.
One of the raw tracebacks is
Time:2019-03-08T15:25:40.4539092Z
at /app/node_modules/fast-azure-storage/lib/queue.js:321:27
at tryCallOne (/app/node_modules/promise/lib/core.js:37:12)
at /app/node_modules/promise/lib/core.js:123:15
at flush (/app/node_modules/asap/raw.js:50:29)
at process._tickCallback (internal/process/next_tick.js:61:11)
Note that this is also missing the first two (more useful!) lines of the error message.
Reporter | ||
Comment 4•6 years ago
|
||
It looks like the original error report in comment 0 has been replaced with
https://console.cloud.google.com/errors/CKeRk8nN5_jzxgE?time=P7D&project=heroku-logging&organizationId=442341870013
and is again logging several different errors that happen to be handled through the same codepath.
Reporter | ||
Comment 5•6 years ago
|
||
Now filed as a GCP support ticket.
Description
•