Closed
Bug 1295536
Opened 7 years ago
Closed 7 years ago
Exception during perfherder ingestion 'django.db.utils:OperationalError: (1054, "Unknown column 'inf' in 'field list'")'
Categories
(Tree Management :: Perfherder, defect, P1)
Tree Management
Perfherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: wlach)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
django.db.utils:OperationalError: (1054, "Unknown column 'inf' in 'field list'"): https://rpm.newrelic.com/accounts/677903/applications/4180461/traced_errors/0bb0bc77-639a-11e6-9e2c-c81f66b8ceca_12674_19988 ...during perf/models.py's save(): https://github.com/mozilla/treeherder/blob/f7b3d2f67d218580d8d5be97d558c5e72e57360d/treeherder/perf/models.py#L101-L105 Example log: https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-win64/1471329804/autoland_win8_64_test-g4-e10s-bm112-tests1-windows-build126.txt.gz The PERFHERDER_DATA part of the log contains replicates with value `Infinity`. Ideally: * The performance suite in question shouldn't ever output invalid replicate values * Even if invalid replicates are present, they shouldn't break ingestion The tasks hitting this error are retrying (since typically OperationalError is something that should be retried), so this is contributing to the backlog of jobs on stage/prod + Heroku.
Flags: needinfo?(wlachance)
Assignee | ||
Comment 1•7 years ago
|
||
We should restrict the range of acceptable numbers in the schema, I guess.
Flags: needinfo?(wlachance)
Comment 2•7 years ago
|
||
Assignee | ||
Comment 3•7 years ago
|
||
Comment on attachment 8781607 [details] [review] [treeherder] wlach:1295536 > mozilla:master We should sync the schema changes to m-c, if this looks ok. That way talos will fail if it's producing these weird values.
Attachment #8781607 -
Flags: review?(emorley)
Assignee | ||
Comment 4•7 years ago
|
||
Filed bug 1295630 about the underlying issue in the test.
Reporter | ||
Comment 5•7 years ago
|
||
Comment on attachment 8781607 [details] [review] [treeherder] wlach:1295536 > mozilla:master Many thanks :-)
Attachment #8781607 -
Flags: review?(emorley) → review+
Comment 6•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/07db3a801b5ded5e096d34a052576f0eb65cb7c8 Bug 1295536 - Validate that perfherder values are within acceptable ranges (#1786) Especially make sure that we have no "infinite" values, as those can cause exceptions.
Comment hidden (mozreview-request) |
Comment 8•7 years ago
|
||
mozreview-review |
Comment on attachment 8781628 [details] Bug 1295536 - Update performance schema to treeherder latest; https://reviewboard.mozilla.org/r/72014/#review69512 this will be much better! And the values look very sane
Attachment #8781628 -
Flags: review?(jmaher) → review+
Reporter | ||
Comment 9•7 years ago
|
||
(In reply to Treeherder Bugbot from comment #6) > Commit pushed to master at https://github.com/mozilla/treeherder > > https://github.com/mozilla/treeherder/commit/ > 07db3a801b5ded5e096d34a052576f0eb65cb7c8 > Bug 1295536 - Validate that perfherder values are within acceptable ranges > (#1786) In the last 30 minutes there have been 6000+ exceptions on Heroku stage (which deploys master) of form similar to: jsonschema.exceptions:ValidationError: 1780207616 is greater than the maximum of 1000000000.0Failed validating 'maximum' in schema['properties']['suites']['items']['properties']['subtests']['items']['properties']['value']: {'description': 'Summary value for subtest', 'maximum': 1000000000.0, 'minimum': -1000000000.0, 'title': 'Subtest value', 'type': 'number'}On instance['suites'][0]['subtests'][1]['value']: 1780207616 See: https://rpm.newrelic.com/accounts/677903/applications/14179733/filterable_errors#/table?top_facet=transactionUiName&barchart=barchart&_k=26jo7t I'll revert this in the meantime, to unbreak ingestion on Heroku. The jobs are also retrying when they shouldn't - we should add `jsonschema.exceptions:ValidationError` to the non-retryable exceptions list :-)
Comment 10•7 years ago
|
||
Reporter | ||
Comment 11•7 years ago
|
||
Plus I think we should maybe not raise at all for jobs that fail to validate. If it was an API submission we could return an HTTP 400, and it it was buildbot/Pulse ingestion then just skip ingesting perf data for it.
Comment 12•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/8858fc144394ecd2eebf39ff14acc8a207566ee7 Revert "Bug 1295536 - Validate that perfherder values are within acceptable ranges" (#1788) Reverts mozilla/treeherder#1786, due to: https://bugzilla.mozilla.org/show_bug.cgi?id=1295536#c9
Assignee | ||
Comment 13•7 years ago
|
||
Heh, apparently a billion isn't enough for perfherder. Let's set the limit at a trillion then. :) I'll also make some other changes to fix the ingestion problems: 1. Don't even try to store performance artifacts which don't comply with the schema 2. Make jsonschema validation errors non-retryable
Comment 14•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8782057 -
Flags: review?(emorley)
Assignee | ||
Updated•7 years ago
|
Attachment #8781607 -
Attachment is obsolete: true
Reporter | ||
Comment 15•7 years ago
|
||
Comment on attachment 8782057 [details] [review] [treeherder] wlach:1295536 > mozilla:master Looks good (just needs the import order tweak to fix the isort failure). Thank you for sorting this :-)
Attachment #8782057 -
Flags: review?(emorley) → review+
Reporter | ||
Updated•7 years ago
|
Attachment #8781745 -
Attachment is obsolete: true
Comment 16•7 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/e97b8a349ddfa4a8fcd58c041d4f3876923602fe Bug 1295536 - Don't try to store non-compliant perf data via logparser https://github.com/mozilla/treeherder/commit/b9d4f8b4e1a945b729469c683bdbac59af5396c0 Bug 1295536 - Make jsonschema validation errors non-retryable https://github.com/mozilla/treeherder/commit/c660021d2b0755e6d957b129d9988f59ec3663dd Bug 1295536 - Validate that perfherder values are within acceptable ranges Especially make sure that we have no "infinite" values, as those can cause exceptions. https://github.com/mozilla/treeherder/commit/0f980d1808d8c7d14936d2184bbee867c30b4c9a Merge pull request #1789 from wlach/1295536 Bug 1295536 - Validate that perfherder values are within acceptable ranges - take 2
Assignee | ||
Comment 17•7 years ago
|
||
Hopefully this is good now.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•7 years ago
|
Assignee: nobody → wlachance
Assignee | ||
Comment 18•7 years ago
|
||
Oops, we still need to update m-c to turn jobs orange when we encounter this jobs. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment hidden (mozreview-request) |
Comment 20•7 years ago
|
||
Pushed by wlachance@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/31cde7bb6a9e Update performance schema to treeherder latest;r=jmaher
Comment 21•7 years ago
|
||
Backed out in https://hg.mozilla.org/integration/autoland/rev/8d682fddd924 for https://treeherder.mozilla.org/logviewer.html#?job_id=2348544&repo=autoland
Assignee | ||
Comment 22•7 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #21) > Backed out in https://hg.mozilla.org/integration/autoland/rev/8d682fddd924 > for > https://treeherder.mozilla.org/logviewer.html#?job_id=2348544&repo=autoland I'm going to reland now that we've fixed that issue in bug 1295630 (and the test no longer is producing infinite values).
Comment 23•7 years ago
|
||
Pushed by wlachance@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/28fe13ad5610 Update performance schema to treeherder latest;r=jmaher
Comment 24•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/28fe13ad5610
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•