Closed Bug 1768270 Opened 3 years ago Closed 3 years ago

Treeherder data cycling cron job fails: django.core.exceptions.FieldError: Cannot compute Count('id'): 'id' is an aggregate

Categories

(Tree Management :: Treeherder, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aryx, Unassigned)

Details

https://github.com/mozilla/treeherder/blob/ec5327138abc838fc2bbeb558a6066dd6ed5a543/treeherder/model/management/commands/cycle_data.py
succeed on May 2+3+4. No run listed for May 5. 7 attempts and fails on May 6. Joel, was there a deployment in that timeframe and if yes, what got deployed?

Traceback (most recent call last):
  File "./manage.py", line 36, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.7/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.7/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.7/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.7/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/app/treeherder/model/management/commands/cycle_data.py", line 65, in handle
    data_cycler.cycle()
  File "/app/treeherder/model/data_cycling/cyclers.py", line 61, in cycle
    self.cycle_interval, self.chunk_size, self.sleep_time
  File "/app/treeherder/model/models.py", line 515, in cycle_data
    submit_time=Max("submit_time"), id=Max("id"), count=Count("id")
  File "/usr/local/lib/python3.7/site-packages/django/db/models/query.py", line 397, in aggregate
    % (annotation.name, name, name)
django.core.exceptions.FieldError: Cannot compute Count('id'): 'id' is an aggregate
Flags: needinfo?(jmaher)

django upgrade 3.1.14 -> 3.2.13:
https://github.com/mozilla/treeherder/commit/3e8ebd555b3e09db011335fee46586ff72c9a263

utf-8 for mysql:
https://github.com/mozilla/treeherder/commit/8d40b3f904f498cf4ffca9adb7d6e1c9f3548ca9

javascript async library from 2.6.2 -> 2.6.3:
https://github.com/mozilla/treeherder/commit/d8b63ecd5b639d8b38e5dd613f83f2f79533c986

a few other things were in the deploy, but overall these are the changes that would not affect api calls or UI views.

quite likely this is django upgrade or it could be the utf-8 change.

:bastien, are you more familiar with django and mysql?

Flags: needinfo?(jmaher) → needinfo?(abadie)

This may come from the Django upgrade, but I don't immediately see why (no changes on the aggregation low-level API).

I tried to run the offending line on my local checkout, it runs fine.

Aryx, could you try in a django shell to run the following:

from treeherder.model.models import *
Job.objects.all().aggregate(submit_time=Max("submit_time"), id=Max("id"), count=Count("id"))

Does it raise an error ? I get the normal output.

Running docker-compose exec backend ./manage.py cycle_data also works for me, but I guess I don't have enough data to really trigger it.

Flags: needinfo?(abadie) → needinfo?(aryx.bugmail)

Katie, could Bastien's first command from comment 2 be run for Treeherder production (or prototype but this is subject to change) and the result be pasted here (also if it's an error or not). stage and tc-staging cycle without issues - that's unexpected.

Flags: needinfo?(aryx.bugmail) → needinfo?(kkleemola)

I see the same error on production running the command:

>>> Job.objects.all().aggregate(submit_time=Max("submit_time"), id=Max("id"), count=Count("id"))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/django/db/models/query.py", line 397, in aggregate
    % (annotation.name, name, name)
django.core.exceptions.FieldError: Cannot compute Count('id'): 'id' is an aggregate

The same error seems to appear in the logs for tc-stage and stage.

Flags: needinfo?(kkleemola)

Katie, the fix has been deployed on staging, could you check if you see the errors in logs (or running data cycling) ?

The command described above (in the django shell) will still fail as it contains the source of the bug.

Flags: needinfo?(kkleemola)

No more errors in the logs and the cron is no longer failing on stage.

Flags: needinfo?(kkleemola)
You need to log in before you can comment on or make changes to this bug.