Closed
Bug 857100
Opened 12 years ago
Closed 12 years ago
Investigate missing aggregations for April 1
Categories
(Socorro :: General, task)
Socorro
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: laura, Unassigned)
References
Details
We have ADU for last night, but aggregations do not appear in the UI. Work out why and fix it.
Comment 1•12 years ago
|
||
$ grep -E "DEBUG" /var/log/socorro/crontabber.log | grep -vE "FTPScraper|Emails|Bugzilla|Duplicates|ReportsClean" | grep "successful"
2013-04-01 03:07:21,081 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.SignaturesCronApp'> on 2013-04-01 10:00:02.037445+00:00
2013-04-01 03:07:22,901 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.BuildADUCronApp'> on 2013-04-01 10:00:16.365841+00:00
2013-04-01 03:08:09,145 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.CorrelationsCronApp'> on 2013-04-01 10:00:23.561458+00:00
2013-04-01 03:08:11,668 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.CrashesByUserBuildCronApp'> on 2013-04-01 10:00:21.426379+00:00
2013-04-01 03:08:12,692 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.NightlyBuildsCronApp'> on 2013-04-01 10:00:01.313985+00:00
2013-04-01 03:08:13,202 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.HomePageGraphBuildCronApp'> on 2013-04-01 10:00:49.036329+00:00
2013-04-01 03:08:14,070 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.ADUCronApp'> on 2013-04-01 10:00:48.600524+00:00
2013-04-01 03:08:14,627 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.ProductVersionsCronApp'> on 2013-04-01 10:08:14.092286+00:00
2013-04-01 03:08:15,415 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.TCBSBuildCronApp'> on 2013-04-01 10:00:49.560122+00:00
2013-04-01 03:08:16,358 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.HomePageGraphCronApp'> on 2013-04-01 10:00:47.507316+00:00
2013-04-01 03:12:18,348 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.TCBSCronApp'> on 2013-04-01 10:00:14.664714+00:00
2013-04-01 03:14:15,189 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.ExplosivenessCronApp'> on 2013-04-01 10:00:50.349892+00:00
2013-04-01 03:14:17,583 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.CrashesByUserCronApp'> on 2013-04-01 10:00:17.633244+00:00
2013-04-02 03:00:02,490 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.NightlyBuildsCronApp'> on 2013-04-02 10:00:01.313985+00:00
2013-04-02 03:00:02,873 DEBUG - MainThread - successfully ran <class 'socorro.cron.jobs.matviews.ProductVersionsCronApp'> on 2013-04-02 10:00:02.553741+00:00
Comment 2•12 years ago
|
||
(reformatting the output above to be bugzilla friendly)
$ grep -E "DEBUG" /var/log/socorro/crontabber.log | grep -vE "FTPScraper|Emails|Bugzilla|Duplicates|ReportsClean" | grep "successful"
ran SignaturesCronApp on 2013-04-01 10:00:02.037445+00:00
ran BuildADUCronApp on 2013-04-01 10:00:16.365841+00:00
ran CorrelationsCronApp on 2013-04-01 10:00:23.561458+00:00
ran CrashesByUserBuildCronApp on 2013-04-01 10:00:21.426379+00:00
ran NightlyBuildsCronApp on 2013-04-01 10:00:01.313985+00:00
ran HomePageGraphBuildCronApp on 2013-04-01 10:00:49.036329+00:00
ran ADUCronApp on 2013-04-01 10:00:48.600524+00:00
ran ProductVersionsCronApp on 2013-04-01 10:08:14.092286+00:00
ran TCBSBuildCronApp on 2013-04-01 10:00:49.560122+00:00
ran HomePageGraphCronApp on 2013-04-01 10:00:47.507316+00:00
ran TCBSCronApp on 2013-04-01 10:00:14.664714+00:00
ran ExplosivenessCronApp on 2013-04-01 10:00:50.349892+00:00
ran CrashesByUserCronApp on 2013-04-01 10:00:17.633244+00:00
ran NightlyBuildsCronApp on 2013-04-02 10:00:01.313985+00:00
ran ProductVersionsCronApp on 2013-04-02 10:00:02.553741+00:00
Comment 3•12 years ago
|
||
No cause identified yet, but only NightlyBuilds and ProductVersions daily jobs report success from the overnight. Strange.
Confirmed with `grep -E "on 2013-04-02" /var/log/socorro/crontabber.log | grep -vE "Duplicates|Bugzilla|FTPScraper|Emails|ReportsClean"`
Comment 4•12 years ago
|
||
Interesting, why didn't the other jobs run today? Or did they start and not complete?
| Reporter | ||
Comment 5•12 years ago
|
||
Kairo: don't know yet.
lonnen, peterbe: can we safely backfill so Kairo can do his work?
Comment 6•12 years ago
|
||
Yeah this is odd, here's the current tcbs-matview state from crontabbers.json:
"tcbs-matview": {
...
"last_run": "2013-04-02 10:00:02.939109",
"last_success": "2013-04-01 10:00:14.664714",
"error_count": 0,
"last_error": {}
},
In /var/log/socorro/crontabbers.log I see it logging the "last_run" time when it tried to start:
2013-04-02 03:00:02,939 DEBUG - MainThread - about to run <class 'socorro.cron.jobs.matviews.TCBSCronApp'>
However there is no corresponding "successfully ran ..." log after that, and no errors either.
Comment 7•12 years ago
|
||
@Laura -- Backfill anytime, it won't destroy anything. All the info we're going to get is already in those logs.
Comment 8•12 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #6)
> Yeah this is odd, here's the current tcbs-matview state from
> crontabbers.json:
>
> "tcbs-matview": {
> ...
> "last_run": "2013-04-02 10:00:02.939109",
> "last_success": "2013-04-01 10:00:14.664714",
> "error_count": 0,
> "last_error": {}
> },
>
> In /var/log/socorro/crontabbers.log I see it logging the "last_run" time
> when it tried to start:
>
> 2013-04-02 03:00:02,939 DEBUG - MainThread - about to run <class
> 'socorro.cron.jobs.matviews.TCBSCronApp'>
>
> However there is no corresponding "successfully ran ..." log after that, and
> no errors either.
This could be very bad. It means that the backfill concluded that there was nothing to run.
The "about to run" message happens if the dependencies are all passing and the next_run says it's time to run. Then, when it goes into the backfill machinery it does this based on last_success to know how many days it's going to catch up on.
So, since this is run at "2013-04-02 03:00:02" (which is 10:00:02 in UTC) that means 24 hours has NOT passed since "2013-04-01 10:00:14.664714". (see, 23 hours, 59 minutes and 47 seconds)
I fear this is a bug. Here in fact: https://github.com/mozilla/socorro/blob/master/socorro/cron/base.py#L163
What it does is that it avoids any drift by forcing the time to be "10:00" but it forgets the seconds (and milliseconds).
So, this means it should kick in tomorrow but it'll be 1 day off. I'll write a patch.
Status: NEW → ASSIGNED
Comment 9•12 years ago
|
||
New pull request: https://github.com/mozilla/socorro/pull/1174
It's late in the afternoon for heavy date logic but I think it's solid.
Comment 10•12 years ago
|
||
Commit pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/f6342821711f55053156d9fbd7fa4d645ba23c9f
fixes bug 857100 - reset time by seconds and microseconds too, r=rhelmer
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•