Closed
Bug 1075176
Opened 11 years ago
Closed 11 years ago
Pushlog ingestion sometimes caches the wrong revision as the latest for each repo
Categories
(Tree Management :: Treeherder, defect, P1)
Tree Management
Treeherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: mdoglio)
References
Details
Attachments
(1 file)
Ryan noticed that two pushes were missing from b2g32 and some also from mozilla-aurora.
It is my understanding that we should be doing something like:
1) Check cached value for last-seen-rev for repo X
2) Poll json-pushes for &fromchange=that-value
3) Attempt to insert those pushes into the DB
4) Only if the insert of all records was successful, update last-seen-rev in the cache to be the latest push revision seen
Given this is failing, we're either:
* Updating the cached value when the insert hadn't completed successfully
* Updating the cached value to the wrong revision (ie: we picked the revision incorrectly from the array of pushes)
| Reporter | ||
Comment 1•11 years ago
|
||
jeads think this may be the cause:
https://github.com/mozilla/treeherder-service/blob/master/treeherder/etl/pushlog.py#L93
> sorted_pushlog = sorted(extracted_content.values(),
> key=itemgetter('date'), reverse=True)
> last_push = sorted_pushlog[0]
I think sorting by date is incorrect - we should sort by pushID (which is actually the original order).
| Reporter | ||
Comment 2•11 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #1)
> I think sorting by date is incorrect - we should sort by pushID (which is
> actually the original order).
Or more:
last_push = extracted_content.values()[-1]
| Assignee | ||
Updated•11 years ago
|
Assignee: nobody → mdoglio
| Assignee | ||
Updated•11 years ago
|
Status: NEW → ASSIGNED
| Assignee | ||
Comment 3•11 years ago
|
||
Attachment #8498047 -
Flags: review?(emorley)
| Reporter | ||
Comment 4•11 years ago
|
||
Comment on attachment 8498047 [details] [review]
Github PR #235 on treeherder-service
This patch is definitely the right way to do this, so we should take it either way - however it got me thinking:
Surely if we pick the wrong push as the top push (ie like we do prior to this patch), and so had a "wrong" value for top_revision - surely this would just mean on the next poll, we request _more_ pushes than we need (since we'll end up requesting some we've already seen) - and as such, how are we losing data?
eg:
1) We poll json-pushes and get pushes ["A","B","C"].
2) We incorrectly store the rev for push "B" as the top revision.
3) On next poll we call json-pushes?fromchange=REV-OF-B, which returns ["C","D","E"].
4) We then attempt to insert those pushes, notice that "C" is already handled and the revisionhash in the cache, so only insert "D" and "E".
Unless at step#4 we're actually aborting the entire insert after spotting the dupe "C", rather than assessing each push individually?
Attachment #8498047 -
Flags: review?(emorley) → review+
| Reporter | ||
Comment 5•11 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #4)
> Unless at step#4 we're actually aborting the entire insert after spotting
> the dupe "C", rather than assessing each push individually?
If this is the case, then whilst the attached patch should definitely fix most cases where this would be exposed, I guess we're still at risk of race conditions between workers, eg:
1) Worker 1 polls json pushes and gets ["A","B","C"], but the polling took ages, so it hasn't yet inserted into the DB before worker 2 is spawned
2) Worker 2 polls json pushes and gets ["A","B","C","D"] since someone has just pushed. However it hasn't yet inserted this into the DB
3) Worker 1 finally is able to insert ["A","B","C"] and updated top_revision to that of push C.
4) Worker 2 tries to insert ["A","B","C","D"], however if the guess from comment 4 is correct, it will fail to insert "D" (due to the dupes of A-C), but it still then updates top_revision to that of push D.
5) Worker 3 is spawned later, sees the top_revision of D, so does a &fromchange=REV-OF-D, which will only return "E" onwards.
-> We've missed "D".
Comment 6•11 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder-service
https://github.com/mozilla/treeherder-service/commit/980ccd36b531dee3823ce850669fb5d0c03c3a0d
Bug 1075176 - use pushId instead of commit date
When fetching the pushlog we cache the top revision of the last push fetched
to subsequently request deltas. To determine the last push of a pushlog
response, we need to use the pushId.
https://github.com/mozilla/treeherder-service/commit/43e47cd7d17684a61cfc485baf5c871f38f4f1cb
Merge pull request #235 from mozilla/bug-1075176-use-pushId-to-sort-pushlog
Bug 1075176 - use pushId instead of commit date
| Reporter | ||
Comment 7•11 years ago
|
||
| Reporter | ||
Comment 8•11 years ago
|
||
Deployed to prod. (But leaving this bug open per comment 7)
http://treeherderadm.private.scl3.mozilla.com/chief/treeherder.prod/logs/master.1412237689
| Reporter | ||
Comment 9•11 years ago
|
||
jeads, Mauro is PTO today - could you take a look at the remainder of this bug? :-)
Flags: needinfo?(jeads)
| Reporter | ||
Comment 10•11 years ago
|
||
In fact, let's make this bug specific to the problem the PR solves, and break out the other issue to a new bug.
Blocks: 1076750
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: needinfo?(jeads)
Resolution: --- → FIXED
Summary: Pushlog ingestion can still miss pushes → Pushlog ingestion sets the cached top_revision to the wrong value
| Reporter | ||
Updated•11 years ago
|
Summary: Pushlog ingestion sets the cached top_revision to the wrong value → Pushlog ingestion sometimes caches the wrong revision as the latest for each repo
| Reporter | ||
Comment 11•11 years ago
|
||
Filed bug 1076752.
You need to log in
before you can comment on or make changes to this bug.
Description
•