Closed Bug 1292071 Opened 4 years ago Closed 3 years ago
lint jobs show errors on later pushes, not on the one which landed the problematic code
2016-08-03 01:00 +0000 Bug 1180944 landed as https://hg.mozilla.org/integration/fx-team/rev/474f38fc48be9f97b49b084ebf0b59293b81cf16 Eslint passes: https://treeherder.mozilla.org/#/jobs?repo=fx-team&revision=474f38fc48be&filter-searchStr=eslint 2016-08-03 01:53 +0000 Bug 1285863 landed as https://hg.mozilla.org/integration/fx-team/rev/4bd1976fbb4a Eslint passes: https://treeherder.mozilla.org/#/jobs?repo=fx-team&revision=555caa0082c3&filter-searchStr=eslint With the next merge 13 hours later, eslint returns errors for these changes: https://treeherder.mozilla.org/#/jobs?repo=fx-team&revision=4f072e4cc25ac1b0d68887e58f3ee2d5912a8173&filter-searchStr=eslint So does ESlint lint outdated files? Tomcat mentioned he observed this behavior already one day earlier.
The test has been fixed in the next commit: https://treeherder.mozilla.org/#/jobs?repo=fx-team&revision=41072825cb0d9bb0914bd211c54bd6dc51a917fb Kept an eye on this and noticed that Eslint passed each time after that.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Severity: normal → major
Status: RESOLVED → REOPENED
Component: Buildduty → General Automation
QA Contact: bugspam.Callek → catlee
Resolution: FIXED → ---
A local test running |mach lint path/to/file.js| detects a change which violates eslint rules, so this might be a releng issue.
This issue also affects Try runs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ab943d749cb9aadba84e24d10105585067cd4c23 should have warned about https://hg.mozilla.org/try/rev/7c17fe4f669d0bc8c904475d2f03d0a3754d50c1#l2.162
Huh, this is weird. For people following along you can this locally: $ hg update 1de5e07bed66 $ ./mach lint -l eslint devtools/client/sourceeditor/editor.js /home/ahal/hg/mozilla-central/devtools/client/sourceeditor/editor.js 1296:9 error 'cm' is defined but never used. no-unused-vars (eslint) ✖ 1 problem (1 error, 0 warnings) That revision corresponds to this push which has a green ES task: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=1de5e07bed6601d2647e88ed2690b3f30418d8aa&filter-searchStr=ES
So I tried to reproduce this with an interactive loaner, I got errors when trying to update the repo like this: abort: No such file or directory: '/home/worker/checkouts/gecko/media/webrtc/signaling/src' The file it aborted on was different each time, and then finally the update worked. This might explain why |mach lint| passed (if the update aborts, files will be missing). The only problem is this error doesn't show up in the log . Is it somehow possible the log isn't displaying stderr? Gps, any ideas? Could this be vcs related?  https://public-artifacts.taskcluster.net/BpIOCGC7Rxu7d1LvrtSqJQ/0/public/logs/live_backing.log
This is not more than a gut feeling. When fiddling around with eslint upgrades I was running eslint and getting no errors or warnings. I didn't suspect a problem until it wasn't finding an obvious issue. I don't remember what I did then, it may have been an eslint configuration or dependency issue, but afterwards it was finding the issues as expected and also showed a few more that were hidden before. I suspect also that stderr is not displayed or otherwise swallowed. I recently filed/fixed bug 1303725 which may be marginally related, but the error case there is much more obvious.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=555269567efe&selectedJob=28820699 is a try push of mozilla-aurora, being given an eslint failure in a file which does not exist on aurora, a failure which did not exist when the test file landed a week ago, so I was apparently given the results from someone's random try push based on another branch ten or fifteen days ago. Having a hard time believing this is a suite that should be visible.
This is almost certainly VCS related. This is definitely a P1. I have some ideas what's going on. I'll look around.
Priority: -- → P1
(In reply to Andrew Halberstadt [:ahal] from comment #6) > So I tried to reproduce this with an interactive loaner, I got errors when > trying to update the repo like this: > abort: No such file or directory: > '/home/worker/checkouts/gecko/media/webrtc/signaling/src' > > The file it aborted on was different each time, and then finally the update > worked. That's *very* interesting. Do you have exact steps to reproduce? I'd appreciate knowing which bundle file it is downloading and which revision(s) you are updating to/between. > The only problem is this error doesn't show up in > the log . Is it somehow possible the log isn't displaying stderr? stderr should be captured by run-task. Not sure why you would be seeing warnings in the interactive worker but not in the TC task log.
Flags: needinfo?(gps) → needinfo?(ahalberstadt)
STR were something like: 1) Request interactive loaner of ES task 2) Copy paste the exact "hg robustcheckout" command from the log.. that failed on abort 3) Run "hg update" manually to the push revision, update failed again 4) Tried again, update worked I don't have time to try reproducing again now (and tomorrow is a Canadian holiday), but if I had to guess I don't think it is very reproducible. There is also a very good chance that this is completely unrelated to the ES problem at hand, and instead related to environment differences when using one-click loaner.
Yeah, tried and wasn't able to reproduce again. But for posterity, after connecting to , I ran: > /usr/bin/hg --config hostsecurity.hg.mozilla.org:fingerprints=sha256:8e:ad:f7:6a:eb:44:06:15:ed:f3:e4:69:a6:64:60:37:2d:ff:98:88:37:bf:d7:b8:40:84:01:48:9c:26:ce:d9 robustcheckout --sh arebase /home/worker/hg-shared --purge --upstream https://hg.mozilla.org/mozilla-unified --revision 1de5e07bed6601d2647e88ed2690b3f30418d8aa https://hg.mozilla.org/mozilla-central/ /home/worker/checkouts/gecko  https://tools.taskcluster.net/one-click-loaner/connect/#Unf8_WdiQw-FT31vJ6rVsg
This also hits flake8: https://treeherder.mozilla.org/logviewer.html#?job_id=5000148&repo=autoland It's only orange on that push not before or after. jmaher says that the file is missing from the repo and ahal removed it last week or so. Failure messages: TEST-UNEXPECTED-ERROR | /home/worker/checkouts/gecko/testing/mochitest/runtestsb2g.py:222:100 | line too long (113 > 99 characters) (E501) TEST-UNEXPECTED-ERROR | /home/worker/checkouts/gecko/testing/mochitest/runtestsb2g.py:315:100 | line too long (114 > 99 characters) (E501)
This has been removed a week ago (see bug 1308202): https://hg.mozilla.org/integration/mozilla-inbound/rev/5c28890906c3
Yeah, if this is a VCS problem (which seems likely), then it'll affect all lint jobs. Maybe as a band-aid fix we can get the lint jobs to run an `hg purge` before running. So to clarify, is files being linted that don't exist anymore the only issue here? I.e, we've never seen a file that still exists but was checked out at the wrong revision?
Previous eslint failures have all been for already existing or new files, not for ones which got removed.
Hm, in that case a purge wouldn't be good enough, I'm not sure how to debug this further. Gps, you said you might have some ideas in comment 9?
Just a heads up, aside from possible VCS issues see bug 1309963. mach lint is swallowing some errors, so in case something else is borked running the linter it won't show up as an error.
Thanks for bringing that to my attention, but I don't think bug 1309963 is related as we've seen this with flake8 and the wpt linter as well.
Flake8 issue similar to the one from yesterday: https://treeherder.mozilla.org/logviewer.html#?job_id=5054746&repo=autoland
Summary: eslint job shows errors on later pushes, not on the one which landed the problematic code → lint jobs show errors on later pushes, not on the one which landed the problematic code
Now that I'm back from traveling I can look into this.
Assignee: nobody → gps
Status: REOPENED → ASSIGNED
I suspect this has to do with mixing and matching Mercurial store and checkout caches on worker instances due to workers running multiple tasks concurrently. (The lint tasks allow up to 4 simultaneous tasks whereas most other worker types only allow running a single task.) I'll try to (dis)prove this within a few hours.
eslint and flake8 hidden on all trees, we can unhide them when they actually run on the revision they claim and when someone has cleaned up the current bustage and when someone has cleaned up all the bustage they'll pick up between now and then.
Comment on attachment 8802361 [details] Bug 1292071 - Verify VCS checkout is pristine; https://reviewboard.mozilla.org/r/86776/#review86244
Attachment #8802361 - Flags: review?(dustin) → review+
Comment on attachment 8802362 [details] Bug 1292071 - Put Mercurial store on same cache as checkout; https://reviewboard.mozilla.org/r/86778/#review86246
Attachment #8802362 - Flags: review?(dustin) → review+
Pushed by email@example.com: https://hg.mozilla.org/integration/autoland/rev/4d50a677b989 Verify VCS checkout is pristine; r=dustin https://hg.mozilla.org/integration/autoland/rev/332a08725ed0 Put Mercurial store on same cache as checkout; r=dustin
You need to log in before you can comment on or make changes to this bug.