Closed Bug 1734173 Opened 4 years ago Closed 4 years ago

Airflow task probe_scraper.probe_scraper failing on 2021-10-05

Categories

(Data Platform and Tools :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: gleonard)

Details

(Whiteboard: [airflow-triage])

The Airflow task probe_scraper.probe_scraper failed on 2021-10-05

Log from stackdriver:

Info
2021-10-05 05:21:46.736 EDT
base
Getting commits for repository mlhackweek-search
Error
2021-10-05 05:21:46.956 EDT
base
Traceback (most recent call last): File "/usr/local/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/app/probe_scraper/runner.py", line 592, in <module> main( File "/app/probe_scraper/runner.py", line 499, in main load_glean_metrics(cache_dir, out_dir, repositories_file, dry_run, glean_repos) File "/app/probe_scraper/runner.py", line 279, in load_glean_metrics commit_timestamps, repos_metrics_data, emails = git_scraper.scrape( File "/app/probe_scraper/scrapers/git_scraper.py", line 184, in scrape ts, commits = retrieve_files(repo_info, folder) File "/app/probe_scraper/scrapers/git_scraper.py", line 105, in retrieve_files repo = git.Repo.clone_from(repo_info.url, repo_info.name) File "/usr/local/lib/python3.9/site-packages/git/repo/base.py", line 1017, in clone_from return cls._clone(git, url, to_path, GitCmdObjectDB, progress, multi_options, **kwargs) File "/usr/local/lib/python3.9/site-packages/git/repo/base.py", line 958, in _clone finalize_process(proc, stderr=stderr) File "/usr/local/lib/python3.9/site-packages/git/util.py", line 328, in finalize_process proc.wait(**kwargs) File "/usr/local/lib/python3.9/site-packages/git/cmd.py", line 408, in wait raise GitCommandError(self.args, status, errstr) git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
Error
2021-10-05 05:21:46.960 EDT
base
 cmdline: git clone -v https://github.com/mozilla/mlhackweek2021 mlhackweek-search
Error
2021-10-05 05:21:46.960 EDT
base
 stderr: 'Cloning into 'mlhackweek-search'...
Error
2021-10-05 05:21:46.960 EDT
base
fatal: could not read Username for 'https://github.com': No such device or address

Looks like https://github.com/mozilla/mlhackweek2021 disappeared. The quickest mitigation is to bring it back, I think? Glenda, is that possible? If that's not possible we can maybe take it out of repositories.yaml (I don't think that would result in anything bad happening).

Flags: needinfo?(gleonard)

https://github.com/mozilla/mlhackweek2021 access has been reset to public

Flags: needinfo?(gleonard)

Thanks Glenda, I've retriggered a probe-scraper run. Will resolve when that's complete, hopefully later today

Assignee: nobody → gleonard

Prober scraper is good now, a downstream job in this dag is failing (bug 1734390) but that's unrelated to the issues here.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.