Extend indexer timeout limit to compensate for config4 hitting the timeout limit 2 days in a row
Categories
(Webtools :: Searchfox, task)
Tracking
(Not tracked)
People
(Reporter: asuth, Assigned: asuth)
References
Details
Attachments
(2 files)
config4 timed out on July 26th and July 27th for our 10hr limit; I've just deleted the crontab for the July 28th indexing run so it should complete. config4 was already a bit close to the limit when I added cypress to config4 in bug 1901117. Per https://bugzilla.mozilla.org/show_bug.cgi?id=1901117#c2 we expected to be at 8.75 hrs of 10 hrs, but it appears the landing of https://github.com/mozsearch/mozsearch/pull/679 to process macros probably slowed down indexing enough to push us over the limit. I'll gather some more statistics for the next comment.
Assignee | ||
Comment 1•6 months ago
•
|
||
It looks like my intuition was slightly wrong about the problem. It seems like the changes in indexing time for mozilla-central are basically in the noise, but there was a massive increase in LLVM build time. This makes sense if we look at a m-c linux x64 searchfox build job with a runtime of 60 minutes before the patch and a m-c linux x64 searchfox build job after tha landing with a runtime of 83 minutes afterwards, an increase of ~38%.
Our LLVM indexing time, including build, before the macro patch landed was 3:56:25 with a build time of 3:38:32. After, the total indexing time was 5:05:01 with a build time of 4:45:54, for an increase of ~31% in build time.
The indexer timestamps from the log actually put us at a smidge under 10 hours, but the log timestamp is actually from 10h04m, so I think we might have failed just as the web-server was coming up.
In any event, I will adjust the indexer limit up to 12 hours to restore our buffer and because we do have 3 full m-c clones on there whose runtime will likely increase in the future a bit, as likely will LLVM. A more targeted approach would be to set the timeout threshold on each config file at its toplevel, but in general we have not had much in the way of indexers failing in a way that they sit there idle. The caveat is that problems where there is an unexpected cargo build failure on the web server does manifest in the indexer hanging as it waits forever for the web-server, but that is probably a case where we actually want to improve that mechanism so that the web-server can convey it is permanently broken and the indexer can know to promptly trigger a failure and shut them both down. But that has been pretty rare and we normally address that within a day.
BEFORE
├── llvm
│ └──
│ script time since start apparent duration
│ ────────────────────────────────────────────────────────────
│ find-repo-files.py 0:00:00 0:01:07
│ build.sh 0:01:07 3:38:32
│ scip-analyze.sh 3:39:39 0:00:00
│ js-analyze.sh 3:39:39 0:00:01
│ html-analyze.sh 3:39:40 0:00:01
│ css-analyze.sh 3:39:41 0:00:00
│ idl-analyze.sh 3:39:41 0:00:00
│ ipdl-analyze.sh 3:39:41 0:00:01
│ replace-aliases.sh 3:39:42 0:00:16
│ crossref.sh 3:39:58 0:09:30
│ output.sh 3:49:28 0:03:23
│ build-codesearch.py 3:52:51 0:02:27
│ compress-outputs.sh 3:55:18 0:01:07
│ check-index.sh 3:56:25
AFTER
├── llvm
│ └──
│ script time since start apparent duration
│ ────────────────────────────────────────────────────────────
│ find-repo-files.py 0:00:00 0:01:09
│ build.sh 0:01:09 4:45:54
│ scip-analyze.sh 4:47:03 0:00:00
│ js-analyze.sh 4:47:03 0:00:01
│ html-analyze.sh 4:47:04 0:00:00
│ css-analyze.sh 4:47:04 0:00:00
│ idl-analyze.sh 4:47:04 0:00:00
│ ipdl-analyze.sh 4:47:04 0:00:02
│ replace-aliases.sh 4:47:06 0:00:19
│ crossref.sh 4:47:25 0:10:10
│ output.sh 4:57:35 0:03:51
│ build-codesearch.py 5:01:26 0:02:26
│ compress-outputs.sh 5:03:52 0:01:09
│ check-index.sh 5:05:01
m-c indexer times for comparison
Before, July 25th
├── mozilla-central
│ └──
│ script time since start apparent duration
│ ────────────────────────────────────────────────────────────
│ find-repo-files.py 0:00:00 0:02:41
│ build.sh 0:02:41 0:17:53
│ scip-analyze.sh 0:20:34 0:01:45
│ js-analyze.sh 0:22:19 0:01:04
│ html-analyze.sh 0:23:23 0:00:55
│ css-analyze.sh 0:24:18 0:00:03
│ idl-analyze.sh 0:24:21 0:00:47
│ ipdl-analyze.sh 0:25:08 0:00:04
│ replace-aliases.sh 0:25:12 0:00:17
│ crossref.sh 0:25:29 0:25:47
│ output.sh 0:51:16 0:08:58
│ build-codesearch.py 1:00:14 0:03:53
│ compress-outputs.sh 1:04:07 0:06:18
│ check-index.sh 1:10:25
After, July 26th
├── mozilla-central
│ └──
│ script time since start apparent duration
│ ────────────────────────────────────────────────────────────
│ find-repo-files.py 0:00:00 0:02:39
│ build.sh 0:02:39 0:17:39
│ scip-analyze.sh 0:20:18 0:01:40
│ js-analyze.sh 0:21:58 0:01:00
│ html-analyze.sh 0:22:58 0:00:58
│ css-analyze.sh 0:23:56 0:00:03
│ idl-analyze.sh 0:23:59 0:00:45
│ ipdl-analyze.sh 0:24:44 0:00:03
│ replace-aliases.sh 0:24:47 0:00:18
│ crossref.sh 0:25:05 0:22:19
│ output.sh 0:47:24 0:08:41
│ build-codesearch.py 0:56:05 0:03:43
│ compress-outputs.sh 0:59:48 0:06:02
│ check-index.sh 1:05:50
After, July 27th
├── mozilla-central
│ └──
│ script time since start apparent duration
│ ────────────────────────────────────────────────────────────
│ find-repo-files.py 0:00:00 0:02:35
│ build.sh 0:02:35 0:21:06
│ scip-analyze.sh 0:23:41 0:01:40
│ js-analyze.sh 0:25:21 0:00:58
│ html-analyze.sh 0:26:19 0:00:58
│ css-analyze.sh 0:27:17 0:00:02
│ idl-analyze.sh 0:27:19 0:00:58
│ ipdl-analyze.sh 0:28:17 0:00:03
│ replace-aliases.sh 0:28:20 0:00:28
│ crossref.sh 0:28:48 0:26:48
│ output.sh 0:55:36 0:09:41
│ build-codesearch.py 1:05:17 0:03:36
│ compress-outputs.sh 1:08:53 0:05:57
│ check-index.sh 1:14:50
Assignee | ||
Comment 2•6 months ago
|
||
Assignee | ||
Comment 3•6 months ago
|
||
Assignee | ||
Updated•6 months ago
|
Description
•