Parallelize revision creation and diff property calls in submit for faster stack submission
Categories
(Conduit :: moz-phab, enhancement)
Tracking
(Not tracked)
People
(Reporter: Sylvestre, Assigned: Sylvestre)
References
(Blocks 1 open bug)
Details
Attachments
(9 files, 7 obsolete files)
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review |
moz-phab submit currently makes all Conduit API calls sequentially, which is slow for large stacks. This bug tracks parallelizing two key phases of the submit workflow:
-
Parallelize set_diff_property calls: Extract
build_diff_property_argsfromset_diff_propertyso that data building (which may shell out to git-cinnabar viaget_public_node) stays in the main thread, while the actual API calls are deferred and executed in parallel via a ThreadPoolExecutor at the end of submit. -
Parallelize revision creation: Restructure
_submitinto 4 distinct phases:- Phase 1: Create and submit diffs (sequential — touches local VCS)
- Phase 2: Create/update revisions (parallel — no cross-commit dependency after splitting out parent linking)
- Phase 3: AI review, amend commits, build diff metadata (sequential)
- Phase 4: Set diff properties and parent links (parallel)
The key insight is that by splitting
parents.addout ofcreate_revision/update_revisionand deferring it to Phase 4, every revision creation call becomes independent and can run in parallel.
Thread safety: SimpleCache is made thread-safe with a threading.Lock to support concurrent Conduit calls.
Benchmark (simulated 0.3s API latency):
- 5-commit stack: 3.30s → 2.41s (−27%)
- 10-commit stack: 6.61s → 4.81s (−27%)
| Assignee | ||
Comment 1•1 month ago
|
||
Extract build_diff_property_args for thread-safe parallel submission:
split set_diff_property into build_diff_property_args (data building,
which may shell out to git-cinnabar via get_public_node) and the API
call, so the main thread builds all payloads before worker threads
make the network requests.
| Assignee | ||
Comment 2•1 month ago
|
||
Restructure _submit into 4 phases:
- Create and submit diffs (sequential — touches local VCS)
- Create/update revisions (parallel — no cross-commit dependency)
- AI review, amend commits, build diff metadata (sequential)
- Set diff properties and parent links (parallel)
Phase 2 is now parallel because the previous commit split out
parent linking (parents.add) from create_revision. Without that
dependency on the previous commit's rev_phid, every revision
creation call is independent.
Benchmark (simulated 0.3s API latency):
5-commit stack: 3.30s → 2.41s (−27%)
10-commit stack: 6.61s → 4.81s (−27%)
| Assignee | ||
Comment 3•1 month ago
|
||
The reviewer data is already present in the revision dict preloaded at the
top of validate_commit_stack. Check
revision["attachments"]["reviewers"]["reviewers"] directly instead of
making two per-commit conduit.has_revision_reviewers() calls (each of
which does a get_revisions lookup).
| Assignee | ||
Comment 4•1 month ago
|
||
validate_commit_stack already computes bug_id_changed from the preloaded
revision data but discards it. Store it on the Commit dataclass and read
it in update_revision() to eliminate the extra get_revisions() call
during Phase 2.
| Assignee | ||
Comment 5•1 month ago
|
||
Brings back the benchmark framework originally added on a sibling branch
(afe13f4) so the ongoing series of submit/patch performance changes can
be measured per-commit. Includes the framework harness, connection
pooling benchmark, and patch benchmark. The submit benchmark is added
in a follow-up commit so its baseline reflects the already-landed
has_revision_reviewers and bug_id_changed changes.
Run with:
python -m benchmarks
python -m benchmarks submit
python -m benchmarks patch
python -m benchmarks connection_pooling
| Assignee | ||
Comment 6•1 month ago
|
||
edit_revision() refetched the revision just to read the current status,
even though validate_commit_stack had already seen it. Store it on the
Commit dataclass and pass it through as an optional kwarg; the fetch
path remains as a fallback for callers that don't set it.
Also adds bench_submit.py, reflecting the cumulative effect of:
- Step 1: drop has_revision_reviewers from validation
- Step 2: persist bug_id_changed (no refetch in update_revision)
- Step 3: persist existing_status (no refetch in edit_revision)
Benchmark (simulated 50ms per API call):
Commits Before (parallel) After (parallel) Speedup
1 504ms 404ms 1.24x
3 1007ms 806ms 1.25x
5 1559ms 1258ms 1.24x
10 3013ms 2411ms 1.25x
End-to-end submit speedup vs fully sequential (5-commit stack):
Before: 1.48x -> After: 1.83x
| Assignee | ||
Comment 7•1 month ago
|
||
validate_commit_stack called check_for_invalid_reviewers() once per
commit, which triggered one user.query + one project.search per commit.
Collect all unique reviewer names and groups up front, pre-warm the
get_users / get_groups caches with a single batch call each, then the
per-commit validation becomes pure cache hits.
Uses normalise_reviewer(..., strip_group=False) to match the cache keys
used by check_for_invalid_reviewers.
Benchmark (simulated 50ms per API call):
Commits Before (parallel) After (parallel) Speedup
1 404ms 404ms 1.00x
3 806ms 606ms 1.33x
5 1258ms 857ms 1.47x
10 2411ms 1509ms 1.60x
End-to-end submit speedup vs fully sequential (5-commit stack):
Before: 1.83x -> After: 2.69x
| Assignee | ||
Comment 8•1 month ago
|
||
AI review is a standalone API call that only depends on commit.rev_id,
so all eligible commits can be requested concurrently. Collect them
during the main loop and fan them out via a ThreadPoolExecutor after
all revisions are created/updated.
Each future's result is handled independently so one failure doesn't
abort the others. Individual ai_review_state is still set per-commit.
Benchmark (simulated 50ms per API call):
Commits Before (parallel) After (parallel) Speedup
1 404ms 405ms 1.00x
3 606ms 507ms 1.20x
5 857ms 659ms 1.30x
10 1509ms 1111ms 1.36x
End-to-end submit speedup vs fully sequential (5-commit stack):
Before: 2.69x -> After: 3.50x
| Assignee | ||
Comment 9•1 month ago
|
||
conduit.check(), repo.check_vcs(), and repo.is_worktree_clean() are
independent of each other: one is a network ping, the other two are
local VCS queries. Run them concurrently via a ThreadPoolExecutor so
the local checks overlap with the Phabricator round-trip.
Saves ~50-200ms on every patch invocation. The benchmark simulation
only models network latency, so the wall-clock delta isn't visible
there -- but real invocations over a live network benefit directly.
| Assignee | ||
Comment 10•1 month ago
|
||
Mirrors the existing caching pattern in get_revisions. Diffs are
immutable once submitted, so a per-process cache keyed by PHID (and
a PHID index keyed by diff ID) is safe. Partial lookups only query
Phabricator for the uncached subset.
No visible impact on the simulated submit/patch benchmarks (each
diff is typically fetched once per invocation), but helps any code
path -- or future change -- that repeats a lookup within a session.
| Assignee | ||
Comment 11•1 month ago
|
||
Every Conduit API call now enforces a 120-second socket timeout, so a
hung network connection can no longer stall moz-phab indefinitely.
For idempotent read-only methods (conduit.ping, user.whoami,
user.query, project.search, *.search, differential.getrawdiff,
file.querychunks, edge.search), transient URLError / TimeoutError /
ConnectionError failures are retried up to 3 times with a simple
linear backoff.
Mutation methods (differential.revision.edit, differential.creatediff,
etc.) are intentionally NOT retried. A retried POST after a successful
request that lost its response could create duplicate revisions or
diffs.
No benchmark impact on the happy path; this is a reliability change.
| Assignee | ||
Comment 12•1 month ago
|
||
Extend test_commit_to_dict_json_roundtrip expectations to include
bug_id_changed and existing_status, the two fields added to the
Commit dataclass to avoid refetching validation state.
| Assignee | ||
Comment 13•1 month ago
|
||
Replace per-call urllib.request.urlopen with a shared
urllib3.PoolManager, so repeated Conduit API calls reuse the TCP+TLS
connection instead of paying a fresh handshake every time.
urllib3's own Retry is disabled (total=0) because we still do
method-aware retries in call(): idempotent read methods are retried,
mutations (differential.revision.edit, differential.creatediff, etc.)
are not, to avoid duplicate POSTs if a successful request lost its
response.
Benchmark (20 sequential calls, 30ms simulated RTT):
urllib.request : 1969ms, 20 TLS handshakes
urllib3 pool : 1628ms, 1 TLS handshake (1.21x speedup)
| Assignee | ||
Comment 14•1 month ago
|
||
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Updated•1 month ago
|
| Assignee | ||
Comment 15•1 month ago
|
||
The previous bench_submit/bench_patch replayed a hand-rolled sequence of
time.sleep() calls approximating the Conduit API call pattern. The
approximation ignored the real SimpleCache, so calls that are cache hits
in production (e.g. the get_revisions lookup inside edit_revision after
validate_commit_stack has preloaded the stack) were counted as full
network trips. That made the measured speedups of any change which only
eliminated a cache-hit lookup look real when they were zero on the wire.
Replace the sleep-sequence benchmarks with a shared harness
(benchmarks/conduit_harness.py) that monkey-patches ConduitAPI.call with
a function that sleeps the simulated latency and returns fake response
payloads. The real mozphab.conduit helpers, the real SimpleCache, and
the real threadpool scheduling all run. Cache hits therefore cost
nothing and only genuine network trips count toward wall time.
bench_submit now drives validate_commit_stack + the per-commit loop +
the parallel AI-review fan-out through the real helpers, reporting wall
time and per-method call counts for 1/3/5/10-commit stacks.
bench_patch drives ping + get_revisions + get_diffs + the parallel
differential.getrawdiff download, comparing sequential vs parallel raw
diff fetch -- which is a real saving because each getrawdiff is a
network round trip, not a cache hit.
Updated•1 month ago
|
| Assignee | ||
Comment 16•1 month ago
|
||
When a stack has abandoned revisions, Phabricator's stackGraph can still
include links through them, leaving a predecessor with multiple
successors. convert_stackgraph_to_linear rejects that shape, and the
surrounding try/except re-raised unconditionally -- so --force couldn't
recover even though force_stack_transactions only needs the set of
remote PHIDs (not their order) to do its job.
Apply the same fallback the walk_llist branch already uses: in force
mode, log a warning and treat list(stack_graph.keys()) as the set of
remote PHIDs. That gives force_stack_transactions enough to unlink the
nonlinear graph and relink to match local.
Updated•1 month ago
|
Comment 17•1 month ago
|
||
Comment on attachment 9572685 [details]
Bug 2032102 - reorganise: let --force tolerate abandoned-revision ghost links
Revision D295971 was moved to bug 2034269. Setting attachment 9572685 [details] to obsolete.
Updated•1 month ago
|
Updated•29 days ago
|
Updated•29 days ago
|
Comment 18•29 days ago
|
||
Authored by https://github.com/sylvestre
https://github.com/mozilla-conduit/review/commit/518cfa2396f22163d7c41eebcc13332996cb8f5e
[main] Bug 2032102 - submit: batch reviewer lookups across all commits r=sheehan
Authored by https://github.com/sylvestre
https://github.com/mozilla-conduit/review/commit/6e31798d54283027759282a780d3a12862d42963
[main] Bug 2032102 - submit: parallelize AI review requests r=sheehan
Authored by https://github.com/sylvestre
https://github.com/mozilla-conduit/review/commit/9dd827f51578a64cab3be57352ac4a5f77fe1407
[main] Bug 2032102 - patch: parallelize initial environment checks r=sheehan
Authored by https://github.com/sylvestre
https://github.com/mozilla-conduit/review/commit/8296902e7407a6d087a2a0399e05cb2522a26825
[main] Bug 2032102 - conduit: cache diff lookups in get_diffs r=sheehan
Authored by https://github.com/sylvestre
https://github.com/mozilla-conduit/review/commit/058e1b1f624abdbdfffd16ac7dab1b15f5814c65
[main] Bug 2032102 - conduit: add socket timeout and retry idempotent calls r=sheehan
Authored by https://github.com/sylvestre
https://github.com/mozilla-conduit/review/commit/bb6cbaba4240f1a1689ed032d88d2281a2016fb6
[main] Bug 2032102 - conduit: pool HTTP connections via urllib3 r=sheehan
Comment 19•29 days ago
|
||
Authored by https://github.com/cgsheeh
https://github.com/mozilla-conduit/review/commit/5c32be2a2ae2a7ef758a6afc15c0673966e0d128
[main] testing: add missing user.query mocks (Bug 2032102)
Comment 20•29 days ago
|
||
Authored by https://github.com/cgsheeh
https://github.com/mozilla-conduit/review/commit/605544b46af252124a7e59b52235af86079b07f5
[main] tests: pass args to patch test (Bug 2032102)
Description
•