Open Bug 1773165 Opened 6 months ago Updated 4 months ago

Improve "traverse" diagramming heuristics to be smarter about core infrastructure interfaces/overrides and IPC in general

Categories

(Webtools :: Searchfox, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: asuth, Unassigned)

References

(Depends on 1 open bug)

Details

Initial experimentation with calls-to (which uses the "uses" edge) has found that the heuristics I added to try and create equivalence sets with overrides end very badly when the graph ends up reaching mozilla::ipc::IProtocol::OnMessageReceived or mozilla::Runnable::Run. This was not a problem in the original fancy branch implementation of TransitiveCallsDoodler because:

  • By default it would only display control-flow within the same module, where same-module was simplified to be mean the the source files containing the definition for each symbols was in the same directory.
  • A special case would let us get a single traversal into a symbol whose definition was rooted at __GENERATED__/ipc/ so we would get to the Send or Recv method and then stop.

The InternalDoodler would use both isSameSourceFileAs (based on the definition's source file) and isSameClassAs.

The SymbolInfo class also had a concept of being boring which was some hard-coded checks that dictated that we'd generally try and ignore smart-pointer related infrastructure, getters/setters, and XPCOM string code. But I think in general that was a preliminary effort that was quickly obsoleted by the same-directory rule which was serving as a proxy for same-module.

I think the potential set of next steps here is:

  1. Implement overload limits and reporting like for crossref-expand; there currently are no graph size limits because what we have so far was a first iteration, but it's way too easy for the graph to get huge. Interestingly, but not surprisingly, it seems like the dot runtime is the real long pole here. It seems reasonable to impose a hard limit on node count for this reason, at least for server-rendered graphs. If we cross a threshold, we could always kick the graph back to the client to render using a JS/WASM-based large graph viewer.
    • crossref-expand style heuristics would work here, but probably would need to be more constrained as fan-in/out is a much larger problem in a graph!
  2. Do the same-directory heuristic. It's easy!
  3. Leverage that we know the bugzilla-component for every file in the mozilla-central tree to define the module as "same bugzilla component". We could fall back to same directory for repositories without this additional mapping.
  4. Make it possible to specify the set of modules that the graph is allowed to go through based on regexp/glob-style matching.
    • With a more hierarchical understanding of modules we could do some level of automatic expansion, but it also wouldn't be that hard for the traversal to indicate what graph edges it suppressed because they touched other modules and then let the user push a button to add that to the query string and re-compute.
  5. Explicitly make traverse aware of IPC srcsym/targetsym edges and have IPC related symbols avoid going into the auto-generated machinery. That is, we want to treat SendFoo and RecvFoo as magically linked by default and completely elide the entirety of the IPC under-pinnings.
  6. Consider whether a "small multiples" style approach might be more suitable when dealing with overrides and abstract interfaces.
    • nsIInputStream::Close is an example I've been using for override-related real-world checks since Nika ran into router.py's cut-offs being too aggressive for search. I think a use-case that might make sense would be:
      • I'm interested in what's going on around nsIInputStream::Close or maybe nsIInputStream in general, there are M implementations of nsIInputStream and N consumers. There are presumably M + N state machines related to what's going on there. I'd probably rather see a total of M + N separate diagrams rather than a single diagram that challenges the layout algorithm to do something sane after typing all M + N sub-graphs together via a single node!
      • This potentially looks a lot like running the internal doodler against the callers/implementors of Close or all of the methods on the interface. Or alternately, tied to a specific member field's uses of the interface.
      • A reason why someone might want to be presented by the relatiavely large M+N set of graphs is an ability to order them by complexity and/or with some level of automated clustering or faceting based on specific methods some use/don't use and/or specific flags passed.
    • In general one could imagine that searchfox's graphing functionality is conceptually just about having pre-computed internal doodler graphs for each class or method and the act of diagramming is either stitching these graphs together or compiling them into a set of diagrams which can be traversed interactively in a 2/3/4 pane interface.
    • The SymbolGraphCollection is already explicitly modeled to support multiple graphs, so this isn't actually too hard to prototype, but the we do need a new PipelineValues type for the resulting SVG's plus the symbol metadata+defs/decls hits (which we already wanted for interactivity with the results).
Depends on: 1773767
Depends on: 1776564

Re: Runnables with a specific example of calls-to:'WorkerPrivate::ConnectMessagePort' depth:6

  • I already landed a limit enhancement that works for normal Runnables but doesn't kick in for WorkerRunnables which means that a bunch of useless control flow shows up.
  • This definitely shows that for runnables we potentially want to intervene and view their constructor as the thing that leads to their Run method being called. This would be an intervention similar in nature to what we do for IPC semantics.
    • Conceptually the same situation applies to lambdas, although our limited understanding of lambdas right now ends up with us effectively just fusing the lambda into its containing function/lambda, which works well. Once we have the "hier" logic from the prototype having the lambdas broken out might be helpful.
You need to log in before you can comment on or make changes to this bug.