Open Bug 1763005 Opened 1 year ago Updated 2 months ago

Augment searchfox-tool/insta-check command pipeline to generate markdown explanations of the pipeline, including underlying server operations and the data propagated between each pipeline stage.


(Webtools :: Searchfox, enhancement)



(Not tracked)


(Reporter: asuth, Unassigned)



(Keywords: leave-open)


(2 files)

My current plans for the query mechanism in bug 1762817 build very heavily on the searchfox-tool/insta check pipeline mechanism. While I think this is a very reasonable approach to structure the fundamentally complex things going on under the hood, it's still a lot of complexity under the hood! So it behooves me to implement the diagnostic tooling before moving forward with the query mechanism. It would then be a goal that this mechanism can be used when performing any server query by adding &debug=1 to the query URL.

Implementation Decisions

  • Render to markdown that gets committed into the tree whenever test cases are updated. No one should have to run anything to understand what's going on!
    • We're writing and committing to markdown since this allows for a relatively human-readable, diff-friendly output format, but that it can also be easily rendered to HTML.
  • Graphs will be rendered to graphviz "dot" syntax which is what will be incorporated into the markdown.
    • mermaid is another great diagram format with the advantage of easier dynamic rendering from its source syntax (no WASM), but we're making a deep commitment to graphviz and its sophisticated clustering and table rendering support. (In particular, the underlying dagre layout engine that was used by mermaid last time I checked was notably less powerful for my specific searchfox use-cases and development at that time had ceased.) So the choice of graphviz should allow more sophisticated diagrams that can potentially reuse in-tree code.
  • Ensure the generated markdown files can be viewed with their diagrams starting from github. This may mean that we add an mdbook gh-pages setup.
    • I'm fine if there's a little bit of manual effort required to keep this up-to-date initially. Specifically having to run a make command in the VM followed by some shell script outside the VM is an acceptable first step.
    • Part of my rationale on this is that ideally we'll start having searchfox index itself, hosted on, and in that case that would be the step that we do the markdown rendering, in which case the github pages automation would be moot and potentially need to be removed.

Desired Output

  • Graph of the pipeline topology; nice-to-have if we can hack things up so that clicking on the pipeline nodes jumps to the presumable heading anchor.
  • Indication of what server actions were taken at each step. Ideally this could be done by integrating tokio tracing so that additional tracing instrumentation would show up automatically-ish.
  • Stable JSON dump of the pipeline state after each step of the pipeline. We already have representations for all values already, so this is probably the easiest part, modulo perhaps some refactoring to avoid code duplication.

So I think the way this is going to go is:

  • Straightforward "tracing" logging with idiomatic spans and events.
    • We'll just log the graphviz "dot" representation of the pipeline as a field. It can be made conditional on some kind of flag in the future if it turns out to be expensive. (We might be able to get laziness for free if we have the "dot" diagram be the debug rep so if there's no subscriber the debug formatter won't be called.)
  • We'll accumulate the log using the standard "tracing_subscriber" "fmt" JSON subscriber within a with_default call.
  • We throw the JSON blob at templating and maybe we give it some custom filters/tags/blocks and it outputs our markdown.
    • While thinking about the various options for handling more complicated HTML presentation issues like presenting object (field) layouts (ex: nsINode, mozilla::dom::Element) it seemed like we might do well to adopt a templating language that we can both run on the server and locally on the client.
      • My motivation was primarily about making it easier to get test coverage for UI related features like "what will a contextmenu show for this symbol on this line" without involving webdriver. If we used a (powerful) templating language that could be shared between the client and searchfox-tool, that would make things easy. In the end, I don't think the menu use-case makes sense (and in fact we had moved away from a nunjucks generated template in the front-end that had come from DXR), but it does look like liquid is probably the right templating solution for when we need templating.
      • In particular, I think liquid's inclusion of control flow mechanisms would be helpful as opposed to handlebars' approach, which is the other rust templating engine that also has a JS implementation available.
      • I have no underlying desire to change any of our HTML generation to using liquid. It's not something I'd rule out, but my thinking is that liquid is more of an alternative than adopting something like and having a bunch of JS and web components running in the front-end. Despite, or perhaps because of, the massive searchfox fancy branch react-based prototype, I'm very much a fan of the current small vanilla JS codebase and how (pre)cache-able server rending can help us keep latencies low and allow the HTML parser to do much of its work off the main thread.
Depends on: 1763241

A follow-up PR will land in the next few days to improve the quality of the explanations and check them into the tree.

The linked pull request and the followup were merged. Can this bug be closed?

(In reply to Mathew Hodson from comment #3)

The linked pull request and the followup were merged. Can this bug be closed?

Unfortunately there's still a meaningful amount of work to do on this bug:

  • We still need to generate diagrams.
  • The logs just have template scaffolding primarily.
  • The mechanism used to capture the "tracing" JSON logs as currently implemented does not work when we spawn tasks to run on other threads, so the "query" test cases, which is probably where the explanations would be most useful, does not include much in the way of output.
    • It may be worth revisiting the current cleverness used for the tests in favor of moving to a mechanism that will work in production for the pipeline-server to allow the logs to be tied to the underlying query. I was trying to avoid calling set_global_default because of the required extra work for the global subscriber to correlate things, but I now have a better understanding of how that could work especially in terms of interior mutability.
Keywords: leave-open

This PR addressed the 3rd bullet point of and we now are able to collect the tracing logs across all threads and the mechanism will be suitable for use in the "query" endpoint of the pipeline server so that we can provide explanations of how user queries end up getting mapped and what their contents were. That's something we'll probably want/need to add a GET query parameter for rather than trying to do it in-domain as part of the query since it's too late to decide to log things once we're parsing the query and we likely don't want to duplicate that logic. This doesn't need to be a secret mechanism, but probably makes sense as an upsell once we see the results. Like: "click here to see an explanation of how this query was run and to see what the performance looks like".

Assignee: bugmail → nobody
You need to log in before you can comment on or make changes to this bug.