Closed Bug 1762992 Opened 3 years ago Closed 3 years ago

Implement "search-text" searchfox-tool local-only command to talk to the livegrep codesearch server using Tokio-stack https://github.com/hyperium/tonic

Categories

(Webtools :: Searchfox, enhancement)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: asuth, Assigned: asuth)

References

Details

Attachments

(1 file)

In order to implement the full searchfox experience in rust, we need to be able to talk gRPC to the livegrep codesearch server. We'll use https://github.com/hyperium/tonic and expose this as a new "search-text" command in searchfox-tool / insta checking. "text-search" would be more natural, but we already have "search-identifiers" and arguably it's probably more user-friendly for the searchfox-tool '--help' command to have the various searches clustered together. I also expect to add "search-files" soon as well, but I'm trying to be responsible about the scope of bugs and patches, so that will get its own bug.

I'm pursuing this now. I'm struggling a little bit with how to best deal with revision control for livegrep's config.proto and livegrep.proto files.

Currently, our web-server provisioning looks like this:

  • git clone -b mozsearch-version5 https://github.com/mozsearch/livegrep inside the VM at ~/livegrep
  • bazel build that (inside the VM)
  • sudo pip3 install grpcio grpcio-tools (inside the VM)
  • directly invoke python3 -m grpc_tools.protoc to build the python bindings inside the VM at ~/livegrep-grpc3/, and then put that on the python path

That all largely works fine for python because of the lazy runtime compilation model, but at the expense of the VS Code IDE (outside of the VM) where we get yellow squiggles under the grpc and livegrep_pb2 and livegrep_pb2_grpc imports because VS Code can't see any of those files and their byproducts. It's also impossible for anyone to use our codesearch.py module outside of the VM, but this is not generally a problem because they also can't talk to the livegrep codesearch server outside of the VM.

For our rust tools/ dir, the compilation model really wants access to the proto files and the bindings. While I do try and only run the cargo build process inside the VM, VS Code with rust-analyzer generally is able to figure out what's going on as long as the rust versions are similar enough inside and outside the VM without me having pushed any buttons. Also, in theory it would be nice for someone to be able to be able to fully build the crate outside the VM as long as they've checked out the tools/ipdl_parser submodule.

General options seem to be:

  1. Ruled out per the above: Stick with the provisioning idiom.
  2. Contribute a rust crate to livegrep to expose the bindings via a crate. I'm not sure if this is idiomatic or not, and in general it seems like it disproportionately might increase the maintenance burden for livegrep and defeat the point of such a clean API boundary.
  3. Add livegrep as a git submodule so that we can directly reference the proto files from their origin source tree. For sanity this probably does not want to live under tools/ but instead at like deps/livegrep. This could potentially simplify provisioning, but I'm definitely not touching that as a first step.
    • In general I've found git submodules to be a pain, but most of that pain has come from high-churn submodules because git pull won't update the submodule checkouts unless you do git pull --recurse-submodules or have set the submodule.recurse config flag to true. In that case, everything's good, otherwise one needs to do git submodule update --init. When people fail to update the the submodule, their tree status will look like the user has reverted the submodule change and it's easy to end up comitting that change.
    • Our churn on ipdl_parser and livegrep is so low, and our comitter-base so small, that I'm not sure any of this is a problem even if we couldn't just put git config submodule.recurse true in our provisioning script or in our installation instructions which already include git submodule update --init.
  4. Just check the .proto files in, flagging their origin and authorship as best possible. This seems messy at best unless the proto files had explicit per-file licensing/etc. as part of upstream.

So my tentative plan here is to add a new livegrep git submodule but not change the provisioning process at this time, but I will file a follow-up bug to track that idea.

I probably implemented this, but am running into a testing logistical issue related to the codesearch daemon not being active when we run the local index checks. Deferring the check to the web server check phase is not appropriate because we do want to be using the local_index server rather than the remote_server because the pipeline-server will be running in local_index mode itself.

My tentative plan plan is just for the indexer phase in terms of mkindex.sh to:

  • assume/require that the web servers aren't running and that therefore no one is running any codesearch instances. maybe invoke killall/pkill directly itself, even.
  • start up codesearch itself around its call to check-index.sh and clean them up after (in success; in failure the potential killall above might handle it at the (re)start of indexing; or maybe we do an error trap or what not).

This might help approximate a setup where systemd or some other daemon driver process is responsible for starting codesearch, noticing if it crashes, and restarting it. Said daemon driver might also then be able to kick/restart the pipeline-server as well. And then we avoid router.py or pipeline-server.rs having to be in charge of the codesearch process. We haven't had router.py fall over in a long long time now, so there's been no need for such changes, but as I move to expand searchfox's capabilities, there's a real chance for occasional OOM-related breakages and it would be nice to have automated self-healing for this. I feel like I've looked into tools to help with this in the past so I'll look at digging up my notes for that.

It appears my notes about process spawning/keep-aliving were at https://bugzilla.mozilla.org/show_bug.cgi?id=1630646#c6 where I determined systemd is probably the best option available.

Some relevant crates that jump out immediately in this space are:

  • sd-notify: minimal MIT/Apache crate to allow use of the sd_notify API to update systemd about the status of a service.
  • systemd: LGPL-2.1+ with gcc-exception-2.0 Full libsystemd integration
  • killjoy: GPL-3+ standalone tool that is a dbus glue layer (and so doesn't actually need to be linked to any searchfox code)

It's not clear we'd need any of these, though, as external heartbeat check scripts are likely more reliable and would avoid platform-coupling/additional deps.

See Also: → 1630646

For now, to deal with the codesearch startup issue for tests, my plan is:

  • Add a __main__ handler to codesearch.py so that it can be directly run to daemonize a codesearch instance and also verify that the server has started up and is responsive by sending an InfoRequest message over the Info rpc method. This should let us avoid depending on the sleep(5) which it seems like would needlessly slow the dev loop.
  • I'll also add some logic to kill any existing codesearch for the given port and a specific "shutdown" variant that just does the killing without the startup.
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Blocks: 1767538
See Also: → 1768341
Blocks: 1704615
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: