Closed
Bug 983472
Opened 11 years ago
Closed 5 years ago
Make queries multi-language aware
Categories
(Webtools Graveyard :: DXR, defect)
Webtools Graveyard
DXR
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: nrc, Unassigned)
References
Details
With the addition of more languages to DXR, we need to make queries aware of the language they work with.
I propose:
1. adding an optional 'language' field to queries. No language => language independent. (Just informational at this point).
2. Allow plugins to add queries by having all plugins require a 'query.py' file.
3. Move C++-specific queries to the Clang plugin.
That is enough to support multiple languages in DXR, with one lang per DXR instance, which is enough for me, for now (i.e., enough to index the Rust compiler). I would like to support multiple languages per instance though. So this requires a bit more thought.
We need to
4. ensure there are no duplicate queries specified by different plugins (maybe, maybe we don't need this if we always indicate the query with the language)
5. Restrict the queries shown in the search field or indicate the language here some how. This is the bit I'm not sure about - since we might be on a cpp page and want to do a query of Rust code, say. Maybe ordering by language and indicating the lanaguage with the query is enough. I can't think of anything better for now.
Reporter | ||
Comment 1•11 years ago
|
||
Erik, are you OK with 1-4, and do you have any other ideas for 5?
Flags: needinfo?(erik)
Comment 2•11 years ago
|
||
Ah, good, nrc. I've been hoping you'd show up in channel, as I'm planning to rethink/rewrite the query system and would prefer to get the Rust stuff merged first, because I don't want you do have to bear the pain of my deltas.
I like 1 and 3. Those make sense.
Multiple langs in a codebase are very important to me. For example, I'd like to add basic indexing of the Python and JS in moz-central in the next few quarters. When you seach for `fn:smoo`, you should find smoo()s defined in C++, JS, or Python. Searching for `fn:smoo lang:js` would pare it down.
So, more explicitly, I'd like to see concepts that are broadly cross-language supported as a single bucket, at least in the UI. Does that gel with you? Broadly, I'd like to move us more in the direction of "Don't worry about exactly what sort of thing you're searching for. Just type "smoo", and we'll give you an intelligent mix. If there are way too many matches, then you can pare them down." That seems to be less surprising to news users and those migrating from MXR. See https://wiki.mozilla.org/DXR_Result_Mixing for an extremely rough expansion on the subject.
What's the shape of your Rust tables? Are they similar to the C++-inspired ones, or do they store vastly different data? Moving to ES next quarter might help, if they're different. It's basically made out of JSON blobs, so you can put whatever you want in there, be sparse, etc.
Can you elaborate on what you mean by #2?
Rather than #4, I could see each plugin being able to *contribute* to the eventual ES query. We could even have them contribute to the query parser grammer if need be: the framework I used is designed for that sort of thing.
5 is tricky. I could see us having more floating bubbles pop up and make fun of the user if they try to do a subclass search in JS, etc. We'll have to think about that.
I'd love to chat in realtime with you about all this stuff; it strikes me as something that would benefit from high-bandwidth communication. But hopefully that will get you through the weekend. :-) Cheers!
Flags: needinfo?(erik)
Reporter | ||
Comment 3•11 years ago
|
||
(In reply to Erik Rose [:erik][:erikrose] from comment #2)
> Ah, good, nrc. I've been hoping you'd show up in channel, as I'm planning to
> rethink/rewrite the query system and would prefer to get the Rust stuff
> merged first, because I don't want you do have to bear the pain of my deltas.
>
Cool, I think the queries stuff is the only thing stopping me landing the Rust stuff now (at least on the DXR side, the rustc side of things needs a little more tidying up).
I'm on PTO this week and was at a work week last week, which is why I've not been around much on irc. Back next week, will ping you to discuss...
> I like 1 and 3. Those make sense.
>
> Multiple langs in a codebase are very important to me. For example, I'd like
> to add basic indexing of the Python and JS in moz-central in the next few
> quarters. When you seach for `fn:smoo`, you should find smoo()s defined in
> C++, JS, or Python. Searching for `fn:smoo lang:js` would pare it down.
>
> So, more explicitly, I'd like to see concepts that are broadly
> cross-language supported as a single bucket, at least in the UI. Does that
> gel with you? Broadly, I'd like to move us more in the direction of "Don't
> worry about exactly what sort of thing you're searching for. Just type
> "smoo", and we'll give you an intelligent mix. If there are way too many
> matches, then you can pare them down." That seems to be less surprising to
> news users and those migrating from MXR. See
> https://wiki.mozilla.org/DXR_Result_Mixing for an extremely rough expansion
> on the subject.
>
I think this sort of thing will just work. Well, at the moment. If/when we go for smarter results pages, then things will be a bit more complicated.
> What's the shape of your Rust tables? Are they similar to the C++-inspired
> ones, or do they store vastly different data? Moving to ES next quarter
> might help, if they're different. It's basically made out of JSON blobs, so
> you can put whatever you want in there, be sparse, etc.
>
They are mostly very similar, but sometimes a bit different. I see two things that are query issues - one is stuff that just doesn't exist in another language - e.g, searching for impls in Rust, which don't exist in C++. So there is some query which doesn't make sense for C++ programs at all. Mostly that is OK because I think the common use case for this is via the menu. But the drop down list of filters is another matter. The other worry is where we have the same abstract concept but different queries. I think find overriding methods is an example. Overriding works differently in Rust, so the query uses different tables. My solution is to have a query call something like find-overriding-rust, which is a bit hackey.
> Can you elaborate on what you mean by #2?
>
Currently, some tables are language independent and some are specified by the plugin, but all queries are given in query.py, I would like to move the ones that depend on a plugin's tables to that plugin.
> Rather than #4, I could see each plugin being able to *contribute* to the
> eventual ES query. We could even have them contribute to the query parser
> grammer if need be: the framework I used is designed for that sort of thing.
>
> 5 is tricky. I could see us having more floating bubbles pop up and make fun
> of the user if they try to do a subclass search in JS, etc. We'll have to
> think about that.
>
I don't think that would be an issue exactly. My worry is the user searches for sub-classes of X in Rust and gets nothing, because she should have searched for sub-traits. And furthermore, doesn't get a bubble pointing this out, because the codebase also has C++ in, so subclass of X is a totally valid query.
> I'd love to chat in realtime with you about all this stuff; it strikes me as
> something that would benefit from high-bandwidth communication. But
> hopefully that will get you through the weekend. :-) Cheers!
I'll be mostly flying this weekend, so don't expect any progress too soon :-) Lets chat next week.
Comment 4•11 years ago
|
||
Cool. Sorry about the spelling of my earlier comment; they were kicking me out of co-working at closing time. :-)
> Overriding works differently in Rust, so the query uses different tables. My
> solution is to have a query call something like find-overriding-rust, which
> is a bit hackey.
Thinking solely as a user, I'd be least surprised to get back a mix of both langauges when I say "overrides:". It'll be obvious when there's a bunch of language A mixed in when I was looking for language B. Then I can add a "lang:" filter to fix that. Better still, DXR can notice that you got a mix of languages back (probably not desired) and offer you a handy language filtration widget at the top of the results (or wherever): "Show results from [Rust] | [C++]". You could depress one or more of those buttons, and it would add lang filters to (or remove them from) the textual query. I think the cross-language query will be more efficient in ES than in SQL, but that's one of the things I want to talk to you about before I start committing ES code. I'm really glad we'll have a second concrete language thought through to inform the design of any new schema. Too bad we don't have any Forth; that'd really mix it up. ;-)
I'm at a work week *next* week, so, ironically, I won't be getting much work done on DXR (though I will be standing up and talking about our excellent community and bragging about upcoming Rust support :-)). Let's talk when we're both around—maybe even next week, though it's doubtful. You can think of the rest of this comment as notes on what to talk about then.
I wouldn't even be against merging in the Rust stuff before we're ready to turn it on, if it was pretty low-touch on the other parts of the system. I fear you ending up in merge hell, as the query system is becoming a fast-moving target. Though hmm, we'll have to rethink our build system so not every who want DXR for one or two langs needs compilers for 5 just to get the thing up.
> when we go for smarter results pages, then things will be a bit more complicated.
Whyzat?
> But the drop down list of filters is another matter.
As we move toward https://wiki.mozilla.org/DXR_Result_Mixing and https://wiki.mozilla.org/DXR_Query_Language_Refresh, the Filters menu won't matter as much. You won't need it when searching for identifiers. It will be relegated to doing structural searches. That doesn't solve the problem entirely, but it means it no longer has to be so fast-access for every purpose; we could put a Rust tab in it or whatever. (That's just an extremely rough example; most hierarchal menus are horrible to use.)
Reporter | ||
Comment 5•11 years ago
|
||
Discussed with ErikRose and abbeyj - the short term plan is to add a language field to queries and a pref in the config file to indicate which language to present queries from. That will allow us to land Rust/DXR. Long term we came up with a plan that divides queries into high level and low level and presents the user with higher level queries which are combinations of low-level (language dependent ones). We will update the plugin interface too, and move language dependent queries to the plugins then.
Comment 6•11 years ago
|
||
Expanding on the long-term plans: the filter names the user will type ("function:") are the high-level ones. The parser parses those and then tells each plugin "Contribute what you will to this `function` query", for example. Those contributions are merged into a single, big query. We then execute that and present the results.
At least in the long term, I see all queries being considered language-dependent and making a trip through the plugin API. If we have commonalities across languages (say the `functions` table is used identically in C++ and FooLanguage), we simply factor those query-building bits up and call them from both plugins.
Comment 7•5 years ago
|
||
DXR is no longer available. Searchfox is now replacing it.
See meta bug 1669906 & https://groups.google.com/g/mozilla.dev.platform/c/jDRjrq3l-CY for more details.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
Updated•5 years ago
|
Product: Webtools → Webtools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•