Closed Bug 1902875 Opened 1 year ago Closed 1 year ago

Show the field definition's actual type or line in field layout

Categories

(Webtools :: Searchfox, enhancement)

enhancement

Tracking

(firefox129 fixed)

RESOLVED FIXED
Tracking Status
firefox129 --- fixed

People

(Reporter: arai, Assigned: arai)

References

(Regression)

Details

Attachments

(6 files)

derived from https://github.com/mozsearch/mozsearch/pull/758

Currently the field layout table shows the expanded and qualified type name for each field,
which is of course good for determining how and where the field size comes from,
but it doesn't work nicely with template:

  • it's sometimes too verbose
  • not all types are linked

One option is to show the source code line or the type part directly inside the table.
There are some problems:

  • if the field definition spans across multiple line, currently there's no way to determine where it starts
    • If we show single line, it may not contain the type
  • if the field comes from macro, the source code line points the macro invocation
    • This doesn't necessarily contain the field type or even field name

Possible workaround is to show both the expanded type name and the source line (e.g.
https://clicky.visophyte.org/files/screenshots/20200331-020445.png ),
but it also has a problem that both type name and source line can be long. we may need to somehow limit the width of them to make sure the
field offset/size are readable.

Here's prototype's output

the type cell is taking too much space in the GCRuntime case.

there's one thing that can be improved here, which is the existence of enum prefix differs between platform. solving the difference can reduce the amount of text by 50%

See Also: → 1902891
Depends on: 1903464

if the field definition spans across multiple line, currently there's no way to determine where it starts

  • If we show single line, it may not contain the type

here's example:

https://searchfox.org/mozilla-central/rev/b11735b86bb4d416c918e2b2413456561beff50c/js/src/gc/GCRuntime.h#1366-1373

MainThreadOrGCTaskData<Callback<JSHostCleanupFinalizationRegistryCallback>>
    hostCleanupFinalizationRegistryCallback;
MainThreadData<CallbackVector<JSWeakPointerZonesCallback>>
    updateWeakPointerZonesCallbacks;
MainThreadData<CallbackVector<JSWeakPointerCompartmentCallback>>
    updateWeakPointerCompartmentCallbacks;
MainThreadData<CallbackVector<JS::GCNurseryCollectionCallback>>
    nurseryCollectionCallbacks;

The field definition's position can be retrieved by adding the following to the clang plugin, except for the annotation after the field name (Range.getEnd() points the end of field name):

       J.objectBegin();
+      const SourceRange &Range = Field.getSourceRange();
+      J.attribute("locStart", locationToString(Range.getBegin(), 0));
+      J.attribute("locEnd", locationToString(Range.getEnd(), 0));
       J.attribute("pretty", getQualifiedName(&Field));
       J.attribute("sym", getMangledName(CurMangleContext, &Field));

Note that we do have fullRangeToString which we use for arg ranges and the nestingRange. We also have the concept of a peekRange which grew out of my initial mega-menu proposal (https://github.com/mozsearch/mozsearch/pull/11) and is currently just a line range but could also be a "full range". We never really shipped anything that leverages the peekRange by default, but if you use the query endpoint to do a fulltext search like on updateWeakPointerZonesCallbacks you can see that it uses it.

The peekRange gets explicitly expanded to try and include comments, I believe, so it could make sense to still also have a more explicit astRange which is expected to line up directly with the AST. This would potentially line up with my thoughts for how to add data-flow/taint analysis to searchfox.

As a separate thought, from the source listing it looks like you're adding this to the structured field emission on the class/interface and I think it might be a new thing for us to emit information that is already expressed on the target record for the given symbol. That said, I think this makes sense given that we are explicitly dealing with specific platform representations and the merge process for target records is somewhat lossy and it's probably most sane for the field-layout logic to have the information here rather than do a bunch more lookups and deal with the complication of post-merge data. But maybe for consistency I wonder if we should effectively embed the contents of the target record that would be emitted for the definition there? I don't have a strong feeling about that; searchfox is fundamentally pragmatic and some of the info is redundant (context/contextsym with the "target" marker if present being unnecessary), but I figure I'd float the idea.

Thank you for pointing out the existence of peekRange.
Indeed it seems to be what I want, and it seems to point the same line range in testcases.
Given we don't actually need column range for this purpose as well, and we can use lineRangeToString.

About where to put the data, I'm leaning toward putting it into structured record, given:

  • the range we want here doesn't include comment. so, not really a duplicate of peekRange, even if it can be same for most cases
  • indeed, extra lookup for each field for AnalysisTarget doesn't sound ideal

(In reply to Tooru Fujisawa [:arai] from comment #7)

About where to put the data, I'm leaning toward putting it into structured record, given:

Right, definitely makes sense for the data to live there to avoid the extra lookups.

I'm not crazy about the potential need for string parsing, but given that we potentially need to support fields having definition points in different source files from the actual class/struct definition file in pathological preprocessor situations[1]. We could potentially do something like we do for with the jumpref jumps which is arguably just relative URL syntax, but the idea could be that we could emit #100-100 (or even just #100) for a reference to line 100 in the same file as the class (although parse_line_range currently requires the -), but PATH#100-100 if the definition is in a different file. This arguably would be readable and intuitive while allowing us to handle really ugly code. Another precedent in searchfox is that in blame we use % to represent unchanged/same-path but that is potentially less intuitive to read.

1: I'm not sure there are any in the tree. But idioms like the BrowsingContexted sync fields worried me, but I guess that's fine. And maybe the hack for atoms is gone now?

Published to dev channel

https://dev.searchfox.org/mozilla-central/query/default?q=field-layout%3A%27mozilla%3A%3Adom%3A%3ABrowsingContext%27

For macros, the FieldDecl::getSourceRange doesn't contain valid data, so the location string becomes empty string.
Then, when the field's location is empty, it fallbacks to the field's identifier (DerivedSymbolInfo.get_def_lno()), and it points the macro's line.

Then, in simple case, there can be duplicate between "Name+Type" vs "Line" (I guess it should be renamed to "Definition" ? I forgot to fix it),
such as "Name" being a part of "Line", "Type" being exactly same as what's shown in "Line",
but as demonstrated above, if macro is used, or something special happens, there can be no overlap.

For macros I think that can potentially be improved by https://github.com/mozsearch/mozsearch/pull/679 where we can potentially pull the fully expanded text out.

For the short term, I think it's definitely the case that the type is over-normalized and this generally makes things less useful than they could be. Specifically:

  • Having "class" everywhere is not helpful for most types.
  • The value types of uint64_t/uint32_t/etc creating platform differences adds a lot of visual noise that is not useful.
  • And for enums making things like mozilla::RustCell<FlagsType> mFlags{0}; look like class mozilla::RustCell<unsigned int> the loss of FlagsType is significant.

I think the only case the type really seems to be a win is for the macro cases like mRefCnt where the type of class nsCycleCollectingAutoRefCnt is definitely better than the current value. So maybe the fallback for macros could be the type instead of the line for now?

Maybe a good compromise for now would be to:

  • Make the "type" column optional and defaulting to not enabled and just use "line".
  • Add (multiply-allowed) args like "--show-col" and "--hide-col" to "format-symbols" exposed as "show-col" and "hide-col" so that someone could do field-layout:'Foo::Bar' show-col:type hide-col:line:
[term.show-col]
[[term.show-col.group.semantic-format]]
command = "format-symbols"
arg.show-col = "$0"

[term.hide-col]
[[term.hide-col.group.semantic-format]]
command = "format-symbols"
arg.hide-col = "$0"

While it's possible to do dynamic JS stuff, I think it's important to always be able to link to exactly what people are seeing so instructions are never "go to this link and then push buttons". But we could surface UI in the column or similar that helps generate a revised query.

Meta: A longer term plan for me has been to have the server be able to populate a JSON structure that expresses "for the query you're looking at, these are the additional arguments that could be specified". Columns are such a first class UI thing that selecting from a list of columns probably would actually deserve more first-class treatment, but for things like diagrams where there's a random grab bag of extra terms that could be used, having the ability to have the back-end tell the front-end what they are and what the values are would be neat. And in general clap can know what the values are, etc.

This is also a case where soliciting feedback in the searchfox channel could make sense to, to see what people mainly want to use it for.

Longer term, I think these are some options:

  • Make sure clang indexer dumps the rich type information for the fields so that we can build our own type representation including pretty printing.
  • I actually think it could be really nice to show the peekRange so that we can see comments on the fields. But instead of that being in a column, we could leverage the nature of the tree table and expose the peekRange in an additional row beneath the fields. These could follow the normal expand/collapse idiom and there could be a button to expand all. (And maybe that would be a case where we could use the hash to track expansion states?)
Depends on: 1903944

Having "class" everywhere is not helpful for most types.

it seems to be controllable with Policy.SuppressTagKeyword.

https://clang.llvm.org/doxygen/structclang_1_1PrintingPolicy.html#a40f856cb8a41b0eee26f349d3aed1657

filed bug 1903944 for tweaking it.

I'm still not sure if it's possible to control uint8_t and other using/typedefs.

I'll look into making type column hidden-by-default.

No longer depends on: 1903944
See Also: → 1903944

Added the column selector

https://dev.searchfox.org/mozilla-central/query/default?q=field-layout%3A%27mozilla%3A%3Adom%3A%3AWindowContext%27

(not reflected to cached page, so using mozilla::dom::WindowContext instead)

(In reply to Tooru Fujisawa [:arai] from comment #13)

(not reflected to cached page, so using mozilla::dom::WindowContext instead)

Ah, yeah, if making changes on a running server, you'll want to do something like sudo rm -rf ~/index/nginx-cache/*.

(In reply to Tooru Fujisawa [:arai] from comment #13)

Added the column selector

https://dev.searchfox.org/mozilla-central/query/default?q=field-layout%3A%27mozilla%3A%3Adom%3A%3AWindowContext%27

Very cool! I took a peek at the branch to better understand how it was working, and wow! I very much appreciate the effort you went to in order to both make the column query arguments apply on the back-end but also the equivalent JS local changes. The addition of fetch_formatted_lines is also a nice new capability, and I appreciate the care to clean up the pre-existing duplicated logic about path checks when implementing the new method for the local server.

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #14)

(In reply to Tooru Fujisawa [:arai] from comment #13)

(not reflected to cached page, so using mozilla::dom::WindowContext instead)

Ah, yeah, if making changes on a running server, you'll want to do something like sudo rm -rf ~/index/nginx-cache/*.

Good to know that!
now it's also reflected to BrowsingContext
https://dev.searchfox.org/mozilla-central/query/default?q=field-layout%3A%27mozilla%3A%3Adom%3A%3ABrowsingContext%27

Attached file GitHub Pull Request
Assignee: nobody → arai.unmht
Status: NEW → ASSIGNED
Pushed by arai_a@mac.com: https://hg.mozilla.org/integration/autoland/rev/73f6ee2867f3 Add lineRange field to fields in structured record. r=asuth
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Regressed by: 1904323
Regressions: 1910254
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: