add a data view for structs and classes
Categories
(Webtools :: Searchfox, enhancement)
Tracking
(Not tracked)
People
(Reporter: heycam, Assigned: asuth)
References
Details
Attachments
(1 file)
Assignee | ||
Comment 1•6 years ago
|
||
I've been thinking about how to do this and similar functionality (superclasses, subclasses, etc.), and I think I've come to realize that we probably need/want a new record type beyond source and target that encodes type-level information. Perhaps "semantic" or "structured". These would be generated in the analysis files at the point of definition and continue to have a loc
.
I'm also thinking (less seriously) that perhaps we should try and adopt a "content-addressed" mechanism for dealing with the realities that we have a whole bunch of different platforms that we attempt to unify. The core idea is that we attempt to encode the binary layout of a given data structure into a stable JSON representation that we can cryptographically hash. The record contains a mapping from platform labels to their hash, and then a mapping from hashes to the normalized JSON string that was hashed and optionally a non-normalized form that has a straightforward stable merging heuristic. (That is, it's possible the representation we hash may have to elide things that are necessary for human comprehension, and so we need to also keep around a tuple of an exemplar-identifier and the human-friendly JSON. When merging, we would establish a total ordering over the exemplar-identifier space like less-than.)
Ideally this would make it straightforward to show when types are the same across platforms and when they differ without always having N variations for every type because we have N platforms. The idea might even be extended to help out with some of our template issues, but I'm really hand-waving on that.
Thoughts? (Moreso about creating a new record type versus reusing existing record types versus emitting type information someplace other than the analysis files, etc.) :kats, I'm particularly interested in your feedback, but I can't needinfo you here to express that.
Comment 2•6 years ago
|
||
I think generating a new record type for this kind of information makes sense. I didn't fully follow your hash data structure explanation but in general I do agree that it would be nice to unify fields common across platforms. But we'll need UI to show platform-specific variants anyway so maybe we can defer that unification idea until later.
Assignee | ||
Comment 3•5 years ago
|
||
I've got some preliminary progress on emitting structured info. This could probably be used to dynamically re-synthesize a fields-only view as in comment 0 by re-synthesizing from the peekLines for each symbol plus inserting synthetic comment lines that describe the offset in the structure and the size of the following item. (I've got this being emitted in the JSON, although we'll probably need to iterate on how to handle the offsets in the face of superclasses, etc. clang has the info but it has so much info.)
Assignee | ||
Comment 4•5 years ago
|
||
Here's an example of the current structuring I have that you can see if you open https://fancy.searchfox.org/mozilla-central/sorch?path=&q=mozilla::dom::Element and use the devtools console to inspect SEARCH_RESULTS.semantic["T_mozilla::dom::Element"].meta
. Right now there's no UI backing the "sorch" results, but there probably will be next weekend. (And do be careful about looking at any of the static files on fancy; the automatic analysis traversals will end up slowly sucking down the entire searchfox database in the background! I have fixes for that almost completed, but they're not up on fancy yet.)
Note that the fields are only for the immediate class, so the offsetBytes
for each field start well above 0 because those come after the fields of all the superclasses. One would need to walk the super chains to get those extra fields. More details can be found at https://github.com/asutherland/mozsearch/blob/fancy/docs/analysis.md#structured-records
The key ideas are:
- There are "structured" record types.
- In merge-analyses.rs, we hash over their contents to determine if they're equivalent or not. If they're not, we arbitrarily pick an analysis to be canonical. Currently, this ends up being the hash associated with the last file in the merge-analyses args, explicitly because we want to avoid picking android-armv7 because it has 32-bit pointers and those are unrealistic.
- We put a
platforms
field on this canonical structured representation that list what platforms had that hash. (List of strings, where the platform name is extracted from theanalysis-PLATFORM/stuff
path relative path provided. If the regex fails, we just name themplatform-#
.) - We put all the other "variants" in a
variants
field which contains the structured records for the given hash, also with aplatforms
field. This means looking at the canonical "meta" or the variant metas basically the same. - We do not actually expose any hashes in the JSON output we produce, it's just an internal thing.
Disclaimers:
- The windows platform isn't in there for some reason. It may have something to do with the warning:
WARN 2020-03-01T21:52:53Z: tools::file_format::analysis: Error [SyntaxError("expected `:`", 1, 126)] trying to read analysis from file [analysis-win64/__GENERATED__/./dist/stl_wrappers/windows.h] line [5226]: [{"loc":"01416:25-27","source":1,"syntax":"","type":"LPCTSTR","pretty":"variable a0","sym":"V_2221ef5_6f6795","no_crossPCTSTR","pretty":"variable a0","sym":"V_2221ef5_6f6795","no_crossref":1}]
The biggest logistical problem with the structured records has been that previously we had a fixed-size stack buffer for us to call fgets() at https://github.com/mozsearch/mozsearch/blob/4bae5797a032bc5d0d3556a846625d97e7d5b74a/clang-plugin/MozsearchIndexer.cpp#L512. I've switched us to std::getline(), but because of impedence mismatches between C I/O and C++ I/O (no POSIX ::getline on Windows for us! C++ ifstreams really don't like to take raw FD's!). It's possible the Windows solution is compiling and running now but generating super corrupt output or something. I need to look into things more, but will probably defer that until much later in the process.
- The problem may also just be some kind of foolish off by 1 error? (Here's hoping! :)
Assignee | ||
Comment 5•5 years ago
|
||
This is now working reasonably okay on simple cases and exciting things are happening in more complex cases.
- https://fancy.searchfox.org/mozilla-central/symbol?q=T_nsResProtocolHandler is one of the few examples of a structure with a field that only exists on one platform (Android). Screenshot at https://clicky.visophyte.org/files/screenshots/20200331-020445.png
- https://fancy.searchfox.org/mozilla-central/symbol?q=T_mozilla%3A%3Adom%3A%3AElement (the comment 0 example) looks reasonable for Element itself and nsIContent at https://clicky.visophyte.org/files/screenshots/20200331-021616.png but nsINode seems to have 2 sets of entries mashed together relating to how an anonymous union is handled at https://clicky.visophyte.org/files/screenshots/20200331-021603.png.
Disclaimer: If clicking on the links, you are strongly advised to not resize the window, as it's possible an in-content layout loop may result that really upsets the browser and/or linux compositor. And looking up nsGlobalWindowInner is right out.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 6•1 year ago
|
||
alpha-gated MVP landed in https://github.com/mozsearch/mozsearch/pull/692 and this should be live tomorrow. It is definitely not as polished as the client-side fancy branch prototype, but it does work for some value of working. (In particular, field-layout:'nsGlobalWindowInner'
is amusing because nsISupports ends up in there a bunch of times because it's inherited from a bunch of times. This is probably reasonable except that the table isn't able to convey where the superclasses notionally end up in the absolute layout of the root class; I definitely did not capture that additional meta-information in clang yet.)
There's a ton more work to do on this, but I expect that work will be happening on different bugs that will be much more targeted, so I'm going to mark this as fixed.
Description
•