Assume ".inc" file extensions are C++ files for tokenization purposes because that happens a lot and we can have C++ analysis data for them
Categories
(Webtools :: Searchfox, enhancement)
Tracking
(Not tracked)
People
(Reporter: asuth, Assigned: asuth)
References
Details
Attachments
(1 file)
As a result of LLVM doing semantic indexing as part of bug 1852405 I've noticed that we don't have a tokenizer language associated with ".inc" files which can throw away a lot of the power of the C++ indexing. Specifically, LLVM produces a file at __GENERATED__/tools/clang/include/clang/AST/Attrs.inc which is completely legal C++ code and for which we have profuse C++ analysis data, but for which the searchfox source listing is useless.
From a query against m-c for file paths ending in .inc we can see that these files are not always wholly valid C++ files. We can also end up with:
- HTML, XUL-flavored HTML, or maybe it's dead code XUL! ex: https://searchfox.org/mozilla-central/source/browser/base/content/browser-context.inc
- nasm assembly. ex: https://searchfox.org/mozilla-central/source/media/libjpeg/simd/nasm/jcolsamp.inc
- C++ Macrology DSL data payloads that our tokenizer can understand but which cannot really parse correctly in isolation:. ex: https://searchfox.org/mozilla-central/source/build/clang-plugin/Checks.inc
The good news is:
- Most of the files are either wholly valid C++ or C++ macrology DSL payloads for which a C++-friendly tokenizer is the right answer even if it has some trouble with the DSLs.
- We categorically will not have analysis data for other languages than C++ at this time.
- The C++ tokenizer can give up and fall back to plaint text for cases that are mismatches. Or maybe it won't give up and things will look silly. I'm okay with that, especially if it motivates people to stop using the preprocessor on HTML files.
A more elegant solution that would also be more work would be to allow the analysis data to identify the language in use. That could make sense to do if someone implements analysis support for nasm assembly if they really care, etc.
| Assignee | ||
Comment 1•2 years ago
|
||
| Assignee | ||
Updated•2 years ago
|
Description
•