Closed Bug 1418415 Opened 7 years ago Closed 6 years ago

Make clang plugin work in Windows

Categories

(Webtools :: Searchfox, defect)

defect
Not set
normal

Tracking

(firefox65 fixed)

RESOLVED FIXED
Tracking Status
firefox65 --- fixed

People

(Reporter: billm, Assigned: kats)

References

Details

Attachments

(4 files)

Bug 1413738 landed the clang plugin. It works on Linux and Mac but not on Windows--it crashes immediately upon being loaded.
How are you running it? I just tried doing a static-analysis try push with the indexer enabled [1] and it hit a compile error. I'm also having trouble with --enable-clang-plugin generally in my local build [2] [1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a4a65e81ec74b3ce6209af5c062114611b8ea5c [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1418364
Fixed the compiler warning in bug 1418434. I'm not sure about the local build error.
Now I'm getting crashes on Mac as well: https://treeherder.mozilla.org/logviewer.html#?job_id=145718319&repo=try&lineNumber=3055 Not sure if it's the same thing you're seeing, Kats. I don't see us running static analysis builds on Mac anymore. Maybe they just regressed?
Summary: Make clang plugin work in Windows → Make clang plugin work in Windows/Mac
(In reply to Bill McCloskey (:billm) from comment #3) > Now I'm getting crashes on Mac as well: > https://treeherder.mozilla.org/logviewer. > html#?job_id=145718319&repo=try&lineNumber=3055 > > Not sure if it's the same thing you're seeing, Kats. Yeah, not sure either. > I don't see us running static analysis builds on Mac anymore. Maybe they > just regressed? That sounds plausible. I don't see static analysis builds on Mac either. I can try getting the windows build locally and try to figure out why that's crashing. Also - I downloaded the search index artifact from your try push above, and it is missing a lot of files. I suspect what's happening is that we're doing the build with sccache and it's optimizing away most of the compiles, so the indexer plugin never actually sees the code. We should probably disable sccache for the indexer jobs.
Yeah, we definitely need to disable sccache. I don't know how to do that though :-). Feel free to comment in bug 1418188 if you know how.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #4) > > I don't see us running static analysis builds on Mac anymore. Maybe they > > just regressed? > > That sounds plausible. I don't see static analysis builds on Mac either. I > can try getting the windows build locally and try to figure out why that's > crashing. It looks like static analysis builds were explicitly turned off in bug 1355118 because they're included in the cross-compile builds. And I've been trying to get windows building locally with clang as a low-priority background task but haven't had much success. Current issue I'm running into is bug 1402915 and I haven't gotten around to trying the workaround of downgrading MozillaBuild.
Ah, thanks for the pointer! I was running into the same issue, but I assumed it was just caused by my own incompetence with Windows.
> trying to get windows building locally with clang as a low-priority > background task but haven't had much success. Current issue I'm running into > is bug 1402915 and I haven't gotten around to trying the workaround of > downgrading MozillaBuild. This has been my foreground task lately; feel free to find me this week if I can help somehow.
I'm not in Austin this week :/ But I did try MozillaBuild 2.2.0 yesterday, and had the same problem. Not sure if it works properly with VC17 which is what I have installed; I noticed it had start-shell batch files for pre-17 versions but not 17. If you have any suggestions on things to try to work around bug 1402915 I'd be happy to try them.
Yeah, it's awkward. VS2017 is supposed to be auto-detected with vswhere.exe rather than start-shell files. And of course it isn't wired up correctly for clang-cl. As a workaround, I unzipped the vs2017_15.4.2.zip tooltool package locally, and in my mozconfig I pointed VSDIR there and sourced build/win64/mozconfig.vs2017. It seems to work ok.
s/VSDIR/VSPATH/
Made some progress here, I can reproduce the mozsearch plugin crash now. The first problem is that it hard-codes '/' as the path separator in a handful of places, while it's getting handed paths with '\' as the path separator. Classic windows problem. Some ifdef'ing took care of that problem. The mozsearch-enabled build continues further but crashes later at some point, haven't investigated that yet.
More progress, I got to the point where the mozsearch plugin mostly doesn't crash for me locally. (It got through most of the build, then I aborted it because it was hideously slow and I need to reboot). Try push at https://treeherder.mozilla.org/#/jobs?repo=try&revision=29fde2456ba99b0e9b9df6729f2085bb416ed323 also shows the plugin working, although the build fails and I need to investigate that.
A new try push including the macOS indexing run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=bf6ace4ce89e25b6232cb16540f1be92b6bdcbaa This provides a better picture of what's going wrong, there's a crash in the indexer plugin in some name mangling code? I'll investigate.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #14) > This provides a better picture of what's going wrong, there's a crash in the > indexer plugin in some name mangling code? I'll investigate. Looks like https://github.com/llvm-project/llvm-project-20170507/blob/cc8efd4748506684199ad4ee919efa1547439920/clang/lib/AST/MicrosoftMangle.cpp#L1331 doesn't understand __is_convertible_to(U*, T*).
Did another try push just to see where things stand now: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6d5e80139ae42e49e96e5f957d17ea210d64198f Windows seems to be affected by bug 1427808. Mac seems to be mostly working; it failed at the end but I think that's just because the save-analysis stuff from the linux mozconfig [1] needs to be added to the mac mozconfig [2] [1] https://searchfox.org/mozilla-central/rev/ad36eff63e208b37bc9441b91b7cea7291d82890/browser/config/mozconfigs/linux64/debug-searchfox-clang#18 [2] https://searchfox.org/mozilla-central/rev/ad36eff63e208b37bc9441b91b7cea7291d82890/browser/config/mozconfigs/macosx64/debug-searchfox
Depends on: 1427808
Bug 1477792 enables the build for mac. Leaving this bug for windows.
Summary: Make clang plugin work in Windows/Mac → Make clang plugin work in Windows
https://treeherder.mozilla.org/logviewer.html#?job_id=198127438&repo=mozilla-central&lineNumber=9903 is a recent windows searchfox indexing job. Seems to have a static analysis failure.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3858c582c1b25f7a7323020100c666a5ace31632&selectedJob=200794574 - same as the last link. Nika, do you know what's going on here? This is a windows-build try push with the clang plugin (and the searchfox indexer plugin, which builds on top of that) and it's emitting what appears to be a static analysis error with already_AddRefed<nsIRunnable>.
Flags: needinfo?(nika)
I think you are running into bug 1492743; glandium is having similar problems trying to upgrade everything to clang 7.
Indeed, that seems likely.
Depends on: 1492743
Flags: needinfo?(nika)
With bug 1492743 resolved by the backout of bug 1443265, I did another try push to see how things stand. I got a different error [1]: Cannot instantiate '<foo>' with non-memmovable template argument '<bar>'. I don't know if that's covered by some other pre-existing bug. [1] https://treeherder.mozilla.org/logviewer.html#?job_id=203097551&repo=try&lineNumber=18037
That's bug 1438446.
Want to give bug 1438446 comment 4 a shot? I meant to try it myself but other work has come up.
Yup I'll give that a shot.
Depends on: 1438446
I have a patch in bug 1438446 to fix that problem. Next problem [1] is that llvm doesn't seem to know how to mangle __is_convertible_to (what dmajor said in comment 16). So I guess next step here is to see if a newer llvm has that fixed, and if not, file a bug against llvm? Also it looks like we have multiple clang-cl toolchains, with the win64-clang-cl-st-an toolchain being at a newer revision than win64-clang-cl [2] so I'll try with that too. [1] https://treeherder.mozilla.org/logviewer.html#?job_id=203150512&repo=try&lineNumber=19975 [2] https://searchfox.org/mozilla-central/rev/3c85ea2f8700ab17e38b82d77cd44644b4dae703/taskcluster/ci/toolchain/windows.yml#5,47
Also FWIW my local --enable-mozsearch-plugin build didn't run into that __is_convertible_to error (it just completed after 2.5+ hours). The clang-cl I have in ~/.mozbuild/clang/bin is from tags/RELEASE_700/final 342383. Looks like our win64-clang-cl-st-an toolchain is from r317840 [1] so maybe that's not new enough. Although I have no idea how I got that version of clang in my ~/.mozbuild if it didn't come from a taskcluster artifact somewhere... [1] https://searchfox.org/mozilla-central/rev/3c85ea2f8700ab17e38b82d77cd44644b4dae703/build/build-clang/clang-win64-st-an.json#2
The compiler that bootstrap provides is the win64-clang-cl one from RELEASE_700 which is quite recent. The st-an builders are stuck on an ancient trunk revision due to bug 1427808, which I keep meaning to revisit.
Ok. It looks like the clang-tidy toolchain is also on RELEASE_700 so I'm trying that: https://treeherder.mozilla.org/#/jobs?repo=try&revision=97bce087786ff62c67d86b926836d67dedca0f34
What I said in comment 30 makes no sense. For some reason I thought win64-clang-cl was *older* than win64-clang-cl-st-an but it's not. So I guess the question is why did my local build with win64-clang-cl work fine, but it failed on try with (presumably) the same clang version?
As far as I can tell it's running exactly the same thing in my local build and on try, yet it fails on try and passes locally. I checked that all the command line arguments are the same (modulo location of my srcdir/objdir) and that I'm using the same SDK version (I was using 10.0.15063.0 initially, but I switched to 10.0.17134.0 to match try and got the same result). I looked at the preprocessed source for the unified .cpp file that clang-cl generates locally, and it does include wrl/client.h which has the is_convertible stuff. Gonna try again on latest master revisions to make sure those are in sync as well. Next step for debugging would be to try to get a windows loaner so I can get the preprocessed source from the try machine that's failing, and try it with my local compiler, and vice-versa. And just poke around on the loaner to see if there's anything different in the environment than what I have locally.
Same problem with latest master revisions; works locally but fails on try. https://treeherder.mozilla.org/#/jobs?repo=try&bugfiler=&group_state=expanded&selectedJob=203382003&revision=868883b7efd1dacfef96a15fe8bbd2e3067f656a I'm not too excited about getting a loaner since in the past that's usually turned out to be a waste of time. Going to put this on the back burner for now since I have more pressing webrender things to work on.
So I managed to work around the problem by downgrading the clang diagnostic from an error to a note. https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&revision=82e2aa09bb6ef19b6309f156a813b408d8783a23 If patching our win64-clang toolchain like this isn't cool I can fork that toolchain into a new win64-clang-mozsearch toolchain or something. But first I want to test changes on the searchfox side to pull down and index the windows code.
IMO it's fine to apply searchfox-needed patches to the main win64-clang toolchain as long as they don't pose a risk to Firefox builds (and I don't think this patch does).
Quick update: I cleaned up the searchfox-side code to download and merge the windows artifacts into the index. Ran into a few issues which I'll note here for posterity: - the taskcluster index server sometimes returns JSON files with 'Content-Encoding: gzip' and sometimes does not. Neither wget nor curl do content-decoding by default but eventually I discovered we can use curl --compressed to work around this. This spawned [1]. - the windows and linux analysis data disagree on whether opcode_table [2] should be no_crossref or not. Linux says not to crossref, windows says yes. This was causing an assertion failure at [3]. I think Linux is correct here but I just changed the merging code to handle this scenario and only set no_crossref if all inputs agree to no_crossref because I suppose this class of errors can arise again in the future if we are using different versions of clang on different platforms. - some of the JSON files in the windows analysis tarball are malformed. It looks like the file locking mechanism at [4] isn't working properly, since the output in the file had one line of analysis clobbering another, and the file seemed truncated. It might be some other problem but the rest of the code looks fine and the file locking seems like the most obvious place to start looking, since there's a windows-specific implementation that is different from the one used on mac/linux. So I'm trying to debug that now. [1] https://github.com/curl/curl/issues/3192 [2] https://searchfox.org/mozilla-central/rev/8848b9741fc4ee4e9bc3ae83ea0fc048da39979f/gfx/graphite2/src/inc/opcode_table.h#44 [3] https://github.com/mozsearch/mozsearch/blob/e71da650153a366d1beb6c41acdf500daa7e2dff/tools/src/file_format/analysis.rs#L114 [4] https://searchfox.org/mozilla-central/rev/8848b9741fc4ee4e9bc3ae83ea0fc048da39979f/build/clang-plugin/mozsearch-plugin/MozsearchIndexer.cpp#479
The malformed JSON analysis file problem was due to line endings. The JSONFormatter emits JSON with '\n' line endings, but we were reading/writing to disk in "text" mode so windows was adding \r to the line endings. And then we truncated the file to the length we expected with \n line endings so the file got lopped. Here's a try push with that fixed: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&selectedJob=208712687&revision=9a63bb8e98d2d85ffc1e37c3ee86023b38df0514 Hopefully there's no more problems lurking. One thing that's odd is how much the windows indexing time fluctuates. This try push took 174 minutes for the windows job, I've had previous runs that took 148 minutes and 106 minutes. With only minor changes to the indexer code, that's kind of surprising. Hopefully it settles at a stable value so we can trigger the searchfox cron at the right time.
This is necessary because: (a) the JSONFormatter emits a \n newline for each analysis line (b) we truncate the file to the expected length after writing it (c) on Windows writing the file in text mode replaces \n with \r\n and invalidates our computed "expected length"
While looking at this code I found a couple of places where errors could get ignored or silently discarded and result in corrupt data. This checks for the errors and fails harder. Depends on D10353
Also turn on saving of rust analysis for the windows searchfox job. Depends on D10355
Assignee: nobody → kats
Pushed by kgupta@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/82c181279b31 Read and write analysis files in binary mode. r=mccr8 https://hg.mozilla.org/integration/autoland/rev/1c5b63888650 Do more error checking. r=mccr8 https://hg.mozilla.org/integration/autoland/rev/b859b73b96df Add a patch to win64-clang to downgrade a mangling error. r=dmajor https://hg.mozilla.org/integration/autoland/rev/25749072215e Build the daily searchfox index on Windows as well. r=emilio
Depends on: 1504242
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: