Closed Bug 752004 (winclang) Opened 12 years ago Closed 6 years ago

Allow building Firefox with clang-cl on Windows

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: ehsan.akhgari, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(1 file)

      No description provided.
Depends on: 752008
Depends on: 752034
I assume the bug summary should be more like "allow building Firefox with Clang on Windows"? The goal is not to change our official builds or anything like that, just to make it possible?
That's correct!
Summary: Build Firefox with Clang on Windows → Allow building Firefox with Clang on Windows
Depends on: 882766
Depends on: 882770
Depends on: 882779
Depends on: 1021378
Depends on: 1021494
Depends on: 1021290
Summary: Allow building Firefox with Clang on Windows → Allow building Firefox with clang-cl on Windows
Depends on: 1022033
Depends on: 1022043
Depends on: 945582
Depends on: 1022049
Depends on: 1022050
Depends on: 1022348
Depends on: 1022349
Depends on: 1022420
Depends on: 1023058
Depends on: 1023449
Depends on: 1024097
Depends on: 1024195
Depends on: 1024459
Depends on: 1024463
Depends on: 1024465
Depends on: 1024713
Depends on: 1024833
Depends on: 1024836
Depends on: 1024842
Depends on: 1025143
Depends on: 1025324
Depends on: 1025393
Depends on: 1025900
Depends on: 1025906
Depends on: 1026129
Depends on: 1026461
Depends on: 1026718
Depends on: 1027323
Depends on: 1028613
Depends on: 1028633
Depends on: 1028679
Depends on: 1028680
Depends on: 1028684
Depends on: 1028944
Depends on: 1028945
Blocks: winasan
Depends on: 1031349
Depends on: 1032520
Depends on: 1032528
Depends on: 1032621
Depends on: 1033171
Depends on: 1033457
No longer depends on: 1032621
Depends on: 1033887
Depends on: 1034094
Depends on: 1010371
Depends on: 1034415
Depends on: 1032328
Depends on: 1034927
Depends on: 1034930
Depends on: 1036068
Depends on: 1036542
Depends on: 1038148
Depends on: 1038149
Depends on: 1038150
Depends on: 1038152
Depends on: 1038155
Depends on: 1038156
Depends on: 1038158
Depends on: 1038164
Depends on: 1038166
Depends on: 1038170
Depends on: 1038171
Depends on: 1038187
Depends on: 1038189
Depends on: 1038190
Depends on: 1038193
Depends on: 1038195
Depends on: 1038196
Depends on: 1038200
Depends on: 1038202
Depends on: 1038204
Depends on: 1038210
Depends on: 1038212
Depends on: 1038213
Depends on: 1038219
Depends on: 1038221
Depends on: 1038411
Depends on: 1038492
Depends on: 1039459
Depends on: 1039838
Depends on: 1039841
Depends on: 1039843
Depends on: 1039845
Depends on: 1040030
Depends on: 1040031
Depends on: 1040037
Depends on: 1040038
Depends on: 1040039
Depends on: 1040040
Depends on: 1040041
Depends on: 1040042
Depends on: 1040174
Depends on: 1041325
Depends on: 1041326
Depends on: 1042132
Depends on: 1047614
Depends on: 1060918
Depends on: 1061269
Depends on: 1061274
Depends on: 1063413
Depends on: 1067404
Depends on: 1068193
Depends on: 1068195
Depends on: 1068201
Depends on: 1080937
Depends on: 1080965
Depends on: 1080968
Depends on: 1081414
Depends on: 1083616
Depends on: 1084414
Depends on: 1089613
Depends on: winclang-werror
Depends on: 1090512
Depends on: 1109841
Depends on: 1117031
Depends on: 1119225
No longer depends on: 1068195
Depends on: 1167846
Depends on: 1185686
Depends on: 1186934
Depends on: 1187043
Depends on: 1188045
Blocks: 1193452
Depends on: 1196370
Depends on: 1201122
Depends on: 1201205
Depends on: 1203096
Depends on: 1202026
Depends on: 1202022
Depends on: 1232765
Depends on: 1232772
Depends on: 1233542
Brief update on this:

I spent some quality time debugging things yesterday and discovered that the entire JS engine was being compiled with MSVC while the rest of Gecko was being compiled with clang-cl.  That resulted in bug 1233542.  In an ideal world, this shouldn't have mattered; what appears to be happening is that template function instantiations are producing different code between MSVC and clang-cl (unsurprising), but substituting clang-cl's version for MSVC's version in libxul (the linker selects which version to use; it's worth noting that combining compilation units with different implementations of a particular template instantiation is undefined behavior, I believe) causes no end of problems (a little surprising; I'm not entirely certain whether this is a bug in clang-cl).  Fixing the template instantiations to only happen once would be...difficult, I think, and also bad for performance.

Once bug 1233542 is fixed, we're still falling back to MSVC for some JS engine files due to https://llvm.org/bugs/show_bug.cgi?id=25875  And that causes the same problems as before, due to some template function instantiations being different between MSVC and clang-cl.

We still fall back to MSVC on a (very) few Gecko files, but those are straightforward to fix.

Most of the dependent bugs I've been filing have to do with reducing warning spam; once all those patches are in the tree, we might be able to compile Firefox with clang-cl on try (since the logs will no longer overflow due to massive warning spam).
Depends on: 1233732
Depends on: 1233768
Depends on: 1233981
Depends on: 1233983
Depends on: 1234860
The simple startup crashes from comment 4 have been resolved once bug 1234860 winds its way onto mozilla-central.  We still crash on startup, though, with the assert added in bug 1177819.  We don't hit that assert in an MSVC-compiled Gecko, so I'm not quite sure what's going on yet.

Painting performance in a clang-cl-compiled Gecko also appears to be terrible; the UI doesn't paint properly, and the painting appears to be somewhat slow in any event.  One more thing to debug.
Depends on: 1236577
(In reply to Nathan Froyd [:froydnj] from comment #5)
> Painting performance in a clang-cl-compiled Gecko also appears to be
> terrible; the UI doesn't paint properly, and the painting appears to be
> somewhat slow in any event.  One more thing to debug.

Can you screenshot the painting problem?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
> (In reply to Nathan Froyd [:froydnj] from comment #5)
> > Painting performance in a clang-cl-compiled Gecko also appears to be
> > terrible; the UI doesn't paint properly, and the painting appears to be
> > somewhat slow in any event.  One more thing to debug.
> 
> Can you screenshot the painting problem?

Attached.  The browser chrome doesn't get painted, and the content is only painted if you mouseover it.  Once the mouse moves outside of the browser window, the content area is painted black again.

Ctrl-L and typing a URL does seem to work, but eventually hits the crashes about native anonymous content.
(In reply to Nathan Froyd [:froydnj] from comment #5)
> The simple startup crashes from comment 4 have been resolved once bug
> 1234860 winds its way onto mozilla-central.  We still crash on startup,
> though, with the assert added in bug 1177819.  We don't hit that assert in
> an MSVC-compiled Gecko, so I'm not quite sure what's going on yet.

I think the easiest way to deal with this issue is to figure out a way to fix bug 1209680 and find the mis-compiled object file(s)...
Depends on: 1228410
Depends on: 1237476
Depends on: 1242722
Depends on: 1243455
Depends on: 1243617
I have managed to successfully build with clang-cl x86-64 both debug and optimized.  Basic browsing on Windows Server 2012 seems to work just fine.
(In reply to :Ehsan Akhgari from comment #9)
> I have managed to successfully build with clang-cl x86-64 both debug and
> optimized.  Basic browsing on Windows Server 2012 seems to work just fine.

But I'm able to get crashes browsing to yahoo.com on Windows 7.
Depends on: 1243861
Depends on: 1243918
Depends on: 1245328
Nathan, have you been seeing painting issue with a recent build?
Flags: needinfo?(nfroyd)
Depends on: 1245053
No longer depends on: 1228410
(In reply to :Ehsan Akhgari from comment #11)
> Nathan, have you been seeing painting issue with a recent build?

I haven't tried in a couple of weeks.  I was doing 32-bit builds and I see that your recent successes have been with 64-bit.  Have you tried the 32-bit versions as well?
Flags: needinfo?(nfroyd)
I built a 32-bit build and it seems to work fine. I'm running the mochitest browser-chrome now and it seems to be passing.
Depends on: 1246333, 1246334
Depends on: 1246549
Depends on: 1246550
Here is a try push with Windows x86 and x86-64 builds and tests in both debug and optimized: <https://treeherder.mozilla.org/#/jobs?repo=try&revision=f787abf6f20f>  The x86 debug build timed out, and the rest succeeded, but we are failing quite a few tests.
Depends on: 1243233
Depends on: 1251226
Depends on: 1251587
Depends on: 1251936
Depends on: 1254807
Depends on: 1255210
Depends on: 1255211
No longer depends on: 1255210
Depends on: 1290530
Depends on: 1296737
Depends on: 1296739
Depends on: 1296742
Depends on: 1296746
Depends on: 1298132
Depends on: 1298134
Depends on: 1298144
Depends on: 1298149
Depends on: 1298151
Depends on: 1298171
Depends on: 1298383
Depends on: 1298387
Depends on: 1298403
Depends on: 1298412
Depends on: 1298418
Depends on: 1298462
Depends on: 1298466
Depends on: 1298470
Depends on: 1298472
Depends on: 1299145
Depends on: 1300124
Depends on: 1255210
With only minimal changes and a patch or two still under review, I can make a Firefox with clang-cl build.  A --disable-optimize build seems OK, while an optimized build still shows the black painting problems in the screenshot attached to this bug.  (e.g. the newtab page is black everywhere there isn't a visible element)  Is there an obvious place to start looking for graphics bugs like that?
Flags: needinfo?(jmuizelaar)
(In reply to Nathan Froyd [:froydnj] from comment #15)
> With only minimal changes and a patch or two still under review, I can make
> a Firefox with clang-cl build.  A --disable-optimize build seems OK, while
> an optimized build still shows the black painting problems in the screenshot
> attached to this bug.  (e.g. the newtab page is black everywhere there isn't
> a visible element)  Is there an obvious place to start looking for graphics
> bugs like that?

This is a build without any fallback to msvc cl right? I don't have any real suggestions of where to start debugging it. I can try to reproduce and debug the problem next week.

However, you could try to reduce the issue by trying to bisect the compilation.
i.e. have a wrapper around the compiler that chooses to add -O2 based on whether the file is in a list of files. With that infrastructure in place, you can reduce that list using http://delta.tigris.org/. Once we have the compilation unit narrowed down. We can probably decompose -O2 into individual passes and figure out exactly which pass is causing the change. That should give us an idea of what's going wrong.
Flags: needinfo?(jmuizelaar)
Depends on: 1305536
Depends on: 1305819
Depends on: 1307812
Depends on: 1311175
I tried to build m-c with clang-cl (r284701), and it failed with error "cannot use throw with exceptions disabled" in crashreporter/jsoncpp/src/lib_json/json_value.cpp. Should I fallback to the version prior r242176 [1], or are we following the latest clang?

[1] http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20150713/133143.html
(In reply to Ting-Yu Chou [:ting] from comment #17)
> I tried to build m-c with clang-cl (r284701), and it failed with error
> "cannot use throw with exceptions disabled" in
> crashreporter/jsoncpp/src/lib_json/json_value.cpp. Should I fallback to the
> version prior r242176 [1], or are we following the latest clang?
> 
> [1]
> http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20150713/133143.html

We are (mostly) following the latest clang, with the caveat that m-c is not buildable by default (bug 1298418).  So you have two options:

1. Use an earlier version of clang-cl for now; or
2. Figure out what we need to do to make things work with r242176.
(In reply to Ting-Yu Chou [:ting] from comment #17)
> I tried to build m-c with clang-cl (r284701), and it failed with error
> "cannot use throw with exceptions disabled" in
> crashreporter/jsoncpp/src/lib_json/json_value.cpp. Should I fallback to the
> version prior r242176 [1], or are we following the latest clang?
> 
> [1]
> http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20150713/133143.html

Actually, how does this break for you?  What is the command-line you're using?  We don't pass -f{no-}exceptions to clang-cl, so I'm at a bit of a loss to see how this commit affects the build.
Flags: needinfo?(janus926)
Depends on: 1312309
Depends on: 1312313
Depends on: 1312543
Depends on: 1312549
Just to get some ideas about how far we're from clang-cl buildable. With clang r284701, m-c revision 215f96861176 [1] is buildable if bug 1312313, 1312309, 1311175, 1298418 are addressed.

[1] https://hg.mozilla.org/mozilla-central/rev/215f9681176
Anyone knows do we have clang-cl build in automation for monitoring the buildable status?
(In reply to Ting-Yu Chou [:ting] from comment #22)
> Anyone knows do we have clang-cl build in automation for monitoring the
> buildable status?

I am working on that exact thing this quarter.
Depends on: 1314625
Depends on: 1314762
Depends on: 1316116
Depends on: 1316120
Depends on: 1316168
Depends on: 1318376
Depends on: 1319003
Depends on: 1321334
Depends on: 1321378
Depends on: 1321379
Depends on: 1321444
Depends on: 1321453
No longer depends on: 1321453
Depends on: 1321651
Depends on: 1321875
Depends on: 1324103
Depends on: 1324105
Depends on: 1324106
Depends on: 1324110
Depends on: 1335991
Depends on: 1340588
Depends on: 1343149
See Also: → linker-lld
Depends on: 1402915
Blocks: 1341525
Blocks: WinLTO
Depends on: 1422368
Depends on: 1423799
Depends on: 1427808
Blocks: 1429455
No longer blocks: 1429455
Depends on: 1439762
No longer depends on: 1324105
Considering that bug 1443590 is fixed, I'm going to have to conclude that this is fixed as well. :)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Depends on: 1474856
Depends on: 1476000
Blocks: 1485093
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: