Add a test platform for win64 ASan

RESOLVED FIXED in mozilla55

Status

Taskcluster
Task Configuration
RESOLVED FIXED
11 months ago
6 months ago

People

(Reporter: ting, Assigned: ting)

Tracking

(Blocks: 1 bug)

unspecified
mozilla55
Dependency tree / graph

Details

(Whiteboard: [stockwell disabled])

MozReview Requests

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(9 attachments, 2 obsolete attachments)

766 bytes, application/pgp-encrypted
Details
59 bytes, text/x-review-board-request
mshal
: review+
Details | Review
59 bytes, text/x-review-board-request
dustin
: review+
Details | Review
59 bytes, text/x-review-board-request
grenade
: review+
Details | Review
59 bytes, text/x-review-board-request
ted
: review+
Details | Review
59 bytes, text/x-review-board-request
ted
: review+
Details | Review
59 bytes, text/x-review-board-request
ted
: review+
Details | Review
59 bytes, text/x-review-board-request
grenade
: review+
Details | Review
59 bytes, text/x-review-board-request
glandium
: review+
Details | Review
I'd like to add a test platform for win64 ASan.  But for now, I guess we only want to initiate tests on Try because we haven't fixed any issues that ASan reports on Windows yet and we don't know how bad the current status is.  Once it's testable on Try, we can start to fix the failed tests.
I'm not sure, but I think I only need to touch:

  taskcluster/ci/tests/test-platforms.yml
  taskcluster/ci/build/windows.yml

Don't know how to limit this on Try though.
Assignee: nobody → janus926
Status: NEW → ASSIGNED
Another thing I am not sure is, I saw two x64 opt platforms on Treeherder: "Windows 8 x64 opt" and "Windows 2012 x64 opt", which one should I pick for ASan, 2012?
:jlund, could you give me some hints on how to achieve this? Thank you.
Flags: needinfo?(jlund)

Comment 4

11 months ago
(In reply to Ting-Yu Chou [:ting] from comment #3)
> :jlund, could you give me some hints on how to achieve this? Thank you.

Hi,

two thoughts:

1. as far as I know, we don't have asan windows builds. asan builds are linux only.
2. our tier 1 windows tests are still in buildbot (old continuous integration tool) and in the midst of being ported to taskcluster (new continuous integration tool, currently tier 3)

Do you need windows asan builds? Is that a thing? Do you need this to be tier 1 or tier 3? With tier 3, you can run them, verify on try, and self serve your own tests but they won't ever block and fail continous-integration tests until windows on taskcluster becomes tier 1, some point in 2017.
Flags: needinfo?(jlund) → needinfo?(janus926)
(In reply to Jordan Lund (:jlund) from comment #4)
> 1. as far as I know, we don't have asan windows builds. asan builds are
> linux only.

Yes, but now we can build it locally for Windows, see bug 1030826 comment 2.

> 2. our tier 1 windows tests are still in buildbot (old continuous
> integration tool) and in the midst of being ported to taskcluster (new
> continuous integration tool, currently tier 3)
> 
> Do you need windows asan builds? Is that a thing?

Yes, as Windows is the top platform of our user base.

> Do you need this to be
> tier 1 or tier 3? With tier 3, you can run them, verify on try, and self
> serve your own tests but they won't ever block and fail
> continous-integration tests until windows on taskcluster becomes tier 1,
> some point in 2017.

For now, tier 3 is fine because we're just about to start fixing the errors that ASan reports. But once the errors are fixed, I'd prefer to promote it to tier 1 (no matter taskcluster becomes tier 1 or not).
Flags: needinfo?(janus926)

Comment 6

11 months ago
> Do you need windows asan builds? Is that a thing?

This bug is in support of _making_ Windows ASan be a thing. :)

Comment 7

11 months ago
(In reply to David Major [:dmajor] from comment #6)
> > Do you need windows asan builds? Is that a thing?
> 
> This bug is in support of _making_ Windows ASan be a thing. :)

apologies, I read the bug as you wanting to add a Windows ASan test not build. I'll try and point you in the right direction. I'll report back by EOD

Comment 8

11 months ago
(In reply to Jordan Lund (:jlund) from comment #7)
> (In reply to David Major [:dmajor] from comment #6)
> > > Do you need windows asan builds? Is that a thing?
> > 
> > This bug is in support of _making_ Windows ASan be a thing. :)
> 
> apologies, I read the bug as you wanting to add a Windows ASan test not
> build. I'll try and point you in the right direction. I'll report back by EOD

status update: I'm currently distracted with releases in flight (it's release week). Will report back tomorrow.

Comment 9

11 months ago
no update here, will sync up on monday after release work dies down.

Comment 10

11 months ago
disclaimer: I haven't done any work with regards to windows in this new world of things.

okay, so first stop is the official documentation: http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/index.html

:dustin owns and created `mach taskgraph` and would be a great resource for scheduling and for reviewing anything under taskcluster/*
:pmoore and :grenade are both great resources for actually setting up and running windows builds and for reviewing anything about windows builds/configs


my general debugging workflow prior to pushing to try:

use a combo of the following mach commands: http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/taskgraph.html#mach-commands

example debugging:
1. find most recent try (positive test) and mozilla-inbound (negative test) decision task: the 'D' task on treeherder within every push.
2. download from that task's artifacts, parameters.yml
3. mv that file to the root of a clean checkout of mozilla-central/inbound. experiment with the try message in parameters.yml.
4. run something like: `./mach taskgraph full --json --parameters parameters.yml > /tmp/clean_central_nightly_full.json`
5. make changes to ./taskcluster/*
6. run command from step 4 and name it something like dirty_central_nightly_full.json
7. diff the files


As to how this can be done, I would worry about getting 'builds' working first, then add tests. So I would do something like the following while continuously going through debug steps 1-7 above every time I make changes to anything under ./taskcluster/* 

1. Schedule it (tell taskgraph automation 'what to run'):
    a. add a new windows variant (win32-asan/opt:) stanza here: https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/build/windows.yml
    b. make it similar to https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/build/linux.yml#176-198
    c. make it so it doesn't run on any tree by default but you can still specify it in try message via `try: -b o -p win32-asan` by adding to the stanza: `run-on-projects: []`
        * note: you could put `run-on-projects: try` in your local copy but if you are going to land on inbound and you don't want this win asan to run on every `try -b all` push, leave it at `run-on-projects: []`
    d. while standing these up, set them to tier 3 so its ignored by sheriffs and won't block trees by putting the following in the stanza under treeherder: `tier: 3`

2. Define the steps (tell mozharness automation 'how to run it'):
   a. in the above steps you will specify `custom-build-variant-cfg: asan-tc` which gets read by mozharness here: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/building/buildbase.py#348
   b. we now need to add a mozharness config similar to the linux asan-tc file:
      * under this dir: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/builds/releng_sub_windows_configs
      * add a file similar to this linux one but make it windows specific: https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/builds/releng_sub_linux_configs/64_asan_tc.py


3. try it on try with try message that schedules the build!

Hopefully this helps get you started. Feel free to ping me here or in #releng with follow up questions

Updated

11 months ago
Component: General Automation → Task Configuration
Product: Release Engineering → Taskcluster
QA Contact: catlee
Thanks a lot for the detailed information, will let you know if I run into any troubles.
Just walked through the steps, #2 doesn't work as custom-build-variant-cfg is an invalid property for Windows [1], it is used [2] for Linux.

[1] https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/job/mozharness.py#175
[2] https://dxr.mozilla.org/mozilla-central/source/taskcluster/scripts/builder/build-linux.sh#119
(In reply to Ting-Yu Chou [:ting] from comment #12)
> Just walked through the steps, #2 doesn't work as custom-build-variant-cfg
> is an invalid property for Windows [1], it is used [2] for Linux.

I guess what Windows needs are something like:

  testing/mozharness/configs/builds/taskcluster_firefox_win64_asan_opt.py
  browser/config/mozconfigs/win64/opt-asan

and set taskcluster_firefox_win64_asan_opt.py in run/config under win64-asan/opt.
The generated taskgraph does not contain tests like:

  "test-win64-asan/opt-cppunit",
  "test-win64-asan/opt-crashtest",
  "test-win64-asan/opt-mochitest-1"

Not sure where to add for those, is it taskcluster/ci/test/test-platforms.yml? If it is, how should I modify the file? There're "windows10-64-vm/debug", "windows10-64-vm/otp" but I don't see how/where they are used.
(In reply to Ting-Yu Chou [:ting] from comment #14)
> taskcluster/ci/test/test-platforms.yml? If it is, how should I modify the
> file? There're "windows10-64-vm/debug", "windows10-64-vm/otp" but I don't
> see how/where they are used.

I guess the file is correct, and they're applied regarding the build-platform, so I should put something like:

  windows2012-64-asan/opt:
      build-platform: win64-asan/opt
      test-sets:
          - common-tests

and add "windows2012-64-asan" to WORKER_TYPE [1], but I don't know what is the worker type. BTW, I never understand why it is "Windows 2012" (e.g., Windows 2012 opt), do you know why?

[1] https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/tests.py#45
NI for comment 13 and comment 15.
Flags: needinfo?(rthijssen)
Flags: needinfo?(pmoore)
I'll defer to grenade/dustin/jlund on this, this is a bit outside my area of expertise.

I can help with any questions about how taskcluster Windows workers are set up, how they execute tasks, managing privileges (both OS/system level and taskcluster level) or provide help regarding setting up new environments, customising worker behaviour, making toolchains available, managing worker configuration etc.

Regarding the implementation of specific gecko builds, the configs used in-tree and how those configs are organised, would be mostly RelEng domain knowledge (e.g. grenade/jlund/Callek/catlee/coop/kmoir/bhearsum/nthomas/... etc - or #releng in IRC), or the in-tree task scheduling mechanics in gecko via mach is best known by dustin (although several people have actively contributed there and may also be able to help, such as jlund/ahal/grenade/...).

Sorry I couldn't give an absolute answer there, but hopefully the other contact details help.
Flags: needinfo?(pmoore)
:ting you are correct re the mozharness configs in comment 13.

regarding platforms: on taskcluster we currently do all *build* work on Windows Server 2012 (both 32 bit & 64 bit builds are run on a 64 bit Windows Server 2012 os, with environment configuration for x86/x86_64 compilation/linking) and run ui *test* suite work on Windows 7 (for testing of 32 bit binaries) and Windows 10 (for testing of 64 bit binaries).

so if you want to follow the current convention, asan 64 bit builds should run on Windows Server 2012 and any test suites against the binaries from those builds should run on Windows 10.

if tests require a GPU they run on windows7-32 or windows10-64
if tests do not require a GPU they run on windows7-32-vm or windows10-64-vm

i hope that answers your questions, feel free to ping me (:grenade) on #taskcluster for anything i missed or didn't explain properly
Flags: needinfo?(rthijssen)
Created attachment 8837406 [details] [diff] [review]
wip
Attachment #8837406 - Attachment is patch: true
Comment on attachment 8837406 [details] [diff] [review]
wip

Review of attachment 8837406 [details] [diff] [review]:
-----------------------------------------------------------------

I think you've got the right idea here, just be careful what you're enabling :)

::: taskcluster/ci/test/test-platforms.yml
@@ +139,5 @@
>  
> +windows10-64-asan/opt:
> +    build-platform: win64-asan/opt
> +    test-sets:
> +        - common-tests

Note that when this is green and the `run-on-projects: []` is removed, it will add a nontrivial additional testing load.  That might be OK, just plan ahead and talk to people with budget experience (like garndt), since it could be adding a number with a bunch of zeroes to the budget.

Also, please use the `mach taskgraph` subcommands to double-check that this is not going to introduce these tasks by default anyway.  I don't remember if the test jobs (which aren't tagged with `run-on-projects: []`) will run and pull in the build jobs they depend on.
The Try run [1] still has something different from what I expect:

a. I found the test log inside the build task [2], how should I make it like linux64 asan which each test are listed under "Linux x64 asan" separately?

b. My try run has:

     Windows 2012 x64 opt   tc[tier3] (Bo)
     Windows 2012 x64 debug tc[tier3] (Bd)

   but I expect something similar to linux:

     Linux x64 asan         tc (Bo Bd)

c. There's also:

     windows-2012-32 opt Cc[tier2] (Clang-Tidy ClangCL)

   So I guess I should follow the convention to be:

     windows-2012-64 asan

   but I don't know how to make it.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=2b543b299a9e85cf774558531c13061b4e948c51
[2] https://public-artifacts.taskcluster.net/bKW5yEyyT9GZ2Bcgp7ljvg/0/public/logs/live_backing.log
(In reply to Ting-Yu Chou [:ting] from comment #21)
> b. My try run has:
> 
>      Windows 2012 x64 opt   tc[tier3] (Bo)
>      Windows 2012 x64 debug tc[tier3] (Bd)
> 
>    but I expect something similar to linux:
> 
>      Linux x64 asan         tc (Bo Bd)

Fixed by set the collection to asan, change the the attribute treeherder.platform from "windows2012-64/opt" to "windows2012-64/asan".
(In reply to Ting-Yu Chou [:ting] from comment #21)
> c. There's also:
> 
>      windows-2012-32 opt Cc[tier2] (Clang-Tidy ClangCL)
> 
>    So I guess I should follow the convention to be:
> 
>      windows-2012-64 asan
> 
>    but I don't know how to make it.

It seems that this is also set by treeherder.platform, taskcluster/ci/build/windows.yml uses "windows2012-64" but taskcluster/ci/toolchain/windows.yml uses "windows-2012-64". I guess I'll just follow the convention in taskcluster/ci/build/windows.yml.
(In reply to Ting-Yu Chou [:ting] from comment #21)
> The Try run [1] still has something different from what I expect:
> 
> a. I found the test log inside the build task [2], how should I make it like
> linux64 asan which each test are listed under "Linux x64 asan" separately?

Those seems are xpcshell test and also existed for linux x64 asan build which I shouldn't worry about.
The build task is stuck at here, I am checking why:

06:12:30     INFO - mozmake[3]: Entering directory 'z:/build/build/src/obj-firefox/toolkit/mozapps/update/tests'
06:12:30     INFO - rm -f -rf ../../../../_tests/updater/ && z:/build/build/src/obj-firefox/_virtualenv/Scripts/python.exe z:/build/build/src/config/nsinstall.py -D ../../../../_tests/updater/test_bug473417-�/
06:12:30     INFO - for i in TestAUSReadStrings1.ini TestAUSReadStrings2.ini TestAUSReadStrings3.ini ; do \
06:12:30     INFO -   z:/build/build/src/obj-firefox/_virtualenv/Scripts/python.exe z:/build/build/src/config/nsinstall.py -t z:/build/build/src/toolkit/mozapps/update/tests/$i ../../../../_tests/updater/test_bug473417-�/; \
06:12:30     INFO - done
06:12:30     INFO - z:/build/build/src/obj-firefox/_virtualenv/Scripts/python.exe z:/build/build/src/config/nsinstall.py -t ../../../../_tests/xpcshell/toolkit/mozapps/update/tests/data/TestAUSReadStrings.exe ../../../../_tests/updater/test_bug473417-�/
The reason why TestAUSReadStrings.exe get stuck is it lacks ASan runtime dll.  And I found Linux ASan build uses clang++ to link while Windows uses ld.exe, see bug 1307547 comment 2 and 3.
Ehsan, do you know what's the difference between using link.exe or clang.exe to link on windows?
Flags: needinfo?(ehsan)

Comment 28

10 months ago
(In reply to Ting-Yu Chou [:ting] from comment #27)
> Ehsan, do you know what's the difference between using link.exe or clang.exe
> to link on windows?

The main difference is that if you use clang.exe, it will construct the correct linker command line, including the static library needed to link against an ASan DLL, and otherwise you need to pass that to the linker manually in LDFLAGS.  Our build system just uses link.exe directly, for reasons I'm not sure of.  I think at some point I tried to change that for clang-cl builds but it was difficult since we add linker arguments to LDFLAGS that can't be passed directly to clang-cl, so for ASan builds we always used a mozconfig which sets up the ASan library flags, for example see the one posted in bug 1307561 comment 0.

I honestly think switching to use clang-cl.exe as the linker may be too much work for very little gain.  Perhaps we should add some code to sanitize.m4 to do that for us so that we don't need to modify the mozconfig files manually?
Flags: needinfo?(ehsan)
(In reply to :Ehsan Akhgari from comment #28)
> always used a mozconfig which sets up the ASan library flags, for example
> see the one posted in bug 1307561 comment 0.

Yes, I am doing the same here. But now the other *.exe, like obj-asan\toolkit\mozapps\update\tests\TestAUSReadStrings.exe needs clang_rt.asan_dynamic-x86_64.dll to run.

> work for very little gain.  Perhaps we should add some code to sanitize.m4
> to do that for us so that we don't need to modify the mozconfig files
> manually?

Yeah, I am thinking to either 1) strip out "-fsanitize=address" for the compilation that is not for firefox.exe or 2) somehow let the dll can be found. Just I need to figure out how to do those.

Comment 30

10 months ago
(In reply to Ting-Yu Chou [:ting] from comment #29)
> (In reply to :Ehsan Akhgari from comment #28)
> > always used a mozconfig which sets up the ASan library flags, for example
> > see the one posted in bug 1307561 comment 0.
> 
> Yes, I am doing the same here. But now the other *.exe, like
> obj-asan\toolkit\mozapps\update\tests\TestAUSReadStrings.exe needs
> clang_rt.asan_dynamic-x86_64.dll to run.

Ah I see what the problem is.  Did you package the build?  This is supposed to take care of copying the DLL to the packaged build: <http://searchfox.org/mozilla-central/rev/12cf11303392edac9f1da0c02e3d9ad2ecc8f4d3/browser/installer/package-manifest.in#801>

> > work for very little gain.  Perhaps we should add some code to sanitize.m4
> > to do that for us so that we don't need to modify the mozconfig files
> > manually?
> 
> Yeah, I am thinking to either 1) strip out "-fsanitize=address" for the
> compilation that is not for firefox.exe or 2) somehow let the dll can be
> found. Just I need to figure out how to do those.

I think (2) is a better solution.  See above.

If packaging the build before running the test isn't an option, we can set the the $PATH environment variable to where that DLL is located.
(In reply to :Ehsan Akhgari from comment #30)
> > Yeah, I am thinking to either 1) strip out "-fsanitize=address" for the
> > compilation that is not for firefox.exe or 2) somehow let the dll can be
> > found. Just I need to figure out how to do those.
> 
> I think (2) is a better solution.  See above.
> 
> If packaging the build before running the test isn't an option, we can set
> the the $PATH environment variable to where that DLL is located.

I see, will go ahead with (2). I've read MSDN for the DLL search order, yes, set the $PATH seems the easiest way to workaround it, but I'm not sure how to set it for all possible *.exe invocation. I'll try to figure it out.
I am not sure is this question for you, but how can I add a directory to the $PATH environment variable, so it is applied globally for tasks on a test machine, for instance one of the check-test in objdir/toolkit/mozapps/update/tests/Makefile?

I need it to be globally because it is for locating a dll.
Flags: needinfo?(pmoore)

Comment 33

10 months ago
(In reply to :Ehsan Akhgari from comment #30)
> If packaging the build before running the test isn't an option, we can set
> the the $PATH environment variable to where that DLL is located.

How is this different from any other DLL that we rely on? (mozglue, win-apiset-whatever, etc.) Needing to touch $PATH seems like it's approaching the problem from the wrong angle.
For the dll that Firefox.exe relies on, they seem to be located in objdir/dist/bin and will be copied to the package by the code in comment 30.  The problem now is the other executable e.g., objdir/toolkit/mozapps/update/tests/TestAUSReadStrings.exe also needs clang_rt.asan_dynamic-x86_64.dll (which is already in objdir/dist/bin) when we run the test.

Comment 35

10 months ago
I wonder if we could do any of:
- Don't ASan for programs that are not in the package (this only works if the programs don't depend on Firefox DLLs, but if they do, the programs should be in dist/bin already), or
- Copy these programs to dist/bin, or
- Copy the ASan DLL to the location of these programs

Setting $PATH still seems like a really big hammer...
:pmoore, if you could also help clarify comment 35...
For check-test, $PATH seems to be the one in config.status, so adding "export PATH=$PATH:/dir/of/clang/as an/dll" in mozconfig would help.
I tend to agree with dmajor, that if the dll(s) exists already somewhere in the test's task directory (e.g. under `<task_dir>\objdir` rather than `<task_dir>\dist\bin`) that it might be best to move/copy/symlink those to an existing folder in the PATH during the test setup phase (or adapt whatever installs them in the first place, to put them in the preferred location).

If they are available at build time, but not at test time, it might be best to package them up with the tests (depending on how big they are) or make them available e.g. via tooltool so that they can be downloaded as part of the test setup.

If the required DLLs are installed globally on the system (e.g. because they are included in a system clang installation, rather than being somewhere inside the task folder) and we want to make these system DLLs available to all tasks, that would be a different matter, and we might be able to set the default task PATH to include them (so long as this is unlikely to break other tasks). However, ideally we try to avoid placing toolchains on the system, but prefer that tasks extract required toolchains inside their task folder, wherever possible, to avoid conflicts and difficult-to-manage system dependencies (with tasks having varying/conflicting requirements with toolchain versions etc).
Flags: needinfo?(pmoore)

Comment 39

10 months ago
(In reply to David Major [:dmajor] from comment #35)
> I wonder if we could do any of:
> - Don't ASan for programs that are not in the package (this only works if
> the programs don't depend on Firefox DLLs, but if they do, the programs
> should be in dist/bin already), or

This hides bugs that get triggered by our compiled tests.

> - Copy these programs to dist/bin, or
> - Copy the ASan DLL to the location of these programs

Either of these sound good to me.  Honestly whichever is the easiest to implement is better IMO.

(In reply to Pete Moore [:pmoore][:pete] from comment #38)
> If they are available at build time, but not at test time, it might be best
> to package them up with the tests (depending on how big they are) or make
> them available e.g. via tooltool so that they can be downloaded as part of
> the test setup.

tooltool isn't a good solution since the DLL in question comes from our toolchain which comes from the in-tree build scripts, which we download their binaries from toolchain.  :-)

> If the required DLLs are installed globally on the system (e.g. because they
> are included in a system clang installation, rather than being somewhere
> inside the task folder) and we want to make these system DLLs available to
> all tasks, that would be a different matter, and we might be able to set the
> default task PATH to include them (so long as this is unlikely to break
> other tasks). However, ideally we try to avoid placing toolchains on the
> system, but prefer that tasks extract required toolchains inside their task
> folder, wherever possible, to avoid conflicts and difficult-to-manage system
> dependencies (with tasks having varying/conflicting requirements with
> toolchain versions etc).

We shouldn't install them globally for the same reason (they can be different for different build jobs.)
So I just copied the ASan DLL to the folder of TestAUSReadStrings.exe for running check-test, now the build task is green [1], but fail all the tests because the test tasks can't download:

  firefox-54.0a1.en-US.win64-asan.test_packages.json

which is not what the build task generated:

  firefox-54.0a1.en-US.win64.test_packages.json

It seems packager [2] and taskcluster [3] does not use the same package name, how should I fix this? Should I 1) define MOZ_SIMPLE_PACKAGE_NAME somewhere, 2) fix the packager, or 3) fix the taskcluster?

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=3f133d70d48a55cea39191af4c6a92e2d608ab19&filter-tier=1&filter-tier=2&filter-tier=3
[2] https://dxr.mozilla.org/mozilla-central/rev/5069348353f8fc1121e632e3208da33900627214/toolkit/mozapps/installer/package-name.mk#53
[3] https://dxr.mozilla.org/mozilla-central/rev/5069348353f8fc1121e632e3208da33900627214/taskcluster/taskgraph/transforms/job/mozharness_test.py#207
Flags: needinfo?(dustin)
I really don't know the answer to that, but it does seem like the packager is in the wrong here -- it's generating a file with a name that overlaps with a different build.  But then, that's not the part I know about, so you probably expected me to say it's the part that's wrong :)
Flags: needinfo?(dustin)
(In reply to Ting-Yu Chou [:ting] from comment #40)
> It seems packager [2] and taskcluster [3] does not use the same package
> name, how should I fix this? Should I 1) define MOZ_SIMPLE_PACKAGE_NAME
> somewhere, 2) fix the packager, or 3) fix the taskcluster?

I chose (3) because I don't see a way to pass task's build_platform to the packager, also it's not something that the packager really need to know.

(In reply to David Major [:dmajor] from comment #35)
> I wonder if we could do any of:
> - Don't ASan for programs that are not in the package (this only works if
> the programs don't depend on Firefox DLLs, but if they do, the programs
> should be in dist/bin already), or
> - Copy these programs to dist/bin, or
> - Copy the ASan DLL to the location of these programs

Bug 1051190 is how we do it on Mac, it scans all the files and rewrite the path to the dynamic library if the file reference the ASan dylib.
(In reply to Ting-Yu Chou [:ting] from comment #42)
> (In reply to Ting-Yu Chou [:ting] from comment #40)
> > It seems packager [2] and taskcluster [3] does not use the same package
> > name, how should I fix this? Should I 1) define MOZ_SIMPLE_PACKAGE_NAME
> > somewhere, 2) fix the packager, or 3) fix the taskcluster?
> 
> I chose (3) because I don't see a way to pass task's build_platform to the
> packager, also it's not something that the packager really need to know.

Found a better solution, adding "export MOZ_PKG_SPECIAL=asan" in mozconfig.
Depends on: 1343815
Try now has many timed out tests which I can't reproduce locally, for instance:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=4115362a53e3936095241a363ca8af985ede4f9c&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=81037599

I ran the test locally by:

  1. Put tooltool.py under C:\mozilla-build
  2. Add tooltool token for public/private download in C:\builds\relengapi.tok 
  3. Run C:\mozilla-build\start-shell-msvc2015-x64.bat
  4. Execute comand "c:\\mozilla-build\\python\\python.exe -u mozharness\\scripts\\desktop_unittest.py --cfg mozharness\\configs\\unittests\\win_taskcluster_unittest.py --mochitest-suite=plain-chunked --no-read-buildbot-config --installer-url https://queue.taskcluster.net/v1/task/WvKBeUs6Qwu4RY101HzP0w/artifacts/public/build/firefox-54.0a1.en-US.win64-asan.zip --test-packages-url https://queue.taskcluster.net/v1/task/WvKBeUs6Qwu4RY101HzP0w/artifacts/public/build/firefox-54.0a1.en-US.win64-asan.test_packages.json --download-symbols ondemand --total-chunk=5 --this-chunk=2" in the shell

As bug 1343815 for loaning a t-w1064-ix is impossible(?), what else option do I have to debug this? Is the way that I tested locally correct? Or should I setup my Win10 by https://github.com/mozilla-releng/OpenCloudConfig/blob/master/userdata/Manifest/gecko-t-win10-64-gpu.json?
Flags: needinfo?(pmoore)
Created attachment 8843151 [details] [diff] [review]
wip v2
Attachment #8837406 - Attachment is obsolete: true
Or maybe the current status of running tests by taskcluster on windows10-64 is not green?
there are many win 10 tests that are not yet green.

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=windows10&group_state=expanded&filter-tier=3
^ this search shows the few win 10 tests that are regularly green.

if it helps, i can give you some credentials for the windows 10 gpu worker type that would allow you to rdp to the tc windows 10 instances that may help debug what's happening (its a little complicated because the tasks run as a new worker each time, so it's hard to connect as the task user, but you can get on the instance as root or generic-worker to see what's on the filesystem, or run tests from the command prompt). ping me (grenade) on irc.
(In reply to Rob Thijssen (:grenade - GMT) from comment #47)
> there are many win 10 tests that are not yet green.

Yeah, I tried and most of them are timed out. :(

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c79d609647bcc939405fe8e19753020c52182513&filter-tier=1&filter-tier=2&filter-tier=3&group_state=expanded

Are there any bugs filed for this?

> if it helps, i can give you some credentials for the windows 10 gpu worker
> type that would allow you to rdp to the tc windows 10 instances that may
> help debug what's happening (its a little complicated because the tasks run
> as a new worker each time, so it's hard to connect as the task user, but you
> can get on the instance as root or generic-worker to see what's on the
> filesystem, or run tests from the command prompt). ping me (grenade) on irc.

That'd be very helpful, thanks for that. Will try to ping you, I'm in UTC+8.
Hi :ting, if you add a gpg/pgp fingerprint to your Mozilla phonebook entry, I can email you some credentials.
(In reply to Rob Thijssen (:grenade - GMT) from comment #49)
> Hi :ting, if you add a gpg/pgp fingerprint to your Mozilla phonebook entry,
> I can email you some credentials.

Messaged you on IRC already, just in case if you don't receive it, I've added the fingerprint. Thank you.

Clear NI to :pmoore as I'm going to debug on a worker with :grenade's help.
Flags: needinfo?(pmoore)
Created attachment 8844431 [details]
loan-t-win10-64-gpu-01.gpg

Thanks Ting! Credentials attached.
Seems it takes forever to finish the tests on the loan machine:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=f73f52f68a602d134efbecc0c4471f6c021bd635&filter-tier=1&filter-tier=2&filter-tier=3&group_state=expanded
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=910732ee0f2a008bf99c6c707fdda013803c2029&filter-tier=1&filter-tier=2&filter-tier=3&group_state=expanded
 
So I cancel the jobs and decided to remote desktop to the instance directly, but unable to connect (ping 100% packet loss). Though the finished tests are green compare to "aws-provisioner-v1/gecko-t-win10-64-gpu".

Are there any differences between open cloud config loan-t-win10-64-gpu-01.json and gecko-t-win10-64-gpu.json? And how should I address the network issue, is the instance still alive?
main difference is that loan-t-win10-64-gpu-01 is restricted to a single instance at any moment, whereas gecko-t-win10-64-gpu will have multiple running simultaneously. this is why large numbers of jobs assigned to the loaner will take longer to complete than when assigned to the regular worker type (i can up the instance limit if you need that). instances are spawned based on demand (number of pending jobs assigned to that worker type). when an instance is idle for 15 minutes or so, it is terminated (unless an active rdp session is detected) and it's ip address is released back to the pool. you can keep the instance alive by keeping an rdp session open.

- to avoid queuing up all the tests, the try syntax can be changed (to eg: try: -b o -p win64 -u cppunit -t none)
- to spin up a new instance without waiting for a new build to complete, individual tests can also be retriggered from the treeherder interface
- individual tasks can be edited (inspect task > task details > Edit and recreate). you can even run arbitrary commands using this mechanism (by changing one or more of the commands in the payload)

i'm happy to go through this in a vidyo session if that's useful and practicable with our time zone differences. i spend a lot of time debugging windows taskcluster worker issues so i'm happy to share what i've learned
(In reply to Rob Thijssen (:grenade - GMT) from comment #53)
> - to spin up a new instance without waiting for a new build to complete,
> individual tests can also be retriggered from the treeherder interface

I retriggered test M1 for both try runs in comment 53 yesterday before I left the office, but they're still pending now. Not sure how long does it take to spin up a new instance.

> i'm happy to go through this in a vidyo session if that's useful and
> practicable with our time zone differences. i spend a lot of time debugging
> windows taskcluster worker issues so i'm happy to share what i've learned

That'd be great, but can we do it later after I rdp to an instance and able to reproduce the timed out?
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Attachment #8843151 - Attachment is obsolete: true
We can still review the patches while I am checking why the tests are timed out on treeherder, because they don't seem to be related.

Comment 62

9 months ago
mozreview-review
Comment on attachment 8845791 [details]
Bug 1333003 part 3 - Add mozharness configs for Windows x64 ASan build jobs.

https://reviewboard.mozilla.org/r/118936/#review120970
Attachment #8845791 - Flags: review?(rthijssen) → review+

Comment 63

9 months ago
mozreview-review
Comment on attachment 8845789 [details]
Bug 1333003 part 1 - Add Windows x64 ASan mozconfigs to the tree.

https://reviewboard.mozilla.org/r/118932/#review121060

r+ assuming the filename is changed.

::: browser/config/mozconfigs/win64/opt-asan:1
(Diff revision 1)
> +MOZ_AUTOMATION_L10N_CHECK=0

macosx64 and linux64 both name the opt version "nightly-asan" instead of "opt-asan". Please rename this to "nightly-asan" for consistency.

::: browser/config/mozconfigs/win64/opt-asan:6
(Diff revision 1)
> +MOZ_AUTOMATION_L10N_CHECK=0
> +MOZ_AUTOMATION_PACKAGE_TEST=0
> +
> +. "$topsrcdir/build/mozconfig.win-common"
> +. "$topsrcdir/browser/config/mozconfigs/common"
> +

Both the macosx64 and linux64 nightly-asan configs explicitly pass in --disable-debug. Do we need to do that here? Or why not?
Attachment #8845789 - Flags: review?(mshal) → review+

Comment 64

9 months ago
mozreview-review
Comment on attachment 8845790 [details]
Bug 1333003 part 2 - Enable ASan builds and tests on Windows x64.

https://reviewboard.mozilla.org/r/118934/#review121122

::: taskcluster/ci/test/test-platforms.yml:147
(Diff revision 1)
>  #        - windows-gpu-tests
>  
> +windows10-64-asan/opt:
> +    build-platform: win64-asan/opt
> +    test-sets:
> +        - common-tests

No tests for debug -- is that intentional?

::: taskcluster/taskgraph/transforms/gecko_v2_whitelist.py:88
(Diff revision 1)
>      'win64-st-an-opt',
>      'win64-qr-debug',
>      'win64-qr-opt',
> +    'win64-asan-debug',
> +    'win64-asan-opt',
> +

stray newline
Attachment #8845790 - Flags: review?(dustin) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 71

9 months ago
mozreview-review-reply
Comment on attachment 8845790 [details]
Bug 1333003 part 2 - Enable ASan builds and tests on Windows x64.

https://reviewboard.mozilla.org/r/118934/#review121122

> No tests for debug -- is that intentional?

I just followed what linux64-asan has there which doesn't have tests for debug build. I guess it's intended because 1) we care more about ASan issues on opt build, and 2) we don't want to add too much testing load.

> stray newline

Fixed.
(Assignee)

Comment 72

9 months ago
mozreview-review-reply
Comment on attachment 8845789 [details]
Bug 1333003 part 1 - Add Windows x64 ASan mozconfigs to the tree.

https://reviewboard.mozilla.org/r/118932/#review121060

> Both the macosx64 and linux64 nightly-asan configs explicitly pass in --disable-debug. Do we need to do that here? Or why not?

I thought "--disable-debug" is the default setting, added it and "--enable-optimize" back as macosx64 and linux64.
(In reply to Ting-Yu Chou [:ting] from comment #54)
> (In reply to Rob Thijssen (:grenade - GMT) from comment #53)
> > - to spin up a new instance without waiting for a new build to complete,
> > individual tests can also be retriggered from the treeherder interface
> 
> I retriggered test M1 for both try runs in comment 53 yesterday before I
> left the office, but they're still pending now. Not sure how long does it
> take to spin up a new instance.

I still can't manage to spin up a new instance, the retriggered jobs were pending til deadline-exceeded.
Flags: needinfo?(rthijssen)
yes, apologies. i haven't been able to figure it out yet either. there were some missing scopes on the new worker type, but after getting those added, i also failed to get one to spin up. will investigate again today as folks with access to provisioner logs come online.
Flags: needinfo?(rthijssen)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 82

9 months ago
mozreview-review
Comment on attachment 8848373 [details]
Bug 1333003 part 7 - Add jittest-chunked to the suites so the tests are run.

https://reviewboard.mozilla.org/r/121272/#review123810
Attachment #8848373 - Flags: review?(rthijssen) → review+

Comment 83

9 months ago
mozreview-review
Comment on attachment 8845792 [details]
Bug 1333003 part 4 - Package the binary of llvm-symbolizer also on Windows.

https://reviewboard.mozilla.org/r/118938/#review123992
Attachment #8845792 - Flags: review?(ted) → review+

Comment 84

9 months ago
mozreview-review
Comment on attachment 8845793 [details]
Bug 1333003 part 5 - Include ASan runtime dll in common.tests.zip.

https://reviewboard.mozilla.org/r/118940/#review124000

::: python/mozbuild/mozbuild/action/test_archive.py:512
(Diff revision 3)
>          }
>          for path in set(generated_harness_files) - packaged_paths:
>              entry['patterns'].append(path[len('_tests') + 1:])
>          extra_entries.append(entry)
>  
> +        if buildconfig.defines['MOZ_ASAN'] and buildconfig.substs['CLANG_CL']:

Given that all the other package definitions are in the data structures at the top of the file, I think this code would be better at the top-level right after the declaration of `ARCHIVE_FILES`. You could keep your existing conditional here, but then just do
`ARCHIVE_FILES['common'].append(asan_dll)`.

I guess we haven't needed conditional test package contents like this in this file before.
Attachment #8845793 - Flags: review?(ted) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 92

9 months ago
mozreview-review
Comment on attachment 8845794 [details]
Bug 1333003 part 6 - Fix test scripts to run ASan on Windows.

https://reviewboard.mozilla.org/r/118942/#review124918

::: build/automation.py.in:237
(Diff revision 3)
>      env.setdefault('R_LOG_LEVEL', '6')
>      env.setdefault('R_LOG_DESTINATION', 'stderr')
>      env.setdefault('R_LOG_VERBOSE', '1')
>  
>      # ASan specific environment stuff
> -    if self.IS_ASAN and (self.IS_LINUX or self.IS_MAC):
> +    if self.IS_ASAN:

I don't think any of our test harnesses are still using automation.py, just leave this part out of the patch. (We haven't removed it because it's still used by remoteautomation.py for Android tests.)

::: testing/mozbase/mozrunner/mozrunner/utils.py:172
(Diff revision 3)
>                  log.info("TEST-UNEXPECTED-FAIL | runtests.py | Failed to find"
>                           " ASan symbolizer at %s" % llvmsym)
>  
>              # Returns total system memory in kilobytes.
> -            # Works only on unix-like platforms where `free` is in the path.
> +            if mozinfo.isWin:
> +                totalMemory = int(

I probably would have used ctypes to call something like `GetPhysicallyInstalledSystemMemory` ( https://msdn.microsoft.com/en-us/library/windows/desktop/cc300158(v=vs.85).aspx ), but I guess this is similar to what's already here.

::: testing/xpcshell/runxpcshelltests.py:945
(Diff revision 3)
>  
>          usingASan = "asan" in self.mozInfo and self.mozInfo["asan"]
>          usingTSan = "tsan" in self.mozInfo and self.mozInfo["tsan"]
>          if usingASan or usingTSan:
>              # symbolizer support
> -            llvmsym = os.path.join(self.xrePath, "llvm-symbolizer")
> +            llvmsym = os.path.join(

Can you file a followup to make runcppunittests.py and runxpcshelltests.py use something from mozrunner.utils here instead of duplicating this code? It looks like we already have the code they need in `test_environment`, but maybe we need to split that out into smaller functions for these harnesses to use.
Attachment #8845794 - Flags: review?(ted) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 100

9 months ago
mozreview-review-reply
Comment on attachment 8845794 [details]
Bug 1333003 part 6 - Fix test scripts to run ASan on Windows.

https://reviewboard.mozilla.org/r/118942/#review124918

> Can you file a followup to make runcppunittests.py and runxpcshelltests.py use something from mozrunner.utils here instead of duplicating this code? It looks like we already have the code they need in `test_environment`, but maybe we need to split that out into smaller functions for these harnesses to use.

Filed bug 1349858.
(In reply to Ting-Yu Chou [:ting] from comment #44)
> Try now has many timed out tests which I can't reproduce locally, for

This will be followed up in bug 1349420 as it is not related to the patches here.

Comment 102

9 months ago
Pushed by tchou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a89806ba0faa
part 1 - Add Windows x64 ASan mozconfigs to the tree. r=mshal
https://hg.mozilla.org/integration/autoland/rev/911cc14899c8
part 2 - Enable ASan builds and tests on Windows x64. r=dustin
https://hg.mozilla.org/integration/autoland/rev/40fcebfabb33
part 3 - Add mozharness configs for Windows x64 ASan build jobs. r=grenade
https://hg.mozilla.org/integration/autoland/rev/d88370d20b83
part 4 - Package the binary of llvm-symbolizer also on Windows. r=ted
https://hg.mozilla.org/integration/autoland/rev/42cf5ddabc8a
part 5 - Include ASan runtime dll in common.tests.zip. r=ted
https://hg.mozilla.org/integration/autoland/rev/18fd8676751a
part 6 - Fix test scripts to run ASan on Windows. r=ted
https://hg.mozilla.org/integration/autoland/rev/a796423751ce
part 7 - Add jittest-chunked to the suites so the tests are run. r=grenade
backed out for bustage like https://treeherder.mozilla.org/logviewer.html#?job_id=85864006&repo=autoland
Flags: needinfo?(janus926)

Comment 104

9 months ago
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/57d7eb9414f7
Backed out changeset a796423751ce 
https://hg.mozilla.org/integration/autoland/rev/2950eaa868a9
Backed out changeset 18fd8676751a 
https://hg.mozilla.org/integration/autoland/rev/1c3b8fbae73b
Backed out changeset 42cf5ddabc8a 
https://hg.mozilla.org/integration/autoland/rev/86339ccb672f
Backed out changeset d88370d20b83 
https://hg.mozilla.org/integration/autoland/rev/7260d62166ff
Backed out changeset 40fcebfabb33 
https://hg.mozilla.org/integration/autoland/rev/e116d10d2150
Backed out changeset 911cc14899c8 
https://hg.mozilla.org/integration/autoland/rev/36b934545efb
Backed out changeset a89806ba0faa for bustage
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Reland, thanks for backing out.
Flags: needinfo?(janus926)

Comment 109

9 months ago
Pushed by tchou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d5d4112599f2
part 1 - Add Windows x64 ASan mozconfigs to the tree. r=mshal
https://hg.mozilla.org/integration/autoland/rev/375e952bd738
part 2 - Enable ASan builds and tests on Windows x64. r=dustin
https://hg.mozilla.org/integration/autoland/rev/5715b15e33c0
part 3 - Add mozharness configs for Windows x64 ASan build jobs. r=grenade
https://hg.mozilla.org/integration/autoland/rev/70114135bd8c
part 4 - Package the binary of llvm-symbolizer also on Windows. r=ted
https://hg.mozilla.org/integration/autoland/rev/1ba027abdfc9
part 5 - Include ASan runtime dll in common.tests.zip. r=ted
https://hg.mozilla.org/integration/autoland/rev/400d409ba4ca
part 6 - Fix test scripts to run ASan on Windows. r=ted
https://hg.mozilla.org/integration/autoland/rev/3d2b2eeda8d3
part 7 - Add jittest-chunked to the suites so the tests are run. r=grenade
I had to back these out again for failures like https://treeherder.mozilla.org/logviewer.html#?job_id=85909851&repo=autoland

https://hg.mozilla.org/integration/autoland/rev/4e945c008ca2
Flags: needinfo?(janus926)
This one is odd, and I couldn't reproduce locally...
Retriggered few failed jobs, but all are pending, wonder are the test machines still alive?
(In reply to Wes Kocher (:KWierso) from comment #110)
> I had to back these out again for failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=85909851&repo=autoland
> 
> https://hg.mozilla.org/integration/autoland/rev/4e945c008ca2

:grenade, I am not sure whom to ask about this, but do you have any ideas?

It was OK for my previous runs yesterday, but retriggered it today got the error:

  OK yesterday:
    https://treeherder.mozilla.org/#/jobs?repo=try&revision=e02c27228db675ea25c93f84cf4cc4cd94e63414&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=85859637
  NG today:
    https://treeherder.mozilla.org/#/jobs?repo=try&revision=e02c27228db675ea25c93f84cf4cc4cd94e63414&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=86107015
Flags: needinfo?(janus926) → needinfo?(rthijssen)
However retriggering the jobs of normal win64 opt build is fine:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb735ba3af81466643771e13f69bf56e1e4e14df&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=86108046

The only difference I can tell is the size of installer.zip, ASan is ~125MB and normal build is ~57MB. Though that can't explain why unzipping it worked yesterday but not today.
Comment hidden (mozreview-request)
I don't have a good explanation but I suspect something wrong with the worker configuration rather than some problem with the build test configuration. Apologies. I'll be keeping an eye on the win 10 instances in the coming days to try to get to the bottom of this.

I ran the same test from a cmd prompt on one of the workers today and failed to reproduce the error. I also retriggered the same job (https://treeherder.mozilla.org/#/jobs?repo=try&revision=e02c27228db675ea25c93f84cf4cc4cd94e63414&filter-tier=1&filter-tier=2&filter-tier=3&filter-searchStr=x8&selectedJob=86599976) and had no problem with the installer.zip unzip stage which makes me wonder if something silly happened (like perhaps, full Z: drives). Unfortunately the system event logs in papertrail from the 23rd, when these instances were active had already been purged, when I checked today.

The Papertrail system event logs are available at: https://papertrailapp.com/groups/2488493/events?q=i-047f8a847853629f1
where i-047f8a847853629f1 in the url example above is the instance id taken from the beginning of the relevant build log.
Flags: needinfo?(rthijssen)
(In reply to Wes Kocher (:KWierso) from comment #110)
> I had to back these out again for failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=85909851&repo=autoland

Retriggered two jobs and they're green:

  https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=3d2b2eeda8d3001452d7c3f2ffc249d8a5a96e09&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=86564141
  https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=3d2b2eeda8d3001452d7c3f2ffc249d8a5a96e09&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=86564204

Also based on comment 116 this isn't about the patches here. I'll reland once the last patch for adding ASan runtime dll to jsshell package gets reviewed.

Comment 118

9 months ago
mozreview-review
Comment on attachment 8851440 [details]
Bug 1333003 part 8 - Include ASan runtime dll and LLVM symbolizer in jsshell package.

https://reviewboard.mozilla.org/r/123728/#review127082

::: toolkit/mozapps/installer/upload-files.mk:79
(Diff revision 1)
>  ifdef WIN_UCRT_REDIST_DIR
>    JSSHELL_BINS += $(notdir $(wildcard $(DIST)/bin/api-ms-win-*.dll))
>    JSSHELL_BINS += ucrtbase.dll
>  endif
>  
> +ifeq (11,$(MOZ_ASAN)$(CLANG_CL))

You can just use ifdef MOZ_CLANG_RT_ASAN_LIB_PATH here.

This makes me realize we should be using the same variable in browser/installer/package-manifest.in instead of using a wildcard there. Could you do that while you're there? (You'll need to add a DEFINE in browser/installer/Makefile.in for that)
Attachment #8851440 - Flags: review?(mh+mozilla)
Comment hidden (mozreview-request)

Comment 120

8 months ago
mozreview-review
Comment on attachment 8851440 [details]
Bug 1333003 part 8 - Include ASan runtime dll and LLVM symbolizer in jsshell package.

https://reviewboard.mozilla.org/r/123728/#review128834

One thought I just had: you should check with e.g. gerv whether this requires distributing some file in the jsshell zip to indicate the licenses for those things (likely including, in fact, the license of the js shell itself)...
Attachment #8851440 - Flags: review?(mh+mozilla) → review+
NI for comment 120.
Flags: needinfo?(gerv)
If you are adding stuff to the tree which is under its own license, you should certainly add documentation to indicate what that license is. If it's a binary blob, you should say where to find the source code.

Does the answer the question?

Gerv
Flags: needinfo?(gerv)
(In reply to Gervase Markham [:gerv] from comment #122)
> If you are adding stuff to the tree which is under its own license, you
> should certainly add documentation to indicate what that license is. If it's
> a binary blob, you should say where to find the source code.
> 
> Does the answer the question?
> 
> Gerv

I'm not adding stuff to the tree but some binaries to the archives that we generate and can be public downloaded. Don't know if that makes any differences.

Though I don't find similar documents in firefox package for llvm binaries either (browser/installer/package-manifest.in). Because of this, I'll land the patches here for now, and file a follow up bug for the documentation.

Comment 124

8 months ago
Pushed by tchou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4c95b56c85aa
part 1 - Add Windows x64 ASan mozconfigs to the tree. r=mshal
https://hg.mozilla.org/integration/autoland/rev/ce6d44c6917e
part 2 - Enable ASan builds and tests on Windows x64. r=dustin
https://hg.mozilla.org/integration/autoland/rev/94c0bfa8d7aa
part 3 - Add mozharness configs for Windows x64 ASan build jobs. r=grenade
https://hg.mozilla.org/integration/autoland/rev/454033fe3a68
part 4 - Package the binary of llvm-symbolizer also on Windows. r=ted
https://hg.mozilla.org/integration/autoland/rev/015a440d870e
part 5 - Include ASan runtime dll in common.tests.zip. r=ted
https://hg.mozilla.org/integration/autoland/rev/3fcc92d0dcb5
part 6 - Fix test scripts to run ASan on Windows. r=ted
https://hg.mozilla.org/integration/autoland/rev/9ab7778f16d9
part 7 - Add jittest-chunked to the suites so the tests are run. r=grenade
https://hg.mozilla.org/integration/autoland/rev/060bca004d79
part 8 - Include ASan runtime dll and LLVM symbolizer in jsshell package. r=glandium

Comment 125

8 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/4c95b56c85aa
https://hg.mozilla.org/mozilla-central/rev/ce6d44c6917e
https://hg.mozilla.org/mozilla-central/rev/94c0bfa8d7aa
https://hg.mozilla.org/mozilla-central/rev/454033fe3a68
https://hg.mozilla.org/mozilla-central/rev/015a440d870e
https://hg.mozilla.org/mozilla-central/rev/3fcc92d0dcb5
https://hg.mozilla.org/mozilla-central/rev/9ab7778f16d9
https://hg.mozilla.org/mozilla-central/rev/060bca004d79
Status: ASSIGNED → RESOLVED
Last Resolved: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
This got set to Tier 3 on Treeherder so it doesn't show up by default. It showed too many errors.
The build job is set to tier 3, and I don't see them on treeherder by default:

  https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=3c68d659c2b715f811708f043a1e7169d77be2ba

Could you elaborate?
Flags: needinfo?(aryx.bugmail)
(In reply to Ting-Yu Chou [:ting] from comment #123)
> Though I don't find similar documents in firefox package for llvm binaries
> either (browser/installer/package-manifest.in). Because of this, I'll land
> the patches here for now, and file a follow up bug for the documentation.

Filed bug 1354355.

Comment 129

8 months ago
937 failures in 170 pushes (5.512 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* autoland: 937

Platform breakdown:
* windows10-64: 937

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1333003&startday=2017-04-06&endday=2017-04-06&tree=all
They're actually hidden by default, not tier-3. You'll need to toggle the "Excluded Jobs" button in Treeherder's header to see them: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=3c68d659c2b715f811708f043a1e7169d77be2ba&filter-searchStr=win%20asan&exclusion_profile=false
What's the rule for hiding in "Excluded Jobs", is it something like tier-3 build triggered tests?
Sorry for confusing Tier-3 with Excluded Jobs. By default, Treeherder shows Tier-1 (if these jobs fails, the patch gets backed out) and Tier-2 (if a patch causes failures, it doesn't require an immediate backout, but a rather soon fix is appreciated, else the job might get hidden). Because the Windows 10 x64 asan tests were mass failing, they got hidden.
Flags: needinfo?(aryx.bugmail)
Whiteboard: [stockwell disabled]
Depends on: 1354273

Comment 133

8 months ago
938 failures in 867 pushes (1.082 failures/push) were associated with this bug in the last 7 days. 

This is the #1 most frequent failure this week. 

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **  

Repository breakdown:
* autoland: 938

Platform breakdown:
* windows10-64: 938

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1333003&startday=2017-04-03&endday=2017-04-09&tree=all
(In reply to Jordan Lund (:jlund) from comment #10)
>     c. make it so it doesn't run on any tree by default but you can still
> specify it in try message via `try: -b o -p win32-asan` by adding to the
> stanza: `run-on-projects: []`
>         * note: you could put `run-on-projects: try` in your local copy but
> if you are going to land on inbound and you don't want this win asan to run
> on every `try -b all` push, leave it at `run-on-projects: []`

I did this because I intend to have win64-asan only on try, but still the builds/tests are always done for inbound and central. Did I miss anything or what else could go wrong?
Flags: needinfo?(dustin)
Status: RESOLVED → REOPENED
Flags: needinfo?(dustin)
Resolution: FIXED → ---
I've updated the exclusion profile so these jobs are now only *visible* on try. Every other repository has them hidden by default.
(In reply to Dustin J. Mitchell [:dustin] from comment #20)
> Also, please use the `mach taskgraph` subcommands to double-check that this
> is not going to introduce these tasks by default anyway.  I don't remember
> if the test jobs (which aren't tagged with `run-on-projects: []`) will run
> and pull in the build jobs they depend on.

Indeed, that's exactly what's happening here.  The target task selection phase is not selecting the build tasks.  However, it is selecting the test tasks, and those require the builds, so it is performing the builds anyway.

Fixing this is a little ugly -- in taskcluster/ci/test/tests.yml there are `run-on-projects` defined for each test suite.  They're already broken out for a number of platforms in most cases (`by-test-platform`).  You could accomplish the exclusion by adding `windows10-64-asan/opt` to the alternatives for each one, with value `[]`.
No longer depends on: 1354273
Depends on: 1355359
Will address comment 136 in bug 1355359.
Status: REOPENED → RESOLVED
Last Resolved: 8 months ago8 months ago
Resolution: --- → FIXED
FWIW, this is my local mozconfig:

export CC=clang-cl.exe
export CXX=clang-cl.exe
export LLVM_SYMBOLIZER="/c/w/tools/clang/r293859.x64/bin/llvm-symbolizer.exe"
ac_add_options --enable-address-sanitizer
ac_add_options --enable-debug-symbols
ac_add_options --disable-install-strip
ac_add_options --disable-jemalloc
ac_add_options --disable-crashreporter
ac_add_options --disable-profiling
ac_add_options --target=x86_64-pc-mingw32
ac_add_options --host=x86_64-pc-mingw32
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-asan
export LDFLAGS="clang_rt.asan_dynamic-x86_64.lib clang_rt.asan_dynamic_runtime_thunk-x86_64.lib"
CLANG_LIB_DIR="$(cd /c/w/tools/clang/r293859.x64/lib/clang/5.0.0/lib/windows && pwd -W)"
mk_add_options "export LIB=$LIB;$CLANG_LIB_DIR"
export MOZ_CLANG_RT_ASAN_LIB_PATH="${CLANG_LIB_DIR}/clang_rt.asan_dynamic-x86_64.dll"
You need to log in before you can comment on or make changes to this bug.