Closed Bug 1360120 Opened 7 years ago Closed 6 years ago

Promote win64 asan builds to tier2

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox62 --- fixed

People

(Reporter: ting, Assigned: away)

References

Details

Attachments

(2 files)

I've talked to :gkw, he told me that fuzzing team prefer to have win64-asan 1) in tier-2, and 2) green for general tests before doing any fuzzing test.

I've read https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy, but still not quite sure are there anything else I need to do to move win64-asan builds to tier2, which the tests are now disabled on non-try trees.
Are there anything I should do to make win64-asan builds tier2? I don't see linux64-asan related information in https://developer.mozilla.org/docs/Mozilla/QA/Automated_testing.
Blocks: winasan
Flags: needinfo?(wkocher)
These tasks are defined in https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/build/windows.yml#239-277

To get them running on non-try, I think you just need to remove the "run-on-projects" lines. This should get them showing up anywhere that patch lands, as tier3. Once you've shown that they meet the criteria to become tier2, just change the "tier" lines to 2.
Flags: needinfo?(wkocher)
I am not going to make the tests running on non-try until bug 1326419 and bug 1347793 are fixed. What I want to do is to have win64-asan builds tier2 (only the builds). It's unclear what criteria I should meet.
Depends on: 1360650
Depends on: 1361256
No longer depends on: 1360650
Once bug 1361256 gets landed, the build task will be back to green. I read https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy but still not sure what else I should do before changing win64-asan/opt task in windows.yml to tier 2, could you shed me some light?

Note win64-asan/opt build only trigger tests on Try now (bug 1355359).
Flags: needinfo?(ryanvm)
For Tier 2, they need to meet:
* Has an active owner
* Usable job logs
* Has sufficient documentation

As outlined in the doc you linked to. Do they? :)
Flags: needinfo?(ryanvm) → needinfo?(janus926)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #5)
> For Tier 2, they need to meet:
> * Has an active owner

That's me.

> * Usable job logs

Now there's only build task for win64-asan on non-Try, I expect the compiler and build infrastructure to output usable logs. Before enabling tests on non-Try, I will make sure ASan outputs proper symbols when catches an error.

> * Has sufficient documentation

I updated https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Firefox_and_Address_Sanitizer, so pushing a win64-asan build to Try is documented. I'll keep updating it in future whenever needed. For https://developer.mozilla.org/docs/Mozilla/QA/Automated_testing, I don't see any ASan related information there and probably no needs to update.

Do those meet the requirements? I am not so sure because most items in the wiki seems are for test jobs.
Flags: needinfo?(janus926) → needinfo?(ryanvm)
Sounds fine to me as long as symbolification is working properly.
Flags: needinfo?(ryanvm)
Depends on: 1360650
Depends on: 1373562
Depends on: 1400754
(In reply to Ryan VanderMeulen [:RyanVM] from comment #7)
> Sounds fine to me as long as symbolification is working properly.

Symbolification works as of bug 1360650 landed, e.g., https://treeherder.mozilla.org/logviewer.html#?job_id=156314560&repo=try&lineNumber=1924:

8:20:51     INFO -  2 INFO TEST-START | devtools/client/animationinspector/test/browser_animation_animated_properties_displayed.js
08:20:55     INFO -  GECKO(1188) | =================================================================
08:20:55    ERROR -  GECKO(1188) | ==4464==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x0038a77fb314 at pc 0x7ffb781475a8 bp 0x0038a77fb220 sp 0x0038a77fb268
08:20:55     INFO -  GECKO(1188) | WRITE of size 4 at 0x0038a77fb314 thread T0
08:20:55     INFO -  GECKO(1188) | ==4464==WARNING: Failed to use and restart external symbolizer!
08:20:56     INFO -  GECKO(1188) |     #0 0x7ffb781475a7 in mozilla::PresShell::Paint z:\build\build\src\layout\base\PresShell.cpp:6334
08:20:56     INFO -  GECKO(1188) |     #1 0x7ffb77611588 in nsViewManager::ProcessPendingUpdatesPaint z:\build\build\src\view\nsViewManager.cpp:480
08:20:56     INFO -  GECKO(1188) |     #2 0x7ffb7760ff0e in nsViewManager::ProcessPendingUpdatesForView z:\build\build\src\view\nsViewManager.cpp:412
08:20:56     INFO -  GECKO(1188) |     #3 0x7ffb77614741 in nsViewManager::ProcessPendingUpdates z:\build\build\src\view\nsViewManager.cpp:1102
08:20:56     INFO -  GECKO(1188) |     #4 0x7ffb78081b4e in nsRefreshDriver::Tick z:\build\build\src\layout\base\nsRefreshDriver.cpp:2046
08:20:56     INFO -  GECKO(1188) |     #5 0x7ffb780928b0 in mozilla::RefreshDriverTimer::TickRefreshDrivers z:\build\build\src\layout\base\nsRefreshDriver.cpp:306
08:20:56     INFO -  GECKO(1188) |     #6 0x7ffb78092486 in mozilla::RefreshDriverTimer::Tick z:\build\build\src\layout\base\nsRefreshDriver.cpp:328
08:20:56     INFO -  GECKO(1188) |     #7 0x7ffb78096bbf in mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers z:\build\build\src\layout\base\nsRefreshDriver.cpp:769
08:20:56     INFO -  GECKO(1188) |     #8 0x7ffb78095c63 in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver z:\build\build\src\layout\base\nsRefreshDriver.cpp:682
08:20:56     INFO -  GECKO(1188) |     #9 0x7ffb78095566 in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync z:\build\build\src\layout\base\nsRefreshDriver.cpp:583
08:20:56     INFO -  GECKO(1188) |     #10 0x7ffb78a884e0 in mozilla::layout::VsyncChild::RecvNotify z:\build\build\src\layout\ipc\VsyncChild.cpp:68
08:20:56     INFO -  GECKO(1188) |     #11 0x7ffb70a728be in mozilla::layout::PVsyncChild::OnMessageReceived z:\build\build\src\obj-firefox\ipc\ipdl\PVsyncChild.cpp:155
08:20:56     INFO -  GECKO(1188) |     #12 0x7ffb708e9f16 in mozilla::ipc::PBackgroundChild::OnMessageReceived z:\build\build\src\obj-firefox\ipc\ipdl\PBackgroundChild.cpp:1812
08:20:56     INFO -  GECKO(1188) |     #13 0x7ffb704c0892 in mozilla::ipc::MessageChannel::DispatchAsyncMessage z:\build\build\src\ipc\glue\MessageChannel.cpp:2110
08:20:56     INFO -  GECKO(1188) |     #14 0x7ffb704bd948 in mozilla::ipc::MessageChannel::DispatchMessageW z:\build\build\src\ipc\glue\MessageChannel.cpp:2040
08:20:56     INFO -  GECKO(1188) |     #15 0x7ffb704bf255 in mozilla::ipc::MessageChannel::RunMessage z:\build\build\src\ipc\glue\MessageChannel.cpp:1886
08:20:56     INFO -  GECKO(1188) |     #16 0x7ffb704bf765 in mozilla::ipc::MessageChannel::MessageTask::Run z:\build\build\src\ipc\glue\MessageChannel.cpp:1919
08:20:56     INFO -  GECKO(1188) |     #17 0x7ffb6f52dbe8 in nsThread::ProcessNextEvent z:\build\build\src\xpcom\threads\nsThread.cpp:1040
08:20:56     INFO -  GECKO(1188) |     #18 0x7ffb6f54d956 in NS_ProcessNextEvent z:\build\build\src\xpcom\threads\nsThreadUtils.cpp:517
08:20:56     INFO -  GECKO(1188) |     #19 0x7ffb704c8a26 in mozilla::ipc::MessagePump::Run z:\build\build\src\ipc\glue\MessagePump.cpp:125
08:20:56     INFO -  GECKO(1188) |     #20 0x7ffb7042f7da in MessageLoop::RunHandler z:\build\build\src\ipc\chromium\src\base\message_loop.cc:312
08:20:56     INFO -  GECKO(1188) |     #21 0x7ffb7042f570 in MessageLoop::Run z:\build\build\src\ipc\chromium\src\base\message_loop.cc:299
08:20:56     INFO -  GECKO(1188) |     #22 0x7ffb776e1eaa in nsBaseAppShell::Run z:\build\build\src\widget\nsBaseAppShell.cpp:157
08:20:56     INFO -  GECKO(1188) |     #23 0x7ffb7785d8e0 in nsAppShell::Run z:\build\build\src\widget\windows\nsAppShell.cpp:344
08:20:56     INFO -  GECKO(1188) |     #24 0x7ffb7c41f5b6 in XRE_RunAppShell z:\build\build\src\toolkit\xre\nsEmbedFunctions.cpp:874
08:20:56     INFO -  GECKO(1188) |     #25 0x7ffb7042f7da in MessageLoop::RunHandler z:\build\build\src\ipc\chromium\src\base\message_loop.cc:312
08:20:56     INFO -  GECKO(1188) |     #26 0x7ffb7042f570 in MessageLoop::Run z:\build\build\src\ipc\chromium\src\base\message_loop.cc:299
08:20:56     INFO -  GECKO(1188) |     #27 0x7ffb7c41e9f8 in XRE_InitChildProcess z:\build\build\src\toolkit\xre\nsEmbedFunctions.cpp:700
08:20:56     INFO -  GECKO(1188) |     #28 0x7ff6e7ec26e0 in content_process_main z:\build\build\src\ipc\contentproc\plugin-container.cpp:63
08:20:56     INFO -  GECKO(1188) |     #29 0x7ff6e7ec1cc1 in NS_internal_main z:\build\build\src\browser\app\nsBrowserApp.cpp:280
08:20:56     INFO -  GECKO(1188) |     #30 0x7ff6e7ec1358 in wmain z:\build\build\src\toolkit\xre\nsWindowsWMain.cpp:112
08:20:56     INFO -  GECKO(1188) |     #31 0x7ff6e7f7a208 in __scrt_common_main_seh f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283
08:20:56     INFO -  GECKO(1188) |     #32 0x7ffba9e12773 in BaseThreadInitThunk+0x13 (C:\Windows\System32\KERNEL32.DLL+0x180012773)
08:20:56     INFO -  GECKO(1188) |     #33 0x7ffbabcb0d60 in RtlUserThreadStart+0x20 (C:\Windows\SYSTEM32\ntdll.dll+0x180070d60)
(In reply to Ting-Yu Chou [:ting] from comment #8)
> Symbolification works as of bug 1360650 landed, e.g.,
> https://treeherder.mozilla.org/logviewer.
> html#?job_id=156314560&repo=try&lineNumber=1924:
> 

I just found there were still untranslated addresses in the same Try run, like https://treeherder.mozilla.org/logviewer.html#?job_id=156314560&repo=try&lineNumber=2088.
(In reply to Ting-Yu Chou [:ting] from comment #9)
> I just found there were still untranslated addresses in the same Try run,
> like
> https://treeherder.mozilla.org/logviewer.
> html#?job_id=156314560&repo=try&lineNumber=2088.

I can't reproduce this locally, and on Try the only thing I know so far is the symbolizer process returns "??".
Depends on: 1452126
Let's revisit this!
Assignee: nobody → dmajor
Depends on: 1457523
I've deliberately left as tier-3 the following tests:
- gtest (perma-OOM, likely from ASan malloc-meddling)
- xpcshell (builds need to be signed plus other failures too)
Attachment #8973703 - Flags: review?(dustin)
coop, were you able to find out if this would be okay budget-wise?
Attachment #8973705 - Flags: review?(coop)
Attachment #8973703 - Flags: review?(dustin) → review+
Attachment #8973705 - Flags: review?(coop) → review+
(In reply to David Major [:dmajor] from comment #13)
> Created attachment 8973705 [details] [diff] [review]
> Run win64-asan builds and tests on trunk and try
> 
> coop, were you able to find out if this would be okay budget-wise?

Adding another build type without a corresponding multiplex of testing is fine.
(In reply to Chris Cooper [:coop] from comment #14)
> (In reply to David Major [:dmajor] from comment #13)
> > Created attachment 8973705 [details] [diff] [review]
> > Run win64-asan builds and tests on trunk and try
> > 
> > coop, were you able to find out if this would be okay budget-wise?
> 
> Adding another build type without a corresponding multiplex of testing is
> fine.

But this _would_ add testing, or at least that was the intent of the patch. As I understand it, mochitests and friends are set to run on `built-projects` -- so when ASan becomes a built project, the tests would run too.

(Note that the tests would only run on opt, not debug. This matches Linux ASan. I *think* this is controlled by https://searchfox.org/mozilla-central/rev/b28b94dc81d60c6d9164315adbd4a5073526d372/taskcluster/taskgraph/transforms/tests.py#534)
Flags: needinfo?(coop)
(In reply to David Major [:dmajor] from comment #15)
> But this _would_ add testing, or at least that was the intent of the patch.
> As I understand it, mochitests and friends are set to run on
> `built-projects` -- so when ASan becomes a built project, the tests would
> run too.

Are you trying to talk me out of the r+? ;)

The tests _can_ run, but the tier 2 rules for ownership still apply. They need someone committed to fixing them or they'll be disabled.

Time-wise, looking at your most recent Try run (https://treeherder.mozilla.org/#/jobs?repo=try&revision=e62a779ffe063b988352b7cb399eecf6e6254c0f), you generated approx. 40 compute hours of work. 

Compare that to a recent mozilla-central merge commit (https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=93443d36d4bd53dba004f7b73430879f96daa681) that generated 2,578 compute hours of work. 

I'm not too worried about the extra 40 hours. I think the benefits of ASAN on our primary user platform are worth it.
Flags: needinfo?(coop)
(In reply to Chris Cooper [:coop] from comment #16)
> Are you trying to talk me out of the r+? ;)

Just want to make sure I'm not sneaking anything past you. :)

> I'm not too worried about the extra 40 hours. I think the benefits of ASAN
> on our primary user platform are worth it.

Ok, great! Thanks for finding some concrete numbers on this.
Pushed by dmajor@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/38ec627c4758
Promote win64-asan builds and tests to tier 2. r=dustin
https://hg.mozilla.org/integration/mozilla-inbound/rev/c9221e2a82ce
Run Win64 ASan builds and tests on trunk and try. r=coop
https://hg.mozilla.org/mozilla-central/rev/38ec627c4758
https://hg.mozilla.org/mozilla-central/rev/c9221e2a82ce
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Depends on: 1467126
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: