Closed Bug 1373562 Opened 2 years ago Closed 2 years ago

certutil returns 0xc0000142 when conduct mochitest with ASan on Windows

Categories

(Core :: General, defect)

x86_64
Windows 10
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: ting, Assigned: ting)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(2 files, 2 obsolete files)

Running mochitests on Try now have error message:

https://treeherder.mozilla.org/logviewer.html#?job_id=107228936&repo=try&lineNumber=1645

07:34:46     INFO -  TEST-SKIP | accessible/tests/mochitest/tree/test_applicationacc.xul | took 0ms
07:34:46     INFO -  mozprofile.addons WARNING | Could not install Z:\task_1497511817\build\tests\mochitest\extensions\mozscreenshots: [Errno 2] No such file or directory: 'Z:\\task_1497511817\\build\\tests\\mochitest\\extensions\\mozscreenshots\\install.rdf'
07:34:46     INFO -  INFO | runtests.py | ASan using symbolizer at Z:\task_1497511817\build\application\firefox\llvm-symbolizer.exe
07:34:47     INFO -  INFO | runtests.py | ASan running in default memory configuration
07:34:47    ERROR -  0 ERROR TEST-UNEXPECTED-FAIL | runtests.py | Certificate integration failed
07:34:47     INFO -  Buffered messages finished
07:34:47     INFO -  SUITE-END | took 0s

It's because the certutil call:

  https://dxr.mozilla.org/mozilla-central/rev/048240a074e841c425a4da4707cf8e353074ec1d/testing/mochitest/runtests.py#1603

returns 0xc0000142.

This could be a dupe of bug 1361185.
> This could be a dupe of bug 1361185.

Bug 1361185 is triggered by code in Windows 10 Creators Update. 

In that Try push, the test machine was running Windows 10, so it's quite possible to be a dupe.
Does this still happen with LLVM 305581?
Flags: needinfo?(janus926)
Great, thanks for the information, I'll check that.
I ran the test locally:

./xpcshell.exe -g c:\\w\\fx\\mc\\obj-asan\\dist\\bin -a c:\\w\\fx\\mc\\obj-asan\\dist\\bin -r c:/w/fx/mc/obj-asan/dist/bin/components/httpd.manifest -m -s -e 'const _HEAD_JS_PATH = "c:/w/fx/mc/testing/xpcshell/head.js";' -e 'const _MOZINFO_JS_PATH = "c:\\\\users\\\\ting\\\\appdata\\\\local\\\\temp\\\\xpc-profile-ilknsb\\\\mozinfo.json";' -e 'const _TESTING_MODULES_DIR = "c:\\\\w\\\\fx\\\\mc\\\\obj-asan\\\\_tests\\\\modules\\\\";' -f c:\\w\\fx\\mc\\testing\\xpcshell\\head.js -e 'const _SERVER_ADDR = "localhost"' -e 'const _HEAD_FILES = [];' -e 'const _JSDEBUGGER_PORT = 0;' -e 'const _TEST_FILE = ["c:/users/ting/appdata/local/temp/tmpaxue8i/test_not_skip.js"];' -e 'const _TEST_NAME = "test_not_skip.js"' -e '_execute_test(); quit(0);'

which I have xul.pdb and xpcshell.pdb in the same folder, but ASan still output untranslated symbol addresses. It worked before when I use r293859, I'll double check, see if there's anything wrong with dbghelp.
Flags: needinfo?(janus926)
It doesn't translate because I was running on 1607, after upgrade to 1703 it works properly:

ting@ting-nb /c/w/fx/mc
$ ./obj-asan/dist/bin/xpcshell.exe -g c:\\w\\fx\\mc\\obj-asan\\dist\\bin -a c:\\w\\fx\\mc\\obj-asan\\dist\\bin -r c:
/w/fx/mc/obj-asan/dist/bin/components/httpd.manifest -m -s -e 'const _HEAD_JS_PATH = "c:/w/fx/mc/testing/xpcshell/he
ad.js";' -e 'const _MOZINFO_JS_PATH = "c:\\\\users\\\\ting\\\\appdata\\\\local\\\\temp\\\\xpc-profile-ilknsb\\\\mozi
nfo.json";' -e 'const _TESTING_MODULES_DIR = "c:\\\\w\\\\fx\\\\mc\\\\obj-asan\\\\_tests\\\\modules\\\\";' -f c:\\w\\
fx\\mc\\testing\\xpcshell\\head.js -e 'const _SERVER_ADDR = "localhost"' -e 'const _HEAD_FILES = [];' -e 'const _JSD
EBUGGER_PORT = 0;' -e 'const _TEST_FILE = ["c:/users/ting/appdata/local/temp/tmpaxue8i/test_not_skip.js"];' -e 'cons
t _TEST_NAME = "test_not_skip.js"' -e '_execute_test(); quit(0);'

{"action":"log","time":1498182823286,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"ERROR","message":"Error: cannot open file 'c:/users/ting/appdata/local/temp/tmpaxue8i/test_not_skip.js' for reading at c:\\w\\fx\\mc\\testing\\xpcshell\\head.js:657","extra":{"source_file":"c:/users/ting/appdata/local/temp/tmpaxue8i/test_not_skip.js","stack":"load_file@c:\\w\\fx\\mc\\testing\\xpcshell\\head.js:657:7\n_load_files@c:\\w\\fx\\mc\\testing\\xpcshell\\head.js:669:3\n_execute_test@c:\\w\\fx\\mc\\testing\\xpcshell\\head.js:524:3\n@-e:1:1\n"},"stack":"load_file@c:\\w\\fx\\mc\\testing\\xpcshell\\head.js:657:7\n_load_files@c:\\w\\fx\\mc\\testing\\xpcshell\\head.js:669:3\n_execute_test@c:\\w\\fx\\mc\\testing\\xpcshell\\head.js:524:3\n@-e:1:1\n"}

{"action":"log","time":1498182823303,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test MAIN run_test pending (1)"}

{"action":"log","time":1498182823308,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test run_next_test 0 pending (2)"}

{"action":"log","time":1498182823311,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test MAIN run_test finished (2)"}

{"action":"log","time":1498182823315,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"running event loop"}

{"action":"log","time":1498182823319,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"(xpcshell/head.js) | test run_next_test 0 finished (1)"}

{"action":"log","time":1498182823322,"thread":null,"pid":null,"source":"xpcshell/head.js","level":"INFO","message":"exiting test"}
=================================================================
==6960==ERROR: AddressSanitizer: stack-use-after-scope on address 0x00a40bffef70 at pc 0x7fffec11a106 bp 0x00a40bffe5e0 sp 0x00a40bffe628
READ of size 8 at 0x00a40bffef70 thread T0
    #0 0x7fffec11a105 in FindProviderFile c:\w\fx\mc\xpcom\io\nsDirectoryService.cpp:333
    #1 0x7fffec119424 in nsDirectoryService::Get c:\w\fx\mc\xpcom\io\nsDirectoryService.cpp:364
    #2 0x7fffec1fa9f1 in mozilla::InitLateWriteChecks c:\w\fx\mc\xpcom\build\LateWriteChecks.cpp:220
    #3 0x7fffec20244b in mozilla::ShutdownXPCOM c:\w\fx\mc\xpcom\build\XPCOMInit.cpp:920
    #4 0x7fffed99a956 in XRE_XPCShellMain c:\w\fx\mc\js\xpconnect\src\XPCShellImpl.cpp:1558
    #5 0x7ff68f051690 in NS_internal_main c:\w\fx\mc\js\xpconnect\shell\xpcshell.cpp:68
    #6 0x7ff68f05134f in wmain c:\w\fx\mc\toolkit\xre\nsWindowsWMain.cpp:115
    #7 0x7ff68f0df170 in __scrt_common_main_seh f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:253
    #8 0x7ff83d6e2773 in BaseThreadInitThunk+0x13 (C:\WINDOWS\System32\KERNEL32.DLL+0x180012773)
    #9 0x7ff83da00d60 in RtlUserThreadStart+0x20 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x180070d60)

Address 0x00a40bffef70 is located in stack of thread T0 at offset 656 in frame
    #0 0x7fffed99824f in XRE_XPCShellMain c:\w\fx\mc\js\xpconnect\src\XPCShellImpl.cpp:1213

  This frame has 31 object(s):
    [32, 56) 'argsObj.i' (line 1021)
    [96, 120) 'global.i' (line 1036)
    [160, 184) 'str.i' (line 1073)
    [224, 248) 'rval.i' (line 1125)
    [288, 464) 'opts.i' (line 1131)
    [528, 536) 'pluginsDir.i' (line 1151)
    [560, 564) 'rv' (line 1218)
    [576, 577) 'aLocal' (line 1233)
    [592, 600) 'appFile' (line 1251)
    [624, 632) 'appDir' (line 1257)
    [656, 704) 'dirprovider' (line 1264) <== Memory access at offset 656 is inside this variable
    [736, 744) 'greDir' (line 1268)
    [768, 928) 'workingDir' (line 1300)
    [992, 1000) 'dir' (line 1317)
    [1024, 1040) 'ref.tmp' (line 1320)
    [1056, 1064) 'lf' (line 1335)
    [1088, 1096) 'greOmni' (line 1360)
    [1120, 1128) 'appOmni' (line 1361)
    [1152, 1160) 'servMan' (line 1376)
    [1184, 1280) 'jsapi' (line 1387)
    [1312, 1320) 'systemprincipal' (line 1403)
    [1344, 1352) 'securityManager' (line 1409)
    [1376, 1392) 'ref.tmp194' (line 1409)
    [1408, 1416) 'backstagePass' (line 1430)
    [1440, 1496) 'options' (line 1440)
    [1536, 1544) 'holder' (line 1445)
    [1568, 1592) 'glob' (line 1482)
    [1632, 1648) 'ac' (line 1494)
    [1664, 1688) 'envobj' (line 1505)
    [1728, 1888) 'workingDirectory' (line 1513)
    [1952, 2104) 'aes' (line 1525)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp, SEH and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope c:\w\fx\mc\xpcom\io\nsDirectoryService.cpp:333 in FindProviderFile
Shadow bytes around the buggy address:
  0x01d95e0ffd90: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x01d95e0ffda0: f8 f8 f8 f2 f2 f2 f2 f2 f8 f8 f8 f2 f2 f2 f2 f2
  0x01d95e0ffdb0: f8 f8 f8 f2 f2 f2 f2 f2 f8 f8 f8 f2 f2 f2 f2 f2
  0x01d95e0ffdc0: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
  0x01d95e0ffdd0: f8 f8 f8 f8 f8 f8 f2 f2 f2 f2 f2 f2 f2 f2 f8 f2
=>0x01d95e0ffde0: f2 f2 04 f2 01 f2 f8 f2 f2 f2 f8 f2 f2 f2[f8]f8
  0x01d95e0ffdf0: f8 f8 f8 f8 f2 f2 f2 f2 f8 f2 f2 f2 f8 f8 f8 f8
  0x01d95e0ffe00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
  0x01d95e0ffe10: f2 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f8 f8 f2 f2
  0x01d95e0ffe20: f8 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2 f8 f2 f2 f2
  0x01d95e0ffe30: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==6960==ABORTING
Moved to xpcom based on comment 7.
Assignee: nobody → nobody
Component: Tools → XPCOM
Product: NSS → Core
Version: trunk → unspecified
OK, so we pass the XPCShellDirProvider into XPCOM, and we must hold onto it outside the block that it gets used it.

The simple fix is to move the XPCShellDirProvider out one nesting level, so it will outlive the call to shutting down XPCOM.

It's a little worrisome that our normal ASan runs didn't detect this!  I guess this is a newer analysis added in a version of ASan that we don't use on infrastructure?
Assignee: nobody → nfroyd
xpcshell has a custom directory provider that is passed in when XPCOM is
initialized.  XPCOM holds a reference to this directory provider, and
said reference is not released until XPCOM is shutdown.  Unfortunately,
the lifetime of the directory provider was not scoped properly, so it
would actually be destructed prior to shutting down XPCOM, resulting in
a stack use-after-free.

To avoid this problem, we need to move the provider so that its lifetime
outlives the call to both XPCOM init and shutdown.
Attachment #8881497 - Flags: review?(janus926)
Thanks for the patch, but could you file a new bug and move the patch there? Comment 7 is the error reports from r305581, which is a newer version for resolving the certutil error.
Flags: needinfo?(nfroyd)
Filed bug 1376649.
Flags: needinfo?(nfroyd)
Comment on attachment 8881497 [details] [diff] [review]
extend lifetime of xpcshell's directory provider

Moved to bug 1376649.
Attachment #8881497 - Flags: review?(janus926)
r305581 does 1) address the certutil error, 2) pass static analysis, and 3) detect an error (comment 7). Running mochitest with the ASan binary on Taskcluster [1] though still have errors:

02:22:24     INFO -  dir: dom/console/tests
02:22:24     INFO -  INFO | runtests.py | ASan using symbolizer at Z:\task_1498615917\build\application\firefox\llvm-symbolizer.exe
02:22:24     INFO -  INFO | runtests.py | ASan running in default memory configuration
02:22:25     INFO -  =================================================================
02:22:25     INFO -  ==5476==ERROR: AddressSanitizer: access-violation on unknown address 0xffffffffffffffff (pc 0x7ff8a04ba7c2 bp 0x00bf5f1fe260 sp 0x00bf5f1fe1e0 T0)
02:22:25     INFO -  ==5476==The signal is caused by a READ memory access.
02:22:25     INFO -      #0 0x7ff8a04ba7c1  (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  AddressSanitizer can not provide additional info.
02:22:25     INFO -  SUMMARY: AddressSanitizer: access-violation (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  ==5476==ABORTING
02:22:25     INFO -  =================================================================
02:22:25     INFO -  ==3488==ERROR: AddressSanitizer: access-violation on unknown address 0xffffffffffffffff (pc 0x7ff8a04ba7c2 bp 0x003ef49fe6a0 sp 0x003ef49fe620 T0)
02:22:25     INFO -  ==3488==The signal is caused by a READ memory access.
02:22:25     INFO -      #0 0x7ff8a04ba7c1  (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  AddressSanitizer can not provide additional info.
02:22:25     INFO -  SUMMARY: AddressSanitizer: access-violation (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  ==3488==ABORTING
02:22:25     INFO -  =================================================================
02:22:25     INFO -  ==1472==ERROR: AddressSanitizer: access-violation on unknown address 0xffffffffffffffff (pc 0x7ff8a04ba7c2 bp 0x00bd919fe220 sp 0x00bd919fe1a0 T0)
02:22:25     INFO -  ==1472==The signal is caused by a READ memory access.
02:22:25     INFO -      #0 0x7ff8a04ba7c1  (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  AddressSanitizer can not provide additional info.
02:22:25     INFO -  SUMMARY: AddressSanitizer: access-violation (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  ==1472==ABORTING
02:22:25     INFO -  =================================================================
02:22:25     INFO -  ==1908==ERROR: AddressSanitizer: access-violation on unknown address 0xffffffffffffffff (pc 0x7ff8a04ba7c2 bp 0x0042aedfe480 sp 0x0042aedfe400 T0)
02:22:25     INFO -  ==1908==The signal is caused by a READ memory access.
02:22:25     INFO -      #0 0x7ff8a04ba7c1  (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  AddressSanitizer can not provide additional info.
02:22:25     INFO -  SUMMARY: AddressSanitizer: access-violation (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:25     INFO -  ==1908==ABORTING
02:22:26     INFO -  =================================================================
02:22:26     INFO -  ==3116==ERROR: AddressSanitizer: access-violation on unknown address 0xffffffffffffffff (pc 0x7ff8a04ba7c2 bp 0x005e08ffe660 sp 0x005e08ffe5e0 T0)
02:22:26     INFO -  ==3116==The signal is caused by a READ memory access.
02:22:26     INFO -      #0 0x7ff8a04ba7c1  (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:26     INFO -  AddressSanitizer can not provide additional info.
02:22:26     INFO -  SUMMARY: AddressSanitizer: access-violation (C:\Windows\SYSTEM32\ntdll.dll+0x18009a7c1)
02:22:26     INFO -  ==3116==ABORTING

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=8233340bc1bc23a685496b9b4ed1f374c704cc43&filter-tier=1&filter-tier=2&filter-tier=3&group_state=expanded&selectedJob=110259600
The error in comment 14 is from the PORT_Strcat() in NSSUTIL_MkNSSString():

http://searchfox.org/mozilla-central/rev/fdb811340ac4e6b93703d72ee99217f3b1d250ac/security/nss/lib/util/utilpars.c#1054

$ ./obj-asan/dist/bin/certutil.exe -N -d . -f .crtbpw
slotParams=0000124F2A880700 slotLen=125
[0] 123 0x00000001=[slotFlags=RSA,RC4,RC2,DES,DH,SHA1,MD5,MD2,SSL,TLS,AES,SHA256,SHA512,Camellia,SEED,RANDOM askpw=any timeout=30 ]
=================================================================
==2884==ERROR: AddressSanitizer: access-violation on unknown address 0xffffffffffffffff (pc 0x7ffbfba2a7c2 bp 0x004e041fe6e0 sp 0x004e041fe660 T0)
==2884==The signal is caused by a READ memory access.
    #0 0x7ffbfba2a7c1 in strcpy+0x21 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18009a7c1)
    #1 0x0  (<unknown module>)
    #2 0x7ffbac4d9b2e in _asan_wrap_strlen+0xde (c:\w\fx\mc\obj-asan\dist\bin\clang_rt.asan_dynamic-x86_64.dll+0x180029b2e)
    #3 0x0  (<unknown module>)
    #4 0x2372a87ffff  (<unknown module>)
    #5 0x7ffbcce8557a in NSSUTIL_MkNSSString c:\w\fx\mc\security\nss\lib\util\utilpars.c:1056
    #6 0x7ffbdc5d29f7 in legacy_ReadSecmodDB c:\w\fx\mc\security\nss\lib\softoken\legacydb\pk11db.c:609
    #7 0x7ffbdb9cf717 in NSC_ModuleDBFunc c:\w\fx\mc\security\nss\lib\softoken\pkcs11.c:2868
    #8 0x7ffbcce2e454 in SECMOD_LoadModule c:\w\fx\mc\security\nss\lib\pk11wrap\pk11pars.c:1688
    #9 0x7ffbccdb0547 in nss_Init c:\w\fx\mc\security\nss\lib\nss\nssinit.c:574
    #10 0x7ffbccdb0c6e in NSS_Initialize c:\w\fx\mc\security\nss\lib\nss\nssinit.c:889
    #11 0x7ff647f99bb5 in certutil_main c:\w\fx\mc\security\nss\cmd\certutil\certutil.c:2989
    #12 0x7ff647f98583 in main c:\w\fx\mc\security\nss\cmd\certutil\certutil.c:3706
    #13 0x7ff647fb1808 in __scrt_common_main_seh f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:253
    #14 0x7ffbf97a2773 in BaseThreadInitThunk+0x13 (C:\WINDOWS\System32\KERNEL32.DLL+0x180012773)
    #15 0x7ffbfba00d60 in RtlUserThreadStart+0x20 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x180070d60)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: access-violation (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18009a7c1) in strcpy+0x21
==2884==ABORTING
Component: XPCOM → General
Assignee: nfroyd → janus926
When NSSUTIL_MkNSSString calls strcat [1], it goes to ucrtbase!strcat, and jumps to _asan_unpoison_stack right away. Later it jumps back to ntdll!strcat (why?) and then ntdll!__entry_from_strcat_in_strcpy which is ntdll!strcpy+0x3. The problem is ntdll!strcpy+0x3 is part of the instruction that ASan intercepted:

Original ntdll!strcpy is:

  ntdll!strcpy:
  00007ffd`64e8a7a0 4c8bd9          mov     r11,rcx
  ntdll!__entry_from_strcat_in_strcpy:
  00007ffd`64e8a7a3 482bca          sub     rcx,rdx
  00007ffd`64e8a7a6 f6c207          test    dl,7

And ASan intercepted one is:

  ntdll!strcpy:
  00007ffd`64e8a7a0 ff25ce591400    jmp     qword ptr [00007ffd`64fd0174]
  00007ffd`64e8a7a6 f6c207          test    dl,7

When ntdll!strcat jumps to ntdll!__entry_from_strcat_in_strcpy (00007ffd`64e8a7a3), it executes incorrect instructions:

  00007ffd`64e8a7a3 59              pop     rcx
  00007ffd`64e8a7a4 1400            adc     al,0
  00007ffd`64e8a7a6 f6c207          test    dl,7

However the intercepted ucrbase!strcpy is still valid for jumping to ucrtbase!__entry_from_strcat_in_strcpy:

  ucrtbase!strcpy:
  00007ffd`614036a0 ebf8            jmp     ucrtbase!strcat+0x9a (00007ffd`6140369a)
  00007ffd`614036a2 d9              ???
  ucrtbase!__entry_from_strcat_in_strcpy:
  00007ffd`614036a3 482bca          sub     rcx,rdx
  00007ffd`614036a6 f6c207          test    dl,7

[1] http://searchfox.org/mozilla-central/rev/e8f4f51cd543f203e9cb861cecb7545ac43c836c/security/nss/lib/util/utilpars.c#1054
(In reply to Ting-Yu Chou [:ting] from comment #16)
> When NSSUTIL_MkNSSString calls strcat [1], it goes to ucrtbase!strcat, and
> jumps to _asan_unpoison_stack right away. Later it jumps back to
> ntdll!strcat (why?) 

ASan looks through several DLLs and detours all functions named strcat to a single replacement: https://github.com/llvm-project/llvm-project-20170507/blob/cdd318ada19f9cb67cefde21efdf46d9da23d67b/compiler-rt/lib/interception/interception_win.cc#L932

When the replacement function does "return REAL(strcat)(to, from)" then there's only one place it can jump back to, and I guess in this case it happened to be ntdll!strcat.

> The problem is ntdll!strcpy+0x3 is part of the instruction
> that ASan intercepted:

ntdll!strcpy has a jump target within its first few bytes? Yuck, that's really unfortunate. Could you file this at https://github.com/google/sanitizers/issues or https://bugs.llvm.org/ ?
ntdll!strcpy is preceded by a 9-byte nop. If we can convince the interceptor that this is an acceptable padding, then we should be able to use the HotPatch technique that only clobbers the first two bytes of strcpy (https://github.com/llvm-project/llvm-project-20170507/blob/cdd318ada19f9cb67cefde21efdf46d9da23d67b/compiler-rt/lib/interception/interception_win.cc#L62)

ntdll!strcat+0x90:
00000001`8009a780 c3              ret
00000001`8009a781 cc              int     3
00000001`8009a782 cc              int     3
00000001`8009a783 cc              int     3
00000001`8009a784 cc              int     3
00000001`8009a785 cc              int     3
00000001`8009a786 cc              int     3
00000001`8009a787 660f1f840000000000 nop   word ptr [rax+rax]
ntdll!strcpy:
00000001`8009a790 4c8bd9          mov     r11,rcx
ntdll!__entry_from_strcat_in_strcpy:
00000001`8009a793 482bca          sub     rcx,rdx

I _think_ this could be achieved by removing the first 0x66 prefix from `kHintNop10Bytes` and having it just be `kHintNop9Bytes`: https://github.com/llvm-project/llvm-project-20170507/blob/cdd318ada19f9cb67cefde21efdf46d9da23d67b/compiler-rt/lib/interception/interception_win.cc#L226
You're the best! I am verifying this locally, but it should be working. Will update later.
Did basic smoke tests, also loading FB, YouTube, CNN is good. \o/

Will you submitted a PR or do you want me to do it?
I mailed a patch to the llvm-commits list. It's waiting for review.
The patch landed in LLVM r310419.
Comment on attachment 8895670 [details]
Bug 1373562 - Bump to clang r311608 to fix the errors running ASan on Windows 10 1703.

clang-cl r310419 crashed for static analysis and ASan builds: https://treeherder.mozilla.org/#/jobs?repo=try&revision=5e84d2adf580c9cecae766187169fe15810e0017&selectedJob=122197123
Attachment #8895670 - Flags: review?(ehsan)
(In reply to Ting-Yu Chou [:ting] from comment #26)
> Comment on attachment 8895670 [details]
> Bug 1373562 - Bump to clang r310419 to fix errors on Windows 10.
> 
> clang-cl r310419 crashed for static analysis and ASan builds:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=5e84d2adf580c9cecae766187169fe15810e0017&selectedJob=1
> 22197123

https://bugs.llvm.org/show_bug.cgi?id=33997
Can we temporarily return to using -fallback while the compiler crash gets sorted out?
(In reply to David Major [:dmajor] from comment #28)
> Can we temporarily return to using -fallback while the compiler crash gets
> sorted out?

Sounds good to me, then we can move forward. But we may need to redo things like bug 1193452 to remove it. Ehsan, Nathan, what do you think?
Flags: needinfo?(nfroyd)
Flags: needinfo?(ehsan)
r=me on that.  :-)
Flags: needinfo?(ehsan)
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #30)
> r=me on that.  :-)

(As a general rule BTW I think that is OK for now, we don't need to be zealous about this for now.  I would appreciate if people file bugs for things that we fall back on if possible and make them block bug 752004 for tracking purposes...)
I'm happy for changes to be made that unblock making more things work with clang-cl.
Flags: needinfo?(nfroyd)
I am checking whether we can build Firefox by r310419 clang-cl with -fallback, and whether the binary runs properly. But the clang in browser/config/tooltool-manifests/win64/releng.manifest was removed in bug 1389398.

Is it possible to specify a different clang on tooltool to use for a Try win64-st-an or win64-asan build? Also if I want to bump to r310419, how should I make |./mach artifact toolchain win64-clang-cl| to download the updated archive?
Flags: needinfo?(mh+mozilla)
Just update the clang json, and push a try for win64-st-an.
Flags: needinfo?(mh+mozilla)
But we shouldn't use -fallback. If we need to, someone on our payroll should fix https://bugs.llvm.org/show_bug.cgi?id=33997 instead.
(Although, I'm curious whether -fallback actually works when the compiler crashes instead of "normally" failing to compile)
(also, you'd need to fix bug 1385743)
(In reply to Mike Hommey [:glandium] from comment #35)
> But we shouldn't use -fallback. If we need to, someone on our payroll should
> fix https://bugs.llvm.org/show_bug.cgi?id=33997 instead.

Is there someone on our payroll who has the experience and availability to land a fix in a reasonable timeframe? Otherwise I don't think it makes sense to block on getting the perfect solution.

(In reply to Mike Hommey [:glandium] from comment #36)
> (Although, I'm curious whether -fallback actually works when the compiler
> crashes instead of "normally" failing to compile)

It should be fine.
(In reply to David Major [:dmajor] from comment #38)
> (In reply to Mike Hommey [:glandium] from comment #35)
> > But we shouldn't use -fallback. If we need to, someone on our payroll should
> > fix https://bugs.llvm.org/show_bug.cgi?id=33997 instead.
> 
> Is there someone on our payroll who has the experience and availability to
> land a fix in a reasonable timeframe? Otherwise I don't think it makes sense
> to block on getting the perfect solution.

I have some experience fixing clang bugs, yes.  But I have zero time to devote this before we ship Quantum.  And we should *definitely* use -fallback, that is the whole reason why the flag exists.

> (In reply to Mike Hommey [:glandium] from comment #36)
> > (Although, I'm curious whether -fallback actually works when the compiler
> > crashes instead of "normally" failing to compile)
> 
> It should be fine.

It does work fine, clang-cl's driver driver spawn's clang's c1 as a separate process and watches its pid.  If the exit code is non-zero and -fallback is specified, it constructs a cl.exe jobs and runs that.
When I ran |./mach run| with my local build ASan binary, it reported an error and then the computer hanged.
This is the error, which seems OK and can be suppressed. I'll check the hang tomorrow.

==8404==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x122972e2c4e0 in thread T0:
  object passed to delete has wrong type:
  size of the allocated type:   64 bytes;
  size of the deallocated type: 40 bytes.
    #0 0x7fffaad1c82b in operator delete+0xcb (C:\w\fx\mc\obj-asan\dist\bin\clang_rt.asan_dynamic-x86_64.dll+0x18003c82b)
    #1 0x7fffa606ba0e in RefPtr<nsCSSShadowArray>::~RefPtr<nsCSSShadowArray> c:\w\fx\mc\obj-asan\dist\include\mozilla\refptr.h:78
    #2 0x7fffa0526344 in nsStyleEffects::~nsStyleEffects c:\w\fx\mc\layout\style\nsStyleStruct.cpp:4555
    #3 0x7fffa02b1f2a in nsStyleEffects::Destroy c:\w\fx\mc\layout\style\nsStyleStruct.h:3768
    #4 0x7fffa052531f in nsConditionalResetStyleData::Destroy c:\w\fx\mc\obj-asan\layout\style\nsStyleStructList.h:155
    #5 0x7fffa045971a in nsCachedStyleData::Destroy c:\w\fx\mc\layout\style\nsRuleNode.h:344
    #6 0x7fffa0458a16 in nsRuleNode::~nsRuleNode c:\w\fx\mc\layout\style\nsRuleNode.cpp:1875
    #7 0x7fffa04588ec in nsRuleNode::Destroy c:\w\fx\mc\layout\style\nsRuleNode.cpp:1818
    #8 0x7fffa04e4431 in nsStyleSet::GCRuleTrees c:\w\fx\mc\layout\style\nsStyleSet.cpp:2455
    #9 0x7fffa0284356 in mozilla::GeckoStyleContext::~GeckoStyleContext c:\w\fx\mc\layout\style\GeckoStyleContext.cpp:137
    #10 0x7fffa02840fe in mozilla::GeckoStyleContext::Destroy c:\w\fx\mc\layout\style\GeckoStyleContext.cpp:94
    #11 0x7fffa06b5022 in mozilla::UndisplayedNode::~UndisplayedNode c:\w\fx\mc\layout\base\nsFrameManager.h:45
    #12 0x7fffa06b5ba0 in nsFrameManagerBase::UndisplayedMap::Clear c:\w\fx\mc\layout\base\nsFrameManager.cpp:711
    #13 0x7fffa067603e in nsFrameManager::Destroy c:\w\fx\mc\layout\base\nsFrameManager.cpp:131
    #14 0x7fffa0675c52 in nsCSSFrameConstructor::WillDestroyFrameTree c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:9200
    #15 0x7fffa05a7034 in mozilla::PresShell::Destroy c:\w\fx\mc\layout\base\PresShell.cpp:1387
    #16 0x7fffa06a1b07 in nsDocumentViewer::DestroyPresShell c:\w\fx\mc\layout\base\nsDocumentViewer.cpp:4724
    #17 0x7fffa0690501 in nsDocumentViewer::Destroy c:\w\fx\mc\layout\base\nsDocumentViewer.cpp:1768
    #18 0x7fffa06a3aba in nsDocumentViewer::Show c:\w\fx\mc\layout\base\nsDocumentViewer.cpp:2105
    #19 0x7fffa072e6d7 in nsPresContext::EnsureVisible c:\w\fx\mc\layout\base\nsPresContext.cpp:2240
    #20 0x7fffa05c215e in mozilla::PresShell::UnsuppressAndInvalidate c:\w\fx\mc\layout\base\PresShell.cpp:3904
    #21 0x7fffa06a1d45 in nsDocumentViewer::Stop c:\w\fx\mc\layout\base\nsDocumentViewer.cpp:1831
    #22 0x7fffa2de994f in nsDocShell::Stop c:\w\fx\mc\docshell\base\nsDocShell.cpp:5599
    #23 0x7fffa2e0cccd in nsDocShell::InternalLoad c:\w\fx\mc\docshell\base\nsDocShell.cpp:10718
    #24 0x7fffa2e02d49 in nsDocShell::LoadURI c:\w\fx\mc\docshell\base\nsDocShell.cpp:1598
    #25 0x7fff9bccb0a7 in mozilla::dom::Location::SetURI c:\w\fx\mc\dom\base\Location.cpp:255
    #26 0x7fff9bcce65a in mozilla::dom::Location::SetHrefWithBase c:\w\fx\mc\dom\base\Location.cpp:532
    #27 0x7fff9bccdf72 in mozilla::dom::Location::SetHrefWithContext c:\w\fx\mc\dom\base\Location.cpp:485
    #28 0x7fff9bccdacc in mozilla::dom::Location::SetHref c:\w\fx\mc\dom\base\Location.cpp:450
    #29 0x7fff9c477681 in mozilla::dom::LocationBinding::set_href c:\w\fx\mc\obj-asan\dom\bindings\LocationBinding.cpp:96
    #30 0x7fff9c47661a in mozilla::dom::LocationBinding::genericCrossOriginSetter c:\w\fx\mc\obj-asan\dom\bindings\LocationBinding.cpp:970
    #31 0x7fffa4d4fc01 in js::InternalCallOrConstruct c:\w\fx\mc\js\src\vm\Interpreter.cpp:434
    #32 0x7fffa4d524b0 in js::CallSetter c:\w\fx\mc\js\src\vm\Interpreter.cpp:653
    #33 0x7fffa47d86e7 in js::SetPropertyIgnoringNamedGetter c:\w\fx\mc\js\src\proxy\BaseProxyHandler.cpp:245
    #34 0x7fff9dae7448 in mozilla::dom::DOMProxyHandler::set c:\w\fx\mc\dom\bindings\DOMJSProxyHandler.cpp:225
    #35 0x7fffa48a2e31 in js::Proxy::set c:\w\fx\mc\js\src\proxy\Proxy.cpp:384
    #36 0x7fffa42a7168 in JSObject::nonNativeSetProperty c:\w\fx\mc\js\src\jsobj.cpp:1047
    #37 0x7fffa3c72e11 in JS_SetProperty c:\w\fx\mc\js\src\jsapi.cpp:2703
    #38 0x7fff9d22ee06 in mozilla::dom::WindowBinding::set_location c:\w\fx\mc\obj-asan\dom\bindings\WindowBinding.cpp:1383
    #39 0x7fff9d22ca52 in mozilla::dom::WindowBinding::genericCrossOriginSetter c:\w\fx\mc\obj-asan\dom\bindings\WindowBinding.cpp:15800
    #40 0x7fffa4d4fc01 in js::InternalCallOrConstruct c:\w\fx\mc\js\src\vm\Interpreter.cpp:434
    #41 0x7fffa4d524b0 in js::CallSetter c:\w\fx\mc\js\src\vm\Interpreter.cpp:653
    #42 0x7fffa423b3f0 in js::NativeSetProperty c:\w\fx\mc\js\src\vm\NativeObject.cpp:2825
    #43 0x7fffa48cbdcb in js::Wrapper::set c:\w\fx\mc\js\src\proxy\Wrapper.cpp:153
    #44 0x7fff9bb2d993 in nsOuterWindowProxy::set c:\w\fx\mc\dom\base\nsGlobalWindow.cpp:1418
    #45 0x7fffa48a2e31 in js::Proxy::set c:\w\fx\mc\js\src\proxy\Proxy.cpp:384
    #46 0x7fffa42a7168 in JSObject::nonNativeSetProperty c:\w\fx\mc\js\src\jsobj.cpp:1047
    #47 0x7fffa4d38975 in Interpret c:\w\fx\mc\js\src\vm\Interpreter.cpp:2944
    #48 0x7fffa4d17010 in js::RunScript c:\w\fx\mc\js\src\vm\Interpreter.cpp:409
    #49 0x7fffa4d52dee in js::ExecuteKernel c:\w\fx\mc\js\src\vm\Interpreter.cpp:698
    #50 0x7fffa4d53717 in js::Execute c:\w\fx\mc\js\src\vm\Interpreter.cpp:730
    #51 0x7fffa3c913ca in ExecuteScript c:\w\fx\mc\js\src\jsapi.cpp:4651
    #52 0x7fff9bf8d61f in nsJSUtils::ExecutionContext::CompileAndExec c:\w\fx\mc\dom\base\nsJSUtils.cpp:265
    #53 0x7fff9f5efe88 in mozilla::dom::ScriptLoader::EvaluateScript c:\w\fx\mc\dom\script\ScriptLoader.cpp:2144
    #54 0x7fff9f5eb67d in mozilla::dom::ScriptLoader::ProcessRequest c:\w\fx\mc\dom\script\ScriptLoader.cpp:1802
    #55 0x7fff9f5d2d6b in mozilla::dom::ScriptLoader::ProcessScriptElement c:\w\fx\mc\dom\script\ScriptLoader.cpp:1499
    #56 0x7fff9f5cf9a6 in mozilla::dom::ScriptElement::MaybeProcessScript c:\w\fx\mc\dom\script\ScriptElement.cpp:149
    #57 0x7fff9aed48e0 in nsHtml5TreeOpExecutor::RunScript c:\w\fx\mc\parser\html\nsHtml5TreeOpExecutor.cpp:698
    #58 0x7fff9aece445 in nsHtml5TreeOpExecutor::RunFlushLoop c:\w\fx\mc\parser\html\nsHtml5TreeOpExecutor.cpp:506
    #59 0x7fff9aed939e in nsHtml5ExecutorFlusher::Run c:\w\fx\mc\parser\html\nsHtml5StreamParser.cpp:128
    #60 0x7fff993679db in mozilla::SchedulerGroup::Runnable::Run c:\w\fx\mc\xpcom\threads\SchedulerGroup.cpp:387
    #61 0x7fff99398a72 in nsThread::ProcessNextEvent c:\w\fx\mc\xpcom\threads\nsThread.cpp:1569
    #62 0x7fff9939e6e9 in NS_ProcessNextEvent c:\w\fx\mc\xpcom\threads\nsThreadUtils.cpp:521
    #63 0x7fff9a0ef994 in mozilla::ipc::MessagePump::Run c:\w\fx\mc\ipc\glue\MessagePump.cpp:97
    #64 0x7fff9a08043e in MessageLoop::RunHandler c:\w\fx\mc\ipc\chromium\src\base\message_loop.cc:312
    #65 0x7fff9a0801db in MessageLoop::Run c:\w\fx\mc\ipc\chromium\src\base\message_loop.cc:299
    #66 0x7fff9f75debc in nsBaseAppShell::Run c:\w\fx\mc\widget\nsBaseAppShell.cpp:158
    #67 0x7fff9f8a45cb in nsAppShell::Run c:\w\fx\mc\widget\windows\nsAppShell.cpp:210
    #68 0x7fffa392b881 in XRE_RunAppShell c:\w\fx\mc\toolkit\xre\nsEmbedFunctions.cpp:882
    #69 0x7fff9a08043e in MessageLoop::RunHandler c:\w\fx\mc\ipc\chromium\src\base\message_loop.cc:312
    #70 0x7fff9a0801db in MessageLoop::Run c:\w\fx\mc\ipc\chromium\src\base\message_loop.cc:299
    #71 0x7fffa392ae42 in XRE_InitChildProcess c:\w\fx\mc\toolkit\xre\nsEmbedFunctions.cpp:699
    #72 0x7ff715692363 in content_process_main c:\w\fx\mc\ipc\contentproc\plugin-container.cpp:64
    #73 0x7ff715691641 in NS_internal_main c:\w\fx\mc\browser\app\nsBrowserApp.cpp:285
    #74 0x7ff7156912d8 in wmain c:\w\fx\mc\toolkit\xre\nsWindowsWMain.cpp:115
    #75 0x7ff7157202a0 in __scrt_common_main_seh f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:253
    #76 0x7fffda782773 in BaseThreadInitThunk+0x13 (C:\WINDOWS\System32\KERNEL32.DLL+0x180012773)
    #77 0x7fffdc2e0d50 in RtlUserThreadStart+0x20 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x180070d50)

0x122972e2c4e0 is located 0 bytes inside of 64-byte region [0x122972e2c4e0,0x122972e2c520)
allocated by thread T0 here:
    #0 0x7fffaad10d91 in _asan_memmove+0x3d1 (C:\w\fx\mc\obj-asan\dist\bin\clang_rt.asan_dynamic-x86_64.dll+0x180030d91)
    #1 0x7fffaba3aed1 in moz_xmalloc c:\w\fx\mc\memory\mozalloc\mozalloc.cpp:83
    #2 0x7fffa04b30de in GetShadowData c:\w\fx\mc\layout\style\nsRuleNode.cpp:4461
    #3 0x7fffa049e73a in nsRuleNode::ComputeEffectsData c:\w\fx\mc\layout\style\nsRuleNode.cpp:10342
    #4 0x7fffa045b8b9 in nsRuleNode::WalkRuleTree c:\w\fx\mc\layout\style\nsRuleNode.cpp:2832
    #5 0x7fff9dc31d2c in nsRuleNode::GetStyleEffects<1> c:\w\fx\mc\obj-asan\dist\include\nsStyleStructList.h:155
    #6 0x7fff9e25cfc4 in nsStyleDisplay::HasFixedPosContainingBlockStyleInternal<nsStyleContext> c:\w\fx\mc\layout\style\nsStyleStructInlines.h:163
    #7 0x7fffa06534ea in nsCSSFrameConstructor::ConstructScrollableBlockWithConstructor c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5056
    #8 0x7fffa0658d32 in nsCSSFrameConstructor::ConstructScrollableBlock c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5020
    #9 0x7fffa0655401 in nsCSSFrameConstructor::ConstructFrameFromItemInternal c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:4017
    #10 0x7fffa065eed4 in nsCSSFrameConstructor::ConstructFramesFromItem c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:6409
    #11 0x7fffa0641d3c in nsCSSFrameConstructor::ProcessChildren c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:11279
    #12 0x7fffa064a7dd in nsCSSFrameConstructor::ConstructBlock c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:12499
    #13 0x7fffa0652ea0 in nsCSSFrameConstructor::ConstructNonScrollableBlockWithConstructor c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5108
    #14 0x7fffa0658d58 in nsCSSFrameConstructor::ConstructNonScrollableBlock c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5072
    #15 0x7fffa0655401 in nsCSSFrameConstructor::ConstructFrameFromItemInternal c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:4017
    #16 0x7fffa065eed4 in nsCSSFrameConstructor::ConstructFramesFromItem c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:6409
    #17 0x7fffa0641d3c in nsCSSFrameConstructor::ProcessChildren c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:11279
    #18 0x7fffa0655f9a in nsCSSFrameConstructor::ConstructFrameFromItemInternal c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:4201
    #19 0x7fffa065eed4 in nsCSSFrameConstructor::ConstructFramesFromItem c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:6409
    #20 0x7fffa0641d3c in nsCSSFrameConstructor::ProcessChildren c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:11279
    #21 0x7fffa064a7dd in nsCSSFrameConstructor::ConstructBlock c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:12499
    #22 0x7fffa0652ea0 in nsCSSFrameConstructor::ConstructNonScrollableBlockWithConstructor c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5108
    #23 0x7fffa0658d58 in nsCSSFrameConstructor::ConstructNonScrollableBlock c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5072
    #24 0x7fffa0655401 in nsCSSFrameConstructor::ConstructFrameFromItemInternal c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:4017
    #25 0x7fffa065eed4 in nsCSSFrameConstructor::ConstructFramesFromItem c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:6409
    #26 0x7fffa0641d3c in nsCSSFrameConstructor::ProcessChildren c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:11279
    #27 0x7fffa064a7dd in nsCSSFrameConstructor::ConstructBlock c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:12499
    #28 0x7fffa0653641 in nsCSSFrameConstructor::ConstructScrollableBlockWithConstructor c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5053
    #29 0x7fffa0658d32 in nsCSSFrameConstructor::ConstructScrollableBlock c:\w\fx\mc\layout\base\nsCSSFrameConstructor.cpp:5020

SUMMARY: AddressSanitizer: new-delete-type-mismatch (C:\w\fx\mc\obj-asan\dist\bin\clang_rt.asan_dynamic-x86_64.dll+0x18003c82b) in operator delete+0xcb
==8404==HINT: if you don't care about these errors you may set ASAN_OPTIONS=new_delete_type_mismatch=0
==8404==ABORTING
(In reply to Ting-Yu Chou [:ting] from comment #41)
> This is the error, which seems OK and can be suppressed. I'll check the hang
> tomorrow.

Unfortunately there seems no way to suppress it except ASAN_OPTIONS=new_delete_type_mismatch=0, see https://reviews.llvm.org/D21378.
Hmm, is that error really OK to suppress?  See this code: <https://searchfox.org/mozilla-central/rev/5ce320a16f6998dc2b36e30458131d029d107077/layout/style/nsStyleStruct.h#1050>  When nsCSSShadowArray::Release() calls |delete this;|, what would tell the compiler that it is not a memory block of sizeof(nsCSSShadowArray) that is being deleted?  That's what address sanitizer is telling us here, no?

It seems to me that the correct way to write this code would be to provide an operator delete() that computes the correct size just like operator new() and passes that to ::operator delete(), no?.  Just because jemalloc doesn't care about this information in practice doesn't mean ASAN doesn't!
I thought the check is for things like mismatched new[] and delete, but in nsCSSShadowARray we have proper new/delete. Wonder does the mismatched size cause any harm?
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog, Away 8/18) from comment #43)
> It seems to me that the correct way to write this code would be to provide
> an operator delete() that computes the correct size just like operator new()
> and passes that to ::operator delete(), no?.  Just because jemalloc doesn't
> care about this information in practice doesn't mean ASAN doesn't!

::operator delete() doesn't take a size, though.  (There's a C++14 overload that does, but that's not relevant here.)

(In reply to Ting-Yu Chou [:ting] from comment #44)
> I thought the check is for things like mismatched new[] and delete, but in
> nsCSSShadowARray we have proper new/delete. Wonder does the mismatched size
> cause any harm?

Maybe it's complaining that you have nsCSSShadowArray::operator new vs. ::operator delete.  Perhaps try adding an operator delete method to nsCSSShadowArray?
(In reply to Nathan Froyd [:froydnj] from comment #45)
> Maybe it's complaining that you have nsCSSShadowArray::operator new vs.
> ::operator delete.  Perhaps try adding an operator delete method to
> nsCSSShadowArray?

Surprisingly adding this line to nsCSSShadowArray avoid the error:

  void operator delete(void* aPtr) { ::operator delete(aPtr); }

there's a similar implementation in nsCSSValue http://searchfox.org/mozilla-central/rev/e8c36327cd8c9432c69e5e1383156a74330f11f2/layout/style/nsCSSValue.h#1071-1078.
Depends on: 1391500
Depends on: 1391511
Static check builds are still failed with "-fallback" because VC doesn't recognize source annotations like "__attribute__((annotate("moz_implicit")))":

0:55.30 clang.exe: warning: falling back to C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC\bin\amd64\cl.exe [-Wfallback]
 0:55.30 cl : Command line warning D9035 : option 'Og' has been deprecated and will be removed in a future release
 0:55.30 Unified_cpp_layout_style2.cpp
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(43): error C2059: syntax error: '('
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(43): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(43): error C2059: syntax error: ')'
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(44): error C2059: syntax error: '('
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(44): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(44): error C2535: 'int char16ptr_t::__attribute__(void)': member function already defined or declared
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(43): note: see declaration of 'char16ptr_t::__attribute__'
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(44): error C2059: syntax error: ')'
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(49): error C2059: syntax error: '('
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(49): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(49): error C2535: 'int char16ptr_t::__attribute__(void)': member function already defined or declared
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(43): note: see declaration of 'char16ptr_t::__attribute__'
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(49): error C2059: syntax error: ')'
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(69): error C2059: syntax error: '('
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(69): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(69): error C2535: 'int char16ptr_t::__attribute__(void)': member function already defined or declared
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(43): note: see declaration of 'char16ptr_t::__attribute__'
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/Char16.h(69): error C2059: syntax error: ')'
 0:55.30 c:\w\fx\mc\obj-stan\dist\include\mozilla/DebugOnly.h(39): fatal error C1903: unable to recover from previous error(s); stopping compilation
 0:55.30 clang.exe: error: clang frontend command failed with exit code 2 (use -v to see invocation)
 0:55.30 mozmake.EXE[5]: *** [c:/w/fx/mc/config/rules.mk:1062: Unified_cpp_layout_style2.obj] Error 2
Fallback mode actually disable static analysis checking, I am a bit worried about this.
(In reply to Ting-Yu Chou [:ting] from comment #51)
> Fallback mode actually disable static analysis checking, I am a bit worried
> about this.

To be clear here, fallback mode disables static analysis for the particular file that requires fallback, or on all files?  I hope it's the first one!  And if it's the first one, you're right, that is unfortunate, and perhaps we should reconsider using -fallback.

Is it possible to locally revert the offending patches in clang?
(In reply to Nathan Froyd [:froydnj] from comment #45)
> (In reply to :Ehsan Akhgari (needinfo please, extremely long backlog, Away
> 8/18) from comment #43)
> > It seems to me that the correct way to write this code would be to provide
> > an operator delete() that computes the correct size just like operator new()
> > and passes that to ::operator delete(), no?.  Just because jemalloc doesn't
> > care about this information in practice doesn't mean ASAN doesn't!
> 
> ::operator delete() doesn't take a size, though.  (There's a C++14 overload
> that does, but that's not relevant here.)

Oh right.  So there is some other C++ intricacy involved here that I don't understand.

> (In reply to Ting-Yu Chou [:ting] from comment #44)
> > I thought the check is for things like mismatched new[] and delete, but in
> > nsCSSShadowARray we have proper new/delete. Wonder does the mismatched size
> > cause any harm?
> 
> Maybe it's complaining that you have nsCSSShadowArray::operator new vs.
> ::operator delete.  Perhaps try adding an operator delete method to
> nsCSSShadowArray?

Yes, we should do this.  I wish I could remember the reason!
(In reply to Nathan Froyd [:froydnj] from comment #52)
> (In reply to Ting-Yu Chou [:ting] from comment #51)
> > Fallback mode actually disable static analysis checking, I am a bit worried
> > about this.
> 
> To be clear here, fallback mode disables static analysis for the particular
> file that requires fallback, or on all files?

Only in that one file.

> Is it possible to locally revert the offending patches in clang?

That is a lot scarier than the -fallback escape hatch IMO, and assuming that it's possible even is premature I think.  Even if it works now maintaining such a patch for future upgrades can be non-trivial depending on the nature of the patch, and evaluating the effect of such a reversal may not be much easier than fixing the bug in the first place.
Comment on attachment 8895670 [details]
Bug 1373562 - Bump to clang r311608 to fix the errors running ASan on Windows 10 1703.

https://reviewboard.mozilla.org/r/166944/#review176444
Attachment #8895670 - Flags: review?(ehsan) → review+
Comment on attachment 8899740 [details]
Bug 1373562 part 2 - Apply source annotatoins for static analysis only when the compiler is clang.

https://reviewboard.mozilla.org/r/171074/#review176446

::: gfx/skia/skia/include/config/SkUserConfig.h:161
(Diff revision 1)
>  #define SK_DISABLE_SCREENSPACE_TESS_AA_PATH_RENDERER
>  
>  #define SK_DISABLE_SLOW_DEBUG_VALIDATION 1
>  
>  #ifndef MOZ_IMPLICIT
> -#  ifdef MOZ_CLANG_PLUGIN
> +#  if defined(MOZ_CLANG_PLUGIN) && defined(__clang__)

Skia is an upstream library which we aren't allowed to modify in place like this, and in my experience upstreaming patches to Skia can be an involved process.

I think a better way to fix this would be to add something to mozilla-config.h.in to check to see if MOZ_CLANG_PLUGIN is defined and __clang__ isn't defined, and then #undef MOZ_CLANG_PLUGIN.  That way, you won't need any of these two hunks.
Attachment #8899740 - Flags: review?(ehsan) → review-
Comment on attachment 8899740 [details]
Bug 1373562 part 2 - Apply source annotatoins for static analysis only when the compiler is clang.

https://reviewboard.mozilla.org/r/171074/#review176446

> Skia is an upstream library which we aren't allowed to modify in place like this, and in my experience upstreaming patches to Skia can be an involved process.
> 
> I think a better way to fix this would be to add something to mozilla-config.h.in to check to see if MOZ_CLANG_PLUGIN is defined and __clang__ isn't defined, and then #undef MOZ_CLANG_PLUGIN.  That way, you won't need any of these two hunks.

But we're already defining MOZ stuff in this file, which presumably isn't in upstream Skia?  How does this file works with our Skia imports, then?
Yes, looks like you're right: https://skia.googlesource.com/skia/+/master/include/config/SkUserConfig.h  It looks like anything from this line on is our additions: <https://searchfox.org/mozilla-central/rev/48ea452803907f2575d81021e8678634e8067fc2/gfx/skia/skia/include/config/SkUserConfig.h#138>

Lee, what are the rules here?
Flags: needinfo?(lsalzman)
Comment on attachment 8899741 [details]
Bug 1373562 part 3 - Add compiler argument -fallback for clang-cl.

https://reviewboard.mozilla.org/r/171076/#review176534

There's finally some movement on the llvm bug. Let's wait for a few days and see if it gets fixed.
Attachment #8899741 - Flags: review?(mh+mozilla)
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog, Away 8/23) from comment #58)
> Yes, looks like you're right:
> https://skia.googlesource.com/skia/+/master/include/config/SkUserConfig.h 
> It looks like anything from this line on is our additions:
> <https://searchfox.org/mozilla-central/rev/
> 48ea452803907f2575d81021e8678634e8067fc2/gfx/skia/skia/include/config/
> SkUserConfig.h#138>
> 
> Lee, what are the rules here?

SkUserConfig.h is our particular Gecko-specific config for Skia, so nothing in there ever gets upstreamed. None the less, all changes to it are tracked as separate patches by me to ensure that when we do upgrade Skia, those changes don't get erased by accident.

If you do not like the idea of your changes accidentally being erased during a Skia upgrade, then it is best to have me review/r+ the patch, after which you can land it as normal. Once it lands, I then handle everything else with regards to setting up patches and what not.
Flags: needinfo?(lsalzman)
Comment on attachment 8899740 [details]
Bug 1373562 part 2 - Apply source annotatoins for static analysis only when the compiler is clang.

https://reviewboard.mozilla.org/r/171074/#review176582

OK, r=me then!  Please break off the Skia part into a separate patch as Lee requested.  Thanks.
Attachment #8899740 - Flags: review- → review+
https://bugs.llvm.org/show_bug.cgi?id=33997 is now fixed, and building Firefox without -fallback works with it.
Attachment #8899740 - Attachment is obsolete: true
Attachment #8899741 - Attachment is obsolete: true
The ASan build from r311608 loads FB, Youtube, CNN without any issues.
Pushed by tchou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/363f26479099
Bump to clang r311608 to fix the errors running ASan on Windows 10 1703. r=Ehsan
https://hg.mozilla.org/mozilla-central/rev/363f26479099
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.