Closed
Bug 1076446
Opened 7 years ago
Closed 7 years ago
Test DMD on TBPL (Windows)
Categories
(Core :: DMD, defect)
Tracking
()
RESOLVED
FIXED
mozilla36
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
Attachments
(1 file, 3 obsolete files)
45.06 KB,
patch
|
glandium
:
review+
|
Details | Diff | Splinter Review |
Bug 1073312 will add a DMD test, but not run it on Windows. The immediate problem is that the test uses /usr/bin/diff which doesn't work on Windows. And there might be other problems after that.
![]() |
Assignee | |
Comment 1•7 years ago
|
||
On try it fails like this: > Assertion failure: gSmallBlockActualSizeCounter == 120, at c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/memory/replace/dmd/DMD.cpp:2080 This is exactly the same failure that Catalin saw in bug 819839 comment 18. Back then he said that some of the malloc calls were ending up in the system malloc instead of the replace_malloc one; I wonder if that's still the problem. It's a plausible explanation, and I can't think of anything else that would cause this assertion failure.
![]() |
Assignee | |
Comment 2•7 years ago
|
||
I've confirmed that replacing the malloc() calls in RunTestMode() in DMD.cpp with replace_malloc() calls fixes the assertion and lets DMD's test mode run to completion. So it's definitely a problem with the malloc replacement. glandium, any idea what might be wrong? Is there some reason why malloc() calls within DMD.cpp itself wouldn't get replaced on Windows?
Flags: needinfo?(mh+mozilla)
Comment 3•7 years ago
|
||
> Is there some reason why malloc() calls within DMD.cpp itself wouldn't get replaced on Windows?
Because you're not supposed to call malloc from a replace malloc
library. You're just lucky that it happens to work on linux and mac
because of how things are, but that wouldn't work on android or
windows.
If you want to call plain malloc, you need to extract that test code
from the dmd lib and put it elsewhere. If you only care about testing
the dmd code itself and don't care about testing the replace-malloc
infrastructure itself, you can just call replace_malloc functions.
Flags: needinfo?(mh+mozilla)
![]() |
Assignee | |
Comment 4•7 years ago
|
||
> Because you're not supposed to call malloc from a replace malloc > library. You're just lucky that it happens to work on linux and mac > because of how things are, but that wouldn't work on android or > windows. Fair enough. > If you want to call plain malloc, you need to extract that test code > from the dmd lib and put it elsewhere. If you only care about testing > the dmd code itself and don't care about testing the replace-malloc > infrastructure itself, you can just call replace_malloc functions. Does replace-malloc have any other testing? If so, then the latter option would be ok.
Comment 5•7 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #4) > Does replace-malloc have any other testing? If so, then the latter option > would be ok. Currently no, but it's probably better to add testing for that than to jump into hoops to have that done somehow by the DMD test.
![]() |
Assignee | |
Comment 6•7 years ago
|
||
I've gotten further, but now I'm hitting the problem that DMD is getting empty stacks on Windows on try:
> Live {
> 30 blocks in heap block record 1 of 1
> 12,088 bytes (7,870 requested / 4,218 slop)
> 100.00% of the heap (100.00% cumulative)
> Allocated at {
> }
> }
There are supposed to be stack frame descriptions in the "Allocated at" block. Hmm.
![]() |
Assignee | |
Comment 7•7 years ago
|
||
> I've gotten further, but now I'm hitting the problem that DMD is getting > empty stacks on Windows on try: I've worked out why. We're hitting this timeout in EnsureWalkThreadReady(): http://dxr.mozilla.org/mozilla-central/source/xpcom/base/nsStackWalk.cpp#308 The problem is that DMD's test mode runs extremely early -- as soon as the first call to malloc occurs. This is probably while static initializers are running, before we reach main(). This is too early for the Windows stack-walking thread to have started, so we get empty stack traces. Catalin hit exactly this problem in bug 819839 and changed the code so that it would return empty stack traces instead of deadlocking in this case. Which is ok for general use, but no good for the test as it's currently implemented. Probably the easiest thing is to find a way to make DMD's test mode run later. I can do that by creating a JS chrome function that triggers it, and calling that from the xpcshell test. That's arguably nicer than the magic start-up thing in place anyway.
![]() |
Assignee | |
Comment 8•7 years ago
|
||
Here's a draft patch. - It moves the DMD test code out of DMD.cpp, into test/SmokeDMD.cpp. - It executes |SmokeDMD| from the xpcshell test. - It removes DMD's --mode option, because it's no longer needed. - It enables the test on Windows. On Linux and Mac it works nicely. But there are problems on Windows on try (https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=38ef0222592f). - On WinXP and Win7, SmokeDMD appears to hang silently, which causes runxpcshelltests.py to time out. - On Win8, SmokeDMD aborts or crashes silently, causing the process.run() call to fail with NS_ERROR_FILE_EXECUTION_FAILED. I'm not sure what the problem is.
![]() |
Assignee | |
Comment 9•7 years ago
|
||
> - On WinXP and Win7, SmokeDMD appears to hang silently, which causes > runxpcshelltests.py to time out. > > - On Win8, SmokeDMD aborts or crashes silently, causing the process.run() > call > to fail with NS_ERROR_FILE_EXECUTION_FAILED. On Win8 the failures vary. In https://mail.google.com/mail/u/0/#search/tryserver/1490c783401bd749 the failure was that the IsRunning() assertion at the start of SmokeDMD.cpp failed. This indicates that dmd::Init() hasn't been called, which indicates that something is going wrong with replace-malloc, because there's a malloc call just before that assertion which should trigger dmd::Init() being called. On another Win8 try run I saw aborts coming from jemalloc. And on the try run mentioned in comment 8 there was an unspecified abort or crash from SmokeDMD.
![]() |
Assignee | |
Comment 10•7 years ago
|
||
> On another Win8 try run I saw aborts coming from jemalloc.
Nb: in these ones dmd::Init() was definitely called, and the crash occurred some time after that. It's all very unpredictable.
![]() |
Assignee | |
Comment 11•7 years ago
|
||
This is green on Windows, both locally and on try. Most of the patch is taken up by moving the test code from DMD.cpp to SmokeDMD.cpp. Only very minor changes were made to that code as part of the move.
Attachment #8506008 -
Flags: review?(mh+mozilla)
![]() |
Assignee | |
Updated•7 years ago
|
Attachment #8503913 -
Attachment is obsolete: true
Comment 12•7 years ago
|
||
Comment on attachment 8506008 [details] [diff] [review] Make the DMD test work on Windows Review of attachment 8506008 [details] [diff] [review]: ----------------------------------------------------------------- ::: memory/replace/dmd/DMD.cpp @@ -2058,5 @@ > - for (int i = 0; i < seven + 8; i++) { > - s = (char*) replace_malloc(8); > - UseItOrLoseIt(s, seven); > - } > - MOZ_ASSERT(gSmallBlockActualSizeCounter == 120); Do you lose something by not having those asserts in some form in the moved code? ::: memory/replace/dmd/dmd.py @@ +502,5 @@ > > > if __name__ == '__main__': > main() > + No need to add an empty line :) ::: memory/replace/dmd/test/SmokeDMD.cpp @@ +96,5 @@ > + // certainly doesn't know this. > + // > + int n = (rand() % 10 + 1) * 9; > + int nine = (n / 10) + (n % 10); > + int seven = nine - 2; You don't need this if you build this with -O0. @@ +248,5 @@ > + free(c); > + free(e); > + Report(e2); > + free(e3); > + Maybe copy the commented free() calls corresponding to the commented cases above? ::: memory/replace/dmd/test/moz.build @@ +17,5 @@ > +DEFINES['MOZ_NO_MOZALLOC'] = True > + > +DISABLE_STL_WRAPPING = True > + > +USE_LIBS += ['dmd'] trailing whitespace. ::: memory/replace/dmd/test/test_dmd.js @@ +53,5 @@ > + data += str.value; > + } while (read != 0); > + } > + cstream.close(); // this closes fstream > + return data.replace(/\r/g, ""); // normalize line endings It probably would be better to use OS.File, although I don't know how you can do pseudo-synchronous things with it. https://developer.mozilla.org/en-US/docs/JavaScript_OS.File/OS.File_for_the_main_thread#OS.File.read ::: testing/mochitest/Makefile.in @@ +64,5 @@ > > ifdef MOZ_DMD > +TEST_HARNESS_BINS += \ > + dmd.py \ > + SmokeDMD$(BIN_SUFFIX) SmokeDMD$(BIN_SUFFIX) \ $(NULL) ::: toolkit/mozapps/installer/packager.mk @@ +601,5 @@ > endif > > +ifdef MOZ_DMD > +NO_PKG_FILES += \ > + SmokeDMD Put it all on one line, or do SmokeDMD \ $(NULL)
Attachment #8506008 -
Flags: review?(mh+mozilla) → feedback+
Comment 13•7 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #12) > ::: memory/replace/dmd/test/test_dmd.js > @@ +53,5 @@ > > + data += str.value; > > + } while (read != 0); > > + } > > + cstream.close(); // this closes fstream > > + return data.replace(/\r/g, ""); // normalize line endings > > It probably would be better to use OS.File, although I don't know how you > can do pseudo-synchronous things with it. > https://developer.mozilla.org/en-US/docs/JavaScript_OS.File/OS. > File_for_the_main_thread#OS.File.read Yoric, would you have some advice here?
Flags: needinfo?(dteller)
![]() |
Assignee | |
Comment 14•7 years ago
|
||
> Do you lose something by not having those asserts in some form in the moved
> code?
Not really. If anything goes wrong the output will be different to what is expected.
![]() |
Assignee | |
Comment 15•7 years ago
|
||
> It probably would be better to use OS.File, although I don't know how you can do pseudo-synchronous things with it.
I deliberately avoided it precisely because synchronous file-handling is appropriate here, and OS.File is entirely designed around asynchronous handling. No point trying to fit a square peg in a round hole when there is a square hole available that works just as well.
I agree with Nicholas. If what you need is synchronous, don't use OS.File.
Flags: needinfo?(dteller)
![]() |
Assignee | |
Comment 17•7 years ago
|
||
This updated patch addresses all comments, except I haven't used OS.File, and I
didn't do anything about the following suggestion because I didn't understand
it:
> Maybe copy the commented free() calls corresponding to the commented cases above?
Attachment #8509739 -
Flags: review?(mh+mozilla)
![]() |
Assignee | |
Updated•7 years ago
|
Attachment #8506008 -
Attachment is obsolete: true
![]() |
Assignee | |
Comment 18•7 years ago
|
||
Updated version that adds the commented-out free() calls back in, and tweaks the commented-out aligned_alloc() call to be more like the others.
Attachment #8509926 -
Flags: review?(mh+mozilla)
![]() |
Assignee | |
Updated•7 years ago
|
Attachment #8509739 -
Attachment is obsolete: true
Attachment #8509739 -
Flags: review?(mh+mozilla)
Comment 19•7 years ago
|
||
Comment on attachment 8509926 [details] [diff] [review] Make the DMD test work on Windows Review of attachment 8509926 [details] [diff] [review]: ----------------------------------------------------------------- ::: memory/replace/dmd/test/SmokeDMD.cpp @@ +88,5 @@ > + // eliding unused malloc() calls or unrolling loops with fixed iteration > + // counts. So we compile it with -O0 (or equivalent), which probably prevents > + // that. We also use the following variable for various loop iteration > + // counts, just in case compilers might unroll very small loops even with > + // -O0. Compilers don't unroll anything at -O0. I'm still not convinced by the "seven" thing, but meh.
Attachment #8509926 -
Flags: review?(mh+mozilla) → review+
![]() |
Assignee | |
Comment 20•7 years ago
|
||
And we recently change the Windows 8 builds to be 64-bit, and that breaks the test yet again: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=68627a877b6e There are lots of "ERROR: WalkStack64: The handle is invalid." messages. Sigh.
![]() |
Assignee | |
Comment 21•7 years ago
|
||
I ended up disabling the test on Windows 8 just so I could land this and the four other patches I had in my queue following it. https://hg.mozilla.org/integration/mozilla-inbound/rev/835fbe63da4a I filed bug 1088343 as a follow-up.
Comment 22•7 years ago
|
||
sorry had to back this out in https://treeherder.mozilla.org/ui/#/jobs?repo=mozilla-inbound&revision=de805196bbc4 since one of this changes caused perma failure on 10.8 Debug Tests like https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3275566&repo=mozilla-inbound
![]() |
Assignee | |
Comment 23•7 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #22) > sorry had to back this out in > https://treeherder.mozilla.org/ui/#/jobs?repo=mozilla- > inbound&revision=de805196bbc4 since one of this changes caused perma failure > on 10.8 Debug Tests like > https://treeherder.mozilla.org/ui/logviewer. > html#?job_id=3275566&repo=mozilla-inbound Hey glandium. Looks like some compilers do unroll some loops with -O0. Argh. Even more annoying: I would have caught this before landing if my Mac xpcshell try run hadn't taken over 26 hours to run, which was long enough that I gave up on it.
Comment 24•7 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #23) > (In reply to Carsten Book [:Tomcat] from comment #22) > > sorry had to back this out in > > https://treeherder.mozilla.org/ui/#/jobs?repo=mozilla- > > inbound&revision=de805196bbc4 since one of this changes caused perma failure > > on 10.8 Debug Tests like > > https://treeherder.mozilla.org/ui/logviewer. > > html#?job_id=3275566&repo=mozilla-inbound > > Hey glandium. Looks like some compilers do unroll some loops with -O0. Argh. That's hard to believe. I'd like to see the disassembly.
![]() |
Assignee | |
Comment 25•7 years ago
|
||
My mistake; it appears that when I have two calls to Report() on consecutive lines, clang is causing them to have identical stack traces. If I artfully insert another function call between the Report() calls, then they end up with different stack traces as expected.
![]() |
Assignee | |
Comment 26•7 years ago
|
||
Attempt 2: https://hg.mozilla.org/integration/mozilla-inbound/rev/b339acb1f7fe
Comment 27•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/b339acb1f7fe
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Updated•6 years ago
|
Flags: qe-verify-
Comment hidden (spam) |
You need to log in
before you can comment on or make changes to this bug.
Description
•