Closed Bug 964340 Opened 10 years ago Closed 10 years ago

GenerationalGC: dom/workers test crash in ZonesIter on Windows

Categories

(Core :: JavaScript Engine, defect)

x86_64
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 987160

People

(Reporter: jonco, Unassigned)

References

Details

Attachments

(7 files, 1 obsolete file)

On Windows 7 and Windows XP debug builds I'm seeing crashes running the dom/workers tests in mochitest other, for example:

05:08:06     INFO -  [Parent 3528] ###!!! ABORT: Tear-off objects remain in hashtable at shutdown.: '!mTable', file c:\builds\moz2_slave\try-w32-d-00000000000000000000\build\content\svg\content\src\nsSVGAttrTearoffTable.h, line 28
05:08:06     INFO -  [2864] WARNING: shutting down early because of crash!: file c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/dom/plugins/ipc/PluginModuleChild.cpp, line 697
05:08:06     INFO -  [2864] WARNING: plugin process _exit()ing: file c:/builds/moz2_slave/try-w32-d-00000000000000000000/build/dom/plugins/ipc/PluginModuleChild.cpp, line 662
05:08:06     INFO -  NPP_Destroy
05:08:06     INFO -  NPP_Destroy
05:08:06     INFO -  NPP_Destroy
05:08:06     INFO -  NPP_Destroy
05:08:06     INFO -  NPP_Destroy
05:08:06     INFO -  NPP_Destroy
05:08:06     INFO -  NPP_Destroy
05:08:06     INFO -  nsStringStats
05:08:06     INFO -   => mAllocCount:            167
05:08:06     INFO -   => mReallocCount:            1
05:08:06     INFO -   => mFreeCount:              88  --  LEAKED 79 !!!
05:08:06     INFO -   => mShareCount:            159
05:08:06     INFO -   => mAdoptCount:              0
05:08:06     INFO -   => mAdoptFreeCount:          0
05:08:06     INFO -   => Process ID: 2864, Thread ID: 640
05:08:06  WARNING -  TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/dom/workers/test/test_fileSubWorker.xul | application terminated with exit code 2147483651
05:08:06     INFO -  INFO | runtests.py | Application ran for: 0:06:28.434000
05:08:06     INFO -  INFO | zombiecheck | Reading PID log: c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpll4dzmpidlog
05:08:06     INFO -  ==> process 3528 launched child process 2712 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.14951fe0.2077409828 -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" tab)
05:08:06     INFO -  ==> process 3528 launched child process 2404 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.18f70df0.454271108 -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" tab)
05:08:06     INFO -  ==> process 3528 launched child process 2556 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.157bd040.185348031 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" plugin)
05:08:06     INFO -  ==> process 3528 launched child process 2540 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.18eba1a8.1846843863 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" plugin)
05:08:06     INFO -  ==> process 3528 launched child process 2576 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.ef9be78.2133968354 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" plugin)
05:08:06     INFO -  ==> process 3528 launched child process 2064 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.19755030.1205676222 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" plugin)
05:08:06     INFO -  ==> process 3528 launched child process 2904 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.1aa9f978.558499738 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" plugin)
05:08:06     INFO -  ==> process 3528 launched child process 3076 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.19e8b6e0.1841589761 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" plugin)
05:08:06     INFO -  ==> process 3528 launched child process 2864 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3528.15419d88.1387227371 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3528 "\\.\pipe\gecko-crash-server-pipe.3528" plugin)
05:08:14  WARNING -  PROCESS-CRASH | chrome://mochitests/content/chrome/dom/workers/test/test_fileSubWorker.xul | application crashed [@ js::ZonesIter::ZonesIter(JSRuntime *,js::ZoneSelector)]
05:08:14     INFO -  Crash dump filename: c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpxuos3y\minidumps\ac63b389-6979-4e2d-941e-5ab35d251f98.dmp
05:08:14     INFO -  Operating system: Windows NT
05:08:14     INFO -                    5.1.2600 Service Pack 3
05:08:14     INFO -  CPU: x86
05:08:14     INFO -       GenuineIntel family 6 model 30 stepping 5
05:08:14     INFO -       8 CPUs
05:08:14     INFO -  Crash reason:  EXCEPTION_ACCESS_VIOLATION_READ
05:08:14     INFO -  Crash address: 0x24
05:08:14     INFO -  Thread 107 (crashed)
05:08:14     INFO -   0  mozjs.dll!js::ZonesIter::ZonesIter(JSRuntime *,js::ZoneSelector) [Zone.h:78bf56b2c0fd : 346 + 0xa]
05:08:14     INFO -      eip = 0x00cd3da6   esp = 0x2f1ffc38   ebp = 0x2f1ffc44   ebx = 0x24ec9bc0
05:08:14     INFO -      esi = 0x2f1ffc60   edi = 0x1b09d090   eax = 0x24ec9d10   ecx = 0x00000000
05:08:14     INFO -      edx = 0xcdcdcdcd   efl = 0x00010246
05:08:14     INFO -      Found by: given as instruction pointer in context
05:08:14     INFO -   1  mozjs.dll!JSRuntime::~JSRuntime() [Runtime.cpp:78bf56b2c0fd : 449 + 0x14]
05:08:14     INFO -      eip = 0x010bd3a4   esp = 0x2f1ffc4c   ebp = 0x2f1ffc94
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   2  mozjs.dll!JS_NewRuntime(unsigned int,JSUseHelperThreads) [jsapi.cpp:78bf56b2c0fd : 647 + 0x6]
05:08:14     INFO -      eip = 0x00f34278   esp = 0x2f1ffc9c   ebp = 0x2f1ffca0
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   3  xul.dll!mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(unsigned int,JSUseHelperThreads) [CycleCollectedJSRuntime.cpp:78bf56b2c0fd : 447 + 0xb]
05:08:14     INFO -      eip = 0x0194a7a2   esp = 0x2f1ffca8   ebp = 0x2f1ffcb4
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   4  xul.dll!`anonymous namespace'::WorkerThreadPrimaryRunnable::Run() [RuntimeService.cpp:78bf56b2c0fd : 2519 + 0x19]
05:08:14     INFO -      eip = 0x026bdf2c   esp = 0x2f1ffcbc   ebp = 0x2f1ffe00
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   5  xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:78bf56b2c0fd : 637 + 0xd]
05:08:14     INFO -      eip = 0x01990517   esp = 0x2f1ffe08   ebp = 0x2f1ffe68
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   6  xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:78bf56b2c0fd : 263 + 0xc]
05:08:14     INFO -      eip = 0x01923d51   esp = 0x2f1ffe70   ebp = 0x2f1ffe7c
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   7  xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [MessagePump.cpp:78bf56b2c0fd : 332 + 0x9]
05:08:14     INFO -      eip = 0x01bc2780   esp = 0x2f1ffe84   ebp = 0x2f1ffea8
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   8  xul.dll!MessageLoop::RunInternal() [message_loop.cc:78bf56b2c0fd : 226 + 0x8]
05:08:14     INFO -      eip = 0x01b86342   esp = 0x2f1ffeb0   ebp = 0x2f1ffec8
05:08:14     INFO -      Found by: call frame info
05:08:14     INFO -   9  xul.dll!MessageLoop::RunHandler() [message_loop.cc:78bf56b2c0fd : 219 + 0x4]

Logs here:

https://tbpl.mozilla.org/php/getParsedLog.php?id=33625422&tree=Try#error0
https://tbpl.mozilla.org/php/getParsedLog.php?id=33625491&tree=Try#error0
Summary: GenerartionalGC: dom/workers test crash in ZonesIter on Windows → GenerationalGC: dom/workers test crash in ZonesIter on Windows
Depends on: 967589
Now the assertion failure has been cleared up, it seems that we're hitting an OOM.  Looking at the previous few tests, the memory usage seems to really ramp up:

    114 ms  735.23 MB  431.94 MB  516.53 MB chrome://mochitests/content/chrome/dom/workers/test/test_extension.xul
    161 ms  735.23 MB  431.94 MB  515.55 MB chrome://mochitests/content/chrome/dom/workers/test/test_extensionBootstrap.xul
    263 ms  792.48 MB  367.94 MB  572.48 MB chrome://mochitests/content/chrome/dom/workers/test/test_file.xul
    1.1 S   811.31 MB  334.94 MB  588.76 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileBlobPosting.xul
    339 ms  962.81 MB  202.94 MB  738.08 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileBlobSubWorker.xul
    177 ms 1000.94 MB  165.94 MB  776.11 MB chrome://mochitests/content/chrome/dom/workers/test/test_filePosting.xul
    240 ms    1.05 GB  154.91 MB  852.33 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileReadSlice.xul
    341 ms    1.31 GB   89.35 MB  985.55 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileReaderSync.xul
    189 ms    1.33 GB   89.35 MB 1004.48 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileReaderSyncErrors.xul
    388 ms    1.47 GB   72.06 MB    1.13 GB chrome://mochitests/content/chrome/dom/workers/test/test_fileSlice.xul
    350 ms    1.55 GB   26.88 MB    1.20 GB chrome://mochitests/content/chrome/dom/workers/test/test_fileSubWorker.xul

The figures are test time, and the "vsize" "vsizeMaxContiguous" and "residentFast" values taken from the test log.

For comparison this doesn't seem to happen for these tests on inbound (this is a Windows XP debug build):

    123 ms  509.67 MB  656.94 MB  285.77 MB chrome://mochitests/content/chrome/dom/workers/test/test_extensionBootstrap.xul
    180 ms  518.86 MB  647.94 MB  292.42 MB chrome://mochitests/content/chrome/dom/workers/test/test_file.xul
    811 ms  522.63 MB  641.94 MB  292.01 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileBlobPosting.xul
    206 ms  546.13 MB  620.94 MB  311.59 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileBlobSubWorker.xul
    133 ms  552.25 MB  614.94 MB  316.97 MB chrome://mochitests/content/chrome/dom/workers/test/test_filePosting.xul
    164 ms  564.50 MB  602.94 MB  327.68 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileReadSlice.xul
    206 ms  585.94 MB  581.94 MB  346.23 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileReaderSync.xul
    133 ms  589.07 MB  578.94 MB  348.99 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileReaderSyncErrors.xul
    217 ms  613.50 MB  554.94 MB  368.37 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileSlice.xul
    211 ms  766.00 MB  402.94 MB  389.92 MB chrome://mochitests/content/chrome/dom/workers/test/test_fileSubWorker.xul

This might possibly be caused by the memory leak described in bug 969012.
So the patch in bug 969012 doesn't fix this unfortunately.  There's a huge memory spike going on around the dom/workers tests.  This happens in linux builds as well but it's only causing crashes in Windows builds.
It seems the memory spike happens on linux as well as Windows, and both opt and debug builds, but it's only on Windows that it causes OOM and kills the tests.
Attached image graph.png
Graph of memory usage on Windows and linux debug builds.  The green line shows the memory spike and then crash.  On linux, memory spikes but then recovers.
I turned on full GC logging and ran M-oth on tbpl, and had it upload the log file. (This is on Linux; uploads are busted for Windows at the moment.) I thought the GC timer log had a total heap size in it, but I guess not. So here's just a plot of the "Total Time" field in each log entry, overload on the vsize graph. The total time metric is not very interesting, but the X axes match up so you can see that there was a GC before the memory spike, 2 GCs at its peak, and then several seconds *after* the GCs, the memory dropped and there were no more GCs for a long long time.

Or actually, you can't, because this is too zoomed out.
In short, I don't know why the memory usage drops on Linux, unless the background free thread is really really bogged down. Which I doubt.
Oops, I trimmed off the earlier GCs.
Attached file worker_log.txt (obsolete) —
Here's the output of running the worker/dom tests with additional tracing to show worker and nursery creation and destructions.  It looks like we reach a peak of 54 workers alive in test_fileSubWorker.xul!
Attached file worker_log.txt
Log output updated with with worker script URL and origin.
Attachment #8395682 - Attachment is obsolete: true
This was fixed by bug 987160.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: