Closed Bug 1462825 Opened 6 years ago Closed 6 years ago

Intermittent glterrain | application crashed [@ static nsresult `anonymous namespace'::internal_ScalarSnapshotter(const class mozilla::BaseAutoLock<mozilla::StaticMutex> & const, class nsDataHashtable<nsUint32HashKey,nsTArray<mozilla::Tuple<char ..

Categories

(Toolkit :: Telemetry, defect)

defect
Not set
critical

Tracking

()

RESOLVED INVALID

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: crash, intermittent-failure)

Crash Data

Filed by: apavel [at] mozilla.com https://treeherder.mozilla.org/logviewer.html#?job_id=179242688&repo=autoland https://queue.taskcluster.net/v1/task/HujG6mAoTDm05fGDvFO5WQ/runs/0/artifacts/public/logs/live_backing.log 20:40:59 INFO - PID 6856 | [Parent 6856, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 20:40:59 INFO - PID 6856 | [Child 6860, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 20:40:59 INFO - PID 6856 | [Child 6860, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 20:40:59 INFO - PID 6856 | [Parent 6856, Gecko_IOThread] WARNING: file z:/build/build/src/ipc/chromium/src/base/process_util_win.cc, line 188 20:40:59 INFO - PID 6856 | [GFX1-]: ERROR: Invalid window dimensions 120x0. Please call api.set_window_size() 20:40:59 INFO - PID 6856 | ERROR 2018-05-18T20:40:59Z: webrender::render_backend: ERROR: Invalid window dimensions 120x0. Please call api.set_window_size() 20:40:59 INFO - PID 6856 | [Parent 6856, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 20:40:59 INFO - PID 6856 | [Child 3748, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346 20:41:00 INFO - TEST-INFO | 6856: exit 1 20:41:00 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/PtjxCcVJR_yz-VJ31ELl3Q/artifacts/public/build/target.crashreporter-symbols.zip 20:41:04 INFO - mozcrash Copy/paste: C:\Users\task_1526673738\build\win32-minidump_stackwalk.exe c:\users\genericworker\appdata\local\temp\tmp5qwbgx\profile\minidumps\4c426c52-2624-42ee-bbc3-2b617275c51d.dmp c:\users\genericworker\appdata\local\temp\tmpdfpcle 20:41:08 INFO - mozcrash Saved minidump as C:\Users\task_1526673738\build\blobber_upload_dir\4c426c52-2624-42ee-bbc3-2b617275c51d.dmp 20:41:08 INFO - mozcrash Saved app info as C:\Users\task_1526673738\build\blobber_upload_dir\4c426c52-2624-42ee-bbc3-2b617275c51d.extra 20:41:08 INFO - PROCESS-CRASH | glterrain | application crashed [@ static nsresult `anonymous namespace'::internal_ScalarSnapshotter(const class mozilla::BaseAutoLock<mozilla::StaticMutex> & const, class nsDataHashtable<nsUint32HashKey,nsTArray<mozilla::Tuple<char const *,nsCOMPtr<nsIVariant>,unsigned int> > > & const, unsigned int, class nsClassHashtable<nsUint32HashKey,nsClassHashtable<nsUint32HashKey,`anonymous namespace'::ScalarBase> > & const, bool)] 20:41:08 INFO - Crash dump filename: c:\users\genericworker\appdata\local\temp\tmp5qwbgx\profile\minidumps\4c426c52-2624-42ee-bbc3-2b617275c51d.dmp 20:41:08 INFO - Operating system: Windows NT 20:41:08 INFO - 10.0.15063 20:41:08 INFO - CPU: amd64 20:41:08 INFO - family 6 model 94 stepping 3 20:41:08 INFO - 8 CPUs 20:41:08 INFO - GPU: UNKNOWN 20:41:08 INFO - Crash reason: EXCEPTION_ACCESS_VIOLATION_READ 20:41:08 INFO - Crash address: 0xffffffffffffffff 20:41:08 INFO - Process uptime: 83 seconds 20:41:08 INFO - Thread 0 (crashed) 20:41:08 INFO - 0 xul.dll!static nsresult `anonymous namespace'::internal_ScalarSnapshotter(const class mozilla::BaseAutoLock<mozilla::StaticMutex> & const, class nsDataHashtable<nsUint32HashKey,nsTArray<mozilla::Tuple<char const *,nsCOMPtr<nsIVariant>,unsigned int> > > & const, unsigned int, class nsClassHashtable<nsUint32HashKey,nsClassHashtable<nsUint32HashKey,`anonymous namespace'::ScalarBase> > & const, bool) [TelemetryScalar.cpp:8c6bdb5e72fea5c7ad660f7e81fa73533dec2fa7 : 1594 + 0x3] 20:41:08 INFO - rax = 0x00007ffe04609101 rdx = 0x000000f7e63fbd20 20:41:08 INFO - rcx = 0x00000231314b2910 rbx = 0x00000231314b2910 20:41:08 INFO - rsi = 0x0000000000000000 rdi = 0x00007ffe04609118 20:41:08 INFO - rbp = 0x000000f7e63fbd81 rsp = 0x000000f7e63fbd00 20:41:08 INFO - r8 = 0x00007ffe0431de0f r9 = 0x0000000000000020 20:41:08 INFO - r10 = 0x0000000000000000 r11 = 0x0000023126800120 20:41:08 INFO - r12 = 0x0000000000000000 r13 = 0x000000f7e63fbef0 20:41:08 INFO - r14 = 0x0000023133257d60 r15 = 0x000000f7e63fbe88 20:41:08 INFO - rip = 0x00007ffe032fce8e 20:41:08 INFO - Found by: given as instruction pointer in context 20:41:08 INFO - 1 xul.dll!TelemetryScalar::CreateSnapshots(unsigned int,bool,JSContext *,unsigned char,JS::MutableHandle<JS::Value>) [TelemetryScalar.cpp:8c6bdb5e72fea5c7ad660f7e81fa73533dec2fa7 : 2452 + 0x1b] 20:41:08 INFO - rbx = 0x00000231314b2910 rbp = 0x000000f7e63fbd81 20:41:08 INFO - rsp = 0x000000f7e63fbde0 r12 = 0x0000000000000000 20:41:08 INFO - r13 = 0x000000f7e63fbef0 r14 = 0x0000023133257d60 20:41:08 INFO - r15 = 0x000000f7e63fbe88 rip = 0x00007ffe032f92cb 20:41:08 INFO - Found by: call frame info 20:41:08 INFO - 2 xul.dll!XPTC__InvokebyIndex + 0x72 20:41:08 INFO - rbx = 0x00000231314b2910 rbp = 0x000000f7e63fbd81 20:41:08 INFO - rsp = 0x000000f7e63fbee0 r12 = 0x0000000000000000 20:41:08 INFO - r13 = 0x000000f7e63fbef0 r14 = 0x0000023133257d60 20:41:08 INFO - r15 = 0x000000f7e63fbe88 rip = 0x00007ffe00d09fa2 20:41:08 INFO - Found by: call frame info 20:41:08 INFO - Thread 1 20:41:08 INFO - 0 ntdll.dll!NtRemoveIoCompletion + 0x14 20:41:08 INFO - rax = 0x000002312826e840 rdx = 0x0000000000000020 20:41:08 INFO - rcx = 0x000002312828d260 rbx = 0x000000f7e6bffec8 20:41:08 INFO - rsi = 0x0000000000000001 rdi = 0x000000f7e6bffec0 20:41:08 INFO - rbp = 0x000000f7e6bfff10 rsp = 0x000000f7e6bffe28 20:41:08 INFO - r8 = 0x00000231267282fc r9 = 0x000000000000001b 20:41:08 INFO - r10 = 0x00007ffe2dbc95d0 r11 = 0x0000000000000246 20:41:08 INFO - r12 = 0x0000000000000000 r13 = 0x0000000000000250 20:41:08 INFO - r14 = 0x0000000000000254 r15 = 0x0000000000000000 20:41:08 INFO - rip = 0x00007ffe2db554d4 20:41:08 INFO - Found by: given as instruction pointer in context 20:41:08 INFO - 1 KERNELBASE.dll!LdrResGetRCConfig + 0x3ff 20:41:08 INFO - rbx = 0x000000f7e6bffec8 rbp = 0x000000f7e6bfff10 20:41:08 INFO - rsp = 0x000000f7e6bffe30 r12 = 0x0000000000000000 20:41:08 INFO - r13 = 0x0000000000000250 r14 = 0x0000000000000254 20:41:08 INFO - r15 = 0x0000000000000000 rip = 0x00007ffe2a199aff 20:41:08 INFO - Found by: call frame info 20:41:08 INFO - 2 kernel32.dll!LdrpLoadResourceFromAlternativeModule + 0x27c 20:41:08 INFO - rbx = 0x000000f7e6bffec8 rbp = 0x000000f7e6bfff10 20:41:08 INFO - rsp = 0x000000f7e6bfff40 r12 = 0x0000000000000000 20:41:08 INFO - r13 = 0x0000000000000250 r14 = 0x0000000000000254 20:41:08 INFO - r15 = 0x0000000000000000 rip = 0x00007ffe2cd82774 20:41:08 INFO - Found by: call frame info 20:41:08 INFO - 3 ntdll.dll!RtlUserThreadStart + 0x21 20:41:08 INFO - rsp = 0x000000f7e6bfff70 rip = 0x00007ffe2db20d61 20:41:08 INFO - Found by: stack scanning
Component: Talos → Telemetry
Product: Testing → Toolkit
Version: Version 3 → unspecified
Blocks: 1453591
Priority: P5 → P1
Hey Chris, I'm trying to investigate this because, even if the frequency is low, it looks *bad* and related to the GV changes. I think this is happening while shutting down, given that the test suite is printing the final report and we're seeing the usual IPC errors due to the channels being closed. The problem seems to happen around here [1]. |EXCEPTION_ACCESS_VIOLATION_READ| seems to suggest we're reading memory that was already freed/not available anymore. What's your take on this? [1] - https://searchfox.org/mozilla-central/rev/eb51d734a8c06ac5973291be9bcc74ed4e999bca/toolkit/components/telemetry/TelemetryScalar.cpp#1594
Flags: needinfo?(chutten)
You say "low" and I say "there's exactly one report" :) From the callstack we have there's "InvokeByIndex" which I thought meant we're being invoked by something going through IDL. Which would mean script, not GV. Either way, failing on scalar->GetValue instead of on scalarStorage->Iter might mean the scalar in the storage has been free'd but the storage we were passed wasn't. This should be impossible as we have the lock, and we only ever free through Clear(), and we never Clear without holding the lock... I'm about as lost as you are on this, I think.
Flags: needinfo?(chutten)
Priority: P1 → --
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.