Closed Bug 1755292 Opened 2 years ago Closed 2 years ago

Intermittent SUMMARY: AddressSanitizer: bad-malloc_usable_size /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_malloc_win.cpp:70 in _msize

Categories

(Toolkit :: Telemetry, defect, P5)

defect

Tracking

()

RESOLVED FIXED
99 Branch
Tracking Status
firefox-esr91 --- fixed
firefox97 --- wontfix
firefox98 --- fixed
firefox99 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: mccr8)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Filed by: ccozmuta [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=367786486&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/C2qVBsziThuwQMplgfk-AA/runs/0/artifacts/public/logs/live_backing.log


[task 2022-02-14T11:56:25.610Z] 11:56:25     INFO - TEST-START | dom/base/test/browser_chromeutils_getalldomprocesses.js
[task 2022-02-14T11:56:25.794Z] 11:56:25     INFO - GECKO(2552) | MEMORY STAT | vsize 19414312MB | vsizeMaxContiguous 66774896MB | residentFast 1017MB
[task 2022-02-14T11:56:25.795Z] 11:56:25     INFO - TEST-OK | dom/base/test/browser_chromeutils_getalldomprocesses.js | took 177ms
[task 2022-02-14T11:56:25.824Z] 11:56:25     INFO - checking window state
[task 2022-02-14T11:56:25.848Z] 11:56:25     INFO - TEST-START | dom/base/test/browser_data_documents_aboutmemory.js
[task 2022-02-14T11:56:26.201Z] 11:56:26     INFO - GECKO(2552) | =================================================================
[task 2022-02-14T11:56:26.204Z] 11:56:26    ERROR - GECKO(2552) | ==1716==ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x12a9381f86c0
[task 2022-02-14T11:56:26.730Z] 11:56:26     INFO - GECKO(2552) |     #0 0x7fffc28a7c02 in _msize /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_malloc_win.cpp:70
[task 2022-02-14T11:56:26.735Z] 11:56:26     INFO - GECKO(2552) |     #1 0x7fffc2892134 in __sanitizer::BufferedStackTrace::Unwind /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_stacktrace.h:131
[task 2022-02-14T11:56:26.736Z] 11:56:26     INFO - GECKO(2552) |     #2 0x7fffc2892134 in __asan::asan_malloc_usable_size(void const *, unsigned __int64, unsigned __int64) /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_allocator.cpp:1074
[task 2022-02-14T11:56:26.736Z] 11:56:26     INFO - GECKO(2552) |     #3 0x7fffb356fc29 in nsBaseHashtable<nsCStringHashKey,unsigned int,unsigned int,nsDefaultConverter<unsigned int,unsigned int> >::ShallowSizeOfIncludingThis /builds/worker/workspace/obj-build/dist/include/nsBaseHashtable.h:986
[task 2022-02-14T11:56:26.737Z] 11:56:26     INFO - GECKO(2552) |     #4 0x7fffb356fc29 in TelemetryOrigin::SizeOfIncludingThis(unsigned __int64 (__cdecl *)(void const *)) /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/TelemetryOrigin.cpp:607
[task 2022-02-14T11:56:26.737Z] 11:56:26     INFO - GECKO(2552) |     #5 0x7fffb35211ca in `anonymous namespace'::TelemetryImpl::CollectReports /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/Telemetry.cpp:248
[task 2022-02-14T11:56:26.738Z] 11:56:26     INFO - GECKO(2552) |     #6 0x7fffa58ee7a6 in nsMemoryReporterManager::DispatchReporter::<lambda_2>::operator() /builds/worker/checkouts/gecko/xpcom/base/nsMemoryReporterManager.cpp:1855
[task 2022-02-14T11:56:26.738Z] 11:56:26     INFO - GECKO(2552) |     #7 0x7fffa58ee7a6 in mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/xpcom/base/nsMemoryReporterManager.cpp:1853:7'>::Run /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:531
[task 2022-02-14T11:56:26.739Z] 11:56:26     INFO - GECKO(2552) |     #8 0x7fffa5afea6d in mozilla::RunnableTask::Run(void) /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:467
<...>
[task 2022-02-14T11:56:26.795Z] 11:56:26     INFO - GECKO(2552) |     #32 0x7fffa7abd1af in XPC_WN_CallMethod(struct JSContext *, unsigned int, class JS::Value *) /builds/worker/checkouts/gecko/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:923
[task 2022-02-14T11:56:26.796Z] 11:56:26     INFO - GECKO(2552) |     #33 0x7fffb3a9ec77 in CallJSNative /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:426
[task 2022-02-14T11:56:26.796Z] 11:56:26     INFO - GECKO(2552) |     #34 0x7fffb3a9ec77 in js::InternalCallOrConstruct(struct JSContext *, class JS::CallArgs const &, enum js::MaybeConstruct, enum js::CallReason) /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:512
[task 2022-02-14T11:56:26.797Z] 11:56:26     INFO - GECKO(2552) |     #35 0x7fffb3a878f4 in js::CallFromStack /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:576
[task 2022-02-14T11:56:26.797Z] 11:56:26     INFO - GECKO(2552) |     #36 0x7fffb3a878f4 in Interpret /builds/worker/checkouts/gecko/js/src/vm/Interpreter.cpp:3309
[task 2022-02-14T11:56:26.798Z] 11:56:26     INFO - GECKO(2552) | SUMMARY: AddressSanitizer: bad-malloc_usable_size /builds/worker/fetches/llvm-project/compiler-rt/lib/asan/asan_malloc_win.cpp:70 in _msize
[task 2022-02-14T11:56:26.798Z] 11:56:26     INFO - GECKO(2552) | ==1716==ABORTING
[task 2022-02-14T11:56:26.811Z] 11:56:26     INFO - GECKO(2552) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=434.749) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=428.831) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]:[G[GFX1-]: Re cFRX1eeic-v]: Reei evIe PcIePCi vcCe  lcolsoIePCs ew  cwiltoiht she rre ewaistahosno n=Ar=bAenaobnormalSsrmalShutdown (t=427.889) Crash Annota[Exiting due to channel error.
[task 2022-02-14T11:56:26.811Z] 11:56:26     INFO - GECKO(2552) | oGCrash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=11.6478) [GFX1-]: Receive IPC close with reason=AbnormalShutdown
[task 2022-02-14T11:56:27.037Z] 11:56:27     INFO - TEST-INFO | Main app process: exit 1
[task 2022-02-14T11:56:27.042Z] 11:56:27     INFO - Buffered messages logged at 11:56:25
[task 2022-02-14T11:56:27.042Z] 11:56:27     INFO - Entering test bound 
[task 2022-02-14T11:56:27.043Z] 11:56:27     INFO - Console message: [JavaScript Warning: "This page is in Quirks Mode. Page layout may be impacted. For Standards Mode use <!DOCTYPE html>." {file: "chrome://mochitests/content/browser/dom/base/test/browser_data_documents_aboutmemory.js" line: 2}]
[task 2022-02-14T11:56:27.044Z] 11:56:27     INFO - Buffered messages finished
[task 2022-02-14T11:56:27.044Z] 11:56:27    ERROR - TEST-UNEXPECTED-FAIL | dom/base/test/browser_data_documents_aboutmemory.js | application terminated with exit code 1
[task 2022-02-14T11:56:27.044Z] 11:56:27     INFO - runtests.py | Application ran for: 0:00:47.880829
[task 2022-02-14T11:56:27.045Z] 11:56:27     INFO - zombiecheck | Reading PID log: C:\Users\task_164483702816523\AppData\Local\Temp\tmp5vmsu8lxpidlog
<...>
[task 2022-02-14T11:56:39.385Z] 11:56:39     INFO - GECKO(6860) | 1644839799385	Marionette	DEBUG	2 <- [1,4,null,{"value":"content"}]
[task 2022-02-14T11:56:39.387Z] 11:56:39     INFO - GECKO(6860) | 1644839799387	Marionette	DEBUG	2 -> [0,5,"Marionette:SetContext",{"value":"chrome"}]
[task 2022-02-14T11:56:39.388Z] 11:56:39     INFO - GECKO(6860) | 1644839799388	Marionette	DEBUG	2 <- [1,5,null,{"value":null}]
[task 2022-02-14T11:56:39.390Z] 11:56:39     INFO - GECKO(6860) | 1644839799390	Marionette	DEBUG	2 -> [0,6,"WebDriver:ExecuteScript",{"script":"/* This Source Code Form is subject to the terms of the Mozilla Public\n * License, ... ewSandbox":true,"sandbox":"default","line":1988,"filename":"Z:\\task_164483702816523\\build\\tests\\mochitest\\runtests.py"}]
[task 2022-02-14T11:56:39.404Z] 11:56:39     INFO - GECKO(6860) | 1644839799403	Marionette	TRACE	[9] MarionetteCommands actor created for window id 4
[task 2022-02-14T11:56:39.464Z] 11:56:39     INFO - GECKO(6860) | 1644839799470	Marionette	TRACE	Received observer notification domwindowopened
[task 2022-02-14T11:56:39.488Z] 11:56:39     INFO - GECKO(6860) | 1644839799494	Marionette	DEBUG	2 <- [1,6,null,{"value":null}]
[task 2022-02-14T11:56:39.501Z] 11:56:39     INFO - GECKO(6860) | 1644839799500	Marionette	DEBUG	2 -> [0,7,"Marionette:SetContext",{"value":"content"}]
[task 2022-02-14T11:56:39.501Z] 11:56:39     INFO - GECKO(6860) | 1644839799500	Marionette	DEBUG	2 <- [1,7,null,{"value":null}]
[task 2022-02-14T11:56:39.604Z] 11:56:39     INFO - GECKO(6860) | 1644839799609	Marionette	DEBUG	2 -> [0,8,"WebDriver:DeleteSession",{}]
[task 2022-02-14T11:56:39.616Z] 11:56:39     INFO - GECKO(6860) | 1644839799616	Marionette	DEBUG	2 <- [1,8,null,{"value":null}]
[task 2022-02-14T11:56:39.629Z] 11:56:39     INFO - runtests.py | Waiting for browser...
[task 2022-02-14T11:56:39.646Z] 11:56:39     INFO - GECKO(6860) | 1644839799645	Marionette	DEBUG	Closed connection 2
[task 2022-02-14T11:56:40.121Z] 11:56:40     INFO - TEST-START | dom/fetch/tests/browser_blobFromFile.js
[task 2022-02-14T11:56:41.208Z] 11:56:41     INFO - GECKO(6860) | MEMORY STAT heapAllocated not supported in this build configuration.
[task 2022-02-14T11:56:41.217Z] 11:56:41     INFO - GECKO(6860) | MEMORY STAT | vsize 19414349MB | vsizeMaxContiguous 68737887MB | residentFast 869MB
[task 2022-02-14T11:56:41.218Z] 11:56:41     INFO - TEST-OK | dom/fetch/tests/browser_blobFromFile.js | took 1094ms
[task 2022-02-14T11:56:41.262Z] 11:56:41     INFO - checking window state
[task 2022-02-14T11:56:41.281Z] 11:56:41     INFO - GECKO(6860) | JavaScript error: resource://gre/modules/URLQueryStrippingListService.jsm, line 90: TypeError: can't access property "removeEventListener", Services.cpmm.sharedData is null
[task 2022-02-14T11:56:41.865Z] 11:56:41     INFO - GECKO(6860) | ###!!! [Parent][PImageBridgeParent] Error: RunMessage(msgname=PImageBridge::Msg_WillClose) Channel closing: too late to send/recv, messages will be lost
[task 2022-02-14T11:56:42.150Z] 11:56:42     INFO - GECKO(6860) | [Parent 6680, IPC I/O Parent] WARNING: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_win.cc:167
[task 2022-02-14T11:56:42.155Z] 11:56:42     INFO - GECKO(6860) | [Parent 6680, IPC I/O Parent] WARNING: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_win.cc:167
[task 2022-02-14T11:56:42.160Z] 11:56:42     INFO - GECKO(6860) | [Parent 6680, IPC I/O Parent] WARNING: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_win.cc:167
[task 2022-02-14T11:56:42.896Z] 11:56:42     INFO - GECKO(6860) | Completed ShutdownLeaks collections in process 132
[task 2022-02-14T11:56:42.957Z] 11:56:42     INFO - GECKO(6860) | Completed ShutdownLeaks collections in process 3952
[task 2022-02-14T11:56:42.971Z] 11:56:42     INFO - GECKO(6860) | Completed ShutdownLeaks collections in process 1268
[task 2022-02-14T11:56:43.083Z] 11:56:43     INFO - GECKO(6860) | Completed ShutdownLeaks collections in process 6292
[task 2022-02-14T11:56:43.368Z] 11:56:43     INFO - GECKO(6860) | Completed ShutdownLeaks collections in process 6680
[task 2022-02-14T11:56:43.380Z] 11:56:43     INFO - TEST-START | Shutdown
Component: JavaScript Engine → Telemetry
Product: Core → Toolkit

This is happening on this line in the TelemetryOrigin memory reporter:

n += origins.ShallowSizeOfIncludingThis(aMallocSizeOf);

Usually this means a pointer was passed in to malloc_usable_size that wasn't allocated by malloc.

should this maybe be ShallowSizeofExcludingThis? I assume the values that we're iterating over are a OriginBag&, and the origin bag this is allocated as part of the actual hashtable, which should be accounted for by the gMetricToOriginBag->ShallowSizeOfIncludingThis(aMallocSizeOf); call earlier. That being said, I have no idea why this isn't failing all of the time.

The invalid pointer is 320 bytes inside something allocated in PLDHashTable::MakeEntryHandle() which I think is consistent with that.

My understanding of memory reporting is, ahem, shallow, so it is entirely likely that I got it wrong.

As for why it might not happen all the time, gMetricToOriginBag is only non-empty if you catch it between Origin Telemetry being used and the snapshot being read out for ping payload creation. It might be that this isn't deterministically so.

Assignee: nobody → continuation
Status: NEW → ASSIGNED

Ideally we'd use a higher-level SizeOfIncludingThis for this hashtable (bug 1689214) but I started poking at that and was getting bogged down in compile times, so I'll just fix the immediate issue here.

See Also: → 1689214
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/19bd07d35dbb
Don't double-count the entry memory in the TelemetryOrigin reporter. r=nika
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 99 Branch

Comment on attachment 9264413 [details]
Bug 1755292 - Don't double-count the entry memory in the TelemetryOrigin reporter.

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Possible crashes in rare situations. This code has been around a few years and we've only seen this crash in automation once AFAICT.
  • User impact if declined: see above
  • Fix Landed on Version: 99
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Minor change in how memory reporting is done for telemetry. The main risk is that it isn't really possible to test.

Beta/Release Uplift Approval Request

  • User impact if declined:
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky):
  • String changes made/needed: none
Attachment #9264413 - Flags: approval-mozilla-esr91?
Attachment #9264413 - Flags: approval-mozilla-beta?

Comment on attachment 9264413 [details]
Bug 1755292 - Don't double-count the entry memory in the TelemetryOrigin reporter.

Approved for 98 beta 9, thanks.

Attachment #9264413 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9264413 [details]
Bug 1755292 - Don't double-count the entry memory in the TelemetryOrigin reporter.

Approved for 91.7esr.

Attachment #9264413 - Flags: approval-mozilla-esr91? → approval-mozilla-esr91+
See Also: → 1759648
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: