Closed Bug 1649974 Opened 4 years ago Closed 4 years ago

Intermittent leakcheck | rdd 3544 bytes leaked (AbstractThread, ActorLifecycleProxy, CondVar, IPC::Channel, Mutex, ...)

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla80
Tracking Status
firefox80 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jya)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

Attachments

(2 files)

looks like the PRemoteDecoderManagerParent gets destroyed in the rdd too late, after xpcom has shutdown.

This would prevent the task to free it to run.

I had found a similar problem with the canvas code; all IPC actors duplicate similar code, seems I missed one :(

keeping ni? for now

It worked until now as IPC's MessageChannel was only used with background taskqueue; which use a threadpool made of a single thread only.

Assignee: nobody → jyavenard
Status: NEW → ASSIGNED

Sync dispatch are being used; due to how the background taskqueue is setup this could cause a hang if if also called from a background taskqueue.

Depends on D82499

Flags: needinfo?(jyavenard)
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3c5157a9e872
P1. Make WeakPtr work with generic taskqueue. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/fac1e5a07a6c
P2. Use a TaskQueue with media threadpool for RemoteDecoderManagerParent. r=mattwoodrow

Backed out 5 changesets (bug 1650996, bug 1649974) for hazard failures

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&searchStr=linux%2Cx64%2Cdebug%2Chazard-linux64-haz%2Fdebug%2C%28h%29&fromchange=d3f7f11b80ea63a55d4d381b88cd758f12714a06&selectedTaskRun=P-lFOs1KSvmTMZ9Xz_CkWA.0&tochange=47a51ee437a2b271872fca1b25b649c36509fb35

Backout link: https://hg.mozilla.org/integration/autoland/rev/d8abefec5bba959f42bac66237b9eabe2d04dfaa

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=308963824&repo=autoland&lineNumber=73953

...
[task 2020-07-08T12:51:30.770Z] Running explain to generate ('hazards.txt', 'unnecessary.txt', 'refs.txt')
[task 2020-07-08T12:51:30.770Z] LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/builds/worker/workspace/obj-haz-shell/dist/bin" PATH="/builds/worker/fetches/sixgill/usr/bin:${PATH}" XDB='/builds/worker/fetches/sixgill/usr/bin/xdb.so' SOURCE='/builds/worker/checkouts/gecko' ANALYZED_OBJDIR='/builds/worker/workspace/obj-analyzed' python2.7 /builds/worker/checkouts/gecko/js/src/devtools/rootAnalysis/explain.py rootingHazards.txt gcFunctions.txt hazards.txt unnecessary.txt refs.txt
[task 2020-07-08T12:51:33.400Z] Wrote explained_hazards.tmp
[task 2020-07-08T12:51:33.400Z] Wrote unnecessary.tmp
[task 2020-07-08T12:51:33.400Z] Wrote refs.tmp
[task 2020-07-08T12:51:33.400Z] Found 2 hazards 200 unsafe references 0 missing
[task 2020-07-08T12:51:33.401Z] Running heapwrites to generate heapWriteHazards.txt
[task 2020-07-08T12:51:33.401Z] LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/builds/worker/workspace/obj-haz-shell/dist/bin" PATH="/builds/worker/fetches/sixgill/usr/bin:${PATH}" XDB='/builds/worker/fetches/sixgill/usr/bin/xdb.so' SOURCE='/builds/worker/checkouts/gecko' ANALYZED_OBJDIR='/builds/worker/workspace/obj-analyzed' /builds/worker/workspace/obj-haz-shell/dist/bin/js /builds/worker/checkouts/gecko/js/src/devtools/rootAnalysis/analyzeHeapWrites.js > heapWriteHazards.txt
[task 2020-07-08T12:52:01.488Z] + check_hazards /builds/worker/workspace/analysis
[task 2020-07-08T12:52:01.489Z] + set +e
[task 2020-07-08T12:52:01.489Z] ++ grep -c 'Function.*has unrooted.*live across GC call' /builds/worker/workspace/analysis/rootingHazards.txt
[task 2020-07-08T12:52:01.490Z] + NUM_HAZARDS=2
[task 2020-07-08T12:52:01.490Z] ++ grep -c '^Function.*takes unsafe address of unrooted' /builds/worker/workspace/analysis/refs.txt
[task 2020-07-08T12:52:01.492Z] + NUM_UNSAFE=200
[task 2020-07-08T12:52:01.492Z] ++ grep -c '^Function.* has unnecessary root' /builds/worker/workspace/analysis/unnecessary.txt
[task 2020-07-08T12:52:01.494Z] + NUM_UNNECESSARY=1314
[task 2020-07-08T12:52:01.494Z] ++ grep -c '^Dropped CFG' /builds/worker/workspace/analysis/build_xgill.log
[task 2020-07-08T12:52:01.618Z] + NUM_DROPPED=0
[task 2020-07-08T12:52:01.619Z] ++ perl -lne 'print $1 if m!found (\d+)/\d+ allowed errors!' /builds/worker/workspace/analysis/heapWriteHazards.txt
[task 2020-07-08T12:52:01.621Z] + NUM_WRITE_HAZARDS=0
[task 2020-07-08T12:52:01.621Z] ++ grep -c '^Function.*expected hazard.*but none were found' /builds/worker/workspace/analysis/rootingHazards.txt
[task 2020-07-08T12:52:01.622Z] + NUM_MISSING=0
[task 2020-07-08T12:52:01.622Z] + set +x
[task 2020-07-08T12:52:01.622Z] TinderboxPrint: rooting hazards<br/>2
[task 2020-07-08T12:52:01.622Z] TinderboxPrint: (unsafe references to unrooted GC pointers)<br/>200
[task 2020-07-08T12:52:01.622Z] TinderboxPrint: (unnecessary roots)<br/>1314
[task 2020-07-08T12:52:01.622Z] TinderboxPrint: missing expected hazards<br/>0
[task 2020-07-08T12:52:01.622Z] TinderboxPrint: heap write hazards<br/>0
[task 2020-07-08T12:52:01.624Z] TEST-UNEXPECTED-FAIL | hazards | unrooted 'scopedArr' of type 'mozilla::dom::TypedArray<unsigned char, js::UnwrapUint8ClampedArray, JS_GetUint8ClampedArrayData, js::GetUint8ClampedArrayLengthAndData, JS_NewUint8ClampedArray>' live across GC call at dom/canvas/WebGLTextureUpload.cpp:947
[task 2020-07-08T12:52:01.624Z] TEST-UNEXPECTED-FAIL | hazards | unrooted 'obj' of type 'JSObject*' live across GC call at js/xpconnect/src/XPCJSRuntime.cpp:3253
[task 2020-07-08T12:52:01.624Z] TEST-UNEXPECTED-FAIL | hazards | 2 rooting hazards detected
[task 2020-07-08T12:52:01.624Z] TinderboxPrint: documentation<br/><a href='https://wiki.mozilla.org/Javascript:Hazard_Builds#Diagnosing_a_rooting_hazards_failure'>static rooting hazard analysis failures</a>, visit "Inspect Task" link for hazard details
[task 2020-07-08T12:52:01.624Z] + onexit
[task 2020-07-08T12:52:01.624Z] + grab_artifacts /builds/worker/workspace/analysis /builds/worker/artifacts
[task 2020-07-08T12:52:01.624Z] + local analysis_dir
[task 2020-07-08T12:52:01.624Z] + analysis_dir=/builds/worker/workspace/analysis
...
Flags: needinfo?(jyavenard)
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/69bacb6e8287
P1. Make WeakPtr work with generic taskqueue. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/4d7301e8eeb3
P2. Use a TaskQueue with media threadpool for RemoteDecoderManagerParent. r=mattwoodrow
Backout by malexandru@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b46fb0aefee0
Backed out 2 changesets as requested by jyavenard.
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6e475ddbd18c
P1. Make WeakPtr work with generic taskqueue. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/89d5cabfa2df
P2. Use a TaskQueue with media threadpool for RemoteDecoderManagerParent. r=mattwoodrow

I have about 20 try push with that same test being green :(

Flags: needinfo?(jyavenard)

So I added

JS::AutoSuppressGCAnalysis suppress;

bool nsAutoOwningEventTarget::IsCurrentThread() const {
// This API is only ever used for debug builds and has no JS implementation
// Avoid triggering the Hazard static-analyzer.
JS::AutoSuppressGCAnalysis suppress;
return mTarget->IsOnCurrentThread();
}

This causes crashes on Android:
Mozilla crash reason: MOZ_ASSERT(Storage<T>::initialized())
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=309101119&repo=autoland&lineNumber=1570-1597

Seems that this can only be used in JS thread ; nsAutoOwningEventTarget is used outside of that (and so is WeakPtr)

Suggestions?

Flags: needinfo?(sphink)

For background info; this workaround of the GC analyser was due as it gets confused with:

Function '_ZN7mozilla12WebGLTexture8TexImageEjjjRKNS_5avec3IjEES4_RKNS_5webgl11PackingInfoERKNS_14TexImageSourceERKNS_3dom17HTMLCanvasElementE$void mozilla::WebGLTexture::TexImage(uint32, uint32, uint32, const mozilla::avec3<unsigned int>, const mozilla::avec3<unsigned int>, mozilla::webgl::PackingInfo*, mozilla::TexImageSource*, mozilla::dom::HTMLCanvasElement*)' has unrooted 'scopedArr' of type 'mozilla::dom::TypedArray<unsigned char, js::UnwrapUint8ClampedArray, JS_GetUint8ClampedArrayData, js::GetUint8ClampedArrayLengthAndData, JS_NewUint8ClampedArray>' live across GC call '_ZNK7mozilla7WeakPtrINS_12WebGLContextEEptEv$T* mozilla::WeakPtr<T>::operator->() const [with T = mozilla::WebGLContext]' at dom/canvas/WebGLTextureUpload.cpp:947
WebGLTextureUpload.cpp:945: Call(1,2, scopedArr.__ct_comp ())
WebGLTextureUpload.cpp:947: Call(2,3, __temp_2 := this*.field:0.mContext.operator->()) [[GC call]]
WebGLTextureUpload.cpp:947: Call(3,4, __temp_3*.__ct_comp (,imageTarget*))
WebGLTextureUpload.cpp:947: Call(4,5, __temp_1 := __temp_2*.From(canvas*,__temp_3*,claimedSize*,src*,scopedArr))
GC Function: _ZNK7mozilla7WeakPtrINS_12WebGLContextEEptEv$T* mozilla::WeakPtr<T>::operator->() const [with T = mozilla::WebGLContext]
T* mozilla::detail::WeakReference<T>::get() const [with T = mozilla::WebGLContext]
uint8 nsAutoOwningEventTarget::IsCurrentThread() const
uint8 nsIEventTarget::IsOnCurrentThread()
nsIEventTarget.IsOnCurrentThreadInfallible
unresolved nsIEventTarget.IsOnCurrentThreadInfallible

nsIEventTarget.IsOnCurrentThreadInfallible is marked as "noscript" ; it can't cause a GC; this is a native interface only.

Should I open a bug with the Hazard test instead?

I think that something like:

  class SuppressGCAnalysis {
  } JS_HAZ_GC_SUPPRESSED suppress;

will do the trick for now.

Ah! Sorry for the comments in the other bug. I see now why you went this route.

I have had problems before where if the class is too simple (in particular the destructor), the compiler discards it before the hazard analysis sees it. I've got a try push now in case your version doesn't work.

But this is too messy for anyone running into this sort of thing to use. I'll do another push where I annotate this function in a separate file to avoid cluttering up the code there.

I may need to create a variant of AutoSuppressGCAnalysis that omits the dynamic check, so that it can be thread-neutral.

Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b5d9ff5a3e1c
P1. Make WeakPtr work with generic taskqueue. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/4b96894ecdd3
P2. Use a TaskQueue with media threadpool for RemoteDecoderManagerParent. r=mattwoodrow
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
Flags: needinfo?(sphink)
See Also: → 1653060
See Also: → 1672510
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: