Closed Bug 1634259 Opened 4 years ago Closed 4 years ago

High Frequency PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::CycleCollectedJSContext::RecursionDepth() const]

Categories

(Core :: DOM: Workers, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla78
Tracking Status
firefox-esr68 --- unaffected
firefox75 --- unaffected
firefox76 --- unaffected
firefox77 --- wontfix
firefox78 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: asuth)

References

(Regression)

Details

(Keywords: crash, intermittent-failure, regression, Whiteboard: [retriggered][stockwell disable-recommended])

Crash Data

Attachments

(2 files)

Filed by: rmaries [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=300155975&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/KBxj2JD2RH-RWse-hPnvcg/runs/0/artifacts/public/logs/live_backing.log


[task 2020-04-30T06:09:37.830Z] 06:09:37 INFO - TEST-START | browser/components/uitour/test/browser_showMenu.js
[task 2020-04-30T06:09:54.102Z] 06:09:54 INFO - mozcrash Copy/paste: /builds/worker/fetches/minidump_stackwalk/minidump_stackwalk /tmp/tmpTFrZCQ.mozrunner/minidumps/4a84e97a-e105-81e3-7f03-785e8ee3ff44.dmp /builds/worker/workspace/build/symbols
[task 2020-04-30T06:09:59.301Z] 06:09:59 INFO - mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/4a84e97a-e105-81e3-7f03-785e8ee3ff44.dmp
[task 2020-04-30T06:09:59.379Z] 06:09:59 INFO - PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::CycleCollectedJSContext::RecursionDepth() const]
[task 2020-04-30T06:09:59.380Z] 06:09:59 INFO - Crash dump filename: /tmp/tmpTFrZCQ.mozrunner/minidumps/4a84e97a-e105-81e3-7f03-785e8ee3ff44.dmp
[task 2020-04-30T06:09:59.381Z] 06:09:59 INFO - Operating system: Linux
[task 2020-04-30T06:09:59.382Z] 06:09:59 INFO - 0.0.0 Linux 4.4.0-1014-aws #14taskcluster1-Ubuntu SMP Tue Apr 3 10:27:00 UTC 2018 x86_64
[task 2020-04-30T06:09:59.382Z] 06:09:59 INFO - CPU: amd64
[task 2020-04-30T06:09:59.383Z] 06:09:59 INFO - family 6 model 85 stepping 7
[task 2020-04-30T06:09:59.383Z] 06:09:59 INFO - 2 CPUs
[task 2020-04-30T06:09:59.384Z] 06:09:59 INFO -
[task 2020-04-30T06:09:59.385Z] 06:09:59 INFO - GPU: UNKNOWN
[task 2020-04-30T06:09:59.385Z] 06:09:59 INFO -
[task 2020-04-30T06:09:59.386Z] 06:09:59 INFO - Crash reason: SIGSEGV /SEGV_MAPERR
[task 2020-04-30T06:09:59.386Z] 06:09:59 INFO - Crash address: 0x5b70
[task 2020-04-30T06:09:59.387Z] 06:09:59 INFO - Process uptime: not available
[task 2020-04-30T06:09:59.387Z] 06:09:59 INFO -
[task 2020-04-30T06:09:59.388Z] 06:09:59 INFO - Thread 27 (crashed)
[task 2020-04-30T06:09:59.388Z] 06:09:59 INFO - 0 libxul.so!mozilla::CycleCollectedJSContext::RecursionDepth() const [CycleCollectedJSContext.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 520 + 0x0]
[task 2020-04-30T06:09:59.389Z] 06:09:59 INFO - rax = 0x0000000000000000 rdx = 0x00007f39e108a5d0
[task 2020-04-30T06:09:59.389Z] 06:09:59 INFO - rcx = 0x00007f39fd1b9320 rbx = 0x0000000000000000
[task 2020-04-30T06:09:59.390Z] 06:09:59 INFO - rsi = 0x0000000000000001 rdi = 0x0000000000000000
[task 2020-04-30T06:09:59.391Z] 06:09:59 INFO - rbp = 0x00007f39de6fdd60 rsp = 0x00007f39de6fdd50
[task 2020-04-30T06:09:59.391Z] 06:09:59 INFO - r8 = 0x0000000000727736 r9 = 0x0000000000000003
[task 2020-04-30T06:09:59.392Z] 06:09:59 INFO - r10 = 0x0006b40bdf835ec1 r11 = 0x00000000ffffffff
[task 2020-04-30T06:09:59.392Z] 06:09:59 INFO - r12 = 0x0000000000000000 r13 = 0x00007f39de6fde80
[task 2020-04-30T06:09:59.393Z] 06:09:59 INFO - r14 = 0x0000000000000000 r15 = 0x00007f39e6699e00
[task 2020-04-30T06:09:59.393Z] 06:09:59 INFO - rip = 0x00007f39e90af19b
[task 2020-04-30T06:09:59.394Z] 06:09:59 INFO - Found by: given as instruction pointer in context
[task 2020-04-30T06:09:59.395Z] 06:09:59 INFO - 1 libxul.so!mozilla::dom::WorkerPrivate::OnProcessNextEvent() [WorkerPrivate.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 2988 + 0xd]
[task 2020-04-30T06:09:59.396Z] 06:09:59 INFO - rbx = 0x00007f39e0712000 rbp = 0x00007f39de6fdd80
[task 2020-04-30T06:09:59.396Z] 06:09:59 INFO - rsp = 0x00007f39de6fdd70 r12 = 0x0000000000000000
[task 2020-04-30T06:09:59.396Z] 06:09:59 INFO - r13 = 0x00007f39de6fde80 r14 = 0x0000000000000000
[task 2020-04-30T06:09:59.397Z] 06:09:59 INFO - r15 = 0x00007f39e6699e00 rip = 0x00007f39ebc2070d
[task 2020-04-30T06:09:59.397Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.397Z] 06:09:59 INFO - 2 libxul.so!mozilla::dom::WorkerThread::Observer::OnProcessNextEvent(nsIThreadInternal*, bool) [WorkerThread.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 356 + 0x9]
[task 2020-04-30T06:09:59.397Z] 06:09:59 INFO - rbx = 0x00007f39e6699e00 rbp = 0x00007f39de6fdda0
[task 2020-04-30T06:09:59.398Z] 06:09:59 INFO - rsp = 0x00007f39de6fdd90 r12 = 0x0000000000000000
[task 2020-04-30T06:09:59.398Z] 06:09:59 INFO - r13 = 0x00007f39de6fde80 r14 = 0x0000000000000000
[task 2020-04-30T06:09:59.398Z] 06:09:59 INFO - r15 = 0x00007f39e6699e00 rip = 0x00007f39ebc31379
[task 2020-04-30T06:09:59.398Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.398Z] 06:09:59 INFO - 3 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 1107 + 0x3b]
[task 2020-04-30T06:09:59.399Z] 06:09:59 INFO - rbx = 0x00007f39de6fdee0 rbp = 0x00007f39de6fe300
[task 2020-04-30T06:09:59.399Z] 06:09:59 INFO - rsp = 0x00007f39de6fddb0 r12 = 0x0000000000000000
[task 2020-04-30T06:09:59.399Z] 06:09:59 INFO - r13 = 0x00007f39de6fde80 r14 = 0x00007f39e10093c0
[task 2020-04-30T06:09:59.399Z] 06:09:59 INFO - r15 = 0x00007f39e6699e00 rip = 0x00007f39e9189894
[task 2020-04-30T06:09:59.400Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.400Z] 06:09:59 INFO - 4 libxul.so!NS_ProcessPendingEvents(nsIThread*, unsigned int) [nsThreadUtils.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 429 + 0xf]
[task 2020-04-30T06:09:59.400Z] 06:09:59 INFO - rbx = 0x0000000000000000 rbp = 0x00007f39de6fe350
[task 2020-04-30T06:09:59.400Z] 06:09:59 INFO - rsp = 0x00007f39de6fe310 r12 = 0x00007f39e10093c0
[task 2020-04-30T06:09:59.401Z] 06:09:59 INFO - r13 = 0x00007f39de6fe31f r14 = 0x00000000ffffffff
[task 2020-04-30T06:09:59.401Z] 06:09:59 INFO - r15 = 0x000000000102d69c rip = 0x00007f39e9187ce3
[task 2020-04-30T06:09:59.401Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.401Z] 06:09:59 INFO - 5 libxul.so!mozilla::dom::WorkerPrivate::ClearMainEventQueue(mozilla::dom::WorkerPrivate::WorkerRanOrNot) [WorkerPrivate.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 3518 + 0xd]
[task 2020-04-30T06:09:59.401Z] 06:09:59 INFO - rbx = 0x0000000000000000 rbp = 0x00007f39de6fe390
[task 2020-04-30T06:09:59.402Z] 06:09:59 INFO - rsp = 0x00007f39de6fe360 r12 = 0x0000000080004005
[task 2020-04-30T06:09:59.402Z] 06:09:59 INFO - r13 = 0x00001edb1e635b4e r14 = 0x0000000000000001
[task 2020-04-30T06:09:59.402Z] 06:09:59 INFO - r15 = 0x00007f39e0712000 rip = 0x00007f39ebc217cf
[task 2020-04-30T06:09:59.402Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.402Z] 06:09:59 INFO - 6 libxul.so!mozilla::dom::WorkerPrivate::ScheduleDeletion(mozilla::dom::WorkerPrivate::WorkerRanOrNot) [WorkerPrivate.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 3363 + 0xb]
[task 2020-04-30T06:09:59.402Z] 06:09:59 INFO - rbx = 0x00007f39e0712000 rbp = 0x00007f39de6fe3d0
[task 2020-04-30T06:09:59.403Z] 06:09:59 INFO - rsp = 0x00007f39de6fe3a0 r12 = 0x0000000080004005
[task 2020-04-30T06:09:59.403Z] 06:09:59 INFO - r13 = 0x00001edb1e635b4e r14 = 0x0000000000000001
[task 2020-04-30T06:09:59.403Z] 06:09:59 INFO - r15 = 0x00007f39e6571880 rip = 0x00007f39ebc19ab2
[task 2020-04-30T06:09:59.403Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.404Z] 06:09:59 INFO - 7 libxul.so!mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run() [RuntimeService.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 2334 + 0x1e]
[task 2020-04-30T06:09:59.404Z] 06:09:59 INFO - rbx = 0x00007f39de6fe620 rbp = 0x00007f39de6fe5e0
[task 2020-04-30T06:09:59.404Z] 06:09:59 INFO - rsp = 0x00007f39de6fe3e0 r12 = 0x0000000080004005
[task 2020-04-30T06:09:59.404Z] 06:09:59 INFO - r13 = 0x00001edb1e635b4e r14 = 0x00007f39e0712000
[task 2020-04-30T06:09:59.404Z] 06:09:59 INFO - r15 = 0x00007f39e6571880 rip = 0x00007f39ebc069e2
[task 2020-04-30T06:09:59.405Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.405Z] 06:09:59 INFO - 8 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 1200 + 0x11]
[task 2020-04-30T06:09:59.405Z] 06:09:59 INFO - rbx = 0x00007f39de6fe620 rbp = 0x00007f39de6feb40
[task 2020-04-30T06:09:59.405Z] 06:09:59 INFO - rsp = 0x00007f39de6fe5f0 r12 = 0x00007f39e1009488
[task 2020-04-30T06:09:59.406Z] 06:09:59 INFO - r13 = 0x00001edb1e635b4e r14 = 0x00007f39e10093c0
[task 2020-04-30T06:09:59.406Z] 06:09:59 INFO - r15 = 0x00007f39de6fe6c0 rip = 0x00007f39e9189f36
[task 2020-04-30T06:09:59.407Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.408Z] 06:09:59 INFO - 9 libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 481 + 0xc]
[task 2020-04-30T06:09:59.408Z] 06:09:59 INFO - rbx = 0x0000000000000000 rbp = 0x00007f39de6feb70
[task 2020-04-30T06:09:59.409Z] 06:09:59 INFO - rsp = 0x00007f39de6feb50 r12 = 0x00007f39de6feb88
[task 2020-04-30T06:09:59.409Z] 06:09:59 INFO - r13 = 0x00007f39e6571820 r14 = 0x00007f39de6feb57
[task 2020-04-30T06:09:59.409Z] 06:09:59 INFO - r15 = 0x00007f39e10093c0 rip = 0x00007f39e918db9f
[task 2020-04-30T06:09:59.410Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.410Z] 06:09:59 INFO - 10 libxul.so!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [MessagePump.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 302 + 0xa]
[task 2020-04-30T06:09:59.410Z] 06:09:59 INFO - rbx = 0x00007f39de6fec68 rbp = 0x00007f39de6febc0
[task 2020-04-30T06:09:59.410Z] 06:09:59 INFO - rsp = 0x00007f39de6feb80 r12 = 0x00007f39de6feb88
[task 2020-04-30T06:09:59.411Z] 06:09:59 INFO - r13 = 0x00007f39e6571820 r14 = 0x00007f39e6571800
[task 2020-04-30T06:09:59.411Z] 06:09:59 INFO - r15 = 0x00007f39e10093c0 rip = 0x00007f39e977fe2b
[task 2020-04-30T06:09:59.411Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.412Z] 06:09:59 INFO - 11 libxul.so!MessageLoop::RunInternal() [message_loop.cc:83beb87d9f6945ccddefe62341b5536398eb72a0 : 315 + 0x17]
[task 2020-04-30T06:09:59.412Z] 06:09:59 INFO - rbx = 0x00007f39de6fec68 rbp = 0x00007f39de6fec00
[task 2020-04-30T06:09:59.412Z] 06:09:59 INFO - rsp = 0x00007f39de6febd0 r12 = 0x00007f39de6fec60
[task 2020-04-30T06:09:59.413Z] 06:09:59 INFO - r13 = 0x00007f39e10093c0 r14 = 0x00007f39de6fec10
[task 2020-04-30T06:09:59.413Z] 06:09:59 INFO - r15 = 0x00007f39e1009400 rip = 0x00007f39e972db89
[task 2020-04-30T06:09:59.413Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.413Z] 06:09:59 INFO - 12 libxul.so!MessageLoop::Run() [message_loop.cc:83beb87d9f6945ccddefe62341b5536398eb72a0 : 290 + 0x8]
[task 2020-04-30T06:09:59.414Z] 06:09:59 INFO - rbx = 0x00007f39de6fec68 rbp = 0x00007f39de6fec40
[task 2020-04-30T06:09:59.414Z] 06:09:59 INFO - rsp = 0x00007f39de6fec10 r12 = 0x00007f39de6fec60
[task 2020-04-30T06:09:59.414Z] 06:09:59 INFO - r13 = 0x00007f39e10093c0 r14 = 0x00007f39de6fec10
[task 2020-04-30T06:09:59.415Z] 06:09:59 INFO - r15 = 0x00007f39e1009400 rip = 0x00007f39e972dae3
[task 2020-04-30T06:09:59.415Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.415Z] 06:09:59 INFO - 13 libxul.so!nsThread::ThreadFunc(void*) [nsThread.cpp:83beb87d9f6945ccddefe62341b5536398eb72a0 : 444 + 0x8]
[task 2020-04-30T06:09:59.415Z] 06:09:59 INFO - rbx = 0x00007f39de6fec68 rbp = 0x00007f39de6fee60
[task 2020-04-30T06:09:59.416Z] 06:09:59 INFO - rsp = 0x00007f39de6fec50 r12 = 0x00007f39de6fec60
[task 2020-04-30T06:09:59.416Z] 06:09:59 INFO - r13 = 0x00007f39e10093c0 r14 = 0x00007f39de6fec68
[task 2020-04-30T06:09:59.416Z] 06:09:59 INFO - r15 = 0x00007f39e1009400 rip = 0x00007f39e9187556
[task 2020-04-30T06:09:59.417Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.417Z] 06:09:59 INFO - 14 libnspr4.so!_pt_root [ptthread.c:83beb87d9f6945ccddefe62341b5536398eb72a0 : 201 + 0x7]
[task 2020-04-30T06:09:59.417Z] 06:09:59 INFO - rbx = 0x00007f39e078f040 rbp = 0x00007f39de6feeb0
[task 2020-04-30T06:09:59.418Z] 06:09:59 INFO - rsp = 0x00007f39de6fee70 r12 = 0x00007f39de6ff630
[task 2020-04-30T06:09:59.418Z] 06:09:59 INFO - r13 = 0x0000000000000000 r14 = 0x0000000000002804
[task 2020-04-30T06:09:59.418Z] 06:09:59 INFO - r15 = 0x00007f39de6ff700 rip = 0x00007f39fd38a897
[task 2020-04-30T06:09:59.418Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.419Z] 06:09:59 INFO - 15 libpthread.so.0 + 0x76db
[task 2020-04-30T06:09:59.419Z] 06:09:59 INFO - rbx = 0x0000000000000000 rbp = 0x0000000000000000
[task 2020-04-30T06:09:59.419Z] 06:09:59 INFO - rsp = 0x00007f39de6feec0 r12 = 0x00007f39de6fef80
[task 2020-04-30T06:09:59.419Z] 06:09:59 INFO - r13 = 0x0000000000000000 r14 = 0x00007f39e078f040
[task 2020-04-30T06:09:59.420Z] 06:09:59 INFO - r15 = 0x00007ffe08e125a0 rip = 0x00007f39fcfa66db
[task 2020-04-30T06:09:59.420Z] 06:09:59 INFO - Found by: call frame info
[task 2020-04-30T06:09:59.421Z] 06:09:59 INFO - 16 libc.so.6 + 0x12188f
[task 2020-04-30T06:09:59.421Z] 06:09:59 INFO - rsp = 0x00007f39de6fef80 rip = 0x00007f39fbf8488f
[task 2020-04-30T06:09:59.422Z] 06:09:59 INFO - Found by: stack scanning

Flags: needinfo?(kmaglione+bmo)
Regressed by: 1594572
Summary: Intermittent PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::CycleCollectedJSContext::RecursionDepth() const] → High Frequency PROCESS-CRASH | Main app process exited normally | application crashed [@ mozilla::CycleCollectedJSContext::RecursionDepth() const]
Has Regression Range: --- → yes
Keywords: regression

Set release status flags based on info from the regressing bug 1594572

Assignee: nobody → kmaglione+bmo
Flags: needinfo?(kmaglione+bmo)
Whiteboard: [retriggered]
Component: Tours → DOM: Workers
Product: Firefox → Core

This looks to be the same as bug 1634995 comment 2. Kris, do you prefer to keep this one?

Flags: needinfo?(kmaglione+bmo)

(Copied from bug 1634995)

I see a missing nullptr check here: CycleCollectedJSContext::Get() can return nullptr but the caller does not check it and thus the next access on the data structure on mOwningThread may fail.

I see many places in WorkerPrivate.cpp, where CycleCollectedJSContext::Get() is used as if it always returns a healthy (raw!) pointer. Either we make sure, that this is the case or we should check all call sites for proper error handling.

Crash Signature: [@ mozilla::CycleCollectedJSContext::RecursionDepth() const] → [@ mozilla::CycleCollectedJSContext::RecursionDepth() const] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent | mozilla::dom::WorkerThread::Observer::OnProcessNextEvent | NS_ProcessPendingEvents]
Crash Signature: [@ mozilla::CycleCollectedJSContext::RecursionDepth() const] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent | mozilla::dom::WorkerThread::Observer::OnProcessNextEvent | NS_ProcessPendingEvents] → [@ mozilla::CycleCollectedJSContext::RecursionDepth() const] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent | mozilla::dom::WorkerThread::Observer::OnProcessNextEvent | NS_ProcessPendingEvents] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent()]

In my opinion, the crash in bug looks different than the [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent()] one. The one here looks like we're nesting event loops too deeply, whereas with the signature it is only nested once.

Crash Signature: [@ mozilla::CycleCollectedJSContext::RecursionDepth() const] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent | mozilla::dom::WorkerThread::Observer::OnProcessNextEvent | NS_ProcessPendingEvents] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent()] → [@ mozilla::CycleCollectedJSContext::RecursionDepth() const] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent | mozilla::dom::WorkerThread::Observer::OnProcessNextEvent | NS_ProcessPendingEvents] [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent()]
Flags: needinfo?(bugmail)

(In reply to Andrew McCreight [:mccr8] from comment #11)

In my opinion, the crash in bug looks different than the [@ mozilla::dom::WorkerPrivate::OnProcessNextEvent()] one. The one here looks like we're nesting event loops too deeply, whereas with the signature it is only nested once.

Well, it is the same exact line of code that causes the nullptr. So there might be a common local fix to the crash but different consequences on other parts of the code when the underlying condition occurs. I would want to improve the local robustness first and then look for the fallout.

I see three local options:

  1. If CycleCollectedJSContext::Get() is never supposed to fail, we should probably change its signature/behavior.
  2. If it is only supposed to not fail while a WorkerPrivate is living, we might want to wrap it with a local function ensuring this invariant.
  3. If it is supposed to possibly fail any time, we should improve error handling in all call sites AND look out for fallouts of this error handling.

:asuth, what do you think?

Taking bug per slack conversation with :kmag.

Assignee: kmaglione+bmo → bugmail
Status: NEW → ASSIGNED
Flags: needinfo?(kmaglione+bmo)
Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork:owner]

Hi Andrew, are there updates here?

I looked in about ~25 failure logs and the tests failing are browser_ext_windows_size.js and browser_readerMode_readingTime.js. Created the disabling patch for those 2 tests. Let us know if we can land the patch.

Because this bug's Severity has not been changed from the default since it was filed, and it's Priority is -- (non,) indicating it has has not been previously triaged, the bug's Severity is being updated to -- (default, untriaged.)

Severity: normal → --
Flags: needinfo?(bugmail)

Andrew can we land the disabling patch until there is a fix?

Flags: needinfo?(bugmail)

Bug 1594572 attempted to fix a shutdown edge-case where a worker that was
started late enough would fail to create a PBackground connection and
early return. This early return, however, left the worker never scheduled
for deletion, resulting in a shutdown hang. See my restating block at
https://phabricator.services.mozilla.com/D73134#2224963 for more context.

The problem with this was that the attempt to clear the main event queue
ran afoul of pre/post event-processing hooks intended to ensure that control
runnables are processed if some non-worker code goes and spins a nested
event loop on the worker thread. But because we never got around to
creating a cycle collected JS runtime on the thread, the call to get it
would return null and when we tried to get the recursion depth of a null
pointer, we would crash a little.

This patch addresses the immediate regression by adding a helper that returns
a depth of 1 in this edge-case. It also fixes another problem with the fix
that is now more obvious having reviewed bug 1636147... we were failing to
mark the status of the worker as dead and drain any control runnables, which
could have resulted in the CrashIfHangingRunnable usage not working out right.
(Note that CrashIfHangingRunnable does handle Cancel() correctly, so the
fact that we end up canceling the runnable shouldn't break things.)

I have a fix up for review and perry is available to review for today, so maybe let's just try the fix. Try push is at:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=607d6485f17418ccb448bed5a5cc1af1fcda0eb3

Flags: needinfo?(bugmail)
Pushed by bugmail@asutherland.org:
https://hg.mozilla.org/integration/autoland/rev/bd7c4e7e5994
Improve worker shutdown edge case handling. r=perry
Severity: -- → S3
Priority: -- → P2
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla78
Regressions: 1638170
You need to log in before you can comment on or make changes to this bug.