Closed Bug 1406547 Opened 7 years ago Closed 6 years ago

Crash in `anonymous namespace''::ScriptLoaderRunnable::LoadingFinished

Categories

(Core :: DOM: Workers, defect, P3)

57 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- disabled
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- wontfix
firefox60 --- wontfix
firefox61 --- fixed

People

(Reporter: philipp, Assigned: bkelly)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-aeb6165d-b69a-470c-a639-7768d0171005.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	`anonymous namespace'::ScriptLoaderRunnable::LoadingFinished 	dom/workers/ScriptLoader.cpp:633
1 	xul.dll 	`anonymous namespace'::ScriptLoaderRunnable::DataReceivedFromCache 	dom/workers/ScriptLoader.cpp:1291
2 	xul.dll 	`anonymous namespace'::CacheScriptLoader::OnStreamComplete 	dom/workers/ScriptLoader.cpp:1787
3 	xul.dll 	mozilla::net::nsStreamLoader::OnStopRequest(nsIRequest*, nsISupports*, nsresult) 	netwerk/base/nsStreamLoader.cpp:104
4 	xul.dll 	nsInputStreamPump::OnStateStop() 	netwerk/base/nsInputStreamPump.cpp:730
5 	xul.dll 	nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) 	netwerk/base/nsInputStreamPump.cpp:447
6 	xul.dll 	nsInputStreamReadyEvent::Run() 	xpcom/io/nsStreamUtils.cpp:97
7 	xul.dll 	mozilla::ThrottledEventQueue::Inner::ExecuteRunnable() 	xpcom/threads/ThrottledEventQueue.cpp:193
8 	xul.dll 	mozilla::ThrottledEventQueue::Inner::Executor::Run() 	xpcom/threads/ThrottledEventQueue.cpp:79
9 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1039
10 	xul.dll 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:97
11 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:301
12 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:319
13 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:299
14 	xul.dll 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp:158
15 	xul.dll 	nsAppShell::Run() 	widget/windows/nsAppShell.cpp:230
16 	xul.dll 	XRE_RunAppShell() 	toolkit/xre/nsEmbedFunctions.cpp:880
17 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:269
18 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:319
19 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:299
20 	xul.dll 	XRE_InitChildProcess(int, char** const, XREChildData const*) 	toolkit/xre/nsEmbedFunctions.cpp:705
21 	firefox.exe 	content_process_main(mozilla::Bootstrap*, int, char** const) 	ipc/contentproc/plugin-container.cpp:63
22 	firefox.exe 	NS_internal_main(int, char**, char**) 	browser/app/nsBrowserApp.cpp:285
23 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:115
24 	firefox.exe 	__scrt_common_main_seh 	f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253
25 	kernel32.dll 	BaseThreadInitThunk 	
26 	ntdll.dll 	RtlUserThreadStart

these are cross-platform crash reports with MOZ_RELEASE_ASSERT(mWorkerPrivate->PrincipalURIMatchesScriptURL()) that got added in bug 1340652.
Well, bug 1340652 landed in FF54.  If these crashes are just now showing up its probably a more recent change to our principal handling that is causing a problem now.

The only bug I know of which touched things in this area is bug 863246, but I don't know how.  Maybe some of the data URL principal changes.

I guess we should maybe convert PrincipalURIMatchesScriptURL() into AssertPrincipalURIMatchesScriptURL() so we can have separate assertions for each case we check.  Right now its hard to know which condition is failing.

It would also be interesting to know the URL of the crashing worker script.  I don't know if there is any way to add that to the crash report data or not.  If we could reproduce then we could bisect, etc.
The other possible problem here is that some code is accidentally setting a principal on the WorkerPrivate that contains a document URL that does not match our baseURL.  We try hard to sanitize the principal into the worker, but maybe that broke.

We could try changing the URL check at the end of PrincipalURIMatchesScriptURL to an origin check instead.
Flags: needinfo?(bkelly)
Philipp, are there any common URLs (something like Twitter or Facebook or something that isn't personally identifiable in any way) you can see in the crash reports?
Flags: needinfo?(madperson)
sorry as a volunteer i don't have access to those kind of sensitive fields in socorro...
marco, can you perhaps help in this case?
Flags: needinfo?(madperson) → needinfo?(mcastelluccio)
Most crashes have empty URLs.
The most occuring non-empty one (with 4 reports) is https://www.haaretz.co.il/.
Flags: needinfo?(mcastelluccio)
OK, thanks. Low crash volume and no clear STR makes me think this is a P2. I've subscribe to push notifications from haaretz.co.il to see if I can reproduce.
Priority: -- → P2
:overholt, have you been able to reproduce? 

If not, I suggest we triage this to P3 instead of P2 as it seems this isn't a high crasher.
Flags: needinfo?(overholt)
(In reply to Marion Daly [:mdaly] from comment #7)
> :overholt, have you been able to reproduce? 

I got notifications from that website for a while but never got a crash.
Flags: needinfo?(overholt)
Priority: P2 → P3
in crash-stats the reports seem to be pretty much contained to users on nightly and devedition for what it's worth.
Crash Signature: [@ `anonymous namespace''::ScriptLoaderRunnable::LoadingFinished] [@ (anonymous namespace)::ScriptLoaderRunnable::LoadingFinished] → [@ `anonymous namespace'::ScriptLoaderRunnable::LoadingFinished] [@ `anonymous namespace''::ScriptLoaderRunnable::LoadingFinished] [@ (anonymous namespace)::ScriptLoaderRunnable::LoadingFinished]
(In reply to [:philipp] from comment #9)
> in crash-stats the reports seem to be pretty much contained to users on
> nightly and devedition for what it's worth.

That's expected since its a diagnostic assertion.  The assertion is disabled on release/beta.
Crash Signature: [@ `anonymous namespace'::ScriptLoaderRunnable::LoadingFinished] [@ `anonymous namespace''::ScriptLoaderRunnable::LoadingFinished] [@ (anonymous namespace)::ScriptLoaderRunnable::LoadingFinished] → [@ `anonymous namespace'::ScriptLoaderRunnable::LoadingFinished] [@ `anonymous namespace''::ScriptLoaderRunnable::LoadingFinished] [@ (anonymous namespace)::ScriptLoaderRunnable::LoadingFinished] [@ mozilla::dom::`anonymous namespace'::ScriptLoaderRunn…
Crash Signature: namespace'::ScriptLoaderRunnable::LoadingFinished ] → namespace'::ScriptLoaderRunnable::LoadingFinished ] [@ mozilla::dom::(anonymous namespace)::ScriptLoaderRunnable::LoadingFinished ]
I wonder if we should be doing a less restrictive equality check here:

https://searchfox.org/mozilla-central/rev/3abf6fa7e2a6d9a7bfb88796141b0f012e68c2db/dom/workers/WorkerLoadInfo.cpp#386

For example, the spec of the principal may not match the loaded URI exactly?  Perhaps we should instead do a same-origin check here instead.

Andrea, what do you think?
Flags: needinfo?(amarchesini)
I started hitting this today with tabs containing internals.rust-lang.org pages:
https://crash-stats.mozilla.com/report/index/bafc39d7-3d5c-46c0-b5f5-346000180322#tab-details

Reloading the tab causes it to crash immediately, and trying to load internals.rlo in a new tab causes it to crash immediately. I can't reproduce on a fresh profile, so it must have something to do with either being logged in or having cache entries?
Ted, can you post your about:support here?
Flags: needinfo?(bkelly) → needinfo?(ted)
Sure.
Flags: needinfo?(ted)
Hmm... I can't reproduce.  Can you try in a new profile?  And if you can save off your current profile, that might be handy in case we can't reproduce without your particular profile state.
Flags: needinfo?(ted)
I tried using a fresh profile and I could not reproduce there, even after logging in to internals.rlo.
Flags: needinfo?(ted)
On bkelly's suggestion I put a breakpoint in WorkerLoadInfo::PrincipalURIMatchesScriptURL:
https://searchfox.org/mozilla-central/rev/c217fbde244344fedfd07b57a740c694a456dbca/dom/workers/WorkerLoadInfo.cpp#342

It's returning false because it gets all the way to the end, and mBaseURI is https://internals.rust-lang.org/service-worker.js , but principalURI is https://internals.rust-lang.org/service-worker-49bb49a79c0c56d327e5dc0c32ba4b6469a79f13f05a5e174c02f8cb5b1df7b3.js. The site seems to have a redirect from the former URL.
Unregistering the service worker for internals.rlo from about:serviceworkers made the crash go away.
I believe what happened is:

1. The site used to just use a service-worker.js script.
2. Ted used the site and got this old service worker installed.
3. The storage for the site was lost somehow.  This could be bug 1183245 or bug 1428452.
4. The site changed to a hashed service-worker.js script name at some point and added a redirect.
5. Ted visited the site again with the old registration in place but the offline script gone.
6. We hit the fallback code in ScriptLoader that re-offlines the script in this case.
7. The assertion triggers because the redirect URL does not match the original script URL of the service worker.

Now, there are a few problems.

a. Service worker scripts are not supposed to be able to be redirected.  We prevent this in ServiceWorkerScriptCache by calling nsIHttpChannel::SetRedirectLimit(0), but AFAICT we don't do this in the ScriptLoader fallback case.
b. We should never really hit the ScriptLoader fallback case anyway.  The new script could be completely different and incompatible with the current storage situation.

I think perhaps the most sane thing to do here is:

i) Remove the ScriptLoader fallback case and instead purge the registration if we hit it.  This will prevent this situation from happening again in the future.
ii) Add some kind of check that the offlined script URL matches the registration script URL.  If they differ, then purge the registration.  This will help fix the existing profiles in a bad state.
iii) Add telemetry for (i) and (ii) to understand when the problems are corrected.

NI myself to think about what this would take.  We should probably do (i) immediately to prevent the problem from getting worse.
Flags: needinfo?(amarchesini) → needinfo?(bkelly)
See Also: → 1447663
Depends on: 1448141
AFAICT this crash hasn't triggered in a buildid since bug 1448141 shipped.  I'm going to declare this fixed.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(bkelly)
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Assignee: nobody → bkelly
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: