Closed Bug 1286895 Opened 9 years ago Closed 9 years ago

No limit on number of dedicated JS web workers in Firefox

Categories

(Core :: DOM: Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: egoktas, Assigned: baku)

References

Details

Attachments

(1 file, 1 obsolete file)

Tested on Linux, there is no limit on the number of JS dedicated web worker in Firefox. Although there is a limit on the number of web workers per page or iframe, there is no limit on the total number of web workers in an opened tab. It is possible to spawn many web workers by spawning many iframes, of which each spawn the maximum amount of web workers. Since each new dedicated web worker corresponds to a new thread in the process, spawning many web workers essentially sprays the memory with stacks. This increases the easiness of finding a stack in the program memory. Preventing the spawning of unlimited can mitigate this issue. Think of stack spraying as heap spraying.
Summary: No limit on number of web workers in Firefox → No limit on number of dedicated JS web workers in Firefox
Component: Security → DOM: Workers
Product: Firefox → Core
We used to limit worker threads but recently removed that to conform to how other browsers operate.
I see. Would it hurt if we would cap the number of web workers in a page to a certain amount, e.g. 100? I think it is an easy mitigation against stack spraying. From my experience other browsers allow tens of web workers, while FF allows thousands of web workers.
(In reply to Enes Göktaş [:egoktas] from comment #2) > From my experience other browsers allow tens of web workers, while FF allows > thousands of web workers. Are you using something to measure this? My impression chrome did not limit workers. What is the issue with "stack spraying"?
I did some empirical tests to see how many web workers other browsers would allow. On Chrome I noticed that it uses so much memory for every new web worker, such that after spawning tens of web workers the process runs out of memory for web workers and throws an error. So there is like a soft limit. Spawning many stacks increases the chance of being able to randomly hit a stack. In the case of being able to spawn an unlimited amount of stacks, the chance of missing a stack can get to zero. This gets more significant when for example a defense mechanism like SafeStack would get deployed. SafeStack is a new feature in LLVM that separates the return addresses and some other safe values from the actual stack to protect against buffer overflows. It ensures there are no pointers to it and to protect against arbitrary writes it randomizes the location of the separated stack(safestack). Stack spraying can potentially reduce its security. Also spawning many threads would slow down the system and potentially bring it to a halt if it reaches the total number of processes, e.g. on Linux. Imagine a page injected with a malicious script that just aims at DoSsing the system.
Thanks for the explanation. Sounds like we should put the limit back in with some quite large constant. For example, limit threads to 256 or 512. Andrea, what do you think?
Flags: needinfo?(amarchesini)
> Sounds like we should put the limit back in with some quite large constant. > For example, limit threads to 256 or 512. Andrea, what do you think? There are many possible DoSsing a script can do to test the limit of a browser: memory allocation, workers, are just 2 of them. The reason why we decided to remove the limit is that, if a malicious page wants to crash a browser, it can use multiple domains to workaround the limit per domain.
Flags: needinfo?(amarchesini)
But I think the issue here is a bit different from normal DoS: 1) You can DoS the entire OS by exhausting process space. Effectively a fork bomb. 2) The ability to spam stacks reduces the effectiveness of the ASLR security technique. [1] These suggest we should have *some* limit, even if its very high. [1]: https://en.wikipedia.org/wiki/Address_space_layout_randomization
Flags: needinfo?(amarchesini)
ok. I think, having a limit up to 512 should be more than enough for any webpage. I'll reintroduce the limit. Maybe adding some new telemetry IDs (in case the old ones have been removed) to see if this limit is reach by some website.
Flags: needinfo?(amarchesini)
Thanks!
Assignee: nobody → amarchesini
Status: NEW → ASSIGNED
Attached file backed_out_changeset_feb0e93ac30d (obsolete) —
This is just the backout of bug 1280241 with these changes: 1. 512 limit 2. a telemetry ID for dedicated workers.
Attachment #8782789 - Flags: review?(bkelly)
I could not find the number 512 in this patch.
wrong patch.
Attachment #8782789 - Attachment is obsolete: true
Attachment #8782789 - Flags: review?(bkelly)
Attachment #8782803 - Flags: review?(bkelly)
Attachment #8782803 - Attachment is patch: true
Comment on attachment 8782803 [details] [diff] [review] backed_out_changeset_feb0e93ac30d Review of attachment 8782803 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me. I did not reverse engineer the mechanism completely since we are restoring code that use to run before. If you made extensive changes, please separate the backout patch from your additions. (I assume you did not make extensive changes, though.) Thanks! ::: dom/workers/RuntimeService.cpp @@ +2113,5 @@ > RuntimeService::CancelWorkersForWindow(nsPIDOMWindowInner* aWindow) > { > AssertIsOnMainThread(); > > + AutoTArray<WorkerPrivate*, MAX_WORKERS_PER_DOMAIN> workers; Looks like these are about 4k on the stack. I guess thats ok. @@ +2456,5 @@ > numberOfProcessors = 1; // Must be one there somewhere > } > + uint32_t clampedValue = std::min(uint32_t(numberOfProcessors), > + gMaxWorkersPerDomain); > + clampedHardwareConcurrency.compareExchange(0, clampedValue); I thought we wanted this clamped to a smaller value to reduce finger-printing. Is that not the case? ::: modules/libpref/init/all.js @@ +139,5 @@ > pref("dom.select_events.enabled", true); > > // Whether or not Web Workers are enabled. > pref("dom.workers.enabled", true); > +// The number of workers per domain allowed to run concurrently. Maybe extend the comment that we're going for effectively infinite, while preventing abuse. ::: toolkit/components/telemetry/Histograms.json @@ +8478,5 @@ > }, > + "SERVICE_WORKER_SPAWN_GETS_QUEUED": { > + "alert_emails": ["amarchesini@mozilla.com"], > + "bug_numbers": [1286895], > + "expires_in_version": "58", Are the guidelines for when to expire telemetry? Any reason to expire these at all? Hardware is going to change over time so we may want these in place.
Attachment #8782803 - Flags: review?(bkelly) → review+
Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f4c8b3de527e Reintroduce a limit on number of dedicated JS web workers in Firefox, r=bkelly
Backed this out for crashing in test_bug1241485.html, at least on Windows 7 opt: https://hg.mozilla.org/integration/mozilla-inbound/rev/0a3247080eba Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=f4c8b3de527ee6b47a3167248e8a398b6254f119 Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=34342370&repo=mozilla-inbound 01:28:20 WARNING - TEST-UNEXPECTED-FAIL | dom/workers/test/test_bug1241485.html | application terminated with exit code 1 01:28:20 INFO - runtests.py | Application ran for: 0:00:13.246000 01:28:20 INFO - zombiecheck | Reading PID log: c:\users\cltbld\appdata\local\temp\tmpocpsz3pidlog 01:28:20 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/F__ynWzdSZeb0aeG5T-izA/artifacts/public/build/firefox-51.0a1.en-US.win32.crashreporter-symbols.zip 01:28:25 INFO - mozcrash Copy/paste: C:\slave\test\build\win32-minidump_stackwalk.exe c:\users\cltbld\appdata\local\temp\tmpe1ozr8.mozrunner\minidumps\5b90142c-3ce9-46aa-95b7-84f65d563bfe.dmp c:\users\cltbld\appdata\local\temp\tmpmsng85 01:28:33 INFO - mozcrash Saved minidump as C:\slave\test\build\blobber_upload_dir\5b90142c-3ce9-46aa-95b7-84f65d563bfe.dmp 01:28:33 INFO - mozcrash Saved app info as C:\slave\test\build\blobber_upload_dir\5b90142c-3ce9-46aa-95b7-84f65d563bfe.extra 01:28:33 WARNING - PROCESS-CRASH | dom/workers/test/test_bug1241485.html | application crashed [@ mozilla::dom::workers::GetCurrentThreadJSContext()] 01:28:33 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmpe1ozr8.mozrunner\minidumps\5b90142c-3ce9-46aa-95b7-84f65d563bfe.dmp 01:28:33 INFO - Operating system: Windows NT 01:28:33 INFO - 6.1.7601 Service Pack 1 01:28:33 INFO - CPU: x86 01:28:33 INFO - GenuineIntel family 6 model 30 stepping 5 01:28:33 INFO - 8 CPUs 01:28:33 INFO - Crash reason: EXCEPTION_ACCESS_VIOLATION_READ 01:28:33 INFO - Crash address: 0x2d0 01:28:33 INFO - Process uptime: 13 seconds 01:28:33 INFO - Thread 393 (crashed) 01:28:33 INFO - 0 xul.dll!mozilla::dom::workers::GetCurrentThreadJSContext() [RuntimeService.cpp:f4c8b3de527e : 1291 + 0x5] 01:28:33 INFO - eip = 0x61bba83b esp = 0x749df80c ebp = 0x749df900 ebx = 0x7e7a0100 01:28:33 INFO - esi = 0x00000000 edi = 0x19e53b20 eax = 0x00000000 ecx = 0x0000002c 01:28:33 INFO - edx = 0x749df94c efl = 0x00010246 01:28:33 INFO - Found by: given as instruction pointer in context 01:28:33 INFO - 1 xul.dll!mozilla::dom::workers::WorkerRunnable::Run() [WorkerRunnable.cpp:f4c8b3de527e : 289 + 0x5] 01:28:33 INFO - eip = 0x61bdf377 esp = 0x749df810 ebp = 0x749df900 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 2 xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:f4c8b3de527e : 1058 + 0x6] 01:28:33 INFO - eip = 0x60c7d307 esp = 0x749df908 ebp = 0x749df970 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 3 xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:f4c8b3de527e : 290 + 0xd] 01:28:33 INFO - eip = 0x60c937bc esp = 0x749df978 ebp = 0x749df984 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 4 xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [MessagePump.cpp:f4c8b3de527e : 338 + 0x8] 01:28:33 INFO - eip = 0x60f0c64c esp = 0x749df98c ebp = 0x749df9ac 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 5 xul.dll!MessageLoop::RunHandler() [message_loop.cc:f4c8b3de527e : 225 + 0x8] 01:28:33 INFO - eip = 0x60ef948e esp = 0x749df9b4 ebp = 0x749df9e4 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 6 xul.dll!MessageLoop::Run() [message_loop.cc:f4c8b3de527e : 205 + 0x7] 01:28:33 INFO - eip = 0x60ef9281 esp = 0x749df9ec ebp = 0x749dfa04 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 7 xul.dll!nsThread::ThreadFunc(void *) [nsThread.cpp:f4c8b3de527e : 459 + 0x7] 01:28:33 INFO - eip = 0x60c804a0 esp = 0x749dfa0c ebp = 0x749dfa24 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 8 nss3.dll!_PR_NativeRunThread [pruthr.c:f4c8b3de527e : 397 + 0x6] 01:28:33 INFO - eip = 0x6a011520 esp = 0x749dfa2c ebp = 0x749dfa40 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 9 nss3.dll!pr_root [w95thred.c:f4c8b3de527e : 95 + 0xa] 01:28:33 INFO - eip = 0x6a006b2c esp = 0x749dfa48 ebp = 0x749dfa4c 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 10 ucrtbase.dll!_crt_at_quick_exit + 0x104 01:28:33 INFO - eip = 0x69d162a4 esp = 0x749dfa54 ebp = 0x749dfa88 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 11 kernel32.dll!BaseThreadInitThunk + 0x12 01:28:33 INFO - eip = 0x77743c45 esp = 0x749dfa90 ebp = 0x749dfa94 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 12 ntdll.dll!__RtlUserThreadStart + 0x27 01:28:33 INFO - eip = 0x77c837f5 esp = 0x749dfa9c ebp = 0x749dfad4 01:28:33 INFO - Found by: call frame info 01:28:33 INFO - 13 ntdll.dll!_RtlUserThreadStart + 0x1b 01:28:33 INFO - eip = 0x77c837c8 esp = 0x749dfadc ebp = 0x749dfaec 01:28:33 INFO - Found by: call frame info
Flags: needinfo?(amarchesini)
Furthermore Windows XP and 7 opt + pgo failed with dom/workers/test/test_transferable.html | uncaught exception - uncaught exception: out of memory at undefined: https://treeherder.mozilla.org/logviewer.html#?job_id=34343632&repo=mozilla-inbound
Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/1a91a95d5db4 Reintroduce a limit on number of dedicated JS web workers in Firefox, r=bkelly
Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/9e935bd97a1c Remove a test for checking the number of created workers, r=me
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Flags: needinfo?(amarchesini)
Comment on attachment 8782803 [details] [diff] [review] backed_out_changeset_feb0e93ac30d can you do a data review for this patch?
Attachment #8782803 - Flags: review?(francois)
Comment on attachment 8782803 [details] [diff] [review] backed_out_changeset_feb0e93ac30d Review of attachment 8782803 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, except for the expiry. Please use 56. ::: toolkit/components/telemetry/Histograms.json @@ +8478,5 @@ > }, > + "SERVICE_WORKER_SPAWN_GETS_QUEUED": { > + "alert_emails": ["amarchesini@mozilla.com"], > + "bug_numbers": [1286895], > + "expires_in_version": "58", The guidelines are here: https://wiki.mozilla.org/Firefox/Data_Collection In this case, can you change 58 for 56? It's not a problem renewing this later if you are still using the data.
Attachment #8782803 - Flags: review?(francois) → review-
> > + "expires_in_version": "58", Actually I prefer to have 'never' here. In this patch I increased the number of running workers per domain up to 512. If a domain has more than 512 workers, the following one are queued. It seems reasonable to have expire: never because this telemetry ID is meant to be used as alert. Feedback?
Flags: needinfo?(francois)
Benjamin, what would you recommend (re: comment 26)?
Flags: needinfo?(francois) → needinfo?(benjamin)
Has somebody committed to building and maintaining that alerting mechanism? If so this seems like a reasonable permanent quality metric.
Flags: needinfo?(benjamin)
Depends on: 1468036
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: