Bugzilla

Comment 1

•

8 years ago

We used to limit worker threads but recently removed that to conform to how other browsers operate.

Enes Göktaş [:egoktas]

Reporter

Comment 2

•

8 years ago

I see. Would it hurt if we would cap the number of web workers in a page to a certain amount, e.g. 100? I think it is an easy mitigation against stack spraying.
From my experience other browsers allow tens of web workers, while FF allows thousands of web workers.

Comment 3

•

8 years ago

(In reply to Enes Göktaş [:egoktas] from comment #2)
> From my experience other browsers allow tens of web workers, while FF allows
> thousands of web workers.

Are you using something to measure this?  My impression chrome did not limit workers.

What is the issue with "stack spraying"?

Enes Göktaş [:egoktas]

Reporter

Comment 4

•

8 years ago

I did some empirical tests to see how many web workers other browsers would allow.
On Chrome I noticed that it uses so much memory for every new web worker, such that after spawning tens of web workers the process runs out of memory for web workers and throws an error. So there is like a soft limit.
Spawning many stacks increases the chance of being able to randomly hit a stack. In the case of being able to spawn an unlimited amount of stacks, the chance of missing a stack can get to zero. This gets more significant when for example a defense mechanism like SafeStack would get deployed. SafeStack is a new feature in LLVM that separates the return addresses and some other safe values from the actual stack to protect against buffer overflows. It ensures there are no pointers to it and to protect against arbitrary writes it randomizes the location of the separated stack(safestack). Stack spraying can potentially reduce its security.

Also spawning many threads would slow down the system and potentially bring it to a halt if it reaches the total number of processes, e.g. on Linux. Imagine a page injected with a malicious script that just aims at DoSsing the system.

Comment 5

•

8 years ago

Thanks for the explanation.

Sounds like we should put the limit back in with some quite large constant.  For example, limit threads to 256 or 512.  Andrea, what do you think?

Flags: needinfo?(amarchesini)

Assignee

Comment 6

•

8 years ago

> Sounds like we should put the limit back in with some quite large constant. 
> For example, limit threads to 256 or 512.  Andrea, what do you think?

There are many possible DoSsing a script can do to test the limit of a browser: memory allocation, workers, are just 2 of them.

The reason why we decided to remove the limit is that, if a malicious page wants to crash a browser, it can use multiple domains to workaround the limit per domain.

Flags: needinfo?(amarchesini)

Comment 7

•

8 years ago

But I think the issue here is a bit different from normal DoS:

1) You can DoS the entire OS by exhausting process space.  Effectively a fork bomb.
2) The ability to spam stacks reduces the effectiveness of the ASLR security technique. [1]

These suggest we should have *some* limit, even if its very high.

[1]: https://en.wikipedia.org/wiki/Address_space_layout_randomization

Comment 8

•

8 years ago

See comment 7.

Flags: needinfo?(amarchesini)

Assignee

Comment 9

•

8 years ago

ok. I think, having a limit up to 512 should be more than enough for any webpage. I'll reintroduce the limit.
Maybe adding some new telemetry IDs (in case the old ones have been removed) to see if this limit is reach by some website.

Flags: needinfo?(amarchesini)

Comment 10

•

8 years ago

Thanks!

Assignee: nobody → amarchesini

Status: NEW → ASSIGNED

Assignee

Comment 11

•

8 years ago

Attached file backed_out_changeset_feb0e93ac30d (obsolete) — Details

This is just the backout of bug 1280241 with these changes:

1. 512 limit
2. a telemetry ID for dedicated workers.

Attachment #8782789 - Flags: review?(bkelly)

Masatoshi Kimura [:emk]

Comment 12

•

8 years ago

I could not find the number 512 in this patch.

Assignee

Comment 13

•

8 years ago

Attached patch backed_out_changeset_feb0e93ac30d — Details — Splinter Review

wrong patch.

Attachment #8782789 - Attachment is obsolete: true

Attachment #8782789 - Flags: review?(bkelly)

Attachment #8782803 - Flags: review?(bkelly)

Updated

•

8 years ago

Attachment #8782803 - Attachment is patch: true

Comment 14

•

8 years ago

Comment on attachment 8782803 [details] [diff] [review]
backed_out_changeset_feb0e93ac30d

Review of attachment 8782803 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.  I did not reverse engineer the mechanism completely since we are restoring code that use to run before.  If you made extensive changes, please separate the backout patch from your additions.  (I assume you did not make extensive changes, though.)

Thanks!

::: dom/workers/RuntimeService.cpp
@@ +2113,5 @@
>  RuntimeService::CancelWorkersForWindow(nsPIDOMWindowInner* aWindow)
>  {
>    AssertIsOnMainThread();
>  
> +  AutoTArray<WorkerPrivate*, MAX_WORKERS_PER_DOMAIN> workers;

Looks like these are about 4k on the stack.  I guess thats ok.

@@ +2456,5 @@
>        numberOfProcessors = 1; // Must be one there somewhere
>      }
> +    uint32_t clampedValue = std::min(uint32_t(numberOfProcessors),
> +                                     gMaxWorkersPerDomain);
> +    clampedHardwareConcurrency.compareExchange(0, clampedValue);

I thought we wanted this clamped to a smaller value to reduce finger-printing.  Is that not the case?

::: modules/libpref/init/all.js
@@ +139,5 @@
>  pref("dom.select_events.enabled", true);
>  
>  // Whether or not Web Workers are enabled.
>  pref("dom.workers.enabled", true);
> +// The number of workers per domain allowed to run concurrently.

Maybe extend the comment that we're going for effectively infinite, while preventing abuse.

::: toolkit/components/telemetry/Histograms.json
@@ +8478,5 @@
>    },
> +  "SERVICE_WORKER_SPAWN_GETS_QUEUED": {
> +    "alert_emails": ["amarchesini@mozilla.com"],
> +    "bug_numbers": [1286895],
> +    "expires_in_version": "58",

Are the guidelines for when to expire telemetry?  Any reason to expire these at all?  Hardware is going to change over time so we may want these in place.

Attachment #8782803 - Flags: review?(bkelly) → review+

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 15

•

8 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f4c8b3de527e
Reintroduce a limit on number of dedicated JS web workers in Firefox, r=bkelly

Comment 16

•

8 years ago

Backed this out for crashing in test_bug1241485.html, at least on Windows 7 opt:

https://hg.mozilla.org/integration/mozilla-inbound/rev/0a3247080eba

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=f4c8b3de527ee6b47a3167248e8a398b6254f119
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=34342370&repo=mozilla-inbound

01:28:20  WARNING -  TEST-UNEXPECTED-FAIL | dom/workers/test/test_bug1241485.html | application terminated with exit code 1
01:28:20     INFO -  runtests.py | Application ran for: 0:00:13.246000
01:28:20     INFO -  zombiecheck | Reading PID log: c:\users\cltbld\appdata\local\temp\tmpocpsz3pidlog
01:28:20     INFO -  mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/F__ynWzdSZeb0aeG5T-izA/artifacts/public/build/firefox-51.0a1.en-US.win32.crashreporter-symbols.zip
01:28:25     INFO -  mozcrash Copy/paste: C:\slave\test\build\win32-minidump_stackwalk.exe c:\users\cltbld\appdata\local\temp\tmpe1ozr8.mozrunner\minidumps\5b90142c-3ce9-46aa-95b7-84f65d563bfe.dmp c:\users\cltbld\appdata\local\temp\tmpmsng85
01:28:33     INFO -  mozcrash Saved minidump as C:\slave\test\build\blobber_upload_dir\5b90142c-3ce9-46aa-95b7-84f65d563bfe.dmp
01:28:33     INFO -  mozcrash Saved app info as C:\slave\test\build\blobber_upload_dir\5b90142c-3ce9-46aa-95b7-84f65d563bfe.extra
01:28:33  WARNING -  PROCESS-CRASH | dom/workers/test/test_bug1241485.html | application crashed [@ mozilla::dom::workers::GetCurrentThreadJSContext()]
01:28:33     INFO -  Crash dump filename: c:\users\cltbld\appdata\local\temp\tmpe1ozr8.mozrunner\minidumps\5b90142c-3ce9-46aa-95b7-84f65d563bfe.dmp
01:28:33     INFO -  Operating system: Windows NT
01:28:33     INFO -                    6.1.7601 Service Pack 1
01:28:33     INFO -  CPU: x86
01:28:33     INFO -       GenuineIntel family 6 model 30 stepping 5
01:28:33     INFO -       8 CPUs
01:28:33     INFO -  Crash reason:  EXCEPTION_ACCESS_VIOLATION_READ
01:28:33     INFO -  Crash address: 0x2d0
01:28:33     INFO -  Process uptime: 13 seconds
01:28:33     INFO -  Thread 393 (crashed)
01:28:33     INFO -   0  xul.dll!mozilla::dom::workers::GetCurrentThreadJSContext() [RuntimeService.cpp:f4c8b3de527e : 1291 + 0x5]
01:28:33     INFO -      eip = 0x61bba83b   esp = 0x749df80c   ebp = 0x749df900   ebx = 0x7e7a0100
01:28:33     INFO -      esi = 0x00000000   edi = 0x19e53b20   eax = 0x00000000   ecx = 0x0000002c
01:28:33     INFO -      edx = 0x749df94c   efl = 0x00010246
01:28:33     INFO -      Found by: given as instruction pointer in context
01:28:33     INFO -   1  xul.dll!mozilla::dom::workers::WorkerRunnable::Run() [WorkerRunnable.cpp:f4c8b3de527e : 289 + 0x5]
01:28:33     INFO -      eip = 0x61bdf377   esp = 0x749df810   ebp = 0x749df900
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   2  xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:f4c8b3de527e : 1058 + 0x6]
01:28:33     INFO -      eip = 0x60c7d307   esp = 0x749df908   ebp = 0x749df970
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   3  xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:f4c8b3de527e : 290 + 0xd]
01:28:33     INFO -      eip = 0x60c937bc   esp = 0x749df978   ebp = 0x749df984
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   4  xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [MessagePump.cpp:f4c8b3de527e : 338 + 0x8]
01:28:33     INFO -      eip = 0x60f0c64c   esp = 0x749df98c   ebp = 0x749df9ac
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   5  xul.dll!MessageLoop::RunHandler() [message_loop.cc:f4c8b3de527e : 225 + 0x8]
01:28:33     INFO -      eip = 0x60ef948e   esp = 0x749df9b4   ebp = 0x749df9e4
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   6  xul.dll!MessageLoop::Run() [message_loop.cc:f4c8b3de527e : 205 + 0x7]
01:28:33     INFO -      eip = 0x60ef9281   esp = 0x749df9ec   ebp = 0x749dfa04
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   7  xul.dll!nsThread::ThreadFunc(void *) [nsThread.cpp:f4c8b3de527e : 459 + 0x7]
01:28:33     INFO -      eip = 0x60c804a0   esp = 0x749dfa0c   ebp = 0x749dfa24
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   8  nss3.dll!_PR_NativeRunThread [pruthr.c:f4c8b3de527e : 397 + 0x6]
01:28:33     INFO -      eip = 0x6a011520   esp = 0x749dfa2c   ebp = 0x749dfa40
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -   9  nss3.dll!pr_root [w95thred.c:f4c8b3de527e : 95 + 0xa]
01:28:33     INFO -      eip = 0x6a006b2c   esp = 0x749dfa48   ebp = 0x749dfa4c
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -  10  ucrtbase.dll!_crt_at_quick_exit + 0x104
01:28:33     INFO -      eip = 0x69d162a4   esp = 0x749dfa54   ebp = 0x749dfa88
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -  11  kernel32.dll!BaseThreadInitThunk + 0x12
01:28:33     INFO -      eip = 0x77743c45   esp = 0x749dfa90   ebp = 0x749dfa94
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -  12  ntdll.dll!__RtlUserThreadStart + 0x27
01:28:33     INFO -      eip = 0x77c837f5   esp = 0x749dfa9c   ebp = 0x749dfad4
01:28:33     INFO -      Found by: call frame info
01:28:33     INFO -  13  ntdll.dll!_RtlUserThreadStart + 0x1b
01:28:33     INFO -      eip = 0x77c837c8   esp = 0x749dfadc   ebp = 0x749dfaec
01:28:33     INFO -      Found by: call frame info

Flags: needinfo?(amarchesini)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 17

•

8 years ago

Furthermore Windows XP and 7 opt + pgo failed with dom/workers/test/test_transferable.html | uncaught exception - uncaught exception: out of memory at undefined:

https://treeherder.mozilla.org/logviewer.html#?job_id=34343632&repo=mozilla-inbound

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 18

•

8 years ago

Sorry, meant only Windows XP.

Comment 19

•

8 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a91a95d5db4
Reintroduce a limit on number of dedicated JS web workers in Firefox, r=bkelly

Comment 20

•

8 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9e935bd97a1c
Remove a test for checking the number of created workers, r=me

Ryan VanderMeulen [:RyanVM]

Comment 21

•

8 years ago

Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/328618be592c
Fix a talos regression, r=me

Comment 22

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/1a91a95d5db4

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox51: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla51

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 23

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9e935bd97a1c
https://hg.mozilla.org/mozilla-central/rev/328618be592c

Assignee

Updated

•

8 years ago

Flags: needinfo?(amarchesini)

François Marier [:francois]

Assignee

Comment 24

•

8 years ago

Comment on attachment 8782803 [details] [diff] [review]
backed_out_changeset_feb0e93ac30d

can you do a data review for this patch?

Attachment #8782803 - Flags: review?(francois)

Comment 25

•

8 years ago

Comment on attachment 8782803 [details] [diff] [review]
backed_out_changeset_feb0e93ac30d

Review of attachment 8782803 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, except for the expiry. Please use 56.

::: toolkit/components/telemetry/Histograms.json
@@ +8478,5 @@
>    },
> +  "SERVICE_WORKER_SPAWN_GETS_QUEUED": {
> +    "alert_emails": ["amarchesini@mozilla.com"],
> +    "bug_numbers": [1286895],
> +    "expires_in_version": "58",

The guidelines are here: https://wiki.mozilla.org/Firefox/Data_Collection

In this case, can you change 58 for 56? It's not a problem renewing this later if you are still using the data.

Attachment #8782803 - Flags: review?(francois) → review-