If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Land cache API file descriptor exhaustion test

RESOLVED FIXED in Firefox 57

Status

()

Core
DOM
P1
normal
RESOLVED FIXED
a month ago
26 days ago

People

(Reporter: bkelly, Assigned: tt)

Tracking

(Blocks: 1 bug)

57 Branch
mozilla57
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox57 fixed)

Details

Attachments

(1 attachment, 4 obsolete attachments)

(Reporter)

Description

a month ago
+++ This bug was initially created as a clone of Bug #1397128 +++

Forking this off from bug 1397128 to land the file descriptor test.  I want to land the main fixes from bug 1397128 today so it has more bake time on nightly before the final merge to beta next week.
(Reporter)

Comment 1

a month ago
Created attachment 8908655 [details] [diff] [review]
TestTonsOfFD.patch

This is the last version of the test from bug 1397128.  Some comments in bug 1397128 comment 95 and bug 1397128 comment 96.
(Assignee)

Comment 2

a month ago
try with patch addressed the comment (before patches in bug 1397128 landed): https://treeherder.mozilla.org/#/jobs?repo=try&revision=d7c80e6283d1cfb7f0b4e9be0b5cfb18c91b4b76

try with patch addressed the comment (after the patches landed): https://treeherder.mozilla.org/#/jobs?repo=try&revision=d2af6455eb37a04c0f13f807700cdc8614f4a820
(Assignee)

Comment 3

a month ago
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #2)
I forgot to apply bug 1397128 comment 96. Note to myself to do it later.
(Assignee)

Comment 4

a month ago
Created attachment 8908953 [details] [diff] [review]
tons_fd.patch

Update patch for addressing comment 95 (not addressing #96 yet)
Attachment #8908655 - Attachment is obsolete: true
(Assignee)

Comment 5

28 days ago
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #2)
> try with patch addressed the comment (before patches in bug 1397128 landed):
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=d7c80e6283d1cfb7f0b4e9be0b5cfb18c91b4b76

Hmm, it seems that 5000 fds don't make the test fail on Linux, Windows and MacOS. Thus, increase it to 10240.

Push to try (without the fix in bug 1397128) again with patch address comment 96 and increasing opening file times to 10240.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=464535944c228207455f25b131107eb648d66a43
(Reporter)

Comment 6

28 days ago
Honestly I would not expect the test to fail on windows.  I think it has a very large limit there.

Also, even if we can't force it to fail in automation there is some value in the test if it fails locally.
Would it be useful to be able to just count the number of open FD's before and after the cache match calls and fail if there's been an increase (modulo a slack factor that allows for other things to be happening outside of the test's purview)?

I made the following function that definitely works on linux/android and should also work on OS X but on Windows I don't see an easy option.  (I think you need to use GetProcessHandleCount() and that's dealing with more than just file handles, plus I don't see any easy way to access it if we're removing ctypes.)


Components.utils.import("resource://gre/modules/osfile.jsm");
Components.utils.import("resource://gre/modules/AppConstants.jsm");
async function countFDs() {
  let fdDir;
  switch (AppConstants.platform) {
    case "linux":
    case "android":
      fdDir = "/proc/self/fd/";
      break;
    case "macosx":
      fdDir = "/dev/fd/";
      break;
    default:
      throw new Error("countFDs supported on this platform!");
  }
  const iter = new OS.File.DirectoryIterator(fdDir);
  let fdCount = 0;
  try {
    let batch = await iter.nextBatch();
    for (let entry of batch) {
      fdCount++;
    }
  } finally {
    iter.close();
  }
  return fdCount;
}

And then if you paste that and the following into the browser toolbox scratchpad:
  countFDs().then((fdCount) => { console.log("fd count:", fdCount); });
and run it, you get something like the following in the console:
  fd count: 152
Correction.  Since:
- The FDs get transferred to the content process and that's where we expect exhaustion.
- OS.File's operation depends on not being sandboxed, which would mean that if running the above logic in a content process, it needs to be a file:// content process.  This only happens for file:// URI top-level loads modulo edge-cases we don't want to deal with.
- OS.File probably one day will likely assert that it's only used in the parent process, meaning the above hack won't work.
- I think /dev/fd only works for the current process on OS X.  (But on linux, /proc/PID/fd/ works for everything, as long as you've got permissions, and the parent process certainly should have permission.)

This probably instead wants to be linux-only and have the parent process read the contents of the given child's /proc/PID/fd directory:


Components.utils.import("resource://gre/modules/osfile.jsm");
Components.utils.import("resource://gre/modules/AppConstants.jsm");
async function countFDsInProcess(pid) {
  let fdDir;
  switch (AppConstants.platform) {
    case "linux":
    case "android":
      fdDir = `/proc/${pid}/fd/`;
      break;
    default:
      throw new Error("countFDs supported on this platform!");
  }
  const iter = new OS.File.DirectoryIterator(fdDir);
  let fdCount = 0;
  try {
    let batch = await iter.nextBatch();
    for (let entry of batch) {
      fdCount++;
    }
  } finally {
    iter.close();
  }
  return fdCount;
}


And the usage looks like (with a hardcoded process id):
  countFDsInProcess(3513).then((fdCount) => { console.log("child fd count:", fdCount); });

In a test, you'd use either:
- From the parent, like in a browser chrome test: Use the above logic directly and find the pid using nsITabParent's osPid field[1], like is accessible via `tab.linkedBrowser.frameLoader.tabParent.osPid` given the `tab` returned by `BrowserTestUtils.openNewForegroundTab`.
- From the child, like if sticking with a mochitest like the current patch, I think you'd use SpecialPowers.loadChromeScript() to load and run the given logic in the parent and send it the pid using the args support.  You'd find the pid of the content process by doing `SpecialPowers.Services.appinfo.processID` which I believe grants magical wrapped access to Services.jsm and Services.jsm exposes the nsIXULRuntime/nsIXULAppInfo on "appinfo".

1: https://searchfox.org/mozilla-central/source/dom/interfaces/base/nsITabParent.idl#51
(Note that although Android is non-e10s, the mochitest child idiom should still work.  The xul appinfo should report its own process, and loadChromeScript just ends up loading the script in the same process in the non-e10s case, and /proc/PID/fd/ works too for your own pid.)
(Assignee)

Comment 10

28 days ago
(In reply to Ben Kelly [:bkelly] from comment #6)
> Honestly I would not expect the test to fail on windows.  I think it has a
> very large limit there.

I see. But, the test only fails on Android(not on Linux and MacOS).

> Also, even if we can't force it to fail in automation there is some value in
> the test if it fails locally.

Yes, you are right. But, I was only able to make the test fail on Ubuntu if I set the opening file times to 5000. After tweaking the value to 10240, I was able to make the test fail on my MacBookPro.
(Assignee)

Comment 11

28 days ago
(In reply to Andrew Sutherland [:asuth] from comment #8)
> In a test, you'd use either:
> - From the parent, like in a browser chrome test: Use the above logic
> directly and find the pid using nsITabParent's osPid field[1], like is
> accessible via `tab.linkedBrowser.frameLoader.tabParent.osPid` given the
> `tab` returned by `BrowserTestUtils.openNewForegroundTab`.
> - From the child, like if sticking with a mochitest like the current patch,
> I think you'd use SpecialPowers.loadChromeScript() to load and run the given
> logic in the parent and send it the pid using the args support.  You'd find
> the pid of the content process by doing
> `SpecialPowers.Services.appinfo.processID` which I believe grants magical
> wrapped access to Services.jsm and Services.jsm exposes the
> nsIXULRuntime/nsIXULAppInfo on "appinfo".
> 
> 1:
> https://searchfox.org/mozilla-central/source/dom/interfaces/base/
> nsITabParent.idl#51

Thanks for the information! I will try the second option (staying in the mochitest and using ladChromeScript) first.
(Assignee)

Comment 12

28 days ago
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #11)
> Thanks for the information! I will try the second option (staying in the
> mochitest and using ladChromeScript) first.

I was thinking like I can get the max FD number that not leads cache.match() to fail. But, I find out that I can get the number of iterating when throwing by the original method (try-catch the exception while cache match()). Anyway, I verify that :asuth's idea works! I can get the number of fd by running the script locally.
(Assignee)

Comment 13

28 days ago
Created attachment 8909742 [details] [diff] [review]
Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r?bkelly

Comment 6 changed my mind. Maybe I don't need to run out of FD on the try. 

I think it's okay to just test cache won't run out of FD if doing cache.match() around 5000 times.
Attachment #8908953 - Attachment is obsolete: true
(Assignee)

Updated

28 days ago
Attachment #8909742 - Flags: review?(bkelly)
(Assignee)

Comment 14

28 days ago
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #14)
Hi Ben,

I addressed your feedback in bug 1397128 comment 95 and bug 1397128 comment 96. Though it doesn't fail on the try even before applying your fix, it does fail on my Ubuntu locally. Could you help me to review it? Thanks!
(Reporter)

Comment 15

28 days ago
Comment on attachment 8909742 [details] [diff] [review]
Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r?bkelly

Review of attachment 8909742 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!

::: dom/cache/test/mochitest/test_cache_tons_of_fd.html
@@ +77,5 @@
> +    }
> +
> +    if (bodyText != body) {
> +      // Reduce the checking message.
> +      return Promsie.reject("The cached body doeen't be the same as original " +

s/Promsie/Promise/g

Also, I think you can just `throw()` from an async function and it will be converted to a promise rejection automatically.  This is fine, though.
Attachment #8909742 - Flags: review?(bkelly) → review+
(Assignee)

Comment 16

27 days ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=efead53ac31c95b1df78fde3da79b6fe9da7dca4
(Assignee)

Comment 17

27 days ago
(In reply to Ben Kelly [:bkelly] from comment #15)
> Also, I think you can just `throw()` from an async function and it will be
> converted to a promise rejection automatically.  This is fine, though.

Thanks! I'll change them to "throw()".
(Assignee)

Comment 18

27 days ago
Created attachment 8910158 [details] [diff] [review]
Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly
Attachment #8909742 - Attachment is obsolete: true
Attachment #8910158 - Flags: review+
(Assignee)

Comment 19

27 days ago
Created attachment 8910159 [details] [diff] [review]
[Final] Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly

Sorry, I updated the wrong patch
Attachment #8910158 - Attachment is obsolete: true
Attachment #8910159 - Flags: review+
(Assignee)

Comment 20

27 days ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2b12683dd80408ce4699d062225c77d08312776a
(Assignee)

Updated

27 days ago
Keywords: checkin-needed
(Assignee)

Updated

27 days ago
Attachment #8910159 - Attachment description: Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly → [Final] Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly

Comment 21

27 days ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/65f8abeba6c6
Add a test to verify cache's operations won't run out of fd. r=bkelly
Keywords: checkin-needed

Comment 22

26 days ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/65f8abeba6c6
Status: ASSIGNED → RESOLVED
Last Resolved: 26 days ago
status-firefox57: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.