Closed Bug 1400282 Opened 7 years ago Closed 7 years ago

Land cache API file descriptor exhaustion test

Categories

(Core :: DOM: Core & HTML, enhancement, P1)

57 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: bkelly, Assigned: tt)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 4 obsolete files)

+++ This bug was initially created as a clone of Bug #1397128 +++

Forking this off from bug 1397128 to land the file descriptor test.  I want to land the main fixes from bug 1397128 today so it has more bake time on nightly before the final merge to beta next week.
Attached patch TestTonsOfFD.patch (obsolete) — Splinter Review
This is the last version of the test from bug 1397128.  Some comments in bug 1397128 comment 95 and bug 1397128 comment 96.
try with patch addressed the comment (before patches in bug 1397128 landed): https://treeherder.mozilla.org/#/jobs?repo=try&revision=d7c80e6283d1cfb7f0b4e9be0b5cfb18c91b4b76

try with patch addressed the comment (after the patches landed): https://treeherder.mozilla.org/#/jobs?repo=try&revision=d2af6455eb37a04c0f13f807700cdc8614f4a820
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #2)
I forgot to apply bug 1397128 comment 96. Note to myself to do it later.
Attached patch tons_fd.patch (obsolete) — Splinter Review
Update patch for addressing comment 95 (not addressing #96 yet)
Attachment #8908655 - Attachment is obsolete: true
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #2)
> try with patch addressed the comment (before patches in bug 1397128 landed):
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=d7c80e6283d1cfb7f0b4e9be0b5cfb18c91b4b76

Hmm, it seems that 5000 fds don't make the test fail on Linux, Windows and MacOS. Thus, increase it to 10240.

Push to try (without the fix in bug 1397128) again with patch address comment 96 and increasing opening file times to 10240.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=464535944c228207455f25b131107eb648d66a43
Honestly I would not expect the test to fail on windows.  I think it has a very large limit there.

Also, even if we can't force it to fail in automation there is some value in the test if it fails locally.
Would it be useful to be able to just count the number of open FD's before and after the cache match calls and fail if there's been an increase (modulo a slack factor that allows for other things to be happening outside of the test's purview)?

I made the following function that definitely works on linux/android and should also work on OS X but on Windows I don't see an easy option.  (I think you need to use GetProcessHandleCount() and that's dealing with more than just file handles, plus I don't see any easy way to access it if we're removing ctypes.)


Components.utils.import("resource://gre/modules/osfile.jsm");
Components.utils.import("resource://gre/modules/AppConstants.jsm");
async function countFDs() {
  let fdDir;
  switch (AppConstants.platform) {
    case "linux":
    case "android":
      fdDir = "/proc/self/fd/";
      break;
    case "macosx":
      fdDir = "/dev/fd/";
      break;
    default:
      throw new Error("countFDs supported on this platform!");
  }
  const iter = new OS.File.DirectoryIterator(fdDir);
  let fdCount = 0;
  try {
    let batch = await iter.nextBatch();
    for (let entry of batch) {
      fdCount++;
    }
  } finally {
    iter.close();
  }
  return fdCount;
}

And then if you paste that and the following into the browser toolbox scratchpad:
  countFDs().then((fdCount) => { console.log("fd count:", fdCount); });
and run it, you get something like the following in the console:
  fd count: 152
Correction.  Since:
- The FDs get transferred to the content process and that's where we expect exhaustion.
- OS.File's operation depends on not being sandboxed, which would mean that if running the above logic in a content process, it needs to be a file:// content process.  This only happens for file:// URI top-level loads modulo edge-cases we don't want to deal with.
- OS.File probably one day will likely assert that it's only used in the parent process, meaning the above hack won't work.
- I think /dev/fd only works for the current process on OS X.  (But on linux, /proc/PID/fd/ works for everything, as long as you've got permissions, and the parent process certainly should have permission.)

This probably instead wants to be linux-only and have the parent process read the contents of the given child's /proc/PID/fd directory:


Components.utils.import("resource://gre/modules/osfile.jsm");
Components.utils.import("resource://gre/modules/AppConstants.jsm");
async function countFDsInProcess(pid) {
  let fdDir;
  switch (AppConstants.platform) {
    case "linux":
    case "android":
      fdDir = `/proc/${pid}/fd/`;
      break;
    default:
      throw new Error("countFDs supported on this platform!");
  }
  const iter = new OS.File.DirectoryIterator(fdDir);
  let fdCount = 0;
  try {
    let batch = await iter.nextBatch();
    for (let entry of batch) {
      fdCount++;
    }
  } finally {
    iter.close();
  }
  return fdCount;
}


And the usage looks like (with a hardcoded process id):
  countFDsInProcess(3513).then((fdCount) => { console.log("child fd count:", fdCount); });

In a test, you'd use either:
- From the parent, like in a browser chrome test: Use the above logic directly and find the pid using nsITabParent's osPid field[1], like is accessible via `tab.linkedBrowser.frameLoader.tabParent.osPid` given the `tab` returned by `BrowserTestUtils.openNewForegroundTab`.
- From the child, like if sticking with a mochitest like the current patch, I think you'd use SpecialPowers.loadChromeScript() to load and run the given logic in the parent and send it the pid using the args support.  You'd find the pid of the content process by doing `SpecialPowers.Services.appinfo.processID` which I believe grants magical wrapped access to Services.jsm and Services.jsm exposes the nsIXULRuntime/nsIXULAppInfo on "appinfo".

1: https://searchfox.org/mozilla-central/source/dom/interfaces/base/nsITabParent.idl#51
(Note that although Android is non-e10s, the mochitest child idiom should still work.  The xul appinfo should report its own process, and loadChromeScript just ends up loading the script in the same process in the non-e10s case, and /proc/PID/fd/ works too for your own pid.)
(In reply to Ben Kelly [:bkelly] from comment #6)
> Honestly I would not expect the test to fail on windows.  I think it has a
> very large limit there.

I see. But, the test only fails on Android(not on Linux and MacOS).

> Also, even if we can't force it to fail in automation there is some value in
> the test if it fails locally.

Yes, you are right. But, I was only able to make the test fail on Ubuntu if I set the opening file times to 5000. After tweaking the value to 10240, I was able to make the test fail on my MacBookPro.
(In reply to Andrew Sutherland [:asuth] from comment #8)
> In a test, you'd use either:
> - From the parent, like in a browser chrome test: Use the above logic
> directly and find the pid using nsITabParent's osPid field[1], like is
> accessible via `tab.linkedBrowser.frameLoader.tabParent.osPid` given the
> `tab` returned by `BrowserTestUtils.openNewForegroundTab`.
> - From the child, like if sticking with a mochitest like the current patch,
> I think you'd use SpecialPowers.loadChromeScript() to load and run the given
> logic in the parent and send it the pid using the args support.  You'd find
> the pid of the content process by doing
> `SpecialPowers.Services.appinfo.processID` which I believe grants magical
> wrapped access to Services.jsm and Services.jsm exposes the
> nsIXULRuntime/nsIXULAppInfo on "appinfo".
> 
> 1:
> https://searchfox.org/mozilla-central/source/dom/interfaces/base/
> nsITabParent.idl#51

Thanks for the information! I will try the second option (staying in the mochitest and using ladChromeScript) first.
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #11)
> Thanks for the information! I will try the second option (staying in the
> mochitest and using ladChromeScript) first.

I was thinking like I can get the max FD number that not leads cache.match() to fail. But, I find out that I can get the number of iterating when throwing by the original method (try-catch the exception while cache match()). Anyway, I verify that :asuth's idea works! I can get the number of fd by running the script locally.
Comment 6 changed my mind. Maybe I don't need to run out of FD on the try. 

I think it's okay to just test cache won't run out of FD if doing cache.match() around 5000 times.
Attachment #8908953 - Attachment is obsolete: true
Attachment #8909742 - Flags: review?(bkelly)
(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #14)
Hi Ben,

I addressed your feedback in bug 1397128 comment 95 and bug 1397128 comment 96. Though it doesn't fail on the try even before applying your fix, it does fail on my Ubuntu locally. Could you help me to review it? Thanks!
Comment on attachment 8909742 [details] [diff] [review]
Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r?bkelly

Review of attachment 8909742 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!

::: dom/cache/test/mochitest/test_cache_tons_of_fd.html
@@ +77,5 @@
> +    }
> +
> +    if (bodyText != body) {
> +      // Reduce the checking message.
> +      return Promsie.reject("The cached body doeen't be the same as original " +

s/Promsie/Promise/g

Also, I think you can just `throw()` from an async function and it will be converted to a promise rejection automatically.  This is fine, though.
Attachment #8909742 - Flags: review?(bkelly) → review+
(In reply to Ben Kelly [:bkelly] from comment #15)
> Also, I think you can just `throw()` from an async function and it will be
> converted to a promise rejection automatically.  This is fine, though.

Thanks! I'll change them to "throw()".
Sorry, I updated the wrong patch
Attachment #8910158 - Attachment is obsolete: true
Attachment #8910159 - Flags: review+
Keywords: checkin-needed
Attachment #8910159 - Attachment description: Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly → [Final] Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/65f8abeba6c6
Add a test to verify cache's operations won't run out of fd. r=bkelly
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/65f8abeba6c6
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: