Closed Bug 1400282 Opened 7 years ago Closed 7 years ago

Land cache API file descriptor exhaustion test

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla57

Tracking Flags:

Tracking

Status

firefox57

---

fixed

People

(Reporter: bkelly, Assigned: tt)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 4 obsolete files)

TestTonsOfFD.patch 7 years ago Ben Kelly [:bkelly, not reviewing] 3.56 KB, patch		Details \| Diff \| Splinter Review
tons_fd.patch 7 years ago Tom Tung [:tt, :ttung] 3.56 KB, patch		Details \| Diff \| Splinter Review
Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r?bkelly 7 years ago Tom Tung [:tt, :ttung] 4.33 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly 7 years ago Tom Tung [:tt, :ttung] 4.20 KB, patch	tt : review+	Details \| Diff \| Splinter Review
[Final] Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly 7 years ago Tom Tung [:tt, :ttung] 4.18 KB, patch	tt : review+	Details \| Diff \| Splinter Review

Ben Kelly [:bkelly, not reviewing]

Reporter

Description

•

7 years ago

+++ This bug was initially created as a clone of Bug #1397128 +++

Forking this off from bug 1397128 to land the file descriptor test.  I want to land the main fixes from bug 1397128 today so it has more bake time on nightly before the final merge to beta next week.

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 1

•

7 years ago

Attached patch TestTonsOfFD.patch (obsolete) — Details — Splinter Review

This is the last version of the test from bug 1397128.  Some comments in bug 1397128 comment 95 and bug 1397128 comment 96.

Tom Tung [:tt, :ttung]

Assignee

Comment 2

•

7 years ago

try with patch addressed the comment (before patches in bug 1397128 landed): https://treeherder.mozilla.org/#/jobs?repo=try&revision=d7c80e6283d1cfb7f0b4e9be0b5cfb18c91b4b76

try with patch addressed the comment (after the patches landed): https://treeherder.mozilla.org/#/jobs?repo=try&revision=d2af6455eb37a04c0f13f807700cdc8614f4a820

Tom Tung [:tt, :ttung]

Assignee

Comment 3

•

7 years ago

(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #2)
I forgot to apply bug 1397128 comment 96. Note to myself to do it later.

Tom Tung [:tt, :ttung]

Assignee

Comment 4

•

7 years ago

Attached patch tons_fd.patch (obsolete) — Details — Splinter Review

Update patch for addressing comment 95 (not addressing #96 yet)

Attachment #8908655 - Attachment is obsolete: true

Tom Tung [:tt, :ttung]

Assignee

Comment 5

•

7 years ago

(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #2)
> try with patch addressed the comment (before patches in bug 1397128 landed):
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=d7c80e6283d1cfb7f0b4e9be0b5cfb18c91b4b76

Hmm, it seems that 5000 fds don't make the test fail on Linux, Windows and MacOS. Thus, increase it to 10240.

Push to try (without the fix in bug 1397128) again with patch address comment 96 and increasing opening file times to 10240.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=464535944c228207455f25b131107eb648d66a43

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 6

•

7 years ago

Honestly I would not expect the test to fail on windows.  I think it has a very large limit there.

Also, even if we can't force it to fail in automation there is some value in the test if it fails locally.

Andrew Sutherland [:asuth] (he/him)

Comment 7

•

7 years ago

Would it be useful to be able to just count the number of open FD's before and after the cache match calls and fail if there's been an increase (modulo a slack factor that allows for other things to be happening outside of the test's purview)?

I made the following function that definitely works on linux/android and should also work on OS X but on Windows I don't see an easy option.  (I think you need to use GetProcessHandleCount() and that's dealing with more than just file handles, plus I don't see any easy way to access it if we're removing ctypes.)


Components.utils.import("resource://gre/modules/osfile.jsm");
Components.utils.import("resource://gre/modules/AppConstants.jsm");
async function countFDs() {
  let fdDir;
  switch (AppConstants.platform) {
    case "linux":
    case "android":
      fdDir = "/proc/self/fd/";
      break;
    case "macosx":
      fdDir = "/dev/fd/";
      break;
    default:
      throw new Error("countFDs supported on this platform!");
  }
  const iter = new OS.File.DirectoryIterator(fdDir);
  let fdCount = 0;
  try {
    let batch = await iter.nextBatch();
    for (let entry of batch) {
      fdCount++;
    }
  } finally {
    iter.close();
  }
  return fdCount;
}

And then if you paste that and the following into the browser toolbox scratchpad:
  countFDs().then((fdCount) => { console.log("fd count:", fdCount); });
and run it, you get something like the following in the console:
  fd count: 152

Andrew Sutherland [:asuth] (he/him)

Comment 8

•

7 years ago

Correction.  Since:
- The FDs get transferred to the content process and that's where we expect exhaustion.
- OS.File's operation depends on not being sandboxed, which would mean that if running the above logic in a content process, it needs to be a file:// content process.  This only happens for file:// URI top-level loads modulo edge-cases we don't want to deal with.
- OS.File probably one day will likely assert that it's only used in the parent process, meaning the above hack won't work.
- I think /dev/fd only works for the current process on OS X.  (But on linux, /proc/PID/fd/ works for everything, as long as you've got permissions, and the parent process certainly should have permission.)

This probably instead wants to be linux-only and have the parent process read the contents of the given child's /proc/PID/fd directory:


Components.utils.import("resource://gre/modules/osfile.jsm");
Components.utils.import("resource://gre/modules/AppConstants.jsm");
async function countFDsInProcess(pid) {
  let fdDir;
  switch (AppConstants.platform) {
    case "linux":
    case "android":
      fdDir = `/proc/${pid}/fd/`;
      break;
    default:
      throw new Error("countFDs supported on this platform!");
  }
  const iter = new OS.File.DirectoryIterator(fdDir);
  let fdCount = 0;
  try {
    let batch = await iter.nextBatch();
    for (let entry of batch) {
      fdCount++;
    }
  } finally {
    iter.close();
  }
  return fdCount;
}


And the usage looks like (with a hardcoded process id):
  countFDsInProcess(3513).then((fdCount) => { console.log("child fd count:", fdCount); });

In a test, you'd use either:
- From the parent, like in a browser chrome test: Use the above logic directly and find the pid using nsITabParent's osPid field[1], like is accessible via `tab.linkedBrowser.frameLoader.tabParent.osPid` given the `tab` returned by `BrowserTestUtils.openNewForegroundTab`.
- From the child, like if sticking with a mochitest like the current patch, I think you'd use SpecialPowers.loadChromeScript() to load and run the given logic in the parent and send it the pid using the args support.  You'd find the pid of the content process by doing `SpecialPowers.Services.appinfo.processID` which I believe grants magical wrapped access to Services.jsm and Services.jsm exposes the nsIXULRuntime/nsIXULAppInfo on "appinfo".

1: https://searchfox.org/mozilla-central/source/dom/interfaces/base/nsITabParent.idl#51

Andrew Sutherland [:asuth] (he/him)

Comment 9

•

7 years ago

(Note that although Android is non-e10s, the mochitest child idiom should still work.  The xul appinfo should report its own process, and loadChromeScript just ends up loading the script in the same process in the non-e10s case, and /proc/PID/fd/ works too for your own pid.)

Tom Tung [:tt, :ttung]

Assignee

Comment 10

•

7 years ago

(In reply to Ben Kelly [:bkelly] from comment #6)
> Honestly I would not expect the test to fail on windows.  I think it has a
> very large limit there.

I see. But, the test only fails on Android(not on Linux and MacOS).

> Also, even if we can't force it to fail in automation there is some value in
> the test if it fails locally.

Yes, you are right. But, I was only able to make the test fail on Ubuntu if I set the opening file times to 5000. After tweaking the value to 10240, I was able to make the test fail on my MacBookPro.

Tom Tung [:tt, :ttung]

Assignee

Comment 11

•

7 years ago

(In reply to Andrew Sutherland [:asuth] from comment #8)
> In a test, you'd use either:
> - From the parent, like in a browser chrome test: Use the above logic
> directly and find the pid using nsITabParent's osPid field[1], like is
> accessible via `tab.linkedBrowser.frameLoader.tabParent.osPid` given the
> `tab` returned by `BrowserTestUtils.openNewForegroundTab`.
> - From the child, like if sticking with a mochitest like the current patch,
> I think you'd use SpecialPowers.loadChromeScript() to load and run the given
> logic in the parent and send it the pid using the args support.  You'd find
> the pid of the content process by doing
> `SpecialPowers.Services.appinfo.processID` which I believe grants magical
> wrapped access to Services.jsm and Services.jsm exposes the
> nsIXULRuntime/nsIXULAppInfo on "appinfo".
> 
> 1:
> https://searchfox.org/mozilla-central/source/dom/interfaces/base/
> nsITabParent.idl#51

Thanks for the information! I will try the second option (staying in the mochitest and using ladChromeScript) first.

Tom Tung [:tt, :ttung]

Assignee

Comment 12

•

7 years ago

(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #11)
> Thanks for the information! I will try the second option (staying in the
> mochitest and using ladChromeScript) first.

I was thinking like I can get the max FD number that not leads cache.match() to fail. But, I find out that I can get the number of iterating when throwing by the original method (try-catch the exception while cache match()). Anyway, I verify that :asuth's idea works! I can get the number of fd by running the script locally.

Tom Tung [:tt, :ttung]

Assignee

Comment 13

•

7 years ago

Attached patch Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r?bkelly (obsolete) — Details — Splinter Review

Comment 6 changed my mind. Maybe I don't need to run out of FD on the try. 

I think it's okay to just test cache won't run out of FD if doing cache.match() around 5000 times.

Attachment #8908953 - Attachment is obsolete: true

Tom Tung [:tt, :ttung]

Assignee

Updated

•

7 years ago

Attachment #8909742 - Flags: review?(bkelly)

Tom Tung [:tt, :ttung]

Assignee

Comment 14

•

7 years ago

(In reply to Tom Tung [:tt] (OoO: 9/22 ~ 9/29) from comment #14)
Hi Ben,

I addressed your feedback in bug 1397128 comment 95 and bug 1397128 comment 96. Though it doesn't fail on the try even before applying your fix, it does fail on my Ubuntu locally. Could you help me to review it? Thanks!

Ben Kelly [:bkelly, not reviewing]

Reporter

Comment 15

•

7 years ago

Comment on attachment 8909742 [details] [diff] [review]
Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r?bkelly

Review of attachment 8909742 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!

::: dom/cache/test/mochitest/test_cache_tons_of_fd.html
@@ +77,5 @@
> +    }
> +
> +    if (bodyText != body) {
> +      // Reduce the checking message.
> +      return Promsie.reject("The cached body doeen't be the same as original " +

s/Promsie/Promise/g

Also, I think you can just `throw()` from an async function and it will be converted to a promise rejection automatically.  This is fine, though.

Attachment #8909742 - Flags: review?(bkelly) → review+

Tom Tung [:tt, :ttung]

Assignee

Comment 16

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=efead53ac31c95b1df78fde3da79b6fe9da7dca4

Tom Tung [:tt, :ttung]

Assignee

Comment 17

•

7 years ago

(In reply to Ben Kelly [:bkelly] from comment #15)
> Also, I think you can just `throw()` from an async function and it will be
> converted to a promise rejection automatically.  This is fine, though.

Thanks! I'll change them to "throw()".

Tom Tung [:tt, :ttung]

Assignee

Comment 18

•

7 years ago

Attached patch Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly (obsolete) — Details — Splinter Review

Attachment #8909742 - Attachment is obsolete: true

Attachment #8910158 - Flags: review+

Tom Tung [:tt, :ttung]

Assignee

Comment 19

•

7 years ago

Attached patch [Final] Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly — Details — Splinter Review

Sorry, I updated the wrong patch

Attachment #8910158 - Attachment is obsolete: true

Attachment #8910159 - Flags: review+

Tom Tung [:tt, :ttung]

Assignee

Comment 20

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2b12683dd80408ce4699d062225c77d08312776a

Tom Tung [:tt, :ttung]

Assignee

Updated

•

7 years ago

Keywords: checkin-needed

Tom Tung [:tt, :ttung]

Assignee

Updated

•

7 years ago

Attachment #8910159 - Attachment description: Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly → [Final] Bug 1400282: Add a Test to verify cache's operations won't run out of fd. r=bkelly

Pulsebot

Comment 21

•

7 years ago

Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/65f8abeba6c6
Add a test to verify cache's operations won't run out of fd. r=bkelly

Keywords: checkin-needed

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 22

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/65f8abeba6c6

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

status-firefox57: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla57

Nobody; OK to take it and work on it

Updated

•

5 years ago

Component: DOM → DOM: Core & HTML

You need to log in before you can comment on or make changes to this bug.