Closed Bug 942111 Opened 11 years ago Closed 9 years ago

Intermittent Jetpack WindowsError: [Error 32] The process cannot access the file because it is being used by another process (" command timed out: 4500 seconds elapsed, attempting to kill")

Categories

(Add-on SDK Graveyard :: General, defect, P1)

x86
Windows XP
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: cbook, Unassigned)

References

()

Details

Attachments

(1 file)

Windows XP 32-bit mozilla-inbound debug test jetpack on 2013-11-22 00:42:53 PST for push 2ab7cff78c50

slave: t-xp32-ix-019

https://tbpl.mozilla.org/php/getParsedLog.php?id=30951440&tree=Mozilla-Inbound

we are seeing this currently quite frequently on Win XP, seems XP wants to remove a in-use file 

WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-_bu5fk'
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-_bu5fk'
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-hhpsfi'
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-hhpsfi'
command timed out: 7200 seconds elapsed, attempting to kill
remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last):
Component: Jetpack SDK → General
Product: Mozilla Labs → Add-on SDK
Erik, this seems to have spiked since yesterday's update from upstream.
Flags: needinfo?(evold)
Priority: -- → P1
Blocks: 981003
Windows 7 jetpack jobs hidden for too many failures of the type in this bug; once this bug is fixed bug 981003 will unhide them on TBPL.

For now you'll need to use the &showall=1 prefix, eg:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&showall=1&jobname=Windows%207.*jetpack
Hmm so nothing has really changed on our end that I would suspect of causing a change here.

The only change to runner.py in bug 972925 was https://github.com/mozilla/addon-sdk/compare/2d29a417803206a25c87764b6a7ca1d6cfab9afc...036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43

For some reason some process has a hold on our outfile, so I'm guessing that it's Windows? or a bug in Python for windows (did we update this?).

We can catch these exceptions and ignore them safely, but then the temp output file is left behind.

I'm not sure what else we can do besides doing a `time.sleep(..)` & retry or tracking down the process holding on to the file, which I don't think that came from the SDK side.
Flags: needinfo?(evold) → needinfo?(dtownsend+bugmail)
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #190)
> Hmm so nothing has really changed on our end that I would suspect of causing
> a change here.
> 
> The only change to runner.py in bug 972925 was
> https://github.com/mozilla/addon-sdk/compare/
> 2d29a417803206a25c87764b6a7ca1d6cfab9afc...
> 036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43
> 
> For some reason some process has a hold on our outfile, so I'm guessing that
> it's Windows? or a bug in Python for windows (did we update this?).
> 
> We can catch these exceptions and ignore them safely, but then the temp
> output file is left behind.
> 
> I'm not sure what else we can do besides doing a `time.sleep(..)` & retry or
> tracking down the process holding on to the file, which I don't think that
> came from the SDK side.

The only other possible change would be the child-process stuff.
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #190)
> Hmm so nothing has really changed on our end that I would suspect of causing
> a change here.
> 
> The only change to runner.py in bug 972925 was
> https://github.com/mozilla/addon-sdk/compare/
> 2d29a417803206a25c87764b6a7ca1d6cfab9afc...
> 036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43
> 
> For some reason some process has a hold on our outfile, so I'm guessing that
> it's Windows? or a bug in Python for windows (did we update this?).
> 
> We can catch these exceptions and ignore them safely, but then the temp
> output file is left behind.

Let's do that for now to at least reduce the orange effect here. Can we log when it happens so we can at least see it?
Flags: needinfo?(dtownsend+bugmail)
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #191)
> (In reply to Erik Vold [:erikvold] [:ztatic] from comment #190)
> > Hmm so nothing has really changed on our end that I would suspect of causing
> > a change here.
> > 
> > The only change to runner.py in bug 972925 was
> > https://github.com/mozilla/addon-sdk/compare/
> > 2d29a417803206a25c87764b6a7ca1d6cfab9afc...
> > 036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43
> > 
> > For some reason some process has a hold on our outfile, so I'm guessing that
> > it's Windows? or a bug in Python for windows (did we update this?).
> > 
> > We can catch these exceptions and ignore them safely, but then the temp
> > output file is left behind.
> > 
> > I'm not sure what else we can do besides doing a `time.sleep(..)` & retry or
> > tracking down the process holding on to the file, which I don't think that
> > came from the SDK side.
> 
> The only other possible change would be the child-process stuff.

Hmm, do the child processes we start get killed with the firefox process or do they use the output log?
Flags: needinfo?(jsantell)
child_process does not do anything with the output log. Do we have a list of commits from that range? Not sure what happens to processes that are spawned if FF crashes, but it shouldn't have a hold on that output file. Some of the tests communicate to processes over stdout/stdin, and maybe some of the test scripts on windows shoudn't be used, but they're all very basic:
https://github.com/mozilla/addon-sdk/blob/ac485e854b754e614f5c697b681968022dafe6c4/test/fixtures/child-process-scripts.js#L16-L33

Does anything look like a red flag that'd cause this?
Flags: needinfo?(jsantell)
Mossop and I were discussing possible solutions:

* Get rid of writing to a file for test harness
* Check processes in our test harness and log those out for more help
* Check into if Child Processes leave firefox process lingering during a failure
These errors here `internal.tracker is undefined` could probably be fixed via bug 991692's patch
(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #414)
> These errors here `internal.tracker is undefined` could probably be fixed
> via bug 991692's patch

The `internal.tracker is undefined` strings in the other comments in this bug are indeed bug 991692, however they don't appear to be the cause of the actual failure that these comments represent, which is a timeout during the jetpack test run.
(Note the bug comments here are just whatever log lines matched the TBPL regexp - and thus false positves and/or failures earlier in the log will also appear in the text posted to bugzilla)
Right, closing bug 991692 will just clean up these messages a bit, not solve the actual file lock issues in this thread
Assignee: nobody → evold
Assignee: evold → nobody
Blocks: 784681
just pushed a hack workaround for bug 1006043, possibly related..
Depends on: 1020458
Blocks: 1020473
No longer blocks: 981003
I'm pretty sure this is a dup, the 7200 timeout issue is, which is the only thing in common for all of these logs, so of the logs should be broken out in to separate bugs, but that will be easier with the fix made to bug 1020458, so I think we should just close this bug for now.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #1036)
> I'm pretty sure this is a dup, the 7200 timeout issue is, which is the only
> thing in common for all of these logs, so of the logs should be broken out
> in to separate bugs, but that will be easier with the fix made to bug
> 1020458, so I think we should just close this bug for now.
> 
> *** This bug has been marked as a duplicate of bug 1020458 ***

If you ignore the recent mis-stars, there are common failures in this bug - the ones described in the summary, comment 0 and eg comment 1022. How you wish for these to be filed is up to you, however the failures per the summary are prolific, and so another bug needs to be filed for it asap if you don't want to continue using this one...
https://tbpl.mozilla.org/php/getParsedLog.php?id=42141317&tree=Mozilla-B2g30-v1.4
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Summary: Intermittent Jetpack command timed out: 7200 seconds elapsed, attempting to kill | The process cannot access the file because it is being used by another process → Intermittent Jetpack WindowsError: [Error 32] The process cannot access the file because it is being used by another process (" command timed out: 4500 seconds elapsed, attempting to kill")
Attachment #8443726 - Flags: review?(jsantell) → review+
Commits pushed to master at https://github.com/mozilla/addon-sdk

https://github.com/mozilla/addon-sdk/commit/195f805729928c51b64da5bc21a602c2d602deef
Bug 942111 - Catching Intermittent Jetpack WindowsError

https://github.com/mozilla/addon-sdk/commit/674e37f09307ca3ffaff2251b21329c1c4305522
Merge pull request #1522 from erikvold/942111

Bug 942111 - Catching Intermittent Jetpack WindowsError r=@jsantell
This was a feature of the cfx based harness which is no longer used on the main trees.
No longer blocks: 1020473
Jesus. Talos? Builds? TB xpcshell? Could we do a shittier job of starring if we actually *tried* to misstar things?
Status: REOPENED → RESOLVED
Closed: 10 years ago9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: