Closed Bug 942111 Opened 9 years ago Closed 7 years ago

Intermittent Jetpack WindowsError: [Error 32] The process cannot access the file because it is being used by another process (" command timed out: 4500 seconds elapsed, attempting to kill")

Categories

(Add-on SDK Graveyard :: General, defect, P1)

x86
Windows XP
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: cbook, Unassigned)

References

()

Details

Attachments

(1 file)

Windows XP 32-bit mozilla-inbound debug test jetpack on 2013-11-22 00:42:53 PST for push 2ab7cff78c50

slave: t-xp32-ix-019

https://tbpl.mozilla.org/php/getParsedLog.php?id=30951440&tree=Mozilla-Inbound

we are seeing this currently quite frequently on Win XP, seems XP wants to remove a in-use file 

WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-_bu5fk'
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-_bu5fk'
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-hhpsfi'
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld~1.t-x\\locals~1\\temp\\harness-stdout-hhpsfi'
command timed out: 7200 seconds elapsed, attempting to kill
remoteFailed: [Failure instance: Traceback from remote host -- Traceback (most recent call last):
Component: Jetpack SDK → General
Product: Mozilla Labs → Add-on SDK
Erik, this seems to have spiked since yesterday's update from upstream.
Flags: needinfo?(evold)
Priority: -- → P1
Blocks: 981003
Windows 7 jetpack jobs hidden for too many failures of the type in this bug; once this bug is fixed bug 981003 will unhide them on TBPL.

For now you'll need to use the &showall=1 prefix, eg:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&showall=1&jobname=Windows%207.*jetpack
Hmm so nothing has really changed on our end that I would suspect of causing a change here.

The only change to runner.py in bug 972925 was https://github.com/mozilla/addon-sdk/compare/2d29a417803206a25c87764b6a7ca1d6cfab9afc...036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43

For some reason some process has a hold on our outfile, so I'm guessing that it's Windows? or a bug in Python for windows (did we update this?).

We can catch these exceptions and ignore them safely, but then the temp output file is left behind.

I'm not sure what else we can do besides doing a `time.sleep(..)` & retry or tracking down the process holding on to the file, which I don't think that came from the SDK side.
Flags: needinfo?(evold) → needinfo?(dtownsend+bugmail)
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #190)
> Hmm so nothing has really changed on our end that I would suspect of causing
> a change here.
> 
> The only change to runner.py in bug 972925 was
> https://github.com/mozilla/addon-sdk/compare/
> 2d29a417803206a25c87764b6a7ca1d6cfab9afc...
> 036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43
> 
> For some reason some process has a hold on our outfile, so I'm guessing that
> it's Windows? or a bug in Python for windows (did we update this?).
> 
> We can catch these exceptions and ignore them safely, but then the temp
> output file is left behind.
> 
> I'm not sure what else we can do besides doing a `time.sleep(..)` & retry or
> tracking down the process holding on to the file, which I don't think that
> came from the SDK side.

The only other possible change would be the child-process stuff.
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #190)
> Hmm so nothing has really changed on our end that I would suspect of causing
> a change here.
> 
> The only change to runner.py in bug 972925 was
> https://github.com/mozilla/addon-sdk/compare/
> 2d29a417803206a25c87764b6a7ca1d6cfab9afc...
> 036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43
> 
> For some reason some process has a hold on our outfile, so I'm guessing that
> it's Windows? or a bug in Python for windows (did we update this?).
> 
> We can catch these exceptions and ignore them safely, but then the temp
> output file is left behind.

Let's do that for now to at least reduce the orange effect here. Can we log when it happens so we can at least see it?
Flags: needinfo?(dtownsend+bugmail)
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #191)
> (In reply to Erik Vold [:erikvold] [:ztatic] from comment #190)
> > Hmm so nothing has really changed on our end that I would suspect of causing
> > a change here.
> > 
> > The only change to runner.py in bug 972925 was
> > https://github.com/mozilla/addon-sdk/compare/
> > 2d29a417803206a25c87764b6a7ca1d6cfab9afc...
> > 036238d8bd1bfbbbc2a44467be646711e3aeed19#diff-43
> > 
> > For some reason some process has a hold on our outfile, so I'm guessing that
> > it's Windows? or a bug in Python for windows (did we update this?).
> > 
> > We can catch these exceptions and ignore them safely, but then the temp
> > output file is left behind.
> > 
> > I'm not sure what else we can do besides doing a `time.sleep(..)` & retry or
> > tracking down the process holding on to the file, which I don't think that
> > came from the SDK side.
> 
> The only other possible change would be the child-process stuff.

Hmm, do the child processes we start get killed with the firefox process or do they use the output log?
Flags: needinfo?(jsantell)
child_process does not do anything with the output log. Do we have a list of commits from that range? Not sure what happens to processes that are spawned if FF crashes, but it shouldn't have a hold on that output file. Some of the tests communicate to processes over stdout/stdin, and maybe some of the test scripts on windows shoudn't be used, but they're all very basic:
https://github.com/mozilla/addon-sdk/blob/ac485e854b754e614f5c697b681968022dafe6c4/test/fixtures/child-process-scripts.js#L16-L33

Does anything look like a red flag that'd cause this?
Flags: needinfo?(jsantell)
Mossop and I were discussing possible solutions:

* Get rid of writing to a file for test harness
* Check processes in our test harness and log those out for more help
* Check into if Child Processes leave firefox process lingering during a failure