Closed Bug 568663 Opened 15 years ago Closed 14 years ago

Intermittent failure in test_processasync.js | 1****** == 1310720 (for some value of ******)

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla8

People

(Reporter: philor, Assigned: Waldo)

References

Details

(Keywords: intermittent-failure, Whiteboard: [inbound])

Attachments

(1 file)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1274993581.1274994961.21649.gz WINNT 5.2 mozilla-central opt test xpcshell on 2010/05/27 13:53:01 s: win32-slave30 TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\tests\test_necko\test\test_processasync.js | test failed (with xpcshell return code: 0), see following log: >>>>>>> TEST-INFO | (xpcshell/head.js) | test 1 pending HTTPD-INFO | >>> listening on port 4444, 20 pending connections TEST-INFO | (xpcshell/head.js) | test 2 pending TEST-INFO | (xpcshell/head.js) | test 2 finished Skipping 1464 Lines... HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _waitToWriteData HTTPD-INFO | *** onOutputStreamReady HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _waitToWriteData HTTPD-INFO | *** onOutputStreamReady HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _waitToWriteData HTTPD-INFO | *** onOutputStreamReady HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _waitToWriteData HTTPD-INFO | *** onOutputStreamReady HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _waitToWriteData HTTPD-INFO | *** onOutputStreamReady HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _waitToWriteData HTTPD-INFO | *** onOutputStreamReady HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _waitToWriteData HTTPD-INFO | *** onOutputStreamReady HTTPD-INFO | *** wrote 8192 bytes of data HTTPD-INFO | *** _cancelOrDispatchCancelCallback(0) HTTPD-INFO | *** cancel(0) HTTPD-INFO | *** _doneReadingSource(0x0) HTTPD-INFO | *** _finishSource(0) HTTPD-INFO | *** _cancelOrDispatchCancelCallback(0) HTTPD-INFO | *** onStopRequest async callback HTTPD-INFO | *** onStopRequest [status=0x0] HTTPD-INFO | *** closing connection 4 on port 1806 TEST-UNEXPECTED-FAIL | e:/builds/moz2_slave/mozilla-central-win32-opt-unittest-xpcshell/build/xpcshell/tests/test_necko/test/test_processasync.js | 1047552 == 1310720 - See following stack: JS frame :: e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\head.js :: do_throw :: line 257 JS frame :: e:\builds\moz2_slave\mozilla-central-win32-opt-unittest-xpcshell\build\xpcshell\head.js :: do_check_eq :: line 287 JS frame :: e:/builds/moz2_slave/mozilla-central-win32-opt-unittest-xpcshell/build/xpcshell/tests/test_necko/test/test_processasync.js :: stop_handleAsyncOrdering :: line 333 JS frame :: e:/builds/moz2_slave/mozilla-central-win32-opt-unittest-xpcshell/build/xpcshell/tests/test_necko/test/head_utils.js :: anonymous :: line 426 TEST-INFO | (xpcshell/head.js) | exiting test
Hm, did something just change for this to get a bunch of reports in the last few days, versus a few over many months before? Maybe I'll try to look at this when I get back to the office in a couple days, if it has become more reproducible.
Potentially two things changed: note that every one of those has been on the horribly named "WINNT 5.1" WinXP slaves, which we just recently started running on, plus possibly some code change - http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=625c08180421&tochange=cee083078957 is the 24 hours leading up to the first instance.
Summary: Intermittent failure in test_processasync.js | 1047552 == 1310720 → Intermittent failure in test_processasync.js | 1****** == 1310720 (for some value of ******)
(In reply to comment #7) > Hm, did something just change for this to get a bunch of reports in the last > few days, versus a few over many months before? Maybe I'll try to look at > this when I get back to the office in a couple days, if it has become more > reproducible. Jeff, did you ever get to do this? This is too frequent and it's starting to show up among the top 10 intermittent orange bugs. :(
This is now the third-most-frequent orange, and the most-frequent orange which doesn't have a patch.
So this happening means that we did not in fact get all the data. This seems bad! Looking at the log, I see "wrote 8192 bytes of data" logged 160 times, which looks correct (8192 * 160 == 1310720). So why is that stop handler not seeing it all?
(In reply to comment #157) > Jeff, did you ever get to do this? This is too frequent and it's starting > to show up among the top 10 intermittent orange bugs. :( No, I didn't get back to working on this. :-\ Taking another look now. (In reply to comment #283) > Looking at the log, I see "wrote 8192 bytes of data" logged 160 times, which > looks correct (8192 * 160 == 1310720). So why is that stop handler not > seeing it all? I'm not sure why it's not seeing it. My first thought is that maybe the code reading from the HTTP channel isn't actually being provided all the data that was written. That makes no more sense than anything else, but that onDataAvailable implementation *is* one point that does stuff but doesn't dump any diagnostics which would have pointed out a failure earlier. If other stuff is printing expected information, maybe the problem lies with the code that's not printing expected information. I've made some diagnostic tweaks and pushed them to try, will see what information that provides us: http://tbpl.mozilla.org/?tree=Try&rev=d333add6ac92 If that run doesn't work, I'll keep queueing it up until I get something.
I pushed http://hg.mozilla.org/integration/mozilla-inbound/rev/95cc67ff2903 to get some diagnostic information, because several pushes of that patch (plus one atop it for try-syntax) hadn't yet hit the xpcshell failure. Hopefully the law of large numbers will provide further information in short order.
Whee, a failure! Here's the change I made, to get the output below: 1.28 onDataAvailable: function(request, cx, inputStream, offset, count) 1.29 { 1.30 - Array.prototype.push.apply(this._data, 1.31 - makeBIS(inputStream).readByteArray(count)); 1.32 + dumpn("*** DATA AVAILABLE FOR TEST " + testArray[testIndex].path); 1.33 + dumpn("*** COUNT: " + count); 1.34 + dumpn("*** OFFSET: " + offset); 1.35 + dumpn("*** ALREADY: " + this._data.length); 1.36 + var newData = makeBIS(inputStream).readByteArray(count); 1.37 + dumpn("*** READ NOW: " + newData.length); 1.38 + Array.prototype.push.apply(this._data, newData); 1.39 + dumpn("*** UPDATED: " + this._data.length); 1.40 }, Now, for the output in the log from comment 297, last bit of input reading: HTTPD-INFO | *** DATA AVAILABLE FOR TEST http://localhost:4444/handleAsyncOrdering HTTPD-INFO | *** COUNT: 614400 HTTPD-INFO | *** OFFSET: 696320 HTTPD-INFO | *** ALREADY: 696320 HTTPD-INFO | *** READ NOW: 614400 HTTPD-INFO | *** UPDATED: 1187840 OFFSET should always equal ALREADY. That's good. Since this is the last data available, COUNT + OFFSET should equal 1310720. That's good. READ NOW should always equal COUNT. That's good. And UPDATED should equal ALREADY + READ NOW. ...and it doesn't. Instead of 614400, the delta is 1187840 - 696320 == 491520. So our bytes-push, didn't fully push. I would be confused about this, but I now see I've heard this song before in bug 586525. Indeed, 491520 == StackSpace::ARGS_LENGTH_MAX, if you unwind the constants a little. The JS engine shouldn't be doing this, bug 607371, but we can certainly hack around that for now. I'll do that Monday -- I have pressing matters to attend to today, and I have plans all day tomorrow.
Depends on: 607371
I did a grep for all uses of the string "apply" in netwerk/test/httpserver, and this fixes all uses where the argument array might have a length which could go past engine limits. Most are intrinsically lower due to the data they're working with. One in httpd.js was a little trickier -- it was below only because the length of the array was the min of its original length and Response.SEGMENT, which is well below the argument limit -- but was actually safe before this patch. In any case, this should take care of all issues in the server code and tests. I've raised the underlying issue with ECMA to see if we can get a new method added to Array to address the case of wanting to push the elements of an array, onto another array. We'll see how that goes. I think I'm also annoyed enough by this to make passing an array of too many arguments to apply() just throw an exception. But that's of course a separate bug.
Assignee: nobody → jwalden+bmo
Status: NEW → ASSIGNED
Attachment #548309 - Flags: review?(honzab.moz)
Comment on attachment 548309 [details] [diff] [review] Undo diagnostics, work around Function.prototype.apply's silent clamping behavior, add a test for one problematic instance of the problem Review of attachment 548309 [details] [diff] [review]: ----------------------------------------------------------------- Next time please larger context. r=honzab with few comments. ::: netwerk/test/httpserver/httpd.js @@ +1888,5 @@ > + > + // Large numbers of bytes may cause Array.prototype.push to be called with > + // more arguments than the JavaScript engine supports. In that case append > + // bytes in fixed-size amounts until all bytes are appended. > + for (var start = 0; start < count; start += quantum) Where |count| comes from here? Shouldn't it be assigned bytes.length? @@ +1890,5 @@ > + // more arguments than the JavaScript engine supports. In that case append > + // bytes in fixed-size amounts until all bytes are appended. > + for (var start = 0; start < count; start += quantum) > + { > + var newData = bytes.slice(start, Math.min(start + quantum, count)); Rather call this |subData| or |slice| ? @@ +1943,1 @@ > out.value = line.substring(0, length); If you process only |length| number of bytes, then you probably don't need this trim, so you can potentially store to out.value directly as well... Also maybe add a comment we splice also the CR LF bytes out from data but are not interested in them for the result? @@ -1910,2 @@ > out.value = line.substring(0, length); > - Personally I would leave this empty line, but this is your child ;) ::: netwerk/test/httpserver/test/head_utils.js @@ +364,5 @@ > + for (var start = 0; start < count; start += quantum) > + { > + var newData = bis.readByteArray(Math.min(quantum, count - start)); > + Array.prototype.push.apply(this._data, newData); > + } I think you really should have a helper method for this, but this is not necessary a review request. You can argument the apply method as well and pass also a function taking (start, stop) args to use for either slice(start, stop) or readByteArray(stop - start). safeApply(slice, method, target, length) { for (var start = 0; start < length; start += gApplyQuantum) method.apply(target, slice(start, Math.min(start + gApplyQuantum, length)); } And then do: safeApply( function(start, end) { return bis.readByteArray(end - start) }, Array.prototype.push, this._data, count); If I'm right, can be used for strings as well. Maybe add such methods to appropriate prototypes? @@ +438,4 @@ > data = [data]; > if (data.length <= 0) > throw "bad data length"; > + if (!data.every(function(v) { return /^[\x00-\xff]*$/.test(v); })) Doesn't this belong to another bug?
Attachment #548309 - Flags: review?(honzab.moz) → review+
(In reply to comment #308) > Next time please larger context. Sorry, I upload patches from .hg/patches, which gives it whatever amount of context is the default (don't think I have it overridden, although I could be wrong). > Where |count| comes from here? Shouldn't it be assigned bytes.length? Ugh, totally right, fixed. And there's no way to hit the problem here except if the OS gives the input stream from which that data's read more than 490KB of data at once, so it's impossible to write a foolproof test for it being gone. :-\ There is that raw test, which is supposed to address this, but you're still at the whim of the OS networking code. > Rather call this |subData| or |slice| ? Sounds good. > If you process only |length| number of bytes, then you probably don't need > this trim, so you can potentially store to out.value directly as well... > > Also maybe add a comment we splice also the CR LF bytes out from data but > are not interested in them for the result? Both sensible, made adjustments to this effect. > @@ -1910,2 @@ > > out.value = line.substring(0, length); > > - > > Personally I would leave this empty line, but this is your child ;) The return value of the function is really the combination of out-setting and returning true, so I slightly prefer keeping them together as here. > I think you really should have a helper method for this I'm in the process of requesting that ECMAScript add an Array.prototype.pushAll method (name and exact semantics up for debate) to the spec to address this. (An implemented-in-script version can be monkey-patched into place pretty easily to remain compatible with non-ES6 engines and such.) The lack of such a method is a pretty bad botch in my book, even ignoring this particular failure mode of push.apply as a half-workaround. > Maybe add such methods to appropriate prototypes? Yeah, that would be the plan. :-) > @@ +438,4 @@ > > data = [data]; > > if (data.length <= 0) > > throw "bad data length"; > > + if (!data.every(function(v) { return /^[\x00-\xff]*$/.test(v); })) > > Doesn't this belong to another bug? Yes, but no. I ran into this bug because when I changed the raw test to use a really, really long status line, I hit this bug -- which checked not that every character was in the [0, 256) range, but that for every character, every character in the *entire array, stringified* was in the [0, 256) range. This quadratic behavior bites quite hugely with the new test that has a 512K status line. :-) So it's necessary for the rest of the changes in the bug to work, but it could be applied separately from the other changes, as long as it landed before. I didn't think to separate it for a one-line change, and even in retrospect it doesn't seem hugely more clarifying to separate it out.
OS: Windows Server 2003 → All
Hardware: x86 → All
Whiteboard: [orange] → [orange][inbound]
Target Milestone: --- → mozilla8
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Blocks: 586525
Whiteboard: [orange][inbound] → [inbound]
Component: httpd.js → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: